(Apologies to Mr. Kubrick)
Digital Picture Exchange (DPX) has been the industry standard file format for film scanning since it was introduced in the mid-1990s. At that time, the only way to guarantee a high quality image was to work with purely uncompressed files. But technology has, as usual, caught up. It's time to adapt, and to rethink this file format for archival uses. The downsides far outweigh the benefits of DPX, and there is a viable alternative that's not only as good as DPX, it's technically better in some respects: ProRes 4444. Below we'll explain why we think this is the ideal file format for film scanning.
But first, a little about DPX:
DPX is derived from Kodak's Cineon file format, which you could think of as a kind of digital version of a film frame. Each frame of film is scanned to its own sequentially numbered image file. It was designed to represent the density of the film being scanned, in a digital format with no compression. It's pretty flexible, allowing for both logarithmic and linear color, different color spaces (RGB, YUV, Monochrome), and it's resolution independent.
But DPX has a major problem that has become a huge stumbling block for film archives-- file size. A single 4k 10 bit DPX frame is about 50 Megabytes. At 24 frames per second, one second of footage is approximately 1.2 Gigabytes. One minute is about 72 Gigabytes. A 20-minute reel of film is roughly 1.4 Terabytes. An average feature film is about 6.5 terabytes, consisting of somewhere around 130,000 image files.
Let's take a real-world scenario, one that happened here just a couple weeks ago: We scanned a 96-minute film to 4k 10bit DPX for a local film archive. Once scanned, the film was to be transferred to an 8TB WD EasyStore drive (easily capable of 180-200MB/Second sustained writes). The average real-world speed was 90MB/Second, for a total file copy time of 18 hours. This drive was then transferred to the client for color correction, where the same files needed to be offloaded to a local system, a process that took about 12-14 hours (read speeds are typically faster than writes). Once done, the final files need to be copied to another drive, so assuming they're using the same type of drive, that's another 18 hours. And no, USB3 isn't the limiting factor here: the fact that the computer must write each file separately, adding a massive amount of overhead, is the problem. Each frame needs to be written to disk and then cataloged by the file system so the computer knows where to find it later.
Digital asset management systems weren’t built to work with file sets of this size, and this is an especially big problem for archives dealing with the crush of data from high resolution image sequences. Many systems simply cannot import this much data, and those that can rapidly fill up, requiring more and more expensive nearline storage capacity.
So, let's assume you keep DPX files out of your asset management system and back them up independently. You'll have to factor in data migration in the future. As we all know, hard drives will fail, so the strategy a successful archive will use is one of constant data migration: move those files to new media every few years. As an archive adds more high-resolution film scans to its collection, the time spent migrating data increases substantially. And the physical space occupied by disk-based or tape-based backups will increase as well.
Enter compression
Yes, yes. We know compression is a dirty word in the archival world, but let's take a moment to consider it in this context.
Compression comes in many forms. Some (H.264, HEVC, MPEG-2, etc.) are designed for viewing of a final product on media such as optical discs or streaming. These are not file formats that are meant to be manipulated further, they're just for viewing, and as such, they're highly compromised: These codecs use extreme temporal and spatial compression, and the removal of large amounts of color data to make for a smaller file that's easier to move around. But you wouldn't want to use a file format like these for archival purposes. Access copies? Sure. Masters? Nope.
Other forms of compression are designed to maintain quality while reducing bandwidth and storage requirements. One of these is completely lossless (RLE encoding for example). The problem with truly lossless compression schemes is that they rely on data repetition within the file, removing redundancy to effectively compress the image in such a way that the decoded image is identical to the source. This is great if your files were born digital, but it doesn't work on film scans, where the grain/dye in the film is an inherently random pattern that trips up this kind of compression.
That leaves lossy compression that's "visually lossless," designed for intermediates and masters -- compression that uses a light touch, but still reduces the file size by as much as 10x. Does this mean that the pixels are not absolutely identical to the original? Yes, it does. Does it matter? We'd argue no - if you're using the right kind of compression. That compression format is Apple's ProRes 4444 Codec.
What is ProRes 4444?
Apple released the first ProRes codec in the mid-2000s. It is an intra-frame compression format, designed to provide visually lossless compression for use in post production workflows. It's not meant to be an end-user viewable format (see above), it's designed to maintain maximum picture quality in a convenient package, that can be moved between various post production systems.
ProRes 4444 uses 4:4:4 color sampling to encode RGB log film scans, and a higher bit rate (less compression) than the ProRes 422 codecs. ProRes 4444 XQ takes this further by using even less compression. (However, 4444 XQ is a newer format and isn't as widely supported as ProRes 4444 -- yet).
ProRes 4444's color depth is 12 bit, unlike 10bit DPX. This allows for more dynamic range and color information than DPX, something that comes into play when scanning on a machine like our Lasergraphics ScanStation, which captures between 12-14 bits of color depth (SDR and HDR, respectively).
ProRes has been adopted by all major computer-based edit, color correction, compositing, VFX and digital restoration systems on Mac, Windows and Linux computers. It's also widely used in digital cinema cameras from the Arri Alexa to the Blackmagic Cinema cameras. It is widely considered an industry standard format in the post production world. Apple continues to develop the codec, increasing its maximum resolution and image quality. Open source tools such as ffmpeg now have wide support for both reading and writing ProRes.
The main argument we hear against ProRes is that that "it's proprietary and tied to a single company" We feel that argument is a bit of a straw man. The fact is, all motion picture formats - film, video, digital - originated as proprietary products for a commercial market. We (archives, filmmakers, scanning services, post production houses) work in an industry that develops the tools for us to use and they are inherently, to some degree, proprietary. While it would be nice to have a completely robust, reliable open source toolset for the work we do, that's unrealistic and simply isn't likely to happen.
Another argument is that Apple could pull the plug on ProRes. We don't see that as likely, since they're continuing to develop it. But even if they did, there are already so many implementations of it in the wild that it really wouldn't matter. Reading and writing ProRes Files can happen with or without Apple, for a very long time.
There is no viable open source alternative that ticks all the boxes ProRes 4444 does. While laudable, efforts at making such a format (see: FFV1) have no real support outside of a handful of applications, which means working with it involves multiple levels of transcoding to get into and out of the file format. There doesn't appear to be any interest on the part of the software companies that make post production tools to work with FFV1, so we think it's a non-starter until there's broad industry support.
Proving ProRes is best, empirically
Fair warning: we're going to get into the weeds here for a bit. Skip to the bottom for our conclusion if you don't want to look at fancy charts and graphs.
We scanned some 35mm film to 10bit DPX, ProRes 4444 and ProRes 4444XQ, and analyzed the images, along with some purely digital test patterns. Here is our scanned image, a standard color chart, shot on 35mm color negative and scanned at 5.8k Full Aperture, on our Lasergraphics ScanStation:
And here's a detail of the image, scaled up 300% so you can see that there is no perceptual difference between the DPX and ProRes files.
But why trust just our eyes? We analyzed three aspects of the image: DeltaE2000, PSNR and SSIM.
First up is DeltaE2000: This is a standard measure of perceptual difference between two colors. In this case, we're comparing ProRes 4444 and ProRes 4444XQ to DPX scans. A DeltaE2000 value below 2.0 is considered to be imperceptible. The chart below shows the DeltaE2000 value for each of the colors on the color chart in the reference image: Red, Green, Blue, Cyan, Magenta, Yellow, Grey. Both ProRes 4444 and ProRes 4444XQ have a DeltaE2000 value as low as 0.08 and as high as 0.33, depending on the color. This is well below the 2.0 threshold for a perceptible color change.
The next two graphs, PSNR and SSIM, are complementary to each other. Peak Signal-to-Noise Ratio (PSNR) is an objective absolute error metric that operates pixel by pixel. The graph below shows the numerical quality difference between ProRes 4444 and 4444 XQ encodings, and summarizes artifacts and noise that arise from compression. (The DPX original is not plotted, as when you compare an image to itself the PSNR value is +Infinity). Good PSNR values should fall between 50db and 65db for 12 bit files, such as ProRes. The grain in these images lowers that number slightly.
The issue with PSNR is that it's an objective measurement and does not account for human perception. For that reason we also plotted the Structural Similarity Index Measure (SSIM). Unlike PSNR, SSIM factors in subjective qualities of human perception such as our sensitivity to detail depending on contrast and luminance.
The maximum SSIM index value is 1.0 which represents the DPX reference image. The ProRes variants are slightly below the DPX image in regard to SSIM value but pay attention to the Y axis range. The difference is ~0.001, which is a very low value. Once again, this value is so low that it's below perception.
Next up is Modulation Transfer Function (MTF), a measure of how well a system can record spatial frequency detail. For this test we used a synthetic test pattern rather than the film scan, so that we can see the pure result of what the compression does. Here's the test image we used (converted to DPX, ProRes 4444 and ProRes 4444 XQ):
The Y axis on an MTF plot is Spatial Frequency Response (SFR). The X axis is modulation in cycles (or line pairs) over some distance. That distance can be mm if your goal is to measure the MTF of a real camera system. Since we are interested in the MTF of a codec, we use a synthetic MTF target and measure the distance in pixels. The goal is to have the highest SFR for the highest cycles/pixel. For the purposes of examining DPX vs ProRes 4444, the absolute MTF was not of that much interest. Instead, we're looking at how much the curves deviate from each other, as that represents a sharpness degradation due to compression. Similar to the DeltaE2000 metric there is little error and the curves are all perfectly aligned on top of each other. The result shows that there is no degradation to sharpness with ProRes 4444 vs DPX. All four curves are effectively identical.
But, But - COMPRESSION IS BAD!
So they say. But really ...is it?
Compression on the order of what ProRes offers has been with us for a long time. While most of the broadcast and post production industry considers Digital Betacam (the highest quality Standard Definition videotape format) to be "uncompressed," it used a 2.34:1 DCT compression scheme. And HDCAM SR, the pinnacle of High Definition videotape formats, was also considered to be uncompressed. Yet, it was based on MPEG-4 intra-frame compression - a cousin of ProRes. Put simply, the amount of compression used by the highest quality tape formats, a well as intermediate formats like ProRes, is so minimal that it's effectively inconsequential. In comparison, the generation loss one sees with a duplicate film element is significantly higher than the generation loss from re-compressing a ProRes file.
One of the biggest misconceptions about ProRes is that it's subject to quality loss from multiple generations of compression. While this is of course true, (eventually, with enough generations), in real-world usage it's simply not a problem. We performed a series of tests, recompressing the same images we used above to the same flavor of ProRes, for 10 generations (Gen0->Gen1->Gen2->Gen3 and so on). The results are pretty impressive:
See for yourself. Just like the example above, here's a 300% blowup of a detail on the original scan, alongside the 10th generation ProRes 4444 file. Note that we purposely chose the ProRes 4444 flavor because it has a higher compression ratio than 4444 XQ, in the hopes that it would reveal generational loss faster.
Summary
As you can see, there is no perceptual difference between DPX and ProRes 4444. Yes, ProRes is compressed, but compression isn't an inherently bad thing if it's used judiciously. ProRes 4444 is an excellent format for high quality archival film scan masters on several levels. The minimal level of compression used by the codec is more than a fair trade-off for the broader software compatibility, smaller file size, speed of copying, ease of archiving, and really just general ease of use, that you gain over DPX. ProRes files can be imported into most Digital Asset Management systems, don't require high performance RAIDs to play smoothly, and can be viewed in almost all major video-related applications, both commercial and open source.
Here's a handy bullet list of the Pros and Cons of DPX vs ProRes
Pros | Cons |
---|---|
No Compression | Massive file sizes |
Supports Log/Linear color | Only 10 and 16 bit widely supported |
Supports up to 16bit color depth |
Pros | Cons |
---|---|
Imperceptible Compression | No 16 bit variant |
Supports Log/Linear color | |
Approx. 8x smaller than DPX at the same resolution | |
Supports up to 12 bit color depth | |
Generation loss not a factor in real-world usage | |
Broader software support than DPX |
Bottom line: Don't let the perfect be the enemy of the good (or in this case, the great). If media management of DPX is causing you headaches, why not work with a format that's arguably better on almost all fronts?
Let us know if you have any questions and feel free to Contact us for a quote any time!