Digital Darkroom

The Raw Deal
Why You Should Use The Raw File Format With Digital Cameras

Uwe Steinmueller | Aug 1, 2003

The Raw Deal

Full colored target.

The most common image format use these days with digital cameras is the JPEG (Joint Photographic Experts Group) format. The obvious limitation of JPEG is that it is most often used for its excellent but lossy compression format (there is also lossless JPEG that is rarely used in cameras). Even at low compression rates, the image degrades slightly. More important to stress is the fact that the images undergo heavy color/exposure/ noise/sharpening processing in the camera that reduces the ability to make further post-processing. The JPEG compression works best for images in no need of any further substantial post-processing (which is rare if you get demanding) or under circumstances where such post-processing is prohibited. Many photographers try to get the best possible quality out of their cameras using TIFF or, if supported by the camera, a vendor-specific raw file format. As we will see, the raw file formats allow the most post-processing flexibility.

Gray scale picture seen by the sensor.

On The Sensor Level
To better understand what these magic raw file formats are, we need to understand how most of today's digital cameras work. All our new digital cameras capture color photos, right? Yes, in the end you finally get your color images but most of the modern digital cameras have sensors that can only record gray scale values (the Foveon X3 sensor, digital scanning backs, and multi-shot digital backs are the exceptions).

Assume we want to photograph a box of Crayola crayons.

A gray scale sensor would see the picture like this and you would never get any color photos at all. How can we use a gray scale sensor to capture color photos? Engineers at Kodak came up with the following schema, called the Bayer Pattern (Dr. Bayer, a Kodak scientist, invented this novel Color Filter Array configuration back in the 1980s, hence the name Bayer Pattern). There are also other pattern variations used:
r-g-r-g-r-g
g-b-g-b-g-b
r-g-r-g-r-g
g-b-g-b-g-b

Color mosaic seen through the color filters.

First, it is interesting to note that 50 percent are green and only 25 percent for each red and blue. The reason for this is that the human eye can differentiate far more green shades than red and blue. The green also covers the most important and widest part of the visible spectrum. Thus, the sensor captures gray values filtered by these color filters.

However, we want to have a photo with full color information for every pixel. Here a software trick comes into play called Bayer Pattern demosaicing, or color interpolation. What actually happens is that the missing RGB information is estimated from the neighboring pixels (for a more in-depth discussion go to http://ise.stanford.edu/class/psych221/98/demosaic/kodak/).

A good demosaicing algorithm is actually quite complicated and there are many proprietary solutions on the market. The problem is actually to resolve detail and still be correct with the colors. To illustrate some of the challenges, think of capturing a small black and white checker pattern that is small enough to just overlay the sensor cells.

As the neighboring green filtered photo site does not add new information the algorithm would not know whether it would be some kind of "red" (if the white hits a red filter) or "blue" (if the white hits a blue filter). In contrast, for example, a Foveon sensor would capture white and black correct as all three color channels are captured at the same photo site. The resolution captured by the Bayer sensors would drop if the subject would only consist of red and blue shades as the green channel could not add any information. For monochromatic red/blue (very narrow wavelengths) the green sites get absolutely no information, but such colors are rare in real life. In reality, there is information in both green and to a much less extent even blue, if the sensor samples very bright and saturated red colors. The problem in our example shown is the fact that estimating the color correctly requires a certain amount of spatial information. If only a single photo site samples the red "information" there will be no way to reconstruct the correct color for that particular photo site.

The above crops are from real samples we made in a studio to show the practical effect. Of course, we show an extreme situation here. In reality the failure is less dramatic but still visible to our eyes and definitely should not be ignored.

Image Artifacts
Some of these challenges result in image artifacts like moirés and color aliasing (shown as unrelated green, red, and blue pixels or resulting in discoloration). Most cameras fight the aliasing problem by putting an AA
(Anti-Aliasing) filter in front of the sensor. This filter actually blurs the image and distributes color information to the neighboring photo sites. As you know, blurring and photography don't really match up. Finding the right balance between blurring and aliasing is a camera design challenge. In our experience the Canon EOS-1Ds does a very good job here. Finally, the image needs a stronger sharpening to get back most of the original sharpness. To some extent AA filtering degrades the effective resolution of the sensor.

This sounds like a complicated mission. Indeed it is, but it works surprisingly well. Every technology has to struggle with its inherent limitations. In many aspects digital can beat film today as film has to fight its own limitations.

The Raw Deal
The raw data are the data for all the gray values captured on the chip. To produce a final image these raw data have to be processed (including the demosaicing) by a so-called "raw converter." To produce JPEG images the camera has to have a full raw converter embedded in the camera's firmware.

To give you an idea of how JPEG stacks up against raw file formats, here's a summary of the limitations of using the camera produced JPEGs and the corresponding raw advantages:
· JPEG produces artifacts due to lossy compression.
· Although most sensors capture 12-bit color (gray scale) information only 8 bit are used in the final JPEG file.
· The in camera raw converter can only use limited computing resources and good raw conversion can be very complex and computing intensive. As software technology evolves it's much more flexible to have the conversion done on the host computer instead of the non-upgradeable ASIC commonly used today.
· The in camera set or estimated white balance gets applied in the camera to the photo. The same is true also for color processing, tonal corrections, and in camera sharpening. This limits the post-processing capabilities, as an already corrected image needs to be corrected again. The more processing is done on a photo (especially 8 bit) the more it can degrade.

Now we can explain what raw files formats are. They store only the raw data (plus some additional metadata to describe the properties of the raw data in the so-called EXIF section of the file. (The EXIF section holds information like camera type, lens used, shutter speed, f/stop, and much more.) Now all the processing previously done in the camera can be performed on a more powerful computing platform. The raw data offers the following advantages:
· No JPEG compression.
· Full use of the 12-bit color information.
· Use of very sophisticated raw file converters (like Adobe Camera Raw or Phase One's Capture One DSLR).
· White balance, color processing, tonal/exposure correction, sharpening, and noise suppression can be processed later on the computer.
· The raw files also act more like the digital version of an undeveloped film negative. Over time we will get improved raw file converters to get better and better results from the same data.

Your Own Solutions
Think of using in camera JPEG as like shooting a Polaroid (where you just shoot and get your image processed immediately). Think of raw as being like film that can be developed and enhanced in the darkroom. Raw converters like Adobe Camera Raw or Phase One's Capture One DSLR act like your personal (and at times magical) formula for your own film developer.

What is the advantage of 12-bit data? The main advantage comes into play when you might need to make some major corrections to the white balance, exposure, and color. During the processing of an image, you lose bits of image data merely due to data clipping (accumulating over multiple steps). The more bits you have in the beginning the more data you have with your final corrected image.

What About TIFF?
What about using TIFF files in the camera? Actually TIFF files only solve the lossy compression issue but are still converted to 8 bit inside the camera. Most of the time TIFF files are larger than raw files (remember raw files only hold one 12-bit gray value per pixel) and don't have the other benefits of raw. I would go as far as saying that an 8 bit in camera processed TIFF file is only slightly better than a high-quality/high-resolution JPEG.

I'd like to thank the following individuals and companies for all of their help with this article: Daniel Stephens--Bayer schema pictures. Foveon--Demosaicing error schema, personal discussions with Dick Merrill (X3 chip designer). Michael_Jønsson--Lead developer for Capture One DSLR at Phase One.