Tuesday, January 24, 2012

Image File Formats....(brief)

This is a small attempt to introduce some of the many image formats...

Image file formats

Images!!!...there are many...
Image file formats are standardized means of organizing and storing digital images. Image files are composed of pixels, vector (geometric) data, or a combination of the two. Whatever the format, the files are rasterized to pixels when displayed on most graphic displays. The pixels that constitute an image are ordered as a grid (columns and rows); each pixel consists of numbers representing magnitudes of brightness and color.

Image file sizes

Image file size—expressed as the number of bytes—increases with the number of pixels composing an image, and the color depth of the pixels. The greater the number of rows and columns, the greater is the image resolution, and the larger is the file. Also, each pixel of an image increases in size when its color depth increases—an 8-bit pixel (1 byte) stores 256 colors, a 24-bit pixel (3 bytes) stores 16 million colors, the latter known as truecolor.
Image compression uses algorithms to decrease the size of a file. High resolution cameras produce large image files, ranging from hundreds of kilobytes to megabytes, per the camera's resolution and the image-storage format capacity. High resolution digital cameras record 12 megapixel (1MP = 1,000,000 pixels / 1 million) images, or more, in truecolor. For example, an image recorded by a 12 MP camera; since each pixel uses 3 bytes to record truecolor, the uncompressed image would occupy 36,000,000 bytes of memory—a great amount of digital storage for one image, given that cameras must record and store many images to be practical. Faced with large file sizes, both within the camera and a storage disc, image file formats were developed to store such large images. An overview of the major graphic file formats follows below.

Image file compression

There are two types of image file compression algorithms: lossless and lossy.

1.      Lossless compression algorithms reduce file size without losing image quality, though they are not compressed into as small a file as a lossy compression file. When image quality is valued above file size, lossless algorithms are typically chosen.

2.      Lossy compression algorithms take advantage of the inherent limitations of the human eye and discard invisible information. Most lossy compression algorithms allow for variable quality levels (compression) and as these levels are increased, file size is reduced. At the highest compression levels, image deterioration becomes noticeable as "compression artifacting".

Major graphic file formats

Including proprietary types, there are hundreds of image file types. The PNG, JPEG, and GIF formats are most often used to display images on the Internet. These graphic formats are listed and briefly described below, separated into the two main families of graphics: raster and vector.
In addition to straight image formats, Metafile formats are portable formats which can include both raster and vector information. Examples are application-independent formats such as WMF and EMF. The metafile format is an intermediate format. 

I.                   Raster formats

These formats store images as bitmaps (also known as pixmaps).

1.      JPEG/JFIF
JPEG (Joint Photographic Experts Group) is a compression method; JPEG-compressed images are usually stored in the JFIF (JPEG File Interchange Format) file format. JPEG compression is (in most cases) lossy compression. The JPEG/JFIF filename extension is JPG or JPEG. Nearly every digital camera can save images in the JPEG/JFIF format, which supports 8 bits per color (red, green, blue) for a 24-bit total, producing relatively small files. When not too great, the compression does not noticeably detract from the image's quality, but JPEG files suffer generational degradation when repeatedly edited and saved. The JPEG/JFIF format also is used as the image compression algorithm in many PDF files.

2.      JPEG 2000
JPEG 2000 is a compression standard enabling both lossless and lossy storage. The compression methods used are different from the ones in standard JFIF/JPEG; they improve quality and compression ratios, but also require more computational power to process. JPEG 2000 also adds features that are missing in JPEG. It is not nearly as common as JPEG, but it is used currently in professional movie editing and distribution (e.g., some digital cinemas use JPEG 2000 for individual movie frames).

3.      Exif
The Exif (Exchangeable image file format) format is a file standard similar to the JFIF format with TIFF extensions; it is incorporated in the JPEG-writing software used in most cameras. Its purpose is to record and to standardize the exchange of images with image metadata between digital cameras and editing and viewing software. The metadata are recorded for individual images and include such things as camera settings, time and date, shutter speed, exposure, image size, compression, name of camera, color information. When images are viewed or edited by image editing software, all of this image information can be displayed. It stores meta information.

4.      TIFF
The TIFF (Tagged Image File Format) format is a flexible format that normally saves 8 bits or 16 bits per color (red, green, blue) for 24-bit and 48-bit totals, respectively, usually using either the TIFF or TIF filename extension. TIFF's flexibility can be both an advantage and disadvantage, since a reader that reads every type of TIFF file does not exist. TIFFs can be lossy and lossless; some offer relatively good lossless compression for bi-level (black & white) images. Some digital cameras can save in TIFF format, using the LZW compression algorithm for lossless storage. TIFF image format is not widely supported by web browsers. TIFF remains widely accepted as a photograph file standard in the printing business. TIFF can handle device-specific color spaces, such as the CMYK defined by a particular set of printing press inks. OCR (Optical Character Recognition) software packages commonly generate some (often monochromatic) form of TIFF image for scanned text pgs.

5.      RAW
RAW refers to a family of raw image formats that are options available on some digital cameras. These formats usually use a lossless or nearly-lossless compression, and produce file sizes much smaller than the TIFF formats of full-size processed images from the same cameras. Although there is a standard raw image format, (ISO 12234-2, TIFF/EP), the raw formats used by most cameras are not standardized or documented, and differ among camera manufacturers.
Most camera manufacturers have their own software for decoding or developing their raw file format, but there is also a lot of 3rd party raw file converter software available that accepts the raw format from most cameras including Phase One's Capture One. Some graphic programs and image editors may not accept some or all raw file formats, and some older ones have been effectively orphaned already.
Adobe's Digital Negative (DNG) specification is an attempt at standardizing a raw image format to be used by cameras, or for archival storage of image data converted from undocumented raw image formats, and is used by several niche and minority camera manufacturers including Pentax, Leica, and Samsung. The raw image formats of more than 230 camera models, including those from manufacturers with the largest market shares such as Canon, Nikon, Phase One, Sony, and Olympus, can be converted to DNG. DNG was based on ISO 12234-2, TIFF/EP, and ISO's revision of TIFF/EP is reported to be adding Adobe's modifications and developments made for DNG into profile 2 of the new version of the standard.
As far as video cameras are concerned, ARRI's Arriflex D-20 and D-21 cameras provide raw 3K-resolution sensor data with Bayern pattern as still images (one per frame) in a proprietary format (.ari file extension). Red Digital Cinema Camera Company, with its Mysterium sensor family of still and video cameras, uses its proprietary raw format called REDCODE (.R3D extension), which stores still as well as audio + video information in one lossy-compressed file.

6.      PNG
The PNG (Portable Network Graphics) file format was created as the free, open-source successor to the GIF. The PNG file format supports truecolor (16 million colors) while the GIF supports only 256 colors. The PNG file excels when the image has large, uniformly colored areas. The lossless PNG format is best suited for editing pictures, and the lossy formats, like JPG, are best for the final distribution of photographic images, because in this case JPG files are usually smaller than PNG files. The Adam7-interlacing allows an early preview, even when only a small percentage of the image data has been transmitted.
PNG provides a patent-free replacement for GIF and can also replace many common uses of TIFF. Indexed-color, grayscale, and truecolor images are supported, plus an optional alpha channel.
PNG is designed to work well in online viewing applications like web browsers so it is fully streamable with a progressive display option. PNG is robust, providing both full file integrity checking and simple detection of common transmission errors. Also, PNG can store gamma and chromaticity data for improved color matching on heterogeneous platforms.
Some programs do not handle PNG gamma correctly, which can cause the images to be saved or displayed darker than they should be.
Animated formats derived from PNG are MNG and APNG. The latter is supported by Mozilla Firefox and Opera and is backwards compatible with PNG.

7.      GIF
GIF (Graphics Interchange Format) is limited to an 8-bit palette, or 256 colors. This makes the GIF format suitable for storing graphics with relatively few colors such as simple diagrams, shapes, logos and cartoon style images. The GIF format supports animation and is still widely used to provide image animation effects. It also uses a lossless compression that is more effective when large areas have a single color, and ineffective for detailed images or dithered images.

8.      BMP
The BMP file format (Windows bitmap) handles graphics files within the Microsoft Windows OS. Typically, BMP files are uncompressed, hence they are large; the advantage is their simplicity and wide acceptance in Windows programs.

9.      PPM, PGM, PBM, PNM
Netpbm format is a family including the portable pixmap file format (PPM), the portable graymap file format (PGM) and the portable bitmap file format (PBM). These are either pure ASCII files or raw binary files with an ASCII header that provide very basic functionality and serve as a lowest-common-denominator for converting pixmap, graymap, or bitmap files between different platforms. Several applications refer to them collectively as PNM format (Portable Any Map).

10.  WEBP
WebP is a new image format that uses lossy compression. It was designed by Google to reduce image file size to speed up web page loading: its principal purpose is to supersede JPEG as the primary format for photographs on the web. WebP is based on VP8's intra-frame coding and uses a container based on RIFF.

11.  Others
Other image file formats of raster type include:
  1. JPEG XR (New JPEG standard based on Microsoft HD Photo)
  2. TGA (TARGA)
  3. ILBM (InterLeaved BitMap)
  4. IMG (Graphical Environment Manager image file; planar, run-length encoded)
  5. PCX (Personal Computer eXchange)
  6. ECW (Enhanced Compression Wavelet)
  8. SID (multiresolution seamless image database, MrSID)
  9. CD5 (Chasys Draw Image)
  10. FITS (Flexible Image Transport System)
  11. PGF (Progressive Graphics File)
  12. XCF (eXperimental Computing Facility format, native GIMP format)
  13. PSD (Adobe PhotoShop Document)
  14. PSP (Corel Paint Shop Pro)
II.                 Vector formats

As opposed to the raster image formats above (where the data describes the characteristics of each individual pixel), vector image formats contain a geometric description which can be rendered smoothly at any desired display size.
At some point, all vector graphics must be rasterized in order to be displayed on digital monitors. However, vector images can be displayed with analog CRT technology such as that used in some electronic test equipment, medical monitors, radar displays, laser shows and early video games. Plotters are printers that use vector data rather than pixel data to draw graphics.

1.      CGM
CGM (Computer Graphics Metafile) is a file format for 2D vector graphics, raster graphics, and text, and is defined by ISO/IEC 8632. All graphical elements can be specified in a textual source file that can be compiled into a binary file or one of two text representations. CGM provides a means of graphics data interchange for computer representation of 2D graphical information independent from any particular application, system, platform, or device. It has been adopted to some extent in the areas of technical illustration and professional design, but has largely been superseded by formats such as SVG and DXF.

2.      Gerber Format (RS-274X)
RS-274X Extended Gerber Format was developed by Gerber Systems Corp., now Ucamco. This is a 2D bi-level image description format. It is the de-facto standard format used by printed circuit board or PCB software. It is also widely used in other industries requiring high-precision 2D bi-level images.

3.      SVG
SVG (Scalable Vector Graphics) is an open standard created and developed by the World Wide Web Consortium to address the need (and attempts of several corporations) for a versatile, scriptable and all-purpose vector format for the web and otherwise. The SVG format does not have a compression scheme of its own, but due to the textual nature of XML, an SVG graphic can be compressed using a program such as gzip. Because of its scripting potential, SVG is a key component in web applications: interactive web pages that look and act like applications.

4.      Other 2D vector formats
  1. AI (Adobe Illustrator)
  2. CDR (CorelDRAW)
  3. DrawingML
  4. GEM metafiles (interpreted and written by the Graphical Environment Manager VDI subsystem)
  5. Graphics Layout Engine
  6. HPGL, introduced on Hewlett-Packard plotters, but generalized into a printer language
  7. HVIF (Haiku Vector Icon Format)
  8. MathML
  9. MetaPost
  10. Myv vector format
  11. NAPLPS (North American Presentation Layer Protocol Syntax)
  12. ODG (OpenDocument Graphics)
  13.  !DRAW, a native vector graphic format (in several backward compatible versions) for the RISC-OS computer system begun by Acorn in the mid-1980's and still present on that platform today
  14. POV-Ray markup language
  15. PPT (Microsoft PowerPoint)
  16. Precision Graphics Markup Language, a W3C submission that was not adopted as a recommendation.
  17. PSTricks and PGF/TikZ are languages for creating graphics in TeX documents.
  18. ReGIS, used by DEC computer terminals
  19. Remote imaging protocol
  20. VML (Vector Markup Language)
  21. WMF / EMF (Windows Metafile / Enhanced Metafile)
  22. Xar format used in vector applications from Xara
  23. XPS (XML Paper Specification)
5.      3D vector formats
  1. AMF - Additive Manufacturing File Format
  2. Asymptote - A language that lifts TeX to 3D.
  4. .dwf
  5. eDrawings
  6. HSF
  7. IGES
  8. IMML - Immersive Media Markup Language
  9. IPA
  10. JT
  11. PRC
  12. STEP
  13. SKP
  14. STL - A stereolithography format.
  15. U3D - Universal 3D file format
  16. VRML Virtual Reality Modeling Language
  17. XAML
  18. XGL
  19. XVL
  20. xVRML
  21. X3D
  22. .3D
  23. 3DF
  24. .3ds
  25. 3DXML
  26. X3D vector format used in 3D applications from Xara
III.              Compound formats

These are formats containing both pixel and vector data, possible other data, e.g. the interactive features of PDF.
  • EPS (Encapsulated PostScript)
  • PDF (Portable Document Format)
  • PostScript, a page description language with strong graphics capabilities
  • PICT (Classic Macintosh QuickDraw file)
  • SWF (Shockwave Flash)
  • XAML User interface language using vector graphics for images.
IV.              Stereo formats

1.         MPO
The Multi Picture Object (.mpo) format consists of multiple JPEG images (Camera & Imaging Products Association) (CIPA).

2.      PNS
The PNG Stereo (.pns) format consists of a side-by-side image based on PNG (Portable Network Graphics).

3.         JPS
The JPEG Stereo (.jps) format consists of a side-by-side image format based on JPEG.