• No results found

S OME GRAPHICS FORMATS

In document Graphics Programming With Perl (Page 38-42)

Overview of graphics file formats

2.1 S OME GRAPHICS FORMATS

There exist a large number of formats for storing or transporting graphics informa-tion. These formats can be roughly divided into two groups: formats that store graph-ics as vectors, and formats that store graphgraph-ics as pixel maps.1

Vector formats store their objects as mathematical descriptions and sets of coordi-nates. A circle is stored as the coordinates of the center and the radius; a rectangle as the coordinates of a corner and the height and width (and possibly rotation) of the rectangle. This allows for a good scalability of the image, but it requires the rendering engine to interpret and draw each object every time it needs to be displayed or used in any other way. Also, the calculations to switch display pixels on and off need to be performed each time.

Unfortunately, at present, there are very few Perl modules that work with formats such as these, at least in a native sense. Many modules can create vector graphics for output, and some can import them and translate them into an image format, but that is where it stops. Of course, when a vector graphic is stored in plain text, Perl can be easily used to create files of that type. Reading them back in is another matter, and requires a module or program with intimate knowledge of the format.

Image formats store graphics as a two-dimensional array of pixels, wherein each pixel represents one point of the image. These formats contain no information about whether sections are parts of a circle, rectangle, chair or clown face. Each pixel stands on its own, and is (largely) independent of the other pixels. Most commonly, each pixel contains a set of values that express the color, and sometimes transparency, of that particular pixel.

Some image formats allow you to store more than one image in a single file or data stream. Some will allow you to express a certain relationship between all of these images. One example of this is GIF animation, in which each image is a frame in a sequence, and there are extra data associated with each image that expresses how long it should be displayed and how it should be cleaned up after that time. Other examples

1 Of course there are graphics formats and languages that allow both, but for the sake of simplicity we will just ignore that.

are the layers in a GIMP or Photoshop file. The ways in which these layers can relate to each other are too numerous to list. The main relationship between the layers is how they are combined with other layers below them in the stack.

Some file formats are not well suited for storing images for further processing because they are lossy, meaning that the image data is stored in a way that causes some information to get lost. A format such as JPEG uses some clever algorithmic techniques that can dramatically diminish the amount of data to be stored for photographic images. However, what gets stored is not a full representation of the original data; some information is lost. If you read this image later, manipulate it, and then save it again, you lose even more. A few of these repetitions can result in a considerable loss in quality.

Another way to lose information is by color reduction. Some image formats store colors not per pixel, but instead store a palette of colors, and each pixel points to one of the colors in the palette. For an RGB image in which each color component can have an integer value between 0 and 255, you need at least three bytes per pixel to store the image. However, if you allow a total of only 256 colors, you only need one byte per pixel, which is the index into the palette of 256 colors.

It is, therefore, important to always store the originals of your images in a lossless format, such as PNG or TIFF, and to convert to lossy formats, such as JPEG and GIF, only as the last step in the process.

There is a wealth of information available on the net covering the various graphics formats and their use and abuse [8,9,10,11,12,13]. I will only discuss a very small sub-set here, mainly the formats that are most usable for the Web. The main reason for this limitation is that the use of graphics on the Web is a mess. This is due to a lack of understanding regarding which format to use for which picture, as well as the lim-itation in usable formats (see also section 6.2, “Suitable image formats,” on page 92).

2.1.1 GIF

The Graphics Interchange Format, GIF, was first designed by CompuServe in 1987 as version 87A, and later expanded upon with version 89A. The idea behind GIF was to facilitate the reduction of the size of bitmap files so that transport over modems would be faster.

The fact that the GIF format is so old shows in the limitation of 256 colors per data stream (or per image). However, the GIF format at the moment is the only widespread image format that supports animation of some sort.

The GIF format traditionally uses LZW (Lempel-Ziv & Welch) compression,2 which is patented by Unisys. Unisys, after an extended period of allowing free use of the algorithm for nonprofit applications, decided recently to require a license for the use of LZW compression. There is much confusion about whether a license is also required for LZW decompression, although Unisys insists that a license is needed for

2 LZW compression is not only used in the GIF format, but also, among others, in the TIFF format.

any use of the LZW algorithm. Unisys has changed its position a few times on who needs a license and who doesn’t, and could change it again at any time.

In practice, if you use software that incorporates the LZW algorithm in any way, you or the vendor of the software are responsible for obtaining a license from Unisys.

Even if your application is made available for free, you are currently required to obtain a license. An interesting detail is that a lot of commercial software, including many Microsoft products, have been licensed, but in such a way that the user of the software still has to obtain a license from Unisys.

SEE ALSO More information on the LZW license can be found at:

http://www.unisys.com/unisys/lzw/ and http://burnallgifs.org/.

This license, and the fact that the folks at Unisys keep changing their minds about who should pay for one, has made it virtually impossible for many of the free products out in the Open Source community to continue supporting GIF. There is still a way to use GIF, since the GIF format does not require LZW compression. The libungif C library relies on that feature for creation of GIF images that are not subject to the patent. Of course, GIF images that don’t use a compression algorithm are very large and virtually useless for the web, which is where GIF has its largest application domain.

This, taken together with the limitations of the GIF format itself, suggests that you should try to avoid using it, if at all possible. Use PNG images instead of GIF images.

I can think of only two reasons to use the GIF format in favor of the PNG format: you need to create small and simple animations, or you need to support a user base that has obsolete software that doesn’t read the PNG format. The first reason will disappear when the MNG and SVG formats become widely supported, since they can both be used for various kinds of animations. The second reason disappears when you no longer have to cater to obsolete software.

2.1.2 JPEG, JFIF

The image format which is normally called JPEG should really be called JFIF, which is short for JPEG File Interchange Format. JPEG stands for Joint Photographic Experts Group, the group that gave us the JFIF format and the compression techniques used in it. The compression technique itself is known as JPEG compression.

JPEG compression is at its best when the image is a photo (color or B&W), or any other image that resembles a real-world scene. It is not as good at images where most of the neighboring pixels have the same color. JPEG is a lossy format, meaning you cannot restore the original information from the compressed image. JFIF images work with a 24-bit color space (16 million colors).

2.1.3 PNG

The Portable Network Graphics format (see [14]) was designed with the web in mind, specifically to replace GIF and, to a lesser degree, TIFF. The advantages of PNG over GIF are that PNG has alpha channels, gamma correction, a better interlacing method,

and generally slightly better compression. One feature of GIF that PNG sadly lacks is the popular animation or multiple-image format (see MNG).

PNG is a lossless image format, unlike GIF, which normally only stores up to 256 colors, and JFIF, which loses information due to its compression technique. Together with the 48-bit color (and 16-bit grayscale) support, this makes it a very suitable for-mat for interchange between packages.

2.1.4 MNG

The Multi-Image Network Graphics format (see [15]) is strongly based on PNG, and was in fact designed by some of the same people. MNG is the answer to the desire of web designers to include animated images on their sites, and it promises a lot more, in better ways, than GIF animation is providing.

Unfortunately, the support for MNG is still limited, and the standard for the for-mat has not yet been officially set down. For all practical purposes, at this time, ani-mations will have to be provided with GIF streams.

2.1.5 SVG

The Scalable Vector Graphics format is a language used to describe graphics in XML.

As the name suggests, it will allow vector graphics, but can also contain text and images. The language is a W3C recommendation, and therefore a web standard, and its specification can be found at http://www.w3.org/Graphics/SVG. SVG is clearly aimed at bringing more sane graphics to the Web by providing a standardized vector graphics format that is easy to parse and is transportable.

Support for the SVG format is still limited, but is growing fast. One Perl module that can import SVG graphics is Image::Magick, and other modules to work with SVG area appearing on CPAN. Support for this format in the major web browsers is also growing, and its acceptance promises that this well-designed and flexible format will be successful in solving many of the problems currently experienced with web graphics.

More information about SVG is available at the above-mentioned URL, and in the upcoming book Definitive SVG by Kelvin Lawrence, et al. [16].

2.1.6 TIFF

The Tag Image File Format (TIFF) is one of the most venerable of image formats. It has been around for a long time, and is supported by many pieces of software. The format allows for many image format features and compression schemes (including LZW compression). Originally, TIFF was defined by Adobe Systems Incorporated, but the format now is also defined by the Internet Engineering Task Force (IETF), and is described in RFC 2302.

The TIFF format is mainly intended for images originating from scanners and other imaging devices; hence the format is quite extensive in order to support all the various capabilities of these devices. The TIFF format specifies a baseline set of features that every compliant application should support. Apart from this baseline it also defines many extensions that applications can optionally implement.

The TIFF format allows multiple images per file, full color, grayscale and palette-based color data, as well as an alpha channel. This makes it a suitable format for storing images to use as a source, because it allows the storing of the complex information that image formats can contain, without loss, reasonably compressed, and portable between applications.

In document Graphics Programming With Perl (Page 38-42)