• No results found

Designing the data

In document Graphics Programming With Perl (Page 129-144)

Graphics and the Web

6.5 I MAGE COLLECTIONS AND THUMBNAILING

6.5.6 Designing the data

The first decision to be made is how to maintain the data: which text goes with which image, and in which section does a picture appear. In this case, we’ll use an XML file.

This allows the combination of the data to be maintained together with processing instructions (I’ll discuss that later), which is something that cannot be done as easily with a database-driven application. The XML file that drives the whole process looks like this:

<index title="Maxine's pictures">

<option name="source_dir" value="/spare/pictures" />

<option name="out_dir" value="OUT" />

<list description="The first eight weeks">

<picture name="newborn"

label="Maxine at 18 hours of age.">

<text>Maxine's first official photo.</text>

</picture>

<picture name="no_hand"

label="With Daddy, 7 days old." />

</list>

<!-- More lists go here -->

</index>

The XML document’s top node is the index node. This index consists of list nodes, and each list contains a set of picture nodes. Each picture node can contain text which will be displayed next to the full-sized version of the photo. Each of these elements can have several attributes set, which are fairly self-explanatory. The index has a title, which is used as the title of the HTML index pages. Each list can have a description, which is printed in the index as the section header (see figure 6.2). Each picture has a name and a label. The name is mandatory, and should be the first part of the name of the image you wish to include. The label is printed next to the thumbnail in the index pages.

In addition to the information pertaining to sections and pictures, the XML file can also contain option elements. These elements set various options in the program, and can appear anywhere in the document. For example, the source_dir option is set to a particular directory at the top of the document. This directory contains the original versions of the images you want to display, probably still in TIFF format, and at a much higher resolution than is appropriate for the web pages. If the next section has its source images somewhere else, a new option element can be included, and from then on the program will find its source images in that new directory.

Designing the application and data flow

Now that we have defined the data we need and how it is formatted, we can design a rough outline of an application and the data streams that go with it. Figure 6.4 con-tains a schematic drawing of the various elements of the application we need to build.

The main application reads the XML file, and constructs an in-memory representa-tion of the indexes, lists, and pictures that need to be created, which will be repre-sented by Perl classes. Each of these classes will create a part of the output needed for the photo album. The Picture::Index class is responsible for managing the lists and creating the output for the indexes, with the help of the other classes. The Pic-ture::List class is responsible for managing the individual photos and their order. The Picture class represents a single picture with all its associated information, and is responsible for the creation of the HTML pages that display a single photo, as well as the creation of the web versions of the photos from the source images.

Parsing the XML

Let’s have a look at the program that parses the XML data file, and that builds the rel-evant data structures.

#!/usr/local/bin/perl -w use strict;

use Picture;

use XML::Parser;

my $DATABASE = $ARGV[0] or

die "Usage: $0 picture_data.xml\n";

my $xml = XML::Parser->new(

Handlers => {

Start => \&xml_start, End => \&xml_end, Char => \&xml_char },

);

my $picture_index = Picture::Index->new();

$xml->parsefile($DATABASE);

$picture_index->generate_output;

The program uses XML::Parser, since we don’t want to do all the XML parsing our-selves. It also uses the Picture module, which contains the code for the Picture, Picture::List and Picture::Index classes. It is not available from CPAN, but it will be

Figure 6.4

A high-level application diagram for the Web photo album generator, showing the inputs and outputs of the program and the main operative parts and data structures.

b

Read the name of the XML database file

c

And generate all HTML and images

discussed later in this section. It takes care of all the handling of the images and HTML. The main program only parses the XML, and passes the information to the code in Picture.

b

The program expects as its first argument on the command line the name of an XML file of the format discussed on page 155. It instantiates an XML::Parser object, and tells it which subroutines to call when it encounters the start or end of a tag or some text data.

Following that, a Picture::Index object is created, which is defined in the Picture module.

This is the data structure used to store all the information contained in the XML file.

c

Next, the XML file is parsed, and the $picture_index object’s generate_output() method is called, which actually generates all pictures, thumbnails and HTML files.

That is all the work that occurs in the main program. Of course, there is a lot more code to write—in particular, the subroutines that the XML parser object is to call. The first of these is the subroutine that will be called when a start tag has been encountered in the input file:

my $in_picture = 0;

my $in_text = 0;

sub xml_start {

shift;

my $element = shift;

my %attrs = @_;

for ($element) {

/^index$/ and

$picture_index->set_option(%attrs);

/^option$/ and

$picture_index->set_option($attrs{name}, $attrs{value});

/^list$/ and

$picture_index->add_list(%attrs);

/^picture$/ and do {

$picture_index->add_picture(%attrs);

$in_picture = 1;

};

/^text$/ and $in_text = 1;

} }

d

Declare state variables

e

The name of the element and any attributes

d

Two state variables are declared that will be used to track whether we are currently inside a picture tag or a text tag. We need to keep track of this so that in the text handler, xml_char(), we know whether we need to store the text fragments that are detected.

e

The xml_start() subroutine is invoked each time an opening XML tag is detected, and the attributes of that tag are passed in as arguments. When the top node element is read, the attributes are merely passed on to the $picture_index object, which was defined in the main body of the program. When an <option> tag is encoun-tered, its name and value attributes are also passed on as options to the

$picture_index object. When a <list> or <picture> is seen, the correspond-ing method is called. The two state variables are set to certify that we can keep track of whether we’re actually processing a picture or text at the moment. This allows us to ignore any textual data that isn’t part of a text block in a picture tag.

The subroutines that are called when an XML end tag or character data are encoun-tered are called xml_end() and xml_char(), and are defined as follows:

sub xml_end {

shift;

my $element = shift;

for ($element) {

/^picture$/ and $in_picture = 0;

/^text$/ and $in_text = 0;

} }

sub xml_char {

my $parser = shift;

my $string = shift;

if ($in_picture && $in_text) {

$picture_index->add_text($string);

} }

All that needs to be done in xml_end() is to reset the two state variables, and all that the xml_char() subroutine needs to do is to call the add_text() method if both the state variables are true.

This concludes the program that parses the XML file. It is now time to have a look at the Picture module. It contains code for three classes, Picture::Index, Picture::List, and Picture. The names of these three packages makes it clear that they reflect the data structure in the XML document.

O

Only if we’re processing a picture’s text tag

The Picture::Index class

The Picture::Index class is responsible for the creation of the index pages, and the man-agement of photo lists. It also contains convenient methods that forward information to the lists that are managed. Let’s have a look at how Picture::Index is implemented:

package Picture::Index;

b

The Picture::Index class includes the CGI module, for generating HTML. That way we don’t have to worry as much about correctly opening and closing tags, and can let a thoroughly debugged module do all the hard work for us.

c

A package-wide %options hash is declared, which is used to initialize objects that are instantiated from this class. This can be seen in the new() method, which is defined next. The objects of the class are all instantiated with the contents of the %options hash, overridden with the arguments (stored in %attrs) and have a reference to an empty array added to the store picture lists. This array will be used to store the con-tained photo lists, which are implemented as Picture::List objects, discussed later.

The set_option() method can be used to set individual attributes for the class as a whole, or for individual objects.

sub set_option {

my $proto = shift;

my %attrs = @_;

Picture->set_option(@_);

if (ref($proto))

O

Pass arguments on to the Picture class

O

Is this an object?

In any case, the call is forwarded to the Picture class, so that all attributes set in this subroutine become class defaults. If the method is called on an object reference, the object’s options are changed, otherwise the class options are changed.

The add_list() method is used to create a new list in the index. This is accom-plished by instantiating a new Picture::List object (discussed later), and pushing that onto the end of the array that contains all lists. At this time a file name is also assigned to this particular list, which will be used for the output phase.

sub add_list {

my $self = shift;

my %attrs = @_;

my $list = Picture::List->new(%attrs);

my $num = push(@{$self->{_lists}}, $list);

$self->last_list->{filename} = "index$num.html";

}

sub last_list {

my $self = shift;

$self->{_lists}->[-1];

}

The last_list() method can be used to get a reference to the last list added.

The add_picture() and add_text() methods just add a picture to the current list, or text to the current picture.

sub add_picture {

my $self = shift;

$self->last_list->add_picture(@_);

}

sub add_text {

my $self = shift;

$self->last_list->last_picture->add_text(@_);

}

This is done by forwarding the call to the current Picture::List or Picture object, which can be obtained through the method last_list() in Picture::List and the last_picture() method in Picture, which will be discussed later. There should be no surprises in this way of delegating work for anyone who is familiar with OO programming.

The only code left to write for this package is the code that actually generates the output files from the information provided:

sub generate_output {

my $self = shift;

local(*OUT);

mkdir($self->{out_dir}) or

d

After making sure the output directory exists, a loop is set up over all the lists that were added, and for each list, the name of the output file is determined.

e

There is actually one extra run of the loop for element –1. Its purpose is to create the main index, i.e., the index that has only the headers of the sections, with all sections closed (see figure 6.2). This inclusion is also the reason that the code creating the file name looks complex. If the loop index is –1, then the file name becomes index.html, otherwise the file name becomes whatever it was set to in the add_list() method discussed earlier.

The outermost loop creates all of the index documents. Each index document needs to have information about all the lists in it. Inside this loop, we need to loop again over all the lists, and ask each list to generate the HTML for its own entry by calling the index_entry() method. We will discuss this method in full later. The only argument it expects is a boolean value that tells it whether to generate only a header, or a full set of thumbnails. It needs to generate thumbnail lists only when the docu-ment we are currently generating is the index for the list in question; in other words, when $li == $lj.

d

Make sure the output directory exists

The Picture::List class

We will now look at the code that manages the Picture::List objects. These objects represent individual lists of photos that belong together, and manage the order of these photos, as well as how they should look in an index entry.

package Picture::List;

use strict;

use CGI qw(-no_debug);

sub new {

my $proto = shift;

my %attrs = @_;

my $class = ref($proto) || $proto;

my $self = {%attrs, _pictures => []};

bless $self => $class;

}

There is nothing surprising in the package header or its constructor. We use the CGI module again for the creation of the HTML involved, to save work and debugging. A new object gets initialized by using the arguments passed into the constructor, and a reference to an empty array is added. This array will be used to store all the pictures in this list.

To add those pictures, the method add_picture() is provided. The add_picture() method instantiates a new Picture object and pushes that onto the end of the discussed array.

sub add_picture {

my $self = shift;

my $picture = Picture->new(@_);

push @{$self->{_pictures}}, $picture;

}

sub last_picture {

my $self = shift;

$self->{_pictures}->[-1];

}

The method last_picture() is provided to get a reference to the current picture, so that outside code can easily obtain this information.

In the generate_output() method of the Picture::Index class, we discussed ask-ing the Picture::List objects to generate the HTML for themselves. This is accom-plished in the following code:

sub index_entry {

my $self = shift;

my $selected = shift;

my $h = CGI->new();

if ($selected)

O

Need to generate HTML for a selected list

O

Need to generate HTML for an unselected list

There are two main branches in this subroutine:

b

one to generate HTML for a selected index entry, and

c

one branch to generate HTML for an unselected index entry (see figure 6.2). The code in each of these branches is almost identical, and gen-erates HTML for a table row representing the index entry for this picture list, with a little arrow picture in the left table cell, and the description of the list in the right.

If the list is currently selected, thumbnails of the photos in the list need to be printed, together with a short description. This is done by setting up a loop which calls the generate_output() method for each picture, and adds the result of that call to the current output. In this way, the Picture::List class delegates part of the work to the Picture class.

The Picture class

We will now have a look at the last package, which is responsible for all the image handling in this program.

$self->{html_name} = "$self->{name}.html";

$self->{thumb_name} = "$self->{thumb_dir}/$self->{name}.jpg";

$self->{pic_name} = "$self->{pic_dir}/$self->{name}.jpg";

bless $self => $class;

}

f

The Picture class uses the CGI module for the HTML output, but it also needs Image::Magick to be able to read the source images, and write the resized and refor-matted images for the web photo album and the thumbnails.

g

As in the Picture::Index package, there is a %option hash that is used to provide defaults for the various attributes of the pictures. The attributes specify what the size of the images will be for the single pages (they will fit inside a 400 by 400 pixel

f

The Image::Magick module is needed for image sizing

g

Default options for the photos

square), and for the thumbnails (70 by 70 square maximum). The source images will be sought in the directory specified by source_dir, and there are three attributes that control the output directories: out_dir, thumb_dir and pic_dir. All of the HTML files will be written in out_dir, all full-size pictures in out_dir/pic_dir and all thumbnail pictures that are displayed in the indexes will be put in out_dir/

thumb_dir. The overwrite attribute controls whether already existing images will be overwritten or left alone. Each of these attributes can be overridden by specifying an appropriate <option> tag in the XML file. The constructor for this class initial-izes an object with defaults copied from the %option hash, and sets up the file names of the various output files for this particular photo.

All these options can be changed per object, or for the whole class with the set_option() method:

All pictures, except for full-sized images and thumbnail images, can also have text associated with them. This text is added with the add_text() method:

sub add_text {

my $self = shift;

$self->{text} .= shift;

}

Each photo can appear as an entry on an index page. The Picture::List class calls the index_entry() method to get the HTML when that is necessary, and that method is implemented as follows:

$h->a({-href => $self->{html_name}},

This method generates HTML for a single table row with two cells. The left cell con-tains the image tag for the thumbnail, and the right side concon-tains the picture label.

Remember that this method was called from the index_entry() method in the Picture::List object, just after a call to generate_output() (see the code on page 114). The order of those two calls is important, because generate_output() is actually responsible for calling the method that sets the size of the thumbnail image:

sub generate_output {

my $self = shift;

local(*OUT);

my $pic_dir = "$self->{out_dir}/$self->{pic_dir}";

my $thumb_dir = "$self->{out_dir}/$self->{thumb_dir}";

mkdir($pic_dir) or die "Cannot mkdir $pic_dir: $!"

unless -d "$pic_dir";

mkdir($thumb_dir) or die "Cannot mkdir $thumb_dir: $!"

unless -d "$thumb_dir";

$self->generate_images;

$self->{text} =~ s#\n\s*\n#</p><p>#g;

my $h = CGI->new();

my $html_name = "$self->{out_dir}/$self->{html_name}";

open(OUT, ">$html_name") or

die "Cannot open $html_name: $!";

In document Graphics Programming With Perl (Page 129-144)