• No results found

Figure 1. Yahoo Movies Advanced Title Search DAVE TAYLOR

that would take HTML form tags as input and produce shell script segments as output. Uh, no thanks.

Instead, with a few hacks in vi (yeah, I don’t use Emacs), I have the following, as part of a usage() function:

usage() { cat << EOF

USAGE: findmovie -g genre -k keywords -nrst title Where

-n only match those that have news or features -r only match those with reviews

-s only match those that have showtimes -t only match those that have trailers

and genre can be one of:

act (Action/Adventure), ada (Adaptation), ani (Animation), ...

tee (Teen), thr (Thriller), war (War) or wes (Western).

EOF

}

This makes life easy and pushes the trick of remembering the three-letter abbreviation for the genre onto the user. Sneaky, eh? Now, to be fair, good interface design would have me writing a more sophisticated script that lets users enter a variety of abbreviations (or the full word) and converts them into the proper Yahoo-approved abbreviation, but that’s actually work, so we’ll skip that too, okay?

Now, note the actual usage I’ve created:

USAGE: findmovie -g genre -k keywords -nrst title

This means there are a couple elements of the form that we are going to ignore in the script, including which decade the film was released and some of the more obscure conditional parameters.

Still, it’s enough to keep us busy.

Parsing Parameters with getopts I’ve talked about the splendid getopts within shell scripts before, without which parsing the six parameters—two of which have arguments, four of which don’t—would be a huge hassle. Instead, this

is straightforward. Here are the first few lines to give you the idea:

while getopts "g:k:nrst" arg do

case "$arg" in

g) params="${params:+$params&}gen=$OPTARG" ;;

There’s a lot to talk about here, but we have covered getopts before, and you can <cough>

check the man page too, right? In a nutshell though, a letter with a trailing colon means it has a required parameter, so g and k have arguments (g:k:), while n, r, s and t do not (nrst).

The params expansion is a nifty little shell trick that’s worth a special mention too. The notation

${params:+$params } expands to the value of the

$params variable, plus a trailing space, if the vari-able already has a value. Otherwise, it’s the null string. The point? To avoid leading ampersands in the URL that we’re building.

Let’s have a quick peek:

$ findmovie.sh -g war -k peace -r

finished. params = gen=war&syn=peace&revs=1

As we’d hope, the params variable has been expanded to reflect the specific values that the user has specified on the command line—in this case, War films that have reviews and contain the word

“peace” in the synopsis.

Building the Full URL

There’s a hiccup waiting to bite us with the code in its current state though. The problem is, what if the user specifies two words in the keywords value field or, worse, does so in the title field (remember, the last word or words are the title pattern, the core search for the Yahoo Movies system)?

The answer is that we need to convert spaces into symbols that are acceptable by the http system.

That’s easily done, fortunately:

params="$(echo $params | sed 's/ /+/g')"

It’s not the most elegant solution, but it’s certainly functional!

The bigger problem here is that Yahoo requires certain parameters actually be present to do a search. Choose a genre on the Web interface and click search, and you’ll see that’s not sufficient for it to proceed.

As a result, our base URL for searches is going to be a bit more complicated:

baseurl="http://movies.yahoo.com/mv/search"

baseurl="${baseurl}?yr=all&syn_match=all&"

Try that, and you’ll find it doesn’t work. Why?

Because there are some hidden parameters that Yahoo has slipped into the form that are required to send to the search program. Without them, it just stops.

In fact, here’s the baseurl value we need:

baseurl="http://movies.yahoo.com/mv/search"

baseurl="${baseurl}?yr=all&syn_match=all&adv=y&type=feature&"

Now, how do we put this all together? It’s not so easy, because we still need to grab whatever’s on the end of the invocation (the title pattern), then mask the spaces:

shift $(( $OPTIND - 1 ))

Hang on, let me explain this line before we go further. OPTIND contains the index into the positional parameters of the script, indicating the first parameter that wasn’t absorbed by the getopts processing. Unfortunately, it’s indexed from 1, and the options array is indexed starting at zero. The result? We have to subtract one from the value to be able to get the actual value with the $* notation:

params="$(echo $params | sed 's/ /+/g')"

pattern="$(echo $* | sed 's/ /+/g')"

echo URL: $baseurl${params}\&p=$pattern

Now, finally, armed with that, we can search for films that contain the word “love” and have reviews:

$ findmovie.sh -r love

URL: ...BASEURL...revs=1&p=love

Type that in, and you’ll find it works fine, showing 80 films where “love” appears in the title and Yahoo Movies is aware of on-line reviews of the films.

Most Linuxes and other flavors of UNIX have a way that you can launch a Web browser from the command line, with the specified URL as its home.

That’s what we’ll do:

echo $baseurl${params}\&p=$pattern

open -a safari "$baseurl${params}\&p=$pattern"

There are other things we can do now that we’ve converted the Yahoo advanced search form into a shell script, but we’ll leave those for next month!I Dave Taylor has been hacking shell scripts for a really long time, 30 years. He’s the author of the popular Wicked Cool Shell Scripts and can be found on Twitter as @DaveTaylor and more generally at www.DaveTaylorOnline.com.

WORK THE SHELL

The 1994–2009 Archive CD,