• No results found

Languages of the Internet: Perl and HTML

In document Oops, page not found. (Page 95-99)

It is an interesting phenomenon that most computer scientists go through a 5-yearperiod early in theircareerwhen they think that computers and programming are really, really neat. They will kill enormous amounts of time customizing their personal computers to get everything to work just right and learn all the arcane details of the latest programming languages.

Students in this obsession phase are a joy to have around, largely because theirprofessors have long since left it. These days, I am much more excited about finding interesting things to do with computers (like predicting jai alai matches) than I am in dealing with an upgrade from Windows 98 to Windows 2000. Fortunately, Dario did Windows, and a whole lot more.

Dario was particularly eager to learn the behind-the-scenes language that makes the Internet go, a programming language called Perl. Perl is not as much a reflection of hot new technology as it is a manifestation of old ideas freshly applicable to today’s problems.

Although the explosive growth of the Internet has clearly been the most exciting recent development in computer technology, the dirty truth is that it really doesn’t require much computing to make the Internet work.

Throughout most of the information age, computers spent the bulk of their time crunching numbers (like predicting the weather) or in business data processing (doing things like payroll and accounting). Most applications ran on expensive, mainframe computers that kept busy around the clock and charged users for every minute of computer time.

Fast forward to today. Now millions of desks across the nation contain personal computers, each of which is vastly more powerful than the “big iron” of yesteryear. And what do we do with the billions of instructions

persecond that we have at ourdisposal? We run increasingly elaborate screen-saving programs whose shimmering images decorate our desks as they protect the phosphors on our monitors.

The truth is that the Internet is really about communication, not com-putation. Although the Worldwide Web has been dubbed the “World-wide Wait” because of sluggish response times, the primary source of these delays is not insufficient processing power but the problem of too many people trying to use too few dedicated telephone lines – all at the same time.

An embarrassingly high percentage of the computing tasks associated with the Worldwide Web are basic bookkeeping and simple text refor-matting. Perl is a language designed to make writing these conversion tasks as simple and painless as possible. Depending upon whom you be-lieve, Perl is an acronym for either “practical extracting and reporting lan-guage” or“pathologically eclectic rubbish lister.” The goal of its creator, Larry Wall, was to “make the easy jobs easy, without making the hard jobs impossible.”

Perl programs are not particularly efficient, but they are particularly short. They are designed to be written quickly, plugged in place, and forgot-ten. No one would think of building a Monte Carlo simulation to simulate a million jai alai games in Perl because such high-performance number-crunching jobs must be carefully written to utilize the machine efficiently.

Perl is for those quick-and-dirty, hit-and-run reformatting tasks that help programmers untangle the Web.

One of the common text-processing tasks in which Perl scripts are used is preparing WWW pages on demand from databases. Look up your favorite book (ideally, look up my book) on Amazon.com orsome other on-line book dealerand you will see a customized page with the title and publisher, a picture of the cover, reader-supplied reviews, and even the current rank on the company’s bestsellers’ list. This WWW page was not written by a person but a computer program that extracts the relevant information from the database and adds formatting commands to make it look right on the reader’s screen.

A second language of the Internet is HTML, an abbreviation for the

“hypertext markup language.” HTML is the language in which all WWW pages are written, that is, the text spit out by Amazon.com’s Perl programs.

It really isn’t a computer programming language at all, for you can’t write a program in HTML to do anything. This language provides a medium for

an author(orcomputer) to specify what a WWW page should look like to the reader.

As we saw, Milford’s schedule and results files were presented as unexciting-to-read but simple-to-parse text files. Dania Jai-Alai was more ambitious and used HTML formatting to present its results and schedule files. The following portion of a Dania schedule file illustrates HTML:

<HTML>

<HEAD>

<TITLE>Entries Shell</TITLE>

</HEAD>

<BODY BGCOLOR= "#FFFFff"TEXT = "#000000"LINK = "#FF0000"

VLINK= "#0f4504">

<font color= "#ff0000">

<center><img src= "botlogo.gif"></center>

ENTRIES DANIA JAI ALAI AFTERNOON 07/19/98 14 GAMES</font>

<table cellpadding="15"align="top">

<tr align=left valign="top">

<td>

<!--column 1 entries -->

<table valign="top">

<tr valign=top align=left>

<td><font color="#ff0000">GAME 1 - Spec 7 -Tri,DD<br></font>

<font>1 Mouhica-Oyhara<br>

2Blanco-Verge<br>

3 Scotty-Zuri<br>

4 Arecha-Inigo<br>

5 Rocha III-Ondo<br>

6 Aymar-Eneko<br>

7 Laucirica II-Bilbao<br>

8 Andonegui-Homero<br>

SUBS: Burgo-Ulises</font></td>

</tr>

The formatting commands of HTML appear within the angle brackets such as<TITLE>. This portion starts by presenting the title of this page and then specifies the colorof both the background and the text (the

actual colors are described by “names” like #ff0000). It then specifies that a picture named “botlogo” should be inserted, neatly centered in the middle of the line. The schedule of each game is formatted as a table in which each row presents the post number and the two members of each doubles team.

This HTML formatting may seem ungainly, but you weren’t intended to read it – yourWWW browserwas. It would be tedious fora person to write all those formatting commands each day, but that was done by a Perl program, not a person. As is the case with Amazon.com, these WWW pages are produced by formatting the information in a database using a straightforward computer program. Because a computer program writes the actual HTML files, we can rely on the format to be the same day to day without any typing or formatting errors.

My student Dario did not have access to the fronton’s private database containing the unformatted schedule and result information. However, he did have access to these HTML pages. By writing his own Perl program, he could carefully strip away all that formatting the fronton’s program had diligently inserted. He could take the remaining data and format it just as we did with the Milford data, enabling us to add it each day to our library of jai alai scores. Once we had amassed enough data to work with, our fun could really begin.

Any discussion of the languages of the Internet would be incomplete without mentioning Java. At the risk of slightly oversimplifying things, Java is a programming language for writing programs that will run on somebody else’s machine, typically using an Internet browser.

Forexample, suppose I want to put a facility on the WWW enabling you to calculate the amount of money you will pay each month if you take out a mortgage. I could create a WWW page that would prompt you to type in the interest rate, loan amount, and term of the loan, then calculate the numberon my machine, and send this numberto you on yourmachine.

Alternatively, I could write a little program in Java my machine could give yourmachine, which, when run on your machine, would prompt you for the relevant numbers and do the calculation there. This second arrange-ment is betterforme, in that it reduces the amount of interaction on my machine, and also betterforyou because I don’t get to know how much money you are thinking of embezzling from the bank.

We don’t use Java anywhere in our system because there is no program we want to run on somebody else’s machine, and because no fronton’s

WWW site provides a program that we want to run (as opposed to data, which we want to read). Still, Java is a good thing. In fact, it is such a good thing that Microsoft devoted considerable energy and resources trying to kill it.

In document Oops, page not found. (Page 95-99)