The Document Head - MUSIC FORM Submit to Server Now

MUSIC FORM Submit to Server Now

10. The Document Head

l 10.1 Introduction

l 10.2 The title Element

l 10.3 The meta Element

l 10.4 Scripting

l 10.5 The style Element

l 10.6 The link Element

l 10.7 The base Element

10.1 Introduction

In an HTML document, the head element contains a set of elements that are not concerned with the content of the web page but give additional meta information related to styling, what the page contains, how it is related to other pages etc. The set of elements allowed in the head of the document are:

title

The title of the document.

meta

Defines a set of metadata associated with the document. It is of use to browsers, search engines, editors etc.

script

Defines a script to be associated with the Web page. Although not part of the head, the noscript element will also be described.

style

Defines a style sheet to be used by the document link

Defines document relationships base

resolves relative URLs

10.2 The title Element

<title >

The title element must be present in a document for it to be legal HTML. It serves to identify the document and it is normal for the browser to add the title to its own decoration at the top of the browser window. When a user stores a page in a bookmark (or favourites) list, the bookmark will be identified with the title of the document. In consequence, the title chosen should place the document in its context.

10.3 The meta Element

<meta name= content= http-equiv= scheme=/>

The meta element is a major way that a search engine locates pages. Browsers are not required to support the meta element so there are few hard and fast rules about what is allowed and what it does.

In many cases it may be browser or search engine specific. The two attributes name and content define a property and its value. As HTML does not define any values that the attribute name can take, it is largely up to the users of the page to decide what to do with the information. Here is an example of a typical web site aiming to get itself recognised by the search engines (http://www.equineworld,co.uk):

<html>

<head>

<title>Horses and ponies, UK</title>

<meta name="description" content="Horses and Ponies: about horses, ponies, horse riding and equestrianism in the UK. Horses for sale, equestrian news, equine care, horse chat and more.">

<meta name="keywords" content="horses, horse, equestrian, equine, ponies, pony, horses for sale, uk">

<meta name="robots" content="index, follow">

<!--Horses and Ponies: about horses, ponies, horse riding and

equestrianism in the UK. Horses for sale, equestrian news, equine care, horse chat and more.-->

<meta name="owner" content="Equine World UK Ltd">

<meta name="subject" content="horses and ponies, uk">

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

<meta http-equiv="PICS-Label" content='(PICS-1.1

"http://www.icra.org/ratingsv02.html" l gen true for

"http://www.equine-world.co.uk" r (ca 1 lz 1 nz 1 oz 1 vz 1)

"http://www.rsac.org/ratingsv01.html" l gen true for

"http://www.equine-world.co.uk" r (n 0 s 0 v 0 l 0))'>

There are two main types of search engines. One set crawl the web looking at pages while the other type tries to get some human cataloguing of the information. Many of the ones used today are a mixture of both approaches. The web crawl involves looking at pages and following links to other pages. The better ones may look at all the text on a page. The worst ones start at the top and may only look at the head of the document. The meta element is one of the elements that some search engines look for. But do not assume that by using lots of meta elements you are sure to be recognised. Search engine spamming is a problem. if you add words to get your pages listed in a search, the search engines will detect that so make sure any words you use in the meta information are appropriate. For a commercial web site, it is essential that for a relevant query, the site gets ranked in the top 10. If you land up on the second page, you are close to death! So target keywords important for your site must be in the title of the page and in the meta information. Search engines have realised that meta information can be rigged and so the use of this information by search engines is getting less and less important. For example, neither Google or Alta Vista use meta element information to rank web pages any more. On the other hand, Inktomi does. Most read and use the information in the meta elements in some way. The two most important properties defined by the name attribute are description and keywords. If the search engine believes the page is honest, it will use the description in place of one that it might try to derive from its own activities. In the example above, the description aims to give a good overview of the whole site. It is clear that the site is about both horses and ponies, that people in the UK are most likely to be interested, that you can buy a horse there and so on. The keywords property tries to guess the kinds of searches that people who need to find this site might type.

The robots property is primarily aimed at stopping robots visiting pages of no value If the content is set to noindex robots will not visit the page. In the example, the author has requested a robot to index the page and follow any links on the page. You may wonder why the author has put a comment containing all the keywords after the meta information itself. As some search engines do not look at meta information and believe information near the top of the page is most important, adding the comment effectively acts as a meta element for those search engines.

Some of the other values of the name attribute are:

author

The author's name generator

The tool used to create the page formatter

Similar to generator copyright

Equivalent to a copyright statement subject

Similar to tile

It is difficult to be precise as anybody is allowed to dream up values and there usefulness is dependent on industry using them as de facto standards. A set of standard terms would have made the meta element more useful.

A good site for further information about search engines and what they do is http://www.searchenginewatch.com/.

The attribute http-equiv effectively defines an HTTP header which controls the action of the browser receiving the page and is used to refine the information provided by the actual headers in the HTTP.

Some servers translate the meta information into actual HTTP headers automatically. In the example above, the meta elements define the content type of the page and its PICS label (this will be covered in the discussion of metadata). Other possibilities might be:

<meta http-equiv="expires" content=" Wed 28 Feb 2002 23:59:59 GMT">

<meta http-equiv="Pragma" content="no-cache">

<meta http-equiv="Refresh" content="0;URL=http://www.newurl.com">

<meta http-equiv="Set-Cookie"

content="cookievalue=xxx;expires=Wednesday, 28-Feb-02 18:12:20 GMT;

path=/">

The first states when the page is no longer valid. The second stops Netscape caching the information.

The third states when the page should be refreshed and the fourth defines a cookie. Fuller details of what is possible is defined in HTTP.

The scheme attribute has been added as a qualifier on what the meta information means. For example:

<meta scheme="ISBN" name="identifier" content="0-201-17805-2">

This states that to interpret the meaning of identifier and the number, you need to realise that the scheme being used is the ISBN classifications scheme. By knowing this, the meta information can be correctly interpreted as the book Raggett on HTML 4.

10.4 Scripting

<script type= src= >

<noscript >

To a large extent, scripting is outside the scope of this Primer. Suffice is to say that scripts can be placed in either the head or the body of the HTML document. Several scripts can be provided at different points in the page but effectively all the scripts are put together as one script associated with the page. Putting the script in the head of the page is more elegant but may not be the best place.

Remember that search engines do not normally read the whole page. In consequence if you have a long script associated with your page, it is possible that all the search engine reads is a load of non-English rubbish as far as it is concerned. An example use is:

<script type="text/vbscript"

src="http://www.w3.org/script/script.js" />

Here the script is defined external to the document. The type indicates the MIME type of the script

<script type="text/javascript">

. . . </script>

In this case the script is enclosed between the start and end tags. The body element has an attribute that can cause the script to be executed when the page is loaded or unloaded:

<body onload="processscript()">

<body onunload="processscript()">

Most elements have the ability to define a script invocation on being clicked or moved over. For example:

<h1 onclick="ps()">ABC</h1>

<p onmousedown="ps()">ABC</p>

The complete set of events are:

onload

Occurs when the browser completes the page load. Only available on the body element.

onunload

Occurs when the browser unloads the page. Only available on the body element.

onclick

Occurs when the mouse button is clicked over an element.

ondblclick

Occurs when the mouse button is double clicked over an element.

onmousedown

Occurs when the mouse button is pressed over an element.

onmouseup

Occurs when the mouse button is released over an element.

onmouseover

Occurs when the mouse is moved over an element.

onmousemove

Occurs when the mouse is moved while it is over an element.

onmouseout

Occurs when the mouse is moved away from an element.

onfocus

Occurs when an element receives focus. To be described under Accessibility onblur

Occurs when an element loses focus. To be described under Accessibility onkeypress

Occurs when a key is pressed and released over an element onkeydown

Occurs when a key is pressed over an element onkeyup

Occurs when a key is released over an element onsubmit

Occurs when a form is submitted onreset

Occurs when a form is reset

onselect

Occurs when a user selects some text in either the input or textarea elements onchange

Occurs when a device loses focus and its value has changed since gaining focus

10.5 The style Element

<style type= >

The style element defines a stylesheet to be used by the document in much the same was as the script element defines a script. An example is:

<style type="text/css" media="print"> . . . </style>

The type attribute defines the type of stylsheet following and the media attribute indicates to which media it applies.

10.6 The link Element

<link href= type= media= hreflang= charset=/>

The link element has been available in HTML almost since the beginning. the intention was that the element would give information concerning how this page relates to other pages. As two pages can each link to the other, it is not always obvious which is the parent and which is the child. The link element was designed for that purpose and the hope was that search engines would use the link information to improve search relevance. To a large extent it is not used in that role. However there are some specific uses of the link element:

<link href="../primerstyle.css" rel="stylesheet" type="text/css"

media="print"/> <link rel="alternate" hreflang="pt" type="text/html" />

<link rel="alternate" hreflang="ar" charset="ISO-8859-6"

type="text/html" />

The first example shows a link to an external stylesheet to be used when the document is printed. The second gives an alternative to the current page in Portuguese. The third links to an alternative page in Arabic. In this case, the character set needs to be defined also. Search engines may offer the linked pages when the user as requested pages in a specific language.

10.7 The base Element

<base href= />

The base element provides the base for relative URLs. For example:

<base href="http://www.w3.org/nuts/index.htm" />

. . . < a href="../brazil/origin.htm" />

This would take the relative URL and, as the base element is declared, the relative URL will be resolved to:

http://www.w3.org/brazil/origin.htm

by effectively treating the current page as though it were the one defined by the base element.

In document HTML (Page 43-48)