Parsing Variables - WSh Parser and Interpreter

5.7 WSh Language

5.7.1 WSh Parser and Interpreter

5.7.1.3 Parsing Variables

The language needs, in many cases, to deal with objects with attributes, getting regular users to understand object-oriented concepts could be harder when compared to simple XML tags with attributes.

Hence, Web2Sh uses item as a general type, where the name or id attribute will be the item identifier. In addition, a sub-type is decided based on the tag name. Web commands are needed to be able to create and manipulate many markup tags; hence, the item structure is very useful.

The example above creates an item variable of sub-type img (an html image). Each variable is treated as a special command. If followed by a pipe, then the command will return its own value to the next command.

So both of these forms are valid:

Using any attribute name as a flag will return this attribute value

• Item attributes must be unique ;

• An item value will be decided in this order:

o The tag content may contain other items, any valid UTF characters, or empty;

1 $imag = <img src=”url” alt=”text”/>

1 2 3

$link = <a href=”url”> link text </a> echo $link;

$link | echo ; // will print link text

o The attribute “value” content may contain any valid UTF characters or empty.

When an item command is called, its tag content will be checked and, if found, it will be returned. If not found, the Interpreter will look for the default attribute value content. If found it will be returned, otherwise, an empty string will be returned. If an item has both a tag value and a value attribute, only the tag value will be returned. To access the value attribute, a flag with the attribute name must be used. For example:

WSh variables are able to hold a list of values. This feature was needed as many Web2Sh commands produce a list of items as their result.

For example:

This will retrieve all available links on the BBC home page and store the values as a list in the $bbcLinks variable.

1 2 3 4 5

$item = <string value=”test attribute”> inner tag text</string>

echo $item; //will print “inner tag text” $item –value |

echo ; //will print “test attribute”

1 2

fetch “http://www.bbc.co.uk” | getLinks | $bbcLinks;

5.8 Summary

The work in this research required the implementation of a dedicated scripting language for achieving the goal of end-user Web automation. This chapter presented the building process of Web2Sh dedicated scripting language (WSh). It explained the software tools philosophy employed by WSh. It also presented the background for defining new scripting languages. The design and development of WSh language was explained, and the language’s informal and formal specifications were introduced.

Chapter Six:

WEB COMMANDS

6.1 Introduction

This chapter presents Web commands as the basic building block for writing Web automation scripts. A detailed description of the syntax, structure, and characteristics of Web commands is provided and the Web2Sh implementation of the pipeline concept is discussed.

Web commands are tools and filters created with the specific goal of providing solutions for automating a set of problems in various domains connected to user/Web interactions. Moreover, the generic approach of Web2Sh allows for connecting any set of Web commands using pipes. This provides general solutions for a large set of problems to meet Web users’ needs.

Web2Sh offers a repository of basic commands in various domains as a proof of concept. The goal is to enable end users to develop new commands and be able to customise existing ones. In addition, the collaborative nature of the framework offers good potential to utilise end users’ experiences and to develop and share commands in one location.

An important objective of the Web command class design is to offer support for different resource types. To this end, the functionality of many Web2Sh commands is designed in a way that supports different approaches based on the MIME type of the resource piped to it. Web2Sh achieves this by implementing the command pattern to handle a command execution, where the action of a command is decided by the receiver class (Web resource).

The receiver class in the command pattern can be any class representing a valid MIME type. For additional generality, as the resource type is only decided at run time, the receiver object is generated dynamically through the implementation of the factory design pattern. This thesis refers to this concept as polymorphic MIME types support. This implies that each command may function in a logically related but different way according to the type it handles.

According to this design feature (i.e. Polymorphic MIME types), each Web command may support different functionality for different resources. For example, the title Web command will return the title tag value in the case of an HTML page, while it will return the value of the file property title in case of a PDF file.

The context of the Web command may also alter the actions it takes. For example, the getImages Web command performs a look forward to see if there is a following Web command in the pipeline. If there is not, it will extract all img tags in its input stream and write them to the output stream. Otherwise, if the following command does some image processing functionality, the getImages will retrieve the actual image, save it, and writes its URL on Web2Sh server to the output stream. The super class WebCommand specifies that a Web command imports a set of java library built in classes, most significantly the java.io.PipedInputStream and java.io.PipedOutputStream as the standard input and standard output for each command. This allows commands to deal with input as a piped stream of bytes and produce output as a stream of bytes. The PipedInputStream and PipedOutPutstream classes allow a Web command to convert an OutputStream into an Inputstream of the following command in a pipe. The idea is that at one end of the pipe, a writer thread writes to a PipedOutputStream. A PipedInputStream thread concurrently reads whatever is written on the other end.

The use of PipedInputStream and PipedOutputStream classes has many advantages making them ideal to use to implement Web2Sh pipes. A piped output stream can be connected to a piped input stream to create a communications pipe. This technique is useful as it supports communication between threads, allowing each command to run as a separate thread, produce its output and pass it simultaneously to the following command in a pipe sequence.

In document A framework for interactive end-user web automation (Page 119-124)