• No results found

Scripting Language for Web Automation

Experienced programmers as well as casual programmers (i.e. those who are less- experienced), may use a programming language in order to implement algorithms to achieve a higher level of automation in their daily computer usage.

Basic programming knowledge can be useful for non-expert computer users in order to save them significant time doing repetitive tasks. Almost all major platforms offer simple scripting languages to users in order to empower them to better utilise the platform and enable automation. Web2Sh follows this model by providing a scripting language that can be used to create and customise Web commands that are focused on Web automation. One of the main advantages of the Web2Sh model over most other platforms’ models is that Web2Sh is designed explicitly for the Web.

For example, consider a user who has compiled a list of laptops that he/she is interested in, through a set of shopping Web sites. The user wants to check which ones have received good reviews on sites like Cnet19, Amazon20, and PcWorld21. Moreover, imagine that the user also wants to check technical performance issues of some of the laptop parts on sites such as notebookcheck.net.

This could be done by visiting each of these sites, providing details of each laptop, then manually compiling the necessary information to carry out the comparison that would assist in making the decision. Some services may be available on specific Web sites such as Dell that make this comparison easier, yet this applies only for the devices they offer. This service may also be provided by a Web site that offers price comparison. However, if the user wishes to customise the process in ways, which truly reflect his/her own needs and mental model, then the possibility that a Web site can provide such specific format is not high. For this reason, end users might then need the ability to write scripts that would automate their Web tasks in a personalised manner. In other words, users may often need customisable tools that simplify the process of automating the Web. The ideal approach for such an objective is through a scripting language that allows the development and execution of scripts that take less time than it would take to do a task manually.

The Web is shifting towards being a platform. Most computer users spend significant part of their time interacting with the Web in different ways. Allowing end users to better utilise the Web is essential, and a goal that many researchers have pursued. Many available tools for automating Web tasks (Rob, Paul et al. 1997, Bolin 2005a, Jeffrey and Richard 2007) require the user to work either with the raw semi structured HTML of a page, Document Object Model (DOM) (Hégaret 2005) of a Web page, or simply the rendered model of a Web page. There is also many Web scripting languages, however, they are still mainly targeted for developers and Web programmers. Regular Web users lack both the languages and tools to automate their Web usage. 19 http://www.cnet.co.uk/ 20 http://www.amazon.co.uk/ 21 http://www.pcworld.co.uk/

A Web page can be represented as a long string of HTML, and users identify parts of the page either by location or by regular expression matching. However, as the HTML for almost all dynamic Web sites is generated by code executed on Web servers, it is often hard to understand by end users. So writing scripts can be both difficult and time-consuming.

Users may also access and manipulate a Web page’s content based on its DOM, which is a standard model for Web page structure; JavaScript is the most popular scripting language utilising the DOM. However, this model requires adequate programming experience, including basic understanding of Object Oriented concepts. In addition, users will still have to understand the HTML structure of the Web page.

Furthermore, such tools or scripts would have a fragile dependency on the structure and text in a Web page, which may break when the page changes. Therefore, it is crucial to be able to update the scripts the user may use to adapt to possible changes in any Web page.

Hence, to address these limitations, users need to be able to have the option to make use of the HTML structure, Document Object Model, or to depend on the actual page content to achieve their goals. Users may choose a different methodology to achieve their goals, based on their programming experience and the type of task they need to achieve.

The WSh scripting language aims to offer a wide set of tools to give its users the ability to access and manipulate Web pages using the best model that matches their requirements and experience. Hence, the main objectives of the language design are simplicity, generality, usability, and expressiveness. Furthermore, writing and testing the language is done through a Web page, as it is the most common, well-known and simple interface to use.

Web2Sh enables users to work with different representations of the Web page. It provides tools that support the many MIME types that may be retrieved over the Web such as images, RSS, and PDF files.

Web users are always searching for easier and more efficient ways to instruct the Web platform to execute useful and complex tasks. For this purpose, this thesis argues that users need new powerful and easy to use tools. Such tools should allow the user to enter, edit, translate, and run Web commands on the Web platform. Web2Sh provide such tools for the users; all tools are Web-based, hence, the user only needs a modern Web browser to create, edit, share and run Web2Sh Web commands.

The Web can be considered as a high-level platform, on which programs are written and executed. It also comprises the use of many technologies and Web programming languages such as PHP, Java, Perl, Dot Net, among others. The users need not worry about any low-level operations, which are taken care of by protocols such as TCP/IP, HTTP, FTP, and similar protocols.