2.5 Dynamic Extraction of Client-Side JavaScript Code
2.5.4 Rhino and HtmlUnit Testing Framework
Rhino [73] is a JavaScript interperter that is written in Java. HtmlUnit [34] is a GUI-less Java web browser that is used along with Rhino to provide a unit testing framework for web applications. Figure 2.6 shows a simple HtmlUnit testcase. The testcase first loads a web page with an Html form, fills out field userid in the form then submits the form.
To run JavaScript code, Rhino either: (1) transforms JavaScript code into Java bytecode then executes the bytecode using Java Virtual Machine (JVM) or (2) compiles the JavaScript code into its own bytecode and uses its own stack-based virtual machine, which we call Rhino
3Although inlining can be problematic in the case of recursion and in the presence of deep call graphs, vali-
dation and sanitization code tend to be simple and not affected by these issues. If this were not the case, it would always be possible to stop the inlining at a given depth and introduce approximations.
@Test
public void submittingForm() throws Exception { final WebClient webClient = new WebClient(); // Get the first page
final HtmlPage page1 = webClient.getPage("http://some_url"); // Get the form that we are dealing with and within that form, // find the submit button and the field that we want to change. final HtmlForm form = page1.getFormByName("myform");
final HtmlSubmitInput button = form.getInputByName("submitbutton"); final HtmlTextInput textField = form.getInputByName("userid"); // Change the value of the text field
textField.setValueAttribute("root");
// Now submit the form by clicking the button and get back the second page. final HtmlPage page2 = button.click();
webClient.closeAllWindows(); }
Figure 2.6: Example of an HtmlUnit testcase.
Virtual Machine (RVM), to execute the code4. When instrumenting JavaScript code, we always
use the second option. Figure 2.7 shows a simple JavaScript code along with the corresponding RVM bytecode. The bytecode for each function is stored as an instruction array. The numbers on the left of each instruction in the figure represent the index of the instruction in the bytecode array. Note that some instructions such as BINDNAME and SETNAME correspond to complex read and write operations that involve traversal of the prototype chain of JavaScript objects. Rhino uses an execution (call) stack where, for each JavaScript function under execution, a stack frame is used to store the instructions array along with the scope object and other runtime information related to this function. Note that this stack is different than the Rhino Virtual Machine (RVM) stack that is used to store arguments for instructions along with their results.
4We could not find an official documentation for RVM so the information provided here is based on our own
--- (a) JavaScript code --- var x = "foo"; var y = x + "bar"; --- (b) RVM bytecode --- [0] LINE : 1 [3] REG_STR_C0
[4] BINDNAME //push value of x onto stack
[5] REG_STR_C1
[6] STRING //push string "foo" onto stack
[7] REG_STR_C0
[8] SETNAME //pop value of x and "foo", store "foo" into x then push value of x
[9] POP //pop value of x
[10] LINE : 2 [13] REG_STR_C2
[14] BINDNAME //push value of y onto stack
[15] REG_STR_C0
[16] NAME //push value of x onto stack
[17] REG_STR_C3
[18] STRING //push "bar" onto stack
[19] ADD //pop value of x and "bar", concatenate them then push result
[20] REG_STR_C2
[21] SETNAME //pop result and value of y, store result into y then push value of y
[22] POP //pop value of y
[23] RETURN_RESULT
Figure 2.7: (a) JavaScript code along with (b) corresponding Rhino Virtual Machine bytecode.
2.5.5
JavaScript Memory Management in Rhino
Our extraction technique tracks memory locations dynamically during JavaScript program execution. Tracking memory locations is done to allow our analysis to (1) extract only the statements that operate on the input field that we are interested in, (2) to deal with aliasing and (3) to give unique names to different memory locations. To understand how memory locations tracking works, we need first to explain briefly how JavaScript stores values during program execution and how Rhino implements this using Java.
There are three classes of values that a JavaScript program reads and writes during exe- cution: (1) primitive values which are String, Number and Boolean; (2) special values which are Undefined which is a single-valued type that has one value undefined that is returned when reading an uninitialized variable or a non-existent object property and Null which is a single-valued type that has one value null assigned to variables and object proper- ties that do not have a valid value of any of the other types; and (3) composite values which are
Arrayand Object. A JavaScript object is a one to one mapping from a set of strings (prop-
erty names) to a set of values (primitive, special and/or composite). JavaScript treats arrays as objects but with a special internal length property that is updated automatically by the lan- guage itself. JavaScript object model contains intrinsic object types such as Object (which is the supertype of all composite types), Function, Array, Date, ... etc, along with user
definedobject types. A third class of object types, called host objects, is added to JavaScript
when it runs inside a web browser. These object types are not part of the JavaScript language specification but are provided by all modern browsers. These include the DOM tree objects (along with DOM event model) and the Window object that represents the browser window.
Since Rhino is implemented in Java, it has to map these data types into Java data types. (1) Rhino maps primitive types into Java types String, double and boolean. (2) Rhino maps JavaScript object model into a Java class hierarchy with the Java class ScriptableObject as its root (which correponds to the JavaScript type Object). HtmlUnit extends this hierarchy by providing host objects. (3) (I) the special single-valued JavaScript type Undefined is mapped to Java class Undefined with a singleton instance Undefined.instance that
represents the JavaScript value undefined and (II) the special single-valued type Null is mapped to null in Java.
During a JavaScript program execution, all memory reads and writes operate on either prim- itive values, the two special values undefined and null or object references. These are read from/written to either (1) variables or (2) object properties. JavaScript binds global variables to the global scope and local variables to the scope of the function where these variables are
defined5. A JavaScript function’s scope is stored in the function frame on the execution (call)
stack. Finally, objects and their properties are stored in the heap. How does Rhino implement this? Let us first explain how Rhino maps each JavaScript object into an instance of a Java
ScriptableObject. ScriptableObject stores properties of a JavaScript object in a
hashtable. For each property, the hash value of the property name (which is of type String) is used as the key. Property value (which is either a primitive value, one of the two special values or an object reference) is stored in the value field of a Slot object. Figure 2.8 shows how Rhino stores two nested JavaScript objects into two instances of class ScriptableObject. For example, the value of the property p1 of object obj which is the string value foo is stored in the Java object field value of the first Slot object that is indexed by h(p1) where h is the hash function and p1 is of type String.
In the case of values that are stored in variables, Rhino stores these values in Slot objects in the hashtable of the special object scope (of type ScriptableObject) which repre- sents the scope of some JavaScript function func. In this case, the hash values of the local
var obj = { p1: “foo”, p2: 3, p3: { p1: “bar”, p2: “abc” }}! h(p1)% h(p2)% h(p3)% other&object&info& Keys% (h%is%hash% func3on)% hashtable% hashtable%% values%each% of%type% Slot%object% ScriptableObject%for% JavaScript%object%obj! foo% 3% other&property&& info& other&property&& info& other&property&& info& h(p1)% h(p2)% other&object&info& bar% abc% other&property&& info& other&property&& info& ScriptableObject%for% JavaScript%object%obj.p3!
Figure 2.8: Structure of the two Java ScriptableObjects that Rhino uses to store the two JavaScript nested objects obj and obj.p3.
variables’ names are used as keys. The scope object for some function func is stored in func’s stack frame in the execution stack. The top scope object (i.e., the scope object of the top code) where global variables are stored is the Window object.
2.5.6
Tracking Memory Locations
Unlike C language where the dynamic memory (i.e., heap) is modeled as a byte array and pointers can be used to directly access any memory location i.e., any byte in the array, in Java, dynamic memory (i.e., heap) is modeled as a graph of objects where each object is accessible
either through a reference stored in a local variable on the stack or through a reference stored in a field in another object. Since Rhino is implemented in Java we should expect the smallest unit of memory that we can track dynamically to be a Java object.
In fact, we define a memory location as a Java object of type Slot that is stored in some
ScriptableObjectand has the value of some JavaScript variable or object property. We
track memory locations by keeping track of references to Slot objects. When tracking mem- ory locations, we only track locations that store string values. Tracking Slot objects allows us to track memory locations that store primitive values (more specifically string values) since, as we discussed before, each JavaScript primitive value and object reference is stored in the
valuefield of some Slot object that corresponds to some JavaScript variable or object prop-
erty. For example, to track an HTML form input field of type text, we track the Slot object for the text property in the JavaScript DOM object that corresponds to the input field.
As we said before, in our extraction, we extract all input validation and sanitization opera- tions that operate on a certain HTML from input field i. This means that we need to track the memory location that corresponds to this input field along with all other memory locations that are assigned the value of this input field during program execution. To do this we need first to find the memory location that stores the value of the input field itself which is done using HtmlUnit. When filling out a form using HtmlUnit (see Figure 2.6), it allows us to access DOM objects that store the values of the HTML input fields in this form. This in turn allows us to intiate the memory tracking in Rhino by feeding the Slot object for target HTML input
field to the MEMORYTRACKERcomponent of the extraction framework as the initial memory location.