• No results found

Bytecode Instruction Level API

4.3 Bytecode Engineering Instruments

4.3.1 Bytecode Instruction Level API

A Bytecode Instruction Level API exposes the artifacts of the JAVA Vir- tual Machine as JAVAobjects to the programmer. The BCEL (Byte Code Engineering Library) is a frequently used tool to perform low level op- erations on bytecode structures (Dahm, 2001). Similar libraries such as the ASM library exist, but are limited in their potential analysis scenarios. On top of a bytecode engineering library the analysis algorithm is imple- mented, BCEL supports a generic Visitor pattern that allows the caller to walk through the class file represented in a tree-like structure. In contrast to the JAVA language perspective bytecode engineering libraries are ana- lyzed class file per class files. Inter-class file dependencies need further analysis, typically not provided by a low-level library, therefore this feature is then implemented by the user of the library and the interpretation of the dependencies are scenario-dependent.

The API is subdivided in three parts. This is reflected by the BCEL package structure, which is formed by a static part, a class generation part and associated tools.

Static structures The first package represents the static structure of class files according to the JVM specification. The core of the package is the JavaClassclass definition, which holds the fields, methods and other associated metadata needed to describe the logical structure of a JAVA class file. A JavaClass instance also has an associated org.apache.bcel. classfile.ClassParser object.

form existing bytecode to JAVArepresentation objects, which may be used for analysis that goes beyond reflection. As seen above, JAVA classes store their invariant values in the Constant-Pool. This is reflected by the ConstantPool container object, which stores the defined constants of a class file inside single Constant objects. This package also defines bit mask values that represent access flags used for fields, methods, and classes. The Repository class provides structures and methods to lookup and to compare JavaClass objects.

Code generation The second part of the BCEL API (application program- ming interface) provides functionality to generate and alter the contents of class files. It abstracts bytecode from the real memory representation. Special utility classes allow generating JavaClass and ConstantPool ob- jects in an object-oriented representation. A type information framework for types like Void, Integer, etc is used to manage field and method signa- tures. Fields and methods can be generated with FieldGen and MethodGen utility classes. Fields are described by the type, the access modifier and optional the initial value. Methods are more complex to analyze and al- ter as they carry bytecode and exceptions. The MethodGen functionality therefore allows adding exceptions to methods. The several byte code instructions are defined as classes and categorized into groups as sub- classes of abstract classes such as BranchInstruction. The byte code is represented as a linked list of instruction objects that allows the pro- grammer to manipulate the code in its logical sequence. The allowed operations include actions to append, insert, and delete instructions. Ad- ditionally methods exist that manage maintenance of relative offsets to branch targets and representation in an InstructionList object. The abstraction of jump targets allows directing branches towards instances of InstructionsHandle, which are resolved to physical addresses when the method is finally transformed to bytecode.

Bytecode Engineering Instruments 139

Use Cases The Low level APIs can be used as a foundation for a range of supporting applications, such as optimizers or analysis tools or adaptive runtime modification. BCEL for instance can be integrated for runtime instrumentation with a modifying class loader. Whenever a JAVA appli- cation is loaded, the classes are loaded via a BCEL enabled class loader. This class loader can modify the class files it is supposed to load to adapt pre-built black-box components to current requirements (Keller and Höl- zle, 1998). This is done without going through the complete development cycle of programming, compilation and deployment for every minor non- functional modification, which only affects byte code semantics.

With the appropriate skill at hand, this results in a better flexibility, as there is no need to modify the source code base. Furthermore, source code may not always be available or license agreements forbid the modi- fication of the source code. A typical use case for modifying the standard class loader with BCE is adding debugging profiling code to methods, or guarding methods with checks for technical preconditions, such as the AccessControlContextto emulate the facilities of smart proxies (Santos et al., 2002).

Additional checks may collect code coverage metrics like the jcoverage (jcov- erage ltd, 2005) toolset. The data is gathered from inserted Bytecode inter- ceptors at the beginning and end of every method. The data gathered can be used for profiling runtime behavior which is important for a quality engineering as it ensures that test case cover a majority or ideally all of the code consisting an application. To ensure a broad coverage of code is an important prerequisite for applying theJCHAINStoolkit, which will be used for security engineering in the refactorings chapter.

In addition, authorization checks could be inserted into the control flow just before instructions to access critical resources (like database records to be updated) to flexible enable enforcement of a stricter security policy without the need of recompilation.