Modeling Android-Specific Concepts - Static Security Analysis for Android

11.5 Static Security Analysis for Android

11.5.1 Modeling Android-Specific Concepts

Android apps provide special functional characteristics, which may constitute challenges for static analysis approaches as discussed in Section 9.4.

11.5 – Static Security Analysis for Android

Many works in the area of Android analysis, however, only tackle some of the described analysis features [306]. Comparable approaches aiming at all of the static analysis challenges mentioned in Section 9.4 are [99, 248, 305, 338, 518, 536, 537]. Additionally, the many approaches deal with the problem of inter-app communication, where the communication between Android apps is considered [193, 287, 385, 442, 469]. This aspect is not in the focus of our approach.

Only some of the works mentioned before also consider Java-specific challenges to improve the results of the static Android app analysis. The works of Gordon et al., Huang et al., and Shen et al. model the Android-specific message mechanism for communicating through threads within an app [193, 248, 442]. This mechanism is called Handler and enables threads to communicate through a shared Handler object. Another Android extension to multi-threading is called AsyncTask. This extension provides a way to communicate with a user interface thread in an asynchronous fashion and is modeled in the works of Huang et al. and Shen et al. [248, 442]. Furthermore, Huang et al. model the implicit thread execution originating from Java, e.g., by inserting a call edge from the start() method of a Thread object to the object’s run() method [248].

Table 11.2 sums up the published approaches comparable to our proposed tool. None of the aforementioned approaches considers the fragment concept. Only Huang et al. [248] address all the aforementioned Android-specific programming concepts that we consider in our approach. We also simulate framework features related to Java concurrency concepts, such as Runnable, ForkJoin, Callable, and ExecutorService. In addition, our tool uses extracted android.jar files of running Android emulators [187] to have fully implemented API information instead of stubs, which are usually included in the android.jar distributed by the Android SDK. The extracted android.jar files enhance the generated points-to-analysis information for used API objects. None of the approaches listed in Table 11.2 describe whether they use extracted Android API information of the Android emulator instead of Android programming stubs for their analysis.

11.5.2 Slicing

Another often used technique among Android analyses is program slicing. Section 11.4.2 already depicted the general approaches of program slicing. With regard to Android, program slicing is often used for taint analyses and sometimes for vulnerability detection. In the following, we briefly describe some approaches of this research area.

Taint Analysis

Taint analyses inspect applications with regard to implemented (potentially) malicious data flows. These analyses can be conducted on different program representations such as bytecode. For example, the Apparecium tool works directly on disassembled bytecode of Android apps (Smali) and not on an abstract code representation such as Jimple [488]. They use forward and backward slices between

the marked source and sink for the taint analyses. Then, they combine the slices to eliminate paths that are only contained in one slice and do not end or start with the marked source or sink. AppCaulk is based on an early version of Apparecium [438]. This tool uses static taint analysis based on backward slicing to generate relevant points inside an application. These points are instrumented with a tainting logic to perform a dynamic taint analysis.

The SAAF (Static Android Analysis Framework for Android apps) tool provides program slicing for Smali code to perform data-flow analyses to backtrack parameters used by a given method [242]. It generates Smali files for all classes within an Android APK, parses the Smali files, and then creates an object representation of its contents based on the Manifest file, basic blocks of the methods, fields, and all opcodes.

Another tool that uses backward slicing for the taint analysis is the Brox tool [317]. It is based on the dalvik-opcode and uses a self-implemented data-flow analysis framework to detect Global Positioning System (GPS) information leakage in Android apps.

Besides the bytecode and Dalvik-opcode based approaches, some approaches use existing static analysis tools for their taint analysis approaches. Gibler et al.’s AndroidLeaks tool uses the WALA tool to perform taint-analyses with regard to permission information to detect leaked private information [175].

Two other tools compute slices on the internal representation of the Soot tool. Both tools convert the app’s application binary (DEX) format into Java bytecode and then translate the Java bytecode into an intermediate representation with Soot. Then, they use program slicing for the taint propagation. The Capper tool additionally rewrites (selectively) Android apps by inserting bytecode instructions for tracking sensitive information flows in specific fractions of the program that may be involved in information leakage [546]. In contrary, the tool AppSealer generates patches automatically for potential component hijacking attacks [547]. It injects a patch before a predefined policy is violated and shows a pop-up dialog to inform the user.

Besides the aim of taint analysis there are some other approaches using the slicing technique to identify vulnerabilities or security issues. The tool presented by Poeplau et al. uses slices to detect code loading sites for external code, such as class loading or native code, within Android apps [367]. They use an inter-procedural control- flow graph (ICFG) for their backward slicing to trace the flow of data starting at a selected instruction related to external code loading. The AQUA (Android Query Analyzer) tool uses static analysis to reverse engineer aspects of the application’s interaction with used databases [85]. It works directly with Android application binaries (DEX) and uses a lightweight program slicing technique. The slicing method extracts potentially relevant string information (nodes) that are used either directly or via content provider methods to construct SQLite queries.

Another tool that identified security issues via taint analysis is CredMiner [551]. It programmatically identifies and recovers (obfuscated) developer credentials which are unsafely embedded in Android apps. It locates the calls to interesting functions that can contain credential information via static backward slicing technique.

11.5 – Static Security Analysis for Android

Vulnerability Detection

One work that can be directly compared to our tool is the CryptoLint tool [123]. CryptoLint finds implementation bugs in the cryptographic code of Android applications based on six rules (e.g., concerning insecure encryption modes, insecure parameters for password-based encryption, and weak random number generation) and was applied within a large-scale study. CryptoLint focuses on finding bugs at the level of single code lines as it uses cryptographic API calls and single parameters as criteria for backward slicing.

Other approaches deal with the wrong usage of the Android framework in apps. For example, CHEX allows an analyst to detect Android apps vulnerable to component hijacking, i.e., an attacker can jump on an app’s Android components and use them to access security-critical functionality [312]. CHEX uses static program analysis techniques based on system dependence graphs, which are usually employed to implement slicing. However, as CHEX does not use control dependences.

All in all, none of the presented approaches deal with the aspect of tracing multiple objects, which together constitute the implementation of a security feature (e.g., tracing the origin of the key and the data to be encrypted), rather than single parameters of crypto API calls [123]. The discussed taint analyses only provide path information based on data-flows from defined sources to sinks. In contrast to that, the approach in this thesis preserves live-time information of every object that interacts. Moreover, the tools are not designed to support locating security features in code and interactions between its constituting objects, which is one of the main tasks of our approach.

In document Security-Pattern Recognition and Validation (Page 178-181)