5. Semi-Structured Interviews with Subject Matter Experts
5.6 Procedural Aspects of Understanding Programs
5.7.3 Automaticity and Tacit Knowledge
directly about processes which had become automatic for them. Afterwards, they were asked to explain what elements of reverse engineering they thought would be difficult for a novice to perform, provided the novice was given detailed instructions on how to complete the task.
Tacit knowledge is typically thought of as compiled experience that allows an expert to act and make decisions in a task more quickly and more effectively than a novice [59, 137]. The SMEs described the ability to recognize high-level program- ming constructs, recognize anomalies in assembly language, and to recognize unique solutions to problems as the primary tacit components of expert knowledge.
5.7.3.1 Recognizing High-Level Programming Constructs. SMEs talked about the ability to recognize different programming constructs in assembly language as one of those things that takes a lot of experience to be able to do. An expert might be able to easily recognize a certain configuration of data as a data structure in memory, a two-dimensional array, an object instance of a class, or a grouping of functions. The SMEs described being able to retain the ability to think about the low-level machine code as the higher-level code structures that they were compiled from. This knowledge allowed them the capability of concerning themselves about program behaviors rather than syntax-level details. It also helps them more quickly think about what elements in the program are important to their goals and which ones are not.
5.7.3.2 Recognizing Anomalies. The SMEs also discussed their abili- ties to recognize when the code looked anomalous. Each SME mentioned having an
ability to notice if a section of assembly code looks “weird” or if something in the program “just looks off.” These cues can be an indication that the program is per- forming unexpected (and possibly malicious) functionality or at least that the reverse engineer has to do more investigation to learn what is happening.
The SMEs described that beginners might not be able to tell the difference between code that is doing something unusual or tricky and normal compiler opti- mizations. However, experienced reverse engineers can look at the same code and notice the same compiler optimizations they have seen many times. They credited these recognition capabilities for giving them an intuitive feel for what is normal code and what could be abnormal protected or malicious code.
5.7.3.3 Recognizing the Approach is Wrong. The SMEs also referred to having “a sixth sense” that the problem-solving effort was going in the wrong direction or “down the wrong path.” One SME compared approaches in the past to the approaches of other reverse engineers who “beat down a path until it’s dead.” This element of tacit knowledge involves being able to recognize the cues that progress is or is not being made in the task, and being able to compare that against an expectation about how the task should be progressing.
5.7.3.4 Recognizing Unique Solutions to Problems. Each of the SMEs discussed the component of “out of the box” thinking that they perform in reverse engineering. They mentioned how some reverse engineers are able to think about a problem and come up with creative solutions that are not directly apparent from the way the problem presents itself. This ability to recognize a problem as an isomorph or analogy to another problem is part of what one SME called the “dark art” of reverse engineering.
The SMEs related stories of getting what seemed like a crazy idea out of nowhere about how to approach a difficult problem. One SME mentioned going through a process of clearing out all thoughts and not thinking about the problem directly
in order to help allow creative solutions to come. In their relations of stories, the ultimate solution to their problems incorporated bringing in elements of knowledge that were outside of the scope of the reverse engineering task, but which made sense in some metaphorical way to a separate and seemingly unrelated approach, often within a different problem domain. Regardless of the approach to creativity, each SME mentioned having an intuitive feel for different ways that a complex problem could be approached which ultimately provided them the ability to perform the task more quickly than other reverse engineers.
The SMEs independently referred to reverse engineering as “putting together a puzzle.” This implies that there may be abstract problem-solving and reasoning activities which are involved in solving puzzles performing reverse engineering work. These patterns or activities in the puzzle domain and the reverse engineering domain seem to be similar enough to each other to make the metaphor resonate independently with each of the SMEs.
5.8 Conclusions
In this chapter, a semi-structured interview study with subject matter expert reverse engineers was described to uncover the procedural and conceptual components involved with reverse engineering executable programs. From the study, four primary work domains were defined in software reverse engineering:
• Vulnerability discovery,
• Malicious software analysis (including looking for rootkits and backdoors), • Software protection analysis, and
• Reverse engineering unprotected software.
Across those four domain areas, the study uncovered eight primary goals that are involved with reverse engineering software. These goals are:
• Finish the analysis quickly,
• Discover general properties of the program,
• Understand how the program uses the system interface, • Understand, abstract, and label instruction-level information, • Understand, abstract, and label the program’s functions, • Understand how the program uses data, and
• Construct a complete “picture” of the program.
After discussing the procedural aspects of reverse engineering, the conceptual aspects were described, including how information is used in reverse engineering tasks and what knowledge is required. The SMEs described information in reverse engineer- ing tasks as providing the means by which they determine and manage their approach to reverse engineering. Information-seeking activities were characterized as passive, active, monitoring, and trustworthiness-related activities.
The conceptual knowledge areas from reverse engineering were described in Sec- tion 5.7.2. First, the general knowledge areas were presented and then the following specialized areas of knowledge were presented and described:
• Translating from assembly language into higher-level languages, • System API functionality,
• System internals knowledge,
• How compilers generate machine code, • Classes of vulnerabilities and exploits,
• Knowledge of and recognition of malware, and
• Knowledge of software protection techniques and how they work.
The next chapter focuses in on the procedural aspects of reverse engineering through the analysis of an observational study to elicit the sensemaking process from
reverse engineers performing a crackme task. The observational study was designed using information from the study in this chapter about the goals involved in reverse engineering and the processes that are shared across the four different reverse engi- neering problem domains.