6. Implementation
6.3. Software to Write Software
Tools used for software development present tremendous amount of importance for high quality releases. On the extreme (and unlikely nowadays) case, low quality compiler will produce erroneous code. Tools can make certain tasks much easier (or even possible) compared to doing by hand.
6.3.1. Development Platform
6.3.1.1. Integrated Development Environments
On the very old days, programmers were punching holes in cards to write codes. Fortunately, these days are long over and there are many sophisticated tools for programmers. These tools are combining many facilities a developer would need during implementation phase of a project. Therefore, they are called integrated development environments, or shortly IDE. A good IDE should provide:
A good source text editor with
• Code highlighting to make it easier to distinguish different elements of code. Normally, in common environments, only keyword highlighting is present. However, highlighting string constants in the source code is very important, too. On the other hand, a program called SourceInsight parses the code as a compiler does, and highlights, underlines, italicizes, and makes bolder to a very granular level.
• Basic syntax checking; this will cut unnecessary compiling times to learn that there is no keyword in C called “strct”.
• Warnings against deprecated API’s.
• Simple yet useful features like source code commenting / un-commenting, auto indentation, and style checking, etc. Furthermore, it must support code auto formatting as well. Some tools allow to format a code to a predefined format.
• A fast, optimizing (both for space and/or speed) compiler capable of producing correct and meaningful error/warning messages and capable of producing absolutely error free binary file representation of supplied source code. Run time code checks are a very welcome improvement. We do not
expect too much from an IDE’s built-in compiler. After all, release bits can be compiled with a compiler of custom selection.
• A fast, optimizing linker.
• A powerful debugging engine capable of source code and assembly debugging with use of full symbols and modifying source code during debugging.
• Integrated and complete help for IDE itself, programming language and supplied libraries.
A good IDE with above-mentioned features will keep a programmer concentrated to its job by making tasks shorter, more intuitive and easier.
6.3.1.2. Simulators
The earlier the code is tested, the better it is in terms of quality and costs economy. Some software projects will require specialized hardware to test it. However, this hardware could be very expensive to dedicate one to each of the programmers / testers; or it may not be available until later phases of the project. Using a simulator to increase the number of testers or to begin testing earlier is advised. However, simulators present many weaknesses.
First, they are slower than the original hardware. This will give wrong estimates of performance and may hide some of the race conditions.
The correctness and exactness of simulator are essential, yet it is hard to test and verify; testing the code against incorrect simulator would cause very unpleasant surprises to the end of the project. Testers using distrusted simulator will end up running tests against both the simulator and actual hardware, just time consuming and result confusing.
6.3.1.3. Profilers
Readers might wonder why profilers are mentioned in a thesis about secure programming. Over optimizing or optimizing in wrong places manipulates the code extensively and can result in code defects. As a rule, each code defect can end up being security vulnerability.
A project should be optimized during requirements phases by cutting off unused features, unneeded flexibility or unneeded scalability. This also reduces attack surface in the future. Design phase also presents opportunities like choosing better algorithms or defining synchronization bottlenecks better. During implementation, optimization can be done at two levels: algorithm and source code.
May be the best summarization of choosing more complex algorithms is stated by Rob Pike in his Notes on C Programming:
Rule 1. You can’t tell where a program is going to spend its time. Bottlenecks occur in surprising places, so don’t try to second guess and put in a speed hack until you have proven that’s where the bottleneck is.
Rule 2. Measure. Don’t tune for speed until you have measured, and even then don’t unless one part of the code overwhelms the rest.
Rule 3. Fancy algorithms are slow when n is small, and n is usually small. Fancy algorithms have big constants. Until you know n is frequently going to be big, don’t get fancy. (Even if n does get big, use Rule 2 first.)
Rule 4. Fancy algorithms are buggier than simple ones, and they are much harder to implement. Use simple algorithms as well as simple data structures.
Rule 5. Data dominates. If you’ve chosen the right data structures and organized thongs well, the algorithms will almost always be self evident. Data structures, not algorithms, are central to programming.
Optimizing a code at the source code level, on the other hand, is usually harder and less efficient. Over-optimizing a piece of code is in particular very dangerous. Hand optimized code gets harder to understand, prone to bugs and very rigid. They usually have many predefined limits and constants, several assumptions and shortcuts; these are generally traits of fragile source codes. It is very reasonable to fine tune performance bottlenecks even at the assembly level to squeeze every possible CPU cycle; however, this should be done only if it is proven to be needed. A working implementation should exist in order to be able to prove, otherwise, there would be only guessing; a method that has no place in a serious engineering practice.
Profilers can help developers to have a profile of their application. With better picture of bottlenecks and regions of code that needs optimization, a developer can pinpoint what to optimize and how much to optimize. Most likely, after setting profiler, developer will perform scenario based testing; this could be handing off the application to a real life end user or mounting it to the environment that reflects the environment the application will be running on once shipped. After several hours of data collection, profiler will give results. After correctly analyzing the results, developer will less likely tend to over optimization.
6.3.1.4. Sand Boxes
Sand box is an isolated private space for a piece of code. A sand box will provide an environment, complete set of global dependencies and local dependencies. That piece of code will “feel” like it is in real life executing in the middle of the process. This capability is very useful to test functions in customized scenarios. Sometimes, it can take long time to bring an application to desired state. For instance, assume that memory stress handling code is subject to test. Without using any tools, a real stress scenario can be realized, which normally requires ample resources. On the other hand, isolating that piece of code and giving it an environment suffering from memory shortage will make testing much easier.
6.3.2. Debuggers
6.3.2.1. NTSD / CDB
NTSD and CDB are two Microsoft provided very similar command line debuggers that use same debugger engine. They are extremely powerful and provide everything a debugger can provide. They are updated frequently and are available for different platforms. They come with a very good, actually helping help file, which demystifies many hard-to-understand features of these powerful engines. On the downside, however, command line user interface is not attractive to many users and makes usage harder than GUI tools.
6.3.2.2. WinDBG
WinDBG is using same debugger engine as NTSD and CDB, however it provides a graphical user interface. Although user interface is not very exciting nor intuitive, it still provides the flexibility of NTSD/CDB in an easier to use environment.
6.3.2.3. Symbol Files
Symbol files include type, address and line information of source files. When used with a debugger, it can help debugger to provide resolved stack information (call stack, local parameters, return values) and line number in the source of current code. Not all symbol files present same amount of information. While private symbol files will generally provide detailed information about binary, release editions of symbol files will include only parameter names, not the types.
Full symbol file generation is crucial for in the field debugging. Otherwise, debugging will require huge amount of disassembly and heuristics.
Microsoft has a public symbol server on the web. To further analyze system calls and call stacks, environment variable for symbol server (_NT_SYMBOL_PATH) can be provided as:
srv*c:\cache*http://msdl.microsoft.com/download/symbols;
6.3.2.4. Effective Usage of Debuggers
Although effective usage of debuggers is an important skill in secure software development, it is beyond scope of this thesis. However, readers are strongly encouraged to develop their debugging skills if they do not feel competent and comfortable. Debugging will open a new door to the internals of binary, which after all executes in the machine.
6.4. Libraries