Final Engine Design Decisions - Final Specification

3.11 Final Specification

3.11.2 Final Engine Design Decisions

This list contains the design decision made to accomplish the above listed features:

• Architecture - 64-bit architecture allowing for larger addressable memory count as well as increased CPU register count.

• Operating System - Linux, chosen due to ease of configuration and development as it is open source and allowed inclusion of this implementation into the Linux kernel as a kernel module.

• Node Data Structure - Radix tries were decided on for this role as they allowed for dynamic allocation and accessing nodes within this data structure compliments the key used for each node’s data store, this being IPv4 address.

• Packets Received - A hook onto netfilter allows for easy packet acquisition, further motivation for this is the ability to copy and then drop packets of hosts that are part of the simulation, thus allowing for network isolation.

• Packets Transmitted - Simple use of sockets allows for byte buffers representing spoofed packets to be generated by this routing simulation.

• Simulated Routing - This is accomplished through replication of rule based for-warding tables. This allows simulation of multiple interfaces as well as the ability to worm hole packets. This also means that each node can be represented as a stateless entity.

• Supported Protocols - Due to memory limitations, this routing simulator is re-stricted to the IPv4 address space.

3.12. SUMMARY 72

• Threading - Use of work queues with individual core scheduling allows for parallel processing as the nodes within this routing simulator are stateless and so have no interaction; this removes the requirements to lock variables.

• Configuration - Web based configuration removes the need for requiring specific software to configure this routing simulator. This also allows for easier configura-tion checking through languages designed for completing such tasks.

3.12 Summary

This design chapter gave a broad overview of the implementation and what it strives to achieve through its design. The desired functionality was laid out and from there the best way to implement each part was selected to achieve each of the functionality goals in Section 3.1. In Sections 3.3 to 3.4 decisions were made, from what memory structure to use to the way in which routing would occur in this implementation in order to best utilize memory and CPU within the routing simulator’s host.

The hardware architecture and choice in OS was chosen and justified in Section 3.5 and how one would go about using this choice in Section 3.6. This chapter wraps up with a look into how threads are interfaced to, what the physical network’s requirements would be and how one would configure this research in Sections 3.7 to 3.10. Finally, tools were described that were developed to aid in the success of this implementation.

In Chapter 5 one can find the results to all the testing that was targeted at showing that this implementation achieves all functionality. This, as well as stress testing to show peak performances on set hardware. The use of netfilter and how it can be used to implement basic firewall ruling follows. Finally, an overview of how the physical network configuration should be performed and how one can better synchronize clocks for testing is detailed.

4

System Implementation Challenges

This chapter’s primary focus is to bring to light specific issues encountered during system implementation and set up. These problems range from bad mathematical theory that throws off the statistical behaviour of this routing simulator, to poor documentation that causes trouble in use of libraries that form a core for this implementation. This chapter is intended as a completing text to anyone who wishes to re-implement this research and have it behave in the manner that the author intended.

The topics addressed in this chapter begin at Section 4.1 with an in-depth analysis of random number generation in the Linux kernel version for which this implementation is intended. Following this, a contrast to the successes of radix tries, as presented in Section 3.3.5, within this implementation is touched on and how one should use the work queues discussed in Section 3.7 to ensure multi-core support is detailed Sections 4.2 and 4.3.

The use of netfilter can also aid in configuration and isolation and this can be achieved through adding appropriate rules, as described in Section 4.4. A look into how physical network configuration should be set up is explained in Section 4.5. In closing, clock syn-chronization for long term testing is explained in Section 4.6 as well as how one can better synchronize clocks using a time keeper to allow for more accurate results as produced in Chapter 5.

4.1. RANDOM NUMBER GENERATION 74

4.1 Random Number Generation

Random number generation is used in this simulator for determining jitter, packet man-gling rate and packet loss of nodes within the simulated network. For this one requires an accurate, and more importantly fair, random number generator. This is not possible with the current Linux 3.2 kernel¹. Consider the Linux 3.2 kernel source code snippets in Listings 4.1 and 4.2.

To summarize this code, get_random_int() is a function that produces a random un-signed integer of 32-bits through a method that ensures its goal of minimal entropy pool depletion. This first method safe for cryptography; however, the resources required to generate this integer are reduced by using this implementation (Mackall and Ts’o, 2014).

The second method, shown in Listing 4.2, uses get_random_int() to ensure its function-ality is randomize_range(). One can generate a random number within a range defined by the user with a range modifier. The range is simply taken as the number of integers between the start and end specified; the range modifier is also introduced at this point.

From this point a 32-bit integer is requested from get_random_int() and then this inte-ger has the modulus operator applied to it using the range that was determined by using the specified start, end and length specified. At this point one has an integer between 0 and the calculated range variable. Now one simply adds the start point of the range to this number to shift it into the range requested, thus delivering a value between the requested range.

This logic is sound until one considers the maximum value of a 32-bit integer, this being 4,294,967,295, and the range within which the start, end and length are determined.

Consider the function called with 0 value for start, 100,000,000 for end, and 0 length to apply no modifier to the start range. For this the range would clearly evaluate to 100,000,000, which is where the problem arises.

Applying a modulus of 100,000,000 to a number would yield a number in the range of 0 to 99,999,999. Considering that 4,294,967,295 is the maximum number a 32-bit integer can yield, this would cause an uneven distribution of integers produced by this range. The reason for this is that when the modulus operator is applied to a random integer between 0 and 4,199,999,999 it will return a number between 0 and 99,999,999. Every number in the second range can be produced by 42 separate integers in the first range, and thus the numbers will be evenly distributed. However, when one considers the range of a

1Comment cannot be made on other kernel versions as this was the version used for implementation.

Listing 4.1: /Linux/drivers/char/random.c get_random_int

1723 s t a t i c DEFINE_PER_CPU ( __u32 [MD5_DIGEST_WORDS] ,

1724 g et _r an do m _i nt _h as h ) ;

1754 randomize_range (unsigned long s t a r t ,

1755 unsigned long end ,

4.1. RANDOM NUMBER GENERATION 76

Figure 4.1: Modulus 100 applied to Single Byte for 10,000,000 Iterations

32-bit that can be produced between 4,200,000,000 and 4,294,967,295, one must notice that once the modulus operator is applied the only values produced are between 0 and 94,967,295. In short, this means that integers 0 through to 94,967,295 can be selected, integers 94,967,296 through to 99,999,999 cannot be selected for this range of a 32-bit integer.

To conclude the significance of these values yielded, this means that when a modulus of 100,000,000 is applied to the full range of a 32-bit int, as produced by get_random_int(), integers 0 through to 94,967,295 can be yielded by 43 unique values in the range of a 32-bit integer, where integers 94,967,296 through to 99,999,999 can only be yielded by 42 unique integers.

This means that unless the range that start, end and len produce in randomize_range() is perfectly divisible² by 4,294,967,295, one cannot expect an evenly distributed random result from this function. For this reason a function was created to evenly produce inte-gers between 0 and 99 through randomly producing smaller inteinte-gers with even distribu-tion and combining them to produce evenly distributed integers between 0 and 99. This was required, as random percentages are usually represented as a value between 0 and 100.

2Yields a result with no remainder.

4.1. RANDOM NUMBER GENERATION 77

Figure 4.2: Modulus 100 applied to Single Byte for 10,000,000 Iterations Using Custom Method

In implementation, single bytes were collected and masked to produce the required smaller values. Applying a bitmask of 00111111 to a byte limits the value to a range of 0 through to 63. Similarly, a mask of 00011111 yields a value from 0 to 31 and 00000011 yields a value from 0 to 3. Although the range of numbers that each mask can produce adds up to 100, in practice this function will never yield a number over 97. For this reason two single bit numbers are produced.

The combination of these numbers, if logic is applied correctly in implementation, pro-duces an evenly distributed integer between 0 and 100, and these results can be com-pared to the standard modulus function in Figures 4.1 and 4.2. This does however come at the cost of more processing time required to generate all the individual random in-tegers. Because of this, optimization can be done through generating two random bytes and masking a high order of bits to produce one number, and the low order to produce another. If one wants to ensure an even distribution, no two numbers’ bit masks should intercept on the same byte.

In document Towards large scale software based network routing simulation (Page 86-93)