Adaptive Random Testing of a Trading System VICTOR CARLSSON

(1)

Adaptive Random Testing

of a Trading System

V I C T O R C A R L S S O N

Master of Science Thesis

Stockholm, Sweden

(2)

Adaptive Random Testing

of a Trading System

V I C T O R C A R L S S O N

Master’s Thesis in Computer Science (30 ECTS credits) at the School of Engineering Physics Royal Institute of Technology year 2011 Supervisor at CSC was Karl Meinke Examiner was Stefan Arnborg TRITA-CSC-E 2011:067 ISRN-KTH/CSC/E--11/067--SE ISSN-1653-5715

Royal Institute of Technology

School of Computer Science and Communication

KTH CSC

SE-100 44 Stockholm, Sweden URL: www.kth.se/csc

(3)

Abstract

This thesis aims to explore the viability of using Adaptive Random Testing and Parameterized Random Testing for functional testing of a trading system. In particular I will test a subset of the input domain that constitutes sending an order to the exchange. I use parameterized Random Testing and Adaptive Random Testing to generate input, and I use both a heuristic oracle and an exception based or-acle to verify the output of the trading system. My Random Testing efforts found faults that had previously remained undetected. Hence Random Testing has empirically been shown to be a cost efficient supplementary testing method. Furthermore my results indicate that Adaptive Random Testing is as good as or better than Random Testing at discovering faults.

(4)

Referat

Adaptive Random Testing och

parameteriserad Random Testing av ett

handelssystem

Detta examensarbete ämnar utforska lämpligheten att an-vända Adaptive Random Testing och parameteriserad Ran-dom Testing för funktionell testning av ett handelssystem. Specifikt kommer jag att testa en delmängd av indatado-mänen som innebär ordersändning till börsen. Jag använder parameteriserad Random Testing samt Adaptive Random Testing för indatagenerering. För verefiering av utdata från handelssystemet används både ett heuristiskt orakel samt ett exception orakel. Min användning av Random Testing har funnit tidigare oupptäckta fel. Således har jag empiriskt visat att Random Testing är ett kostnadseffektivt supple-ment till övrig testning. Vidare indikerar mina resultat att Adaptive Random Testing är minst lika bra, eller bättre, på att upptäcka fel jämfört med Random Testing.

(5)

Acknowledgements

I would like to express my deep gratitude for the guidance given to me by my advi-sors Noah Höjeberg at Cinnober and Karl Meinke at KTH. Without their support and advice my work would not have been possible. Furthermore I would like to thank Cinnober for giving me the opportunity to work on this project. Whenever I have had any questions the people at Cinnober have been both helpful and friendly in providing assistance. In particular I would like to thank the following people at Cinnober (ordered randomly): Peter Eriksson, Amir Hossein Chini Foroushan, Björn Tennander, Magnus Melander, David Karlgren, Jonas Fügedi, Magnus Sköld, Dane Cavanagh, Anders Lindgren and Andreas Eriksson.

(6)

Glossary

ART Adaptive Random Testing

Ask Ask orders are orders that want to sell an asset Bid Bid orders are orders that want to buy an asset

EUR Euro

FIX Financial Information eXchange protocol

FSCS-ART Fixed Size Candidate Set Adaptive Random Testing

ME Matching Engine

MTBF Mean Time Between Failure

RT Random Testing

SEK Swedish krona

(7)

In this part I will provide a background for my work. In section 2 I examine different methods of software testing as well as the theory behind them. In section 3 I give a brief overview of how an electronic exchange operates. Finally, in section 4 I explain how the system that I intend to test works.

2 Software Testing

Software testing is the practice of improving the quality of software and attempting to establish if a piece of software behaves as expected and desired. [Spillner et al., 2006] It is important to note that whilst testing may show the existence of faults in soft-ware it is practically impossible to show absence. [Dijkstra, 1972]

2.1 Testing Methodologies

I will now give an overview of different existing testing methodologies. Random testing will be described in detail in section 2.2. The software that is under test will be referred to as the System Under Test (SUT).

2.1.1 Black Box Testing

Black Box Testing is a general class of testing methods where the SUT is assumed to be a black box in the sense that the inner working of the system is disregarded. The SUT is seen as a box that takes input and returns output. In black box testing the SUT is studied according to the specification. [Spillner et al., 2006]

2.1.2 White Box Testing

White Box Testing is another general class of testing methods in which the source code is considered when designing test cases. [Spillner et al., 2006]

2.1.3 Static Testing

Static Testing is a kind of testing where the software is not executed. This could be syntax checking or reading the source code to see whether sound programming prac-tices are followed. A reading of the specification for a system would also constitute static testing. [Spillner et al., 2006]

2.1.4 Manual Testing

Manual Testing is when a person executes the software and manually performs a set of operations. This type of testing can be done by following a script of what to test or by attempting to imitate the way in which a user might use the SUT. [Robinson, 2000]

(14)

2.1.5 Automatic Testing

The term Automatic Testing encompasses all testing that is not manually performed by a person. Many types of tests can be automated. Examples include, but are not limited to, unit tests, recorded manual tests and random testing. [Robinson, 2000]

2.1.6 Unit Testing

Unit Tests test small functionality in the SUT and not the full-scale system. An example would be to test that a particular function in the code behaves as it should. [Spillner et al., 2006]

2.1.7 Regression Testing

Regression Testing is the procedure when testing new versions of the SUT. We can re-test to ensure that an upgraded system still passes the same tests as the old version. By the nature of regression testing we will not be able to discover errors that exist in both versions. [Spillner et al., 2006]

2.1.8 Exploratory Testing

This approach to testing is a more free method where it is up to the tester to simultaneously learn, design and execute tests. Exploratory testing used to be known as ad hoc testing due to its seemingly unstructured style. The term ad hoc has since been replaced in order to dispel beliefs that this method was somehow completely without structure.

The main idea with exploratory testing is that the tester should be dynamic and not locked down to a static protocol of tests to perform. If there is reason to believe that a certain test would be useful that test can be immediately executed without the need to draft an entirely new testing plan.[Bach, 2002] Exploratory testing can be seen more as a general approach to testing rather than a specific method.

2.1.9 Search Based Testing (SBT)

SBT is any kind of testing where solving an optimization problem is part of the testing. First a fitness function is found. This fitness function should give a measure of how close a test is to failing. Using the fitness function an optimization problem can be solved that aims at finding tests that fail.

SBT can be used to test several different aspects of a piece of software. If perfor-mance is being tested the fitness function could be the execution time of a test and the optimization would attempt to find the slowest or fastest tests. In some cases of testing however, a good fitness function may not exist. In the worst case scenario the fitness function would be a Boolean function and the optimization problem would degenerate into a ’needle in the haystack’ search problem. [Harman et al., 2009]

(15)

2.1.10 Model Based Testing (MBT)

MBT can be defined as the automatable derivation of concrete test cases from abstract formal models, and their execution. A brief summary of the MBT approach is described in steps below.

1. A formal model of the SUT is created based on the specifications of the SUT. This model is generally a simplified abstract version of the SUT.

2. Test selection criteria are defined. That is, a definition on what to test. For instance one might wish to test according to operational profiles. It is also possible to attempt to cover all possible paths or states of the SUT, or at least a finite subset of those paths or states.

3. The test selection criteria are used together with the model of the SUT to define explicit specifications for the tests.

4. A test suite is generated according to the test specifications. This test suite will contain sets of both input and output as predicted by the model.

5. Once the tests have been generated they are executed on the SUT.

6. The outputs of the tests are compared with the expected output predicted by the model and a verdict is given stating if the test has passed, failed or been inconclusive.

The interested reader is recommended to read the taxonomy [Utting et al., 2006] for a more detailed explanation of MBT.

2.2 Random Testing (RT)

The idea behind Random Testing is to send random input to the SUT. The input is generated from some distribution over the input domain (possibly with a restriction to a subset of the input domain). The output is verified with an oracle (see section 2.2.2) that determines if the SUT is acting as specified and expected.

One difficulty with non-random testing is the discovery of low-frequency faults. If the SUT contains a fault that is only triggered by a very specific sequence of input it might be very difficult to discover by manually creating a test case. A major strength with RT is that it is a cost-efficient method for creating a large number of diverse test cases that would be expensive to create manually. Hence it can be efficient at finding low-frequency faults that non-random testing might not discover. [Höjeberg, 2007]

It is however important not to be lulled into a false sense of security by the sheer number of test cases that RT can generate. Even though many test cases can be generated it is possible that a number of very common test cases are not tested with RT. Furthermore, if an insufficient oracle is used RT may even miss faults triggered

(16)

by a test case. Hence, whilst RT can be an efficient method, other testing is still necessary. [Höjeberg, 2007]

Another strength of RT is that it allows for statistical estimates of software reliability. (see section 2.2.4)

2.2.1 RT Process

Random testing consists of three parts. 1. Generate input.

2. Execute the test case. 3. Evaluate the output.

A schematic overview of the RT process can be seen in figure 0.1.

Figure 0.1. A schematic overview of the RT process. Input is generated, sent to

the SUT and evaluated by an oracle.

Generating Input In order to conduct a test we need a test case, hence we

need input to the SUT. For an input to be useful it is required that it lies in the input domain. The input domain is the set of all input sequences for which the SUT has a specified output.[Hamlet, 1994] It may be difficult to fully define the set that is the input domain.

For instance the input domain may consist of an unbounded set of sequences of input. These sequences can be complex and dependent on both time and the internal state of the SUT. If the complete input domain is difficult to define we can restrict the distribution to a known subset of the domain.

Given the input domain or a subset thereof we can generate test cases. If we furthermore have an operational profile we can adjust the distribution of our test generation in order to better simulate likely use of the SUT.[Hamlet, 1994]

(17)

If the input is complicated it might be non-trivial to generate uniform input. One attempt to generate uniformly distributed input when the input takes non-trivial form is Seed [Heam and Nicaud, 2009]. It can generate input data structures defined by grammar-like rules in an XML format.

Executing the Test Case This part is rather straightforward. Assuming

that we have a test case we load the input into the SUT.

Evaluating the Output After executing the test case the output must be

analyzed for the test to be useful at all. Strategies for analyzing the output of a test are known as oracles. [Hoffman, 2001] Explanations of some different classes of oracles follow below (section 2.2.2).

2.2.2 Oracles

The purpose of an oracle is to verify that the SUT handles input correctly. It does this by predicting an output for a certain input and then comparing the actual result with the prediction. If the actual output matches the prediction the test case passes, otherwise it fails.

No Oracle When we don’t have any oracle we don’t know if the output is

correct or not. The only behaviors that we can observe are system crashes or logged exceptions.[Hoffman, 2001] A system that, for all input, never crash or throw exceptions but always give incorrect output would pass a no oracle strategy with flying colors.

Even though random testing without oracle will miss subtle faults it can still have significant value. Random testing without oracle could for instance be a very cheap way to test for crashing bugs. [Höjeberg, 2007]

True Oracle On the other end of the spectrum from the no oracle strategy

a true oracle will give correct output for all input that the SUT accepts. The main problems with true oracles are that they are expensive to build and that their complexity makes it very possible that they contain errors. If the specification for the SUT is equivalent to another independent system then that could be used as a cheap true oracle. However if this oracle is another piece of software its correctness cannot be guaranteed in the general case.[Hoffman, 2001]

Consistency Oracle This approach uses a previous version of the SUT as an

oracle. This is a form of regression testing and is very cheap but suffers from the drawback of not being able to find errors that exists in both versions.

Heuristic Oracle Heuristics may be used to specify general characteristics

(18)

x ∈ {x ∈ R|x > 1} may be required to be smaller than x. Simple heuristic oracles are cheap to build and they are good at finding greatly divergent output. However subtle errors are unlikely to be found with this method.[Hoffman, 2001]

2.2.3 Parameterized Random Testing

Random Testing can be parameterized to change the distribution of the input gen-eration. A basic parameterization would be a uniform ditribution over the input domain. This way all possible input will be tested with the same probability and the chance of discovering a fault is related to the size of the fault region in the input domain. Another parameterization would be to generate input from a distribution over the input domain that fit a particular operational profile.

Another useful way to use parameterization is to restrict the testing to a smaller subset of the input domain. That allows for instense testing of particular function-ality which can be useful if there are suspicions that a certain part of the software contains faults.

2.2.4 Reliability Measurements with Random Testing

The goal with measuring software reliability is to answer the following question: Given a system, what is the probability that it will fail in a given time interval, or, what is the expected duration between successive failures? [NASA, 2004]

Methodology RT can be used to measure software reliability by simulating

actual use of a system and measuring how often faults occur.

The idea behind measuring software reliability with RT is based on two assump-tions. First we assume that we can consider the input to the SUT to be a random variable with a certain, known, distribution. Then we assume that we have an ora-cle that can detect all faults. Hence we can adjust the distribution of the RT to fit the actual use of the system and notice how often faults occur.

The first assumption, that input can be considered to follow a known distribu-tion, is crucial for getting accurate results. This input distribution is also known as an operational profile. If RT is performed with an input distribution that doesn’t match actual use of the system, then any reliability estimates could be arbitrarily inaccurate. [Hamlet, 1994] Similarly it is important with a good oracle. Even if RT is used with an operational profile that perfectly matches actual use of the system, if the oracle is bad it means that triggered faults could remain unnoticed.

Reliability Measures Given a sequence of passed and failed test cases it is

possible to use statistical analysis to estimate different measures of reliability. One such measure is Mean Time Between Failure (MTBF). If we can get a good estimate for MTBF from our RT simulations it is possible to model the failure rate of the software as a Poisson process Po(λ), where λ = _{M T BF}1 . Such a model is sufficient

(19)

to compute the probability of failure in any given time interval, or, the expected duration between failures. [Höjeberg, 2007]

2.3 Adaptive Random Testing (ART)

A number of empirical studies have found that faults in software tend to result from contiguous fault regions in the input domain. Figure 0.2 shows abstract rep-resentations of different classes of error regions in the input domain. It has been empirically shown that box-shaped or strip-shaped error regions are more common than point-shaped error regions. [Ammann and Knight, 1988] [Bishop et al., 1993] [Chan et al., 1996] The idea behind Adaptive Random Testing is to attempt modify RT in order to exploit this phenomenon of software faults.

Figure 0.2. Contiguous error region shapes (left, middle) have empirically been

shown to be more common than point shaped (right) error domains.

If fault regions in the input domain are contiguous regions it follows that non-fault regions also have to be contiguous regions. Hence it would be expected that it is inefficient to choose test cases that are close to each other. In order to achieve a higher fault detection probability ART aims at generating test cases with an even spread in the input domain. [Chen et al., 2010]

There exist several different algorithms for generating test cases using ART prin-ciples. A selection of these algorithms are described briefly in [Chen et al., 2010]. One main point in that article is that the performance gains of the various ART algorithms compared to RT is largely similar. The main differences lie in the com-putational overhead for test case generation and in how suitable the different algo-rithms are for different characteristics of the input domain.

One interesting ART method is the Fixed Size Candidate Set ART algorithm (FSCS-ART) [Chen et al., 2005]. FSCS-ART works in the following way:

1. Generate one test case t and add it to the set of test cases T . 2. Generate k test case candidates and add them to a candidate set C. 3. Find the nearest neighbor in T for each candidate in C.

4. Select the candidate in C that is furthest from its nearest neighbor. 5. Add the selected candidate to T and discard all candidates from C.

(20)

6. Repeat from step 2 until some stopping criteria is met.

Figure 0.3 shows FSCS-ART test case generation.

Figure 0.3. Test case generation with FSCS-ART. In this example, three candidate

test cases are generated at t = 0. The most different candidate (candidate 3) is selected and added to the test suite. The other candidates are rejected.

Another example of ART is Restricted Random Testing [Chen et al., 2009] that iteratively generates new test cases and discards any candidate that lies too close to an existing test case. The exclusion zones surrounding existing test also shrink as successively more test cases are generated in order to permit input closer and closer to existing test cases. ART by partitioning [Chen et al., 2004] is another class of methods that divides the input domain in partitions and generates one test case per partition. When test cases have been constructed for all partitions it is possible to extend testing by dynamically creating more partitions. Quasi-Randon Testing [Chen and Merkel, 2007] is a method where quasi-random numbers are used to generate random test cases that are permuted in a way to make them more even and less clustered than RT.

Common to all ART algorithms is that they attempt to generate test cases with some intuitive idea of even spread of the test cases. Furthermore ART has been empirically shown to be at least as effective as RT at finding faults. When faults stem from non-point shaped regions of the input domain ART has empirically been more effective than RT at finding faults. The drawback comes from the fact that ART has computational overhead in the test case selection compared to RT.

2.4 White Box Random Testing

In the Random Testing describerd above the SUT has been considered a black box. Several efforts have been made to incorporate RT into white box testing. Such efforts use either the source code or a formal model of the software to generate tests.

(21)

One example of RT that uses source code is DART [Godefroid et al., 2005]. DART stands for Directed Automated Random Testing and it attempts to cover different paths of the source code to trigger errors. The oracle used for DART con-sists of checking for crashes, failed assertions and non-termination (non-termination is checked inconclusively, of course). It is random in the sense that input is selected randomly. But DART also contains a searching component when it attempts fo fulfill conditionals in order to traverse different paths of the code.

In MBT, Random Testing is used in order to reach uniform path distribu-tions of test cases. Several such attempts have been made. [Dadeau et al., 2008] [Groce and Joshi, 2008]

3 Exchanges

In the context of this thesis an exchange is an organized marketplace where par-ticipants trade financial instruments. The financial instruments can be securities, commodities, currencies or various derivatives thereof.

The fact that an exchange is organized means that transactions occur according to specific rules. All financial instruments that are not traded on an exchange are referred to as OTC (Over the Counter).

The trading at an exchange can be electronic or conducted by people on a trading floor.[Schauer, 2006] In this thesis we are only interested in how electronic trading works.

3.1 Actors on an exchange

In order to be allowed to trade directly on an electronic exchange you need to pay fees and you also need to have technological infrastructure in place in order to connect to the exchange. If you are not able or willing to pay the high fees and have the technology you can trade through an intermediary who in turn will trade on the exchange on your behalf.[Schauer, 2006]

3.1.1 Retail and Institutional Customers

Retail customers are individuals who trade relatively low volumes and they place orders on the exchange through brokers. Institutional customers trade larger vol-umes than retail customers but they are not generally connected to the exchange and also trade through intermediaries.[Schauer, 2006]

3.1.2 Brokers

In this context brokers are firms or individuals at firms who trade on an exchange on behalf of their clients. The brokers do not take any risk themselves but merely work to facilitate transactions for a client. This client could be a retail or institutional investor.[Schauer, 2006]

(22)

In general a person could also be called a broker if he connects buyer and seller directly without entering into any transaction. For instance a holder of a large position might enlist a broker to find an OTC buyer. This way, the holder hopes to fetch a better price than if the position was sold through the order book on the exchange. These types of OTC brokers will not be discussed in this thesis.

3.1.3 Market Makers

Market makers commit themselves to add liquidity to a set of instruments on the exchange. They do this by always showing bid (buy price) and ask (sell price) offers for other participants in the market. They are also required to keep the spread below a certain threshold. The spread is the difference between the bid and the ask. Market makers are willing to commit to this because they can make money on the spread. They do this by buying at the bid and selling at the ask. In return for their commitment they will receive benefits from the exchange such as lower transaction fees in the security that they make a market in.[Schauer, 2006][NasdaqOMX, 2010]

3.2 Order Book

The exchanges in this thesis are auction markets. In these exchanges all trades pass through the order book. The order book of an instrument is the set of all orders places by market participants for that instrument. The purpose of the order book is to discover the market price of an instrument as well as bringing together buyers and sellers.

A participant may place bid or ask orders of different price, volume and type into the order book. These orders will then be matched according to specific deter-ministic rules.[Schauer, 2006]

3.2.1 Order Types

Several different kinds of orders exist which behave in different ways in the or-der book. Different oror-der types and oror-der attributes can be combined. Which order types are available and their exact definitions may be different between exchanges.[Schauer, 2006]

Limit Order This is the most basic order type. A price and volume is specified

and the order will, if matched, be fully or partially executed at the desired price or a better price.[Burgundy, 2010]

Fill-And-Kill A Fill-And-Kill order will immediately attempt to be fully or

par-tially filled. When the attempted matching has occurred the order will immediately be canceled.[Burgundy, 2010]

Fill-Or-Kill A Fill-Or-Kill order will immediately attempt to be fully filled. If it

(23)

Pegged Order A Pegged Order behaves like a limit order except that its price will change with changes to the orders in the book. The price will be set at a static offset from an anchor price. The anchor price will be the best bid, the best offer or the mid price (arithmetic mean of the best bid and the best offer). Every time the anchor price changes the Pegged Order will be refreshed to a new entry time.[Burgundy, 2010]

Dark Order A Dark Order will not be publicly visible in the order book but it

can be matched with any other eligible order. A Dark Order can be matched with another Dark Order.[Burgundy, 2010]

Iceberg Order An Iceberg Order will have a subset of its volume visible and the

rest of its volume will be dark. When the visible volume has been matched and there exist unmatched dark volume the visible volume will be refilled from the dark volume.[Burgundy, 2010]

3.2.2 Validity Conditions

An order will stay in the order book as long as it is valid. An order is valid as long as its Validity Condition dictates or until it has been filled, whichever occurs first.[Schauer, 2006]

Day Order Day Orders are valid for the remaining of the trading day on which

they were submitted.[Burgundy, 2010]

Good-Till-Date With this validity condition the order will be valid until the

date specified.[Burgundy, 2010]

Good-Till-Cancel This order will be valid until it is manually canceled.[Burgundy, 2010]

3.3 Order Matching

We will use the notation X@Y to mean X instruments at the price Y .

Every time an order is placed in the order book order matching may occur. For instance a bid order of 1@1 might be matched with an ask order of 1@1.

In general, when matching orders, the first priority is to get the best price for the order. The second priority is the time at which the order was placed. Orders with different attributes may also be handled in special ways. [Schauer, 2006][Burgundy, 2010]

(24)

Bid Ask

100@100 100@100 (new order) 100@100

(a) Order Book A

Bid Ask 100@100 100@99 (b) Order Book B Bid Ask 100@100 100@99 (c) Order Book C

Figure 0.4. (a) The ask order would be matched with the bid order that was placed

first. (b) If a limit ask order of 100@99 was placed it would be matched with the 100@100 bid and the transaction would occur at the price 100. If a limit ask order of 200@99 would be placed there would be a partial match of 100@100 and the remaining volume would be matched with the 100@99 bid. (c) No match would occur in this case.

Figure 0.5. Architecture overview of TRADExpress™.

4 The Cinnober System

4.1 Architecture overview

4.1.1 TAX - Trading Application Multiplexor

The TAX is a router that handles incoming and outgoing messages. Actors at the exchange interact with the system through the TAX. Interactions with the TAX are carried out with either the FIX or the EMAPI protocol. For a more thorough explanation of the FIX protocol see section 5

(25)

4.1.2 ME - Matching Engine

The central part of the system is the ME. This is where order books and order matching are handled.

4.1.3 CD - Common Data

The CD contains data that is used by other parts of the system. For example it contains user data, instrument definitions and trading schedules. The CD also has a backup of all data for redundancy reasons.

4.1.4 QS - Query Server

The QS keeps an active copy of the state of the order books in the ME. The QS is used as a source for information queries to lighten the load on the ME.

4.1.5 HS - History Server

The HS handles queries for historical data and store orders that should remain in the system between trading days.

4.1.6 VS - Vote Server

A VS is used to facilitate fail overs from primary to secondary servers. If, for instance, a primary server fails the VS will instruct the secondary server to step in and take over the tasks. Each primary and backup pair of servers will have an individual VS attached to it.

The purpose of the VS is to ensure that only one server of a certain type is active at any given time. That was the risk of multiple servers attempting to accomplish the same task simultaneously is mitigated.

4.1.7 MOPS - Market Operations application

The MOPS is used by the exchange to monitor the market. From the MOPS a user can see the status of order books and orders.

4.1.8 SOPS - System Operations application

Where the MOPS is used to monitor the market activities in the system the SOPS monitors the state of the system itself. The SOPS is used by the exchange to administer and monitor the services in the system.

5 The FIX Protocol

The Financial Information eXchange (FIX) protocol is a messaging standard for real-time electronic exchange of securities transactions. The protocol is the de facto

(26)

messaging standard for pre trade and trade communications in equity markets. It is specified and maintained by FIX Protocol Ltd and kept in the public domain. Hence anyone is free to implement and use FIX in communications.[FIXProtocol, 2010b]

The FIX protocol contains specifications for all aspects of communications. In this section I will focus on how orders are placed and confirmed since that is what I will focus on testing. Hence I will not discuss procedures for logon, logoff, handling of disruptions in communications etc.

5.1 FIX Message Format

A FIX message is a string of tags and values separated with a separation character. The tags are integer numbers and the values are the string representations of several different data types. Basic examples of tags used when placing orders are: price, quantity and instrument name.

There are several different versions of the FIX protocols that support different tags. Furthermore it is possible to specify custom tags that are not available in the standard FIX protocol.

Some tags are required for all FIX messages. These are grouped together in a standard header and a standard trailer. The FIX protocol also contains tags to verify the consistency of a message. The first such tag contains the length of the message. The second such tag contains a checksum computed by taking the byte sum of each character in the FIX message and then taking that value modulo 256. [FIXProtocol, 2008]

5.2 Sending Orders with the FIX Protocol

In order to send orders over FIX a connection needs to be established and the user must log on. Furthermore the user must have permission to place orders at the exchange.

When placing orders one at a time the FIX message type New Order Single is used to fully define the order. After having sent the order an Execution Report message is returned from the server. This message is a confirmation that the order has been received and will contain sufficient information to define the order. The Execution Report will also contain the current state of the order.[Cinnober, 2010]

5.2.1 New Order Single message

The main part of the New Order Single message consists of many tags that together define the order. A list of these fields and their possible values for our exchange can be found in Appendix A. As can be seen in that list a subset of the fields are required. These are the basic fields that every order is reasonably expected to contain such as price, instrument name, side (buy/sell) etc. Other tags govern more specific characteristics of the order such as the order type and validity conditions.

(27)

5.2.2 Execution Report

The Execution Report contains many of the tags found in the New Order Single message. This is so that a newly placed order can be confirmed together with a complete definition of the order. A complete list of the available fields for Execution Report messages can be found in Appendix A.

(28)

(29)

Part II

(30)

(31)

6 Problem formulation

The task is to test Cinnober TRADExpress™. More specifically the work should be centered on testing the implementation of the FIX protocol for sending orders to the exchange. The focus should be on the adherence to functional specifications.

I will investigate the viability of using Random Testing and Adaptive Random Testing to test the system.

7 Choice of Method

Cinnober regularly test using unit tests, random tests, manual testing and regression testing. Furthermore performance under load is being tested extensively.

I have decided to test using Random Testing with different distributions and Adaptive Random Testing.

This section will give brief descriptions of the different methods that were con-sidered and section 8 will explain their implementations.

7.1 Random Testing - RT

Random Testing in various forms have previously been done at Cinnober with pos-itive results. [Höjeberg, 2007][Karlgren, 2009] Both previous master’s theses used an exception based oracle. The first [Höjeberg, 2007] also had a heuristic oracle able to detect some anomalies.

What I intend to do is to expand on the oracle as explained below in subsection 7.2.

I will also restrict my testing to a subset of the input domain. The input domain chosen is the full set of input that constitutes the sending of an order. The input will be biased (but not restricted) towards valid orders.

My intention is to create code that can generate all valid (or invalid) orders with all valid (or invalid) combination of parameters. Furthermore I will add parame-terization so that I can test only particular order types or generate the input using different distributions.

7.2 Oracle

I will primarily use an exception based oracle and a heuristic oracle.

The exception based oracle will consist of checking if the SUT crashes or if the log files contain unhandled exceptions.

The heuristic oracle will verify that the system receives and interprets each order correctly. The oracle will not verify that the order behaves correctly in the system after it has been received and confirmed.

Each time that the oracle reports a failure I have to manually verify that it is indeed a failure. During implementation it happened that I sent in orders that I thought were valid but in fact they were not, hence the need for manual verification.

(32)

7.3 Simulated Trading

One issue with RT is that any estimates of the reliability of the system based on the testing can be arbitrarily incorrect if the input distribution does not follow an operational profile. [Hamlet, 1994] In an attempt to remedy this I shall create random testing that simulates actual trading on the exchange.

7.4 Adaptive Random Testing - ART

ART has been found empirically to be as good as or better than RT at detecting faults. [Chen et al., 2010] Hence I will use ART for testing and compare the results with RT. I will use the algorithm FSCS-ART described in section 2.3.

The reason for choosing FSCS-ART over other ART algorithms was that I deemed it to be rather easy to implement. The only new functionality required for FSCS-ART is the ability to measure the distance between test cases. I was able to implement this without much problem (see section 8.4).

All other methods that I have found were either grid based or partition based algorithms. With those methods it was necessary to be able to randomly generate a test case from an arbitrary, and dynamically changing, subset of the input domain. The way my order generator was coded, it was not immediately obvious how such restricted test case generation would have been performed. Hence I did not have the time to implement other ART algorithms than FSCS-ART.

7.5 Search Based Testing - SBT

I have also considered using Search Based Testing. However, according to the papers [Arcuri et al., 2010] and [Harman et al., 2009] SBT requires a good fitness function in order to be more efficient than RT. I do not believe that this particular system allows for a good fitness function. The only fitness function that I see is a Boolean function that decides if a particular test case fails or passes. Hence SBT would degenerate into a ’needle in the haystack’ search problem which would give no advantage over RT due to the size of the input domain.

Even though I could not think of a way to use SBT for finding errors I have thought of using SBT in order to automatically find performance issues. The idea would be to let the fitness function be the execution time of a particular test case. With that fitness function you could search for input that maximizes this fitness function. That way input on which the SUT performs poorly could be discovered. However my task has not been to test performance, hence I have not looked into this further but I felt that it warranted a mention.

8 Implementation

All testing is done with the JUnit testing framework. [junit.org, 2011] This allows for neat monitoring and control of the testing process.

(33)

8.1 Random Testing

The core of my testing efforts was the creation of a class that can generate random orders. The class is instantiated with a seed and can then be used to generate orders one at a time. Since a seed is used it is possible to recreate a sequence of random orders. Hence a fault found once can be triggered again at a later time. Furthermore, once a fix is in place, the same random sequence can be used to test the fix.

The actual function that generates a new order works in five steps.

1. Call the generation function with parameters specifying the generation method and possible input domain restrictions.

2. Create a placeholder order object.

3. Randomly select which optional order parameters to include subject to re-strictions given at the function call.

4. Randomly set all non-static parameters. 5. Set protocol specific parameters.

At function call (step 1) parameters can be given to control aspects of the order generation. One aspect is restriction of the generation to a subset of the input domain. Restrictions can be used to either guarantee a certain order type or to specify possible order types. Another aspect is the distribution used to generate parameters. Here it is possible to specify either a uniform distribution or a normal distribution.

Step 2 is straightforward and is only used to create a blank placeholder order. The placeholder order will contain information such as username of the sender, protocol version and the name of the receiving exchange.

In step 3 the order type is decided by selecting which order types to include whilst considering specified restrictions. There are also restrictions in place to prevent too many invalid orders being generated. For instance it is sufficient if only a small number of orders are allowed to have a negative price.

In step 4 all required and the selected optional parameters are randomized ac-cording to the distribution specified at the function call. It should be noted that the normal distribution only applies to the price and the quantity parameters. Since the other parameters are discrete they are always uniformly distributed.

The last step, step 5, will generate protocol specific parameters. These would include the length of the FIX message, the sending time and the checksum. The reason that these are not randomly generated is that the subset of the input domain that includes incorrectly formatted messages is large, uninteresting and well tested.

The actual procedure for testing can be summarized in the following way: 1. Establish a connection to the exchange and log in.

(34)

2. Create an order generation object and initialize it with a seed. 3. Generate a new order and send it to the exchange.

4. Validate the response from the exchange with the oracle.

5. Repeat from step 3 until the oracle finds a fault or until a predetermined number of test cases have been sent.

This process is also described in Figure 0.6.

Figure 0.6. The RT process for testing the sending of orders to the exchange.

8.2 Oracle

The exception based Oracle consists of manually checking log files for unhandled exceptions in the system.

Every time the exchange receives a FIX message it shall respond. The heuristic Oracle looks at the responses to sent orders to validate if they were correctly received or not.

The heuristic Oracle can be described in the following way: 1. The order is sent to the exchange.

2. The generated order is analyzed to decide if it is a valid or invalid. 3. A subset of the expected response from the system is generated.

(35)

4. The actual response from the system is compared to the generated expected response.

5. If the responses match the test case pass, otherwise it fails. Step 1 is described in section 8.1 above.

Step 2 contains a number of heuristic validations to determine if the order follows the requirements of the system. For instance negative prices or certain combinations of parameters are invalid. If the order is found to be invalid it is noted.

In step 3 a subset of the response from the system is generated. This generated information contains the parameters of the order such as price, quantity and order type. Furthermore it specifies if the order is expected to be accepted or rejected at the exchange. If the order was determined to be valid in step 2 it is expected to be accepted, if it was determined to be invalid it is expected to be rejected.

In step 4 the Oracle compares the response received from the system with the generated response. Hence the Oracle verifies that the order parameters are cor-rectly interpreted by the system and that orders are corcor-rectly deemed valid or invalid.

If the comparison in step 4 find the system response to match the expected response then the test case passes. Is there is any discrepancy between the actual and expected responses then the test case would fail and the failing test case would have to be manually analyses to determine if it constitutes an error or if the Oracle is incorrect.

8.3 Simulated Trading

In order to simulate trading I used the result of [Karlgren, 2009] that the mid price during a trading day can be approximated as following an AR(1)-process. The way I have implemented this is by letting the price of each new order be equal to the mid price with an added random noise. This way the order book during testing will resemble an actual order book with the orders clustered around one price. Furthermore the noise added to each new order price will ensure that the mid price moves.

8.4 Adaptive Random Testing

I implemented the algorithm FSCS-ART described in 2.3.

In order to implement the algorithm there was a need to measure the distance between test cases. I did this by creating a map from the input domain to the unit hypercube in Rn, where n is the number of interesting parameters. The map works by taking each interesting dimension of the input domain and mapping it to the closed range [0, 1]. The set of such ranges will then make up the coordinates of the test case. After having a representation of a test case as a point in Rnit is possible to use the Euclidean distance as a measurement of how much test cases differ.

(36)

The actual implementation is a sub class of the random order generation class. This is initiated with a seed and will generate a fixed number of orders at a time, using a given number of candidates.

The order generation starts with generating one random order as a start. After that, new orders are created sequentially by generating candidate orders and se-lecting the one that is “furthest” from the existing orders. The “furthest” order is decided by finding the candidate with the maximal distance to its nearest neighbor. There is one important thing that could be improved in my implementation. The algorithm used for finding the nearest neighbor is the naïve algorithm. That is, for each candidate I iterate through all existing points in order to find the nearest neighbor. The naïve solution is O(n) but there exists algorithms that are O(log n) [Knuth, 1973], however I have not implemented any such algorithm. For test suites with running times over an hour or two this gives significant overhead in the test case generation.

The ART process for testing the sending of orders is described in the diagram in Figure 0.7.

(37)

9 Results

9.1 Error Found Manually

During implementation I managed to find one error. When an incorrect order is sent to the exchange the response is an execution report that informs the sender that the order has been rejected. I found, by manually comparing this message with the specification, that execution reports that were rejects were missing two fields that all execution reports are required to contain.

9.2 Errors Found with Random Testing

I have generated several thousands of orders that have been sent in to the system. In each case I have checked the heuristic oracle and the exception based oracle. When a test case fails I first read the logs and specification to verify that I have indeed encountered a true error. If it was an error I then used manual tests as well as parameterized random tests similar to the failing input in order to explore the exact trigger of the error.

All errors found with Random Testing were found with the input being generated with a uniform distribution.

9.2.1 Incorrect Order Classification

The error was that orders with a particular combination of parameters would be-haves as expected at the exchange, but the execution report confirming the orders would describe them incorrectly.

This error could have been found by unit testing or manual testing. But it would likely have taken many test cases before the specific combination of parameters that could trigger the error would have been discovered.

In finding this test the heuristic oracle proved its value. Without that oracle I would not have been able to notice that the system interpreted the order incorrectly.

9.2.2 Integer Overflow

Dark orders needs to be above a certain size to be valid. That is the order volume multiplied by the price has to be above a fixed threshold. The error was that if the order volume and price were large enough their product would result in an integer overflow. Hence the size of the order would appear to be below the threshold, even though it was way above it.

This class of error could have been found using conventional boundary value testing with unit tests. But it would not have been trivial since small variations in the order volume or price would result in large differences of their product. Hence RT helped by allowing automated tests of many different values.

(38)

9.2.3 Minimum Volume Rematch

It was found that an order with a required minimum volume would under certain circumstances fail to be matched even though there were sufficient orders in the order book. The exchange tries to find the combination of orders that will give the best price under the condition that the volume must be above a minimum volume and below a maximum volume. The algorithm in the ME (see 4.1.2) that solves this knapsack problem had a subtle error that was triggered for certain specific order book configurations. The result was that the response from the server was a reject message with an additional text explaining that an exception had occurred.

The fault was triggered whilst using Random Testing with a uniform distribu-tion. The error was noticed by both the heuristic oracle and the exception based oracle. The bug was found after hundreds of thousands of tests and it is very un-likely that it would ever have been discovered by manually written unit tests. The fact that the bug was found in the ME is significant since this is the most critical and also the most heavily tested part of the exchange.

9.3 Errors Found with Adaptive Random Testing

Except for the Minimum volume rematch error above (9.2.3), all faults found with RT were also found with ART. The only fault that ART, but not RT, was able to discover was an error in the heuristic oracle.

A dark order needs to be large in scale, i.e. the volume multiplied by the price needs to be above a certain threshold. If the oracle believes that the order value is below the threshold it will expect the system to reject the order.

What happened in a few cases during ART testing was that the oracle decided that some dark orders were not large is scale (above the threshold) whilst the exchange decided that these orders were in fact large enough. The error in the oracle stemmed from an incorrect assumption about the large in scale threshold. In the oracle the threshold was set to 500.000 SEK whilst it was set to 50.000 EUR in the exchange. Furthermore the EUR/SEK exchange rate in the system was set such that the difference between these two thresholds, when converted to the same currency, was quite low.

After finding the inconsistency it was easy to update the oracle and continue testing.

(39)

Part III

Conclusion

(40)

(41)

10 Conclusion

I have found that Random Testing with a simple heuristic oracle can be a viable method for finding errors in a well tested system. The errors that were found would not have been easy to find with manually written unit tests.

I did not find more faults in the software with ART than I did with RT. However ART managed to uncover a slightly incorrect assumption in the oracle code that RT was not able to uncover. This suggests to me that ART is at least as good RT at uncovering faults, if not better.

I believe that randomly simulating actual use of the system is an important way to test it. Any errors found with such a simulation would, by the construction of the test, be likely to occur in production. This could be used to make estimates of the reliability of a system as long as one is careful not to rely too heavily on such estimates.

Even if the RT is not perfectly written it is such a cheap way to generate a large number of tests that it is hard to argue against. But it is important not to be lulled into a false sense of security by a large number of test cases and neglect other testing. Hence Random Testing should be performed in addition to other testing.

If one is careful not to jump to conclusions based on the results of the testing, and if one does not conduct random testing to the exclusion of other testing, the only bad random testing is no random testing.

11 Further Research

11.1 Oracle

The heuristic oracle used is good at validating the confirmation responses from the exchange. Hence it is a good oracle for testing a subset of the FIX implementation of the exchange. If one would like to also test that the system handles the orders correctly after confirming them, an additional oracle could be built for this purpose. Such an oracle has been built for a subset of the possible orders during a previous master’s thesis at Cinnober [Lozano, 2010] and it would be interesting to combine this with my heuristic oracle to simultaneously test a larger subset of the functional requirements of the system.

11.2 ART

The ART implementation is using an inefficient algorithm for solving the nearest neighbor problem. This algorithm could be changed into a more efficient one. Such an optimization would enable larger test suites to be run without the current over-head burden.

(42)

(43)

Bibliography

[Ammann and Knight, 1988] P.E. Ammann and J.C. Knight. Data diversity: An approach to software fault tolerance. IEEE Transactions on Computers, 37:418– 425, 1988.

[Arcuri et al., 2010] Andrea Arcuri, Muhammad Zohaib Iqbal, and Lionel Briand. Black-box system testing of real-time embedded systems using random and search-based testing. In IFIP International Conference on Testing Software and Systems (ICTSS), 2010.

[Bach, 2002] James Bach. Exploratory testing explained, 2002. http://www. satisfice.com/articles/et-article.pdf.

[Bishop et al., 1993] P.G. Bishop, Coborn House, and E Da. The variation of soft-ware survival time for different operational input profiles (or why you can wait a long time for a big bug to fail). In in Proc. FTCS-23, pages 98–107, 1993. [Burgundy, 2010] Burgundy. Trading functionality guide v.1 november 2010, 2010.

http://www.burgundy.se/documents.

[Chan et al., 1996] F. T. Chan, T. Y. Chen, I. K. Mak, and Y. T. Yu. Proportional sampling strategy: guidelines for software testing practitioners. Information and Software Technology, 38(12):775 – 782, 1996.

[Chen and Merkel, 2007] Tsong Yueh Chen and R. Merkel. Quasi-random testing. Reliability, IEEE Transactions on, 56(3):562 –568, September 2007.

[Chen et al., 2004] T. Y. Chen, R. Merkel, G. Eddy, and P. K. Wong. Adaptive random testing through dynamic partitioning. In Proceedings of the Quality Soft-ware, Fourth International Conference, QSIC ’04, pages 79–86, Washington, DC, USA, 2004. IEEE Computer Society.

[Chen et al., 2005] T.Y. Chen, H. Leung, and I.K. Mak. Adaptive random testing. In Michael Maher, editor, Advances in Computer Science - ASIAN 2004, volume 3321 of Lecture Notes in Computer Science, pages 3156–3157. Springer Berlin / Heidelberg, 2005. 10.1007/978-3-540-30502-6₂3.

(44)

[Chen et al., 2009] Tsong Yueh Chen, Fei-Ching Kuo, and Huai Liu. Adaptive random testing based on distribution metrics. Journal of Systems and Software, 82(9):1419 – 1433, 2009. SI: QSIC 2007.

[Chen et al., 2010] T. Y. Chen, Fei-Ching Kuo, Robert G. Merkel, and T. H. Tse. Adaptive random testing: the art of test case diversity. JOURNAL OF SYSTEMS AND SOFTWARE, 83(1):60–66, 2010.

[Cinnober, 2010] Cinnober. Burgundy fix specification 4.4, November 2010.

[Dadeau et al., 2008] Frédéric Dadeau, Pierre-Cyrille Héam, and J. Levrey. A com-bination of model-based testing and random testing approaches using automata. Research Report RR2008-10, LIFC - Laboratoire d’Informatique de l’Université de Franche Comté, October 2008. 21 pages.

[Dijkstra, 1972] Edsger W. Dijkstra. The humble programmer. Commun. ACM, 15(10):859–866, 1972. Turing Award lecture.

[FIXProtocol, 2008] FIXProtocol. Financial information exchange protocol (fix), version 1.1 errata, fix session protocol, March 2008. http://fixprotocol.org/ what-is-fix.shtml.

[FIXProtocol, 2010a] FIXProtocol. Fiximate3.0, 2010. http://www.fixprotocol.org/ FIXimate3.0/.

[FIXProtocol, 2010b] FIXProtocol. What is fix?, 2010. http://fixprotocol.org/ what-is-fix.shtml.

[Godefroid et al., 2005] Patrice Godefroid, Nils Klarlund, and Koushik Sen. Dart: directed automated random testing. In PLDI, pages 213–223, 2005.

[Groce and Joshi, 2008] Alex Groce and Rajeev Joshi. Random testing and model checking: building a common framework for nondeterministic exploration. ACM Press, 2008.

[Hamlet, 1994] Richard Hamlet. Random testing. In Encyclopedia of Software Engi-neering, pages 970–978. Wiley, 1994.

[Harman et al., 2009] Mark Harman, S. Afshin Mansouri, and Yuanyuan Zhang. Search based software engineering: A comprehensive analysis and review of trends techniques and applications, April 2009. Department of Computer Science, King’s College London.

[Heam and Nicaud, 2009] Pierre-Cyrille Heam and Cyril Nicaud. Seed: an easy to use random generator of recursive data structures for testing. Research Report, 2009. [Hoffman, 2001] Douglas Hoffman. Using oracles in test automation. www.

SoftwareQualityMethods.com/H-Papers.html, 2001. 2001 Pacific Northwest Software Quality Conference (PNSQC 2001).

(45)

[Höjeberg, 2007] Noah Höjeberg. Random tests in a trading system. Master’s thesis, School of Computer Science and Communication, Royal Institute of Technology, Stockholm, Sweden, 2007.

[junit.org, 2011] junit.org, 2011. http://www.junit.org/.

[Karlgren, 2009] David Karlgren. Random testing of a market place system. Master’s thesis, Division of Mathematical Statistics, Royal Institute of Technology, Stock-holm, Sweden, 2009.

[Knuth, 1973] D.E. Knuth. Sorting and searching. Art of Computer Programming. Addison-Wesley, 1973.

[Lozano, 2010] Roberto Castaneda Lozano. Constraint programming for random test-ing of a tradtest-ing system. Master’s thesis, Stockholm, Sweden, 2010. School of Infor-mation and Communication Technolog, Royal Institute of Technology.

[NASA, 2004] NASA. NASA Software Safety Guidebook. 2004.

[NasdaqOMX, 2010] NasdaqOMX. Market maker obligations, 2010. http: //nordic.nasdaqomxtrader.com/trading/optionsfutures/Market_Making/Market_ Maker_Obligations/SEK/.

[Robinson, 2000] Harry Robinson. Intelligent test automation. Software Testing & Quality Engineering, 5:24–32, 2000.

[Schauer, 2006] Philipp Martin Schauer. Market Architecture of The Largest Stock Exchanges. PhD thesis, Institut für Banken und Finanzen, Fakultät für Betrieb-swirtschaft, 2006.

[Spillner et al., 2006] Andreas Spillner, Tilo Linz, and Hans Schaefer. Software Testing Foundations. dpunkt.verlag, Heidelberg, Germany, 2006.

[Utting et al., 2006] Mark Utting, Alexander Pretschner, and Bruno Legeard. A tax-onomy of model-based testing. Technical report, April 2006.

(46)

1 Appendix 1 - FIX Message Definitions

1.1 NewOrderSingle FIX Message Definition

NewOrderSingle

Tag or

Compo-nent

Field Name Req’d Comments

Component StandardHeader y MsgType = D

11 ClOrdID y Unique identifier of the

or-der as assigned by institution or by the intermediary (CIV term, not a hub/service bu-reau) with

closest association with the investor.

526 SecondaryClOrdID

583 ClOrdLinkID

Component Parties Insert here the set of "Parties"

(firm identification) fields de-fined in "Common Compo-nents of Application Messages" 229 TradeOriginationDate 75 TradeDate 1 Account 660 AcctIDSource

581 AccountType Type of account associated

with the order (Origin)

589 DayBookingInst

590 BookingUnit

591 PreallocMethod

70 AllocID Used to assign an overall

allo-cation id to the block of pre-allocations

Component PreAllocGrp Number of repeating groups

for pre-trade allocation

63 SettlType

64 SettlDate Takes precedence over

Settl-Type value and conditionally required/omitted for specific SettlType values.

544 CashMargin

(47)

21 HandlInst

18 ExecInst Can contain multiple

instruc-tions, space delimited. If Or-dType=P, exactly one of the following values (ExecInst = L, R, M, P, O, T, W, a, d) must be specified. 110 MinQty 111 MaxFloor 100 ExDestination

Component TrdgSesGrp Specifies the number of

re-peating TradingSessionIDs

81 ProcessCode Used to identify soft trades at

order entry.

Component Instrument y Insert here the set of

"In-strument" (symbology) fields defined in "Common Compo-nents of Application

Messages"

Component FinancingDetails Insert here the set of "Financ-ingDetails" (symbology) fields defined in "Common Compo-nents of

Application Messages"

Component UndInstrmtGrp Number of underlyings

140 PrevClosePx Useful for verifying security

identification

54 Side y

114 LocateReqd Required for short sell orders

60 TransactTime y Time this order request was

initiated/released by the trader, trading system, or intermediary.

Component Stipulations Insert here the set of

"Stip-ulations" (repeating group of Fixed Income stipulations) fields defined in "Common Components of Application Messages"

(48)

Component OrderQtyData y Insert here the set of "Or-derQtyData" fields defined in "Common Components of Ap-plication Messages"

40 OrdType y

423 PriceType

44 Price Required for limit OrdTypes.

For F/X orders, should be the "all-in" rate (spot rate adjusted for forward points). Can be used

to specify a limit price for a pegged order, previously indi-cated, etc.

99 StopPx Required for OrdType =

"Stop" or OrdType = "Stop limit".

Component SpreadOrBenchmarkCurveData Insert here the set of "SpreadOrBenchmarkCurve-Data" (Fixed Income spread or

benchmark curve) fields de-fined in "Common Compo-nents of Application Mes-sages"

Component YieldData Insert here the set of

"Yield-Data" (yield-related) fields de-fined in "Common Compo-nents of Application

Messages"

15 Currency

376 ComplianceID

377 SolicitedFlag

23 IOIID Required for Previously

Indi-cated Orders (OrdType=E)

117 QuoteID Required for Previously

Quoted Orders (OrdType=D)

59 TimeInForce Absence of this field indicates

Day order

168 EffectiveTime Can specify the time at which

the order should be considered valid

(49)

432 ExpireDate Conditionally required if

TimeInForce = GTD and

ExpireTime is not specified.

126 ExpireTime Conditionally required if

TimeInForce = GTD and

ExpireDate is not specified.

427 GTBookingInst States whether executions are

booked out or accumulated on a partially filled GT order

Component CommissionData Insert here the set of

"Com-missionData" fields defined in "Common Components of Ap-plication

Messages"

528 OrderCapacity

529 OrderRestrictions

582 CustOrderCapacity

121 ForexReq Indicates that broker is

re-quested to execute a Forex accommodation trade in con-junction with the security trade.

120 SettlCurrency Required if ForexReq = Y.

775 BookingType Method for booking out this

order. Used when notifying a broker that an order to be set-tled by that broker is to be booked out as an OTC deriva-tive (e.g. CFD or similar). Absence of this field implies regular booking.

58 Text

354 EncodedTextLen Must be set if EncodedText

field is specified and must im-mediately precede it.

355 EncodedText Encoded (non-ASCII

charac-ters) representation of the Text field in the encoded for-mat specified via the

MessageEncoding field.

193 SettlDate2 Can be used with OrdType

= "Forex - Swap" to specify the "value date" for the future portion of a F/X swap.

(50)

192 OrderQty2 Can be used with OrdType = "Forex - Swap" to specify the order quantity for the future portion of a F/X swap.

640 Price2 Can be used with OrdType =

"Forex - Swap" to specify the price for the future portion of a F/X swap which is also a limit

order. For F/X orders, should be the "all-in" rate (spot rate adjusted for forward points).

77 PositionEffect For use in derivatives omnibus

accounting

203 CoveredOrUncovered For use with derivatives, such

as options

210 MaxShow

Component PegInstructions Insert here the set of

"Pe-gInstruction" fields defined in "Common Components of Ap-plication Messages"

Component DiscretionInstructions Insert here the set of "Discre-tionInstruction" fields defined in "Common Components of Application

Messages"

847 TargetStrategy The target strategy of the

or-der

848 TargetStrategyParameters For further specification of the TargetStrategy

849 ParticipationRate Mandatory for a

TargetStrat-egy=Participate order and specifies the target particpa-tion rate. For other order types optionally specifies a volume limit (i.e. do not be more than this percent of the market volume)

480 CancellationRights For CIV - Optional

481 MoneyLaunderingStatus

513 RegistID Reference to Registration

In-structions message for this Order.

(51)

494 Designation Supplementary registration information for this Order

Component StandardTrailer y

Table .1: Definition of the message type NewOrderSingle in FIX 4.4. [FIXProtocol, 2010a]

1.2 ExecutionReport FIX Message Definition

ExecutionReport

Tag or

Compo-nent

Field Name Req’d Comments

Component StandardHeader y MsgType = 8

37 OrderID y OrderID is required to be

unique for each chain of or-ders.

198 SecondaryOrderID Can be used to provide order

id used by exchange or execut-ing system.

526 SecondaryClOrdID

527 SecondaryExecID

11 ClOrdID Required for executions

against electronically sub-mitted orders which were assigned an ID by the institu-tion or

intermediary. Not required for orders manually entered by the broker or fund manager (for CIV orders).

41 OrigClOrdID Conditionally required for

re-sponse to an electronic Can-cel or CanCan-cel/Replace request (ExecType=PendingCancel, Replace, or Canceled). ClOr-dID of the previous accepted order (NOT the initial order of the day) when canceling or replacing an order.

Adaptive Random Testing of a Trading System VICTOR CARLSSON

Adaptive Random Testing

of a Trading System

V I C T O R C A R L S S O N

Master of Science Thesis

Stockholm, Sweden

Adaptive Random Testing

of a Trading System

V I C T O R C A R L S S O N

Abstract

Referat

Adaptive Random Testing och

parameteriserad Random Testing av ett

handelssystem

Acknowledgements

Glossary

Contents

1

Introduction

Part I

2

Software Testing

3

Exchanges

4

The Cinnober System

5

The FIX Protocol

Part II

6

Problem formulation

7

Choice of Method

8

Implementation

9

Results

Part III

Conclusion

10

Conclusion

11

Further Research

Bibliography

1

Appendix 1 - FIX Message Definitions