Smart contracts analysis - How solid is solidity? An in-dept study of solidity’s type safety.

Many recent works address smart contract implementations from a formal point of view. Two are the possible approaches: analyzing the EVM bytecode or working on a higher level, taking into account, for example, Solidity code. Both have pros and cons. The former, for example, has the advantage that the bytecode is stable and that it does not change (or it does in a quite limited way) over time. Many high level languages can be designed, but, in order to run in Ethereum, all have to be finally “translated” into bytecode. Programmers, however, do no develop programs using the bytecode. Instead, they use high level languages such as Solidity. Hence, formalizing and working on these languages narrows the gap between programmers and Ethereum. Pattern and best practices can be tailored in one language, and formal methods can help programmers in writing code respecting these patterns. Nonetheless, languages like Solidity are less mature, and changes may introduce deep modifications from one release to another. Furthermore, also the compiler have to be analyzed and proven bug- free. Currently, the Solidity compiler is written in C++, and importing its definition in a theorem prover is nearly impossible. In fact, the definition of the whole C++11 language has not been formalized yet, although some of the hardest aspects of the language, such as concurrency [7] or inheritance [41], have been addressed. Hence,

5_{https://isabelle.in.tum.de/}

formal verification of Solidity code has to operate outside theorem provers: this is possible, but rather difficult, since many aspects of Solidity’s semantics might change over time. The way we chose is to formalize a calculus modeling the core part of this language, proving properties of its type system and proposing extensions addressing some of its flaws.

Program analysis can be either static or dynamic. The latter is basically like testing, and can reveal only the presence of bugs, not prove their absence. The former, instead, examines the code without running it. The process provides an understanding of the code structure, and can help ensure that the code adheres to certain properties. Usually, static analysis behaves as follows:

1. an intermediate representation (IR), such as an abstract syntax tree, is built from the source code;

2. the IR is enriched with additional information, using algorithms such as control- and dataflow analysis, taint analysis, symbolic execution, or abstract interpreta- tion, depending on what the IR is used for;

3. vulnerability detection with respect to a database of patterns, which define vulnerability criteria in IR terms.

3.3.1 Static analysis

Le et al. [29] address the conditional termination of smart contracts. Even though So- lidity uses gas to make sure that every function eventually terminates (possibly with a revertdue to an out-of-gas exception), a mechanism to prove conditional termination can be useful in other languages compiling down into EVM bytecode. Further- more, it can be applied also to Solidity contracts to solve the problem by construction: letting a contract compile when one, or more, of its functions will for sure terminate only thanks to gas is a waste of money and also an error that should be corrected as soon as possible. FS assumes every program as a terminating one, but we could include this work to prove conditional termination.

One of the first static analysis tools for Solidity is OYENTE[31]. Luu et al. [31] provided a list of common vulnerabilities, proposing a better design (which requires all clients in the network to upgrade) and a tool to help programmers develop better contracts. OYENTE is based on symbolic execution, which represents each program variable as a symbolic expression. Each execution path is then expressed in terms of a logic formula built over the symbolic expressions. In order for the execution to follow the path, all the actual values must satisfy that formula. Of course, if there are no values satisfying those constraints, the path will never be taken at run-time. Symbolic execution can achieve, in principle, a better precision and a lower false positives rate (with respect to, for example taint or data flow analysis) but, in general, it also gets a lower code coverage. OYENTE’s main aim was to prove that the semantics of Solidity is subtle, and that the vulnerabilities Luu et al. identified actually happened in practice. Their thoughts were confirmed: out of 19366 analyzed contracts, 8833 contained at least a vulnerability (according to OYENTE). It was possible to collect the actual code of only 175 of these contracts, and the false positives rate was of 6.4% (i.e. 10 cases out of 175).

OYENTEhas been extended in many ways. One of them is ETHIR[2], by Albert et al., a tool for decompilation of EVM bytecode into a high-level representation in a rule-based form. This form makes it easier to apply the existing tools to infer properties

of the bytecode, because the control and the data flow are explicit. It does so by initially using OYENTEto produce a set of blocks that store the information needed to represent the control flow graph of a set of EVM instructions. It then translates this graph (or better, each block of this graph) into a rule-based representation.

Rosu [43] collects the recent progress in using the K framework to verify smart contracts semantics. The work formalized the semantics of Vyper in K, and found several bugs and inconsistencies. Vyper7 _{is a novel programming language, compil-} ing to EVM, for smart contracts that aim for increased security, simplicity, and human readability. Along with Vyper, a novel consensus protocol, Casper [11], is being developed. It is meant to save wasteful electricity expenditures and at the same time provide greatly increased security. Verifying Casper behavior and Vyper code is very important, since they will play a key role for Ethereum in the near future. Rosu are also formalizing the actual protocol in Coq8and Isabelle.

Chen et al. have identified first 7 [13] and then 24 [12] anti-patterns in smart contract design. They define an anti-pattern as an EVM operation sequence that can be replaced with another one that has the same semantics but needs less gas. Hence, their work focus on finding methods to detect and reduce the waste of gas, making Ethereum users spend less money. It might seem that detecting gas waste is not as crucial as detecting other vulnerabilities, but this is not true. Smart contracts are so critical that their implementation should be deeply reasoned about and optimized, in order not to perform unnecessary operations that may introduce bugs. Chen et al. [12] have developed GASREDUCER, a tool that analyzes contracts looking for these anti-patterns. They have analyzed all the deployed smart contracts (i.e., 599,959 as of 10 June, 2017), and detected 9,490,768 instances of anti-patterns wasting 2,040,892,224 units of gas. The calculus we are to propose can integrate these anti-patterns definitions for proving that a given contract does not suffer from any of them. In fact, currently GASREDUCER works on the bytecode. “Merging” it with Featherweight Solidity would mean reducing the gap between EVM code and Solidity code.

Tikhomirov et al. [49] provide a static analysis tool, SMARTCHECK, using lexical and syntactical analysis on Solidity source code. It generates an XML parse tree as an intermediate representation, and detects vulnerability patterns by using XPath9_queries on it. The tool thus provides full coverage: the analyzed code is fully translated to the IR, and all its elements can be reached with XPath matching. The advantage of this method is that new languages can be added leaving the IR-level algorithms unchanged. On the other hand, XPath queries can easily lead to false positives (when applied, for instance, to reentrancy or timestamp dependencies). The result of their experiment, conducted on 4600 contracts, reveals that SMARTCHECKincurs in more true positives than OYENTEand SECURIFY. This means that SMARTCHECKis well suited to detect certain kind of vulnerabilities, but it also incur in more false positives due to the use of XPath.

In blockchain, an invocation is a run of a smart contract. Depending on the actual input values, the execution path may vary. Hence, looking at a single invocation is not enough to discover all the vulnerabilities a contract can suffer from. An alternative approach consists of looking at a trace of invocations. Nikolic et al. [35] have developed a tool, MAIAN, using systematic techniques to find contracts that violate specific properties of traces. Violations are either of liveness properties, asserting that there ex- ists a trace from a specified blockchain state that causes the contract to violate certain

7_{https://github.com/ethereum/vyper} 8_{https://coq.inria.fr/}

conditions, and of safety properties, asserting whether some actions cannot be taken in any execution starting from a specified blockchain state. They defined three categories of contracts:

• the greedy, those contracts that remain alive and lock Ether indefinitely, allowing it be released under no conditions;

• the prodigal, those contracts returning funds to accounts they had never had to do before, that is, an arbitrary address;

• the suicidal, those contracts invoking the SUICIDE instruction (that terminates a contract’s life) transferring Ether to an account they had never had to do before. Their approach extends OYENTE adding a semantics taking into account invocation traces. They also formalize, into logical formulas, properties characterizing greedy, prodigal, and suicidal contracts, and use symbolic execution to check their satisfiabil- ity. They analyzed 970,898 smart contracts, obtained by downloading the Ethereum blockchain from the first block until block number 4,799,998. Out of these, it was only possible to download the code 9,825 contracts (about 1% of the total), highlighting the usefulness of analyzing the bytecode. The percentage of true positives, with respect to prodigal and suicidal contracts, is 97% and 99%, respectively. On the other hand, greedy contracts have a false positive rate of 31%, quite high. This is due to many causes, but in general finding a trace that leads to an Ether transfer may require three or more traces.

Another static analyzer is SECURIFY[51], by Tsankov et al. It was born from two key observations: first, symbolic execution, on which other tools are based on, has many false positives, requires a long time to inspect large contracts and usually gets a low code coverage; secondly, many security properties can be expressed as patterns on the data flow graph. SECURIFYstates such properties with two kinds of patterns: com- pliance patterns, which imply the satisfaction of the property, and violation patterns, which imply its negation. This tool works as follows: first it parses the EVM code, decompiling it into a static single-assignment form, then it looks for violation patterns. Such patterns are, for example, writing to storage after having invoked another function, or not validating the arguments. The results of their experiments show that this approach is more effective than symbolic execution.

Alongside these tools are many others static analyzers well suited for certain security properties. Examples are Zeus [27], by Kalra et al., and EtherTrust [22], by Gr- ishchenko, Maffei, and Schneidewind. They operate at different levels, either analyzing Solidity code, abstract implementations of the latter, or EVM bytecode. However, static analysis often puts in practice heuristics, leaving room for false positives or negatives. Furthermore, the adoption of these tools could be limited: many programmers could ignore them, making the research progresses useless. We think that security properties, as well as contract invariants or safety properties, have to be checked directly by the compiler whenever it is possible. Smart contracts are not like normal programs that may be patched if something turns out to be wrong. Once a contract is deployed, no modifications or patches can be applied. These programs must be correct by construction, and we strongly believe that analysis tools should play a fundamental role during the development phase. This is the reason why we decided to rely on the compiler. By operating on the type system we make the compilation (as well as the compiler itself) more complex and selective, but, on the other hand, we are able to rule out or detect dangerous patterns. Of course, not every property is enforceable at compile-time, or it is without making the language excessively complex. “Heavy” type systems have

the only effect of making a language too verbose, thus giving programmers a reason to abandon it. As we saw in this section, many are the static analysis tools working on the code after its development, but none of them targets the type system. Solidity has the precise and explicit aim to be a type-safe language, and, to the best of our knowledge, this is the first work aiming to prove so also proposing some modifications to make the language sounder.

3.3.2 Dynamic analysis

Jiang, Liu, and Chan [26] have developed CONTRACTFUZZER, a tool analyzing smart contracts with fuzzing in order to detect common vulnerabilities. Fuzzing is based on the dynamic generation of a set of values used as an input for the program to test. Along with reentrancy and gasless send (i.e. invoking a fallback function that requires more than 2300 gas), it is designed to detect other, less common, vulnerabilities, such as timestamp or block number dependency (i.e. using the block timestamp or number, respectively, for critical operations, such as a random number generation) as well as the dangerous use of delegatecall (about which we discussed in Section 2.4.2). The results of CONTRACTFUZZER, applied to 6991 contracts, are very good: it falls into false positives very rarely, and only when detecting timestamp or block number dependency. When compared to OYENTE, CONTRACTFUZZER has a lower rate of false positives and an higher rate of false negatives. The former is due to the difficulty, for OYENTE, of symbolically analyze certain types of operations. On the other hand, CONTRACTFUZZER relies on the dynamic generation of input to dynamically test a smart contract, a process that could require a long time to detect something. Hence, with a limited analysis time, some bugs are not detected. Furthermore, dynamic analysis can only prove the presence of vulnerabilities, not their absence, and this is an important limitation in this context.

A similar work, even though very preliminary and limited to reentrancy vulnerabilities, comes from Liu et al. [30]. They have developed a fuzzing tool, REGUARD, focused on finding reentrancy, and analyzed 5 contracts, each for 20 minutes, com- paring their results with the ones of OYENTE. It turned out that REGUARDsuffers from false positives and negatives less than OYENTE. Even though the results are en- couraging, the test is not significant enough to say that REGUARDmay be a useful (or complementary) tool.

Fuzzing (and dynamic analysis in general) is very dependent on the amount of test time, which cannot be too high. However, tools like REGUARDor CONTRACTFUZZER may serve as a first alarm. They are for sure more lightweight than more complex tools operating, statically, on the source code (including bytecode), but they are not a replacement.

In document How solid is solidity? An in-dept study of solidity’s type safety. (Page 65-69)