• No results found

INFORMATION-FLOW ANALYSIS

Chapter 12 Formal Specification

Rule 1. Create_object (o, c):

12.7 INFORMATION-FLOW ANALYSIS

The concept of information flow was introduced in section 9.6 as a way of addressing deficiencies in the state-machine modeling technique-where the concept of a secure state and constraints on state transitions are insufficient to prevent certain nonsecure information flows, such as covert channels-while permitting legitimate functions. Information-flow analysis is a general technique for analyzing leakage paths in a system (Lampson 1973; Denning 1983); it is applicable to any security model. The technique can be applied to programs or to specifications, although the rules governing the two applications are different. At present, we shall discuss how to apply information-flow analysis to nonprocedural formal specifications, in order to support the proof that a specification meets a mandatory multilevel security policy. Later, in section 12.7.2, we shall briefly discuss the use of flow analysis with programs.

Before beginning any flow analysis effort, you must realize that the flow analysis of the specification—like any other proof of the specification—is only meaningful to the extent that the implementation corresponds to the specification. While this should be an obvious point, many people seem to focus on flow analysis as being particularly vulnerable to deficiencies in state-of- the-art of proving correspondence, when in fact flow analysis is no more vulnerable than other techniques.

You might convince yourself of the need for flow analysis by noting that our example specification in figure 12-4 has several covert channels. The example allows for a number of write-downs (see section 6.4.4), by permitting the actions of a process at a high access class to be detected by a process at a lower access class. One such case is in the file_exists array, where

a high process can create a file and a lower process can determine that the file already exists by trying to recreate the file and noting that the access array did not change. (Although they are

not shown, we presume that the complete system has functions that return information about what accesses are allowed, either by asking directly or by attempting an access and getting a failure.)

Using the multilevel security policy as our requirement, we find that the complete statement of an informations flow policy is very obvious:

Flow Policy: If information flows from object A to object B in a state transition, the access class of B must dominate the access class of A.

It seems apparent that this policy fulfills the intent of the multilevel security policy.

In theory, if you can eliminate all flow violations in a system (or in a model of a system), the system (or model) has neither covert nor overt channels, and there is no need to perform any of

the invariant or constraint proofs about secure states and state transitions.2 Unfortunately, deciding what is and what is not a flow is not always easy; and tools that perform flow analysis, because they are ultraconservative in finding flows, are usually insufficient to justify our declaring a specification completely clean. You usually have to carry out an error-prone informal analysis to vindicate the apparent flow violations. For these practical reasons, the invariant and correspondence proofs add considerably to the assurance in the security of the specification, even though flow analysis theoretically might be sufficient. (Real systems are also never completely free of real flow violations, so the manual analysis would be required even if the tools were perfect.)

An information flow can be viewed as a cause-and-effect relationship between two variables

w and v. In any function where v is modified and w is referenced, there is flow from variable w to

variable v (written w → v) if any information about the value of w in the old state can be deduced by observing the value of v in the new state. For simplicity, we do not explicitly show the new value in the notation (as in w → ' v), but the understanding is that the flow always moves from a variable in an old state to a variable in a new state.

When analyzing functions in a model or specification, if we cannot tell ahead of time whether a particular function will result in a flow, we play it safe and flag it anyway. Such is the case when the flow occurs only under certain conditions that are not explicit in the definition of the function being analyzed. In fact, when looking at isolated functions, we can never tell whether a potential flow is an actual flow. Only by looking at the system as a whole can we identify the real flows. Thus, when we talk about a flow in a function, we almost always mean a potential flow. Sometimes it is possible to rewrite the function or specification so as to eliminate the potential flow. In such a case, the potential flow is called a formal flow because it appears only as a result of the form in which the specification is written.

The process of flow analysis includes both finding the flows and proving that they do not violate flow policy. The functions are observed one at a time, each expression in the function is analyzed, and each flow between a pair of variables is written as a flow statement. (Rules for finding the flows from expressions are covered in section 12.7.1.) A given function may yield many flow statements. A flow may occur only under certain conditions, depending on the values of other variables, so in general a flow statement has the following form:

Flow Statement: If condition, then A → B

where condition is some expression, and A and B are variables.

To decide whether a flow expressed in a flow statement is safe according to the flow policy, we generate from each flow statement a flow formula having the following form:

Flow Formula: If condition, then class(B) ≥ class(A)

2As we shall see later, sometimes the proof of a flow formula requires you to write and prove an invariant as a

where condition is the same as in the flow statement, class(x) means “the access class of x,” and ≥ is a symbol meaning dominates. Proving that there are no flow violations in a function requires proving that each flow formula is true. If the formula cannot be proved, it may represent a real or formal flow violation that must then be justified. To assist you in proving the flow formulas, you may use invariants or constraints in the specification provided that the specification has already been proved to satisfy the invariants and constraints, or you may write new invariants that you subsequently have to prove.

Notice that the flow formula is defined in terms of the access classes of variables. Probably the most restrictive aspect of information-flow analysis for multilevel security is the need to define an access class manually for every variable in the specification—even for internal state variables that are not objects according to the security policy. If you choose the wrong access class, a flow violation will show up, so you do not have to worry about introducing an undetected error in this process. But in many cases, no matter what access class you pick, a formal flow violation will be committed in some function somewhere, even though the specification may be secure and may exhibit no covert channels. Sometimes you can eliminate a flow by rewriting the specification, but that may make the specification so obscure that correspondence to the code is extremely difficult to demonstrate.

Information-flow analysis is something of an art. The rules for deciding when information flow is possible are complex and difficult to apply by hand. In practice, flow analysis is rarely done on a system at the level of an abstract model. While a flow analysis of a model can indeed catch many potential flow violations, it will also miss most of the interesting ones. This is because a model leaves out many details of a system, such as state variables and functions that do not affect the security state of the system as represented in the access matrices. Yet it is precisely these internal state variables that provide the paths for covert channels. Flow analysis on a model can catch these only if the operations on such variables are represented in the functions of the model.

12.7.1 Flow Rules

At the current state of the art, automated flow tools work syntactically. Semantic assumptions that the flow tool makes about a specification are based solely on the syntactic style in which the specification is written, not on what the specification says: if you write the same secure function in two different ways you may get different flow formulas, some of which are true and some of which are false. The false formulas are due to the ultraconservative nature of the analysis, which finds all possible flows but also flags many formal flows.

Syntactic flow analysis is based on a number of simple rules. Given a form of expression in a specification, a flow rule specifies the potential flows. Following are examples of two simple flow rules:

Flow Rule 1. In the equality statement with a single new-value operator,

where expression is an arbitrary expression containing no new values, there is an unconditional flow from all variables mentioned in expression to v. This includes all variables appearing as parameters of functions and indices of arrays in expression.

Flow Rule 2. To find the flows in the statement,

if condition then statement-1 else statement-2

where statement-i are of the form of the statement in flow rule 1, analyze statement-1 and

statement-2 for flows according to flow rule 1. When condition is true, all the flows in statement- 1 occur; when condition is false those in statement-2 occur. There are also unconditional flows

from all variables mentioned in condition to all variables that are the target of flows in statement-

1 and statement-2.

The preceding rules, though too simple to take care of all cases (especially those where the new value of a variable appears in the expression), can be used to analyze some of the expressions in the examples in table 12-3.

Examples 1 and 2 illustrate flow rule 1, where a flow occurs from any variable in an expression—even when it is in a parameter of a function in an array index—to the new value of the variable. We do not bother with flows from constants: constants are considered to have

SYSTEM LOW access class, so any flow from a constant is safe. In the example, the function f (x)

is a constant function of the variable x. In example 2, we have an array var that is a variable; consequently, we have to show that a flow occurs from the specific array element to the new value, as well as from the variable used as the array subscript. Example 3 illustrates flow rule 2, where the flows in each branch of a conditional statement are conditional and where an unconditional flow occurs from the variable a mentioned in the condition. The latter flow occurs because the value of w can be deduced from the new value of v. You may argue that, if in example 4 we end up with ' v = 6, we do not know much about w, but information flow analysis does not try to quantify the amount of flow: that is a job for a covert channel analysis of the resultant system, which serves to determine the bandwidth of any covert channels revealed by information-flow analysis (see section 7.2).

In example 6, a flow tool operating according to our rules would indicate a flow that was not there. According to flow rule 2, we should indicate the flow w → v; but no such flow exists, since v is set to the same value regardless of w. By moving the assignment to ' v outside the if statement, we can make the formal flow disappear.

Example 7 contains statements in which the new value of a variable appears in places other than on the left-hand side of an = sign, making our flow rules inappropriate for such cases. The

example illustrates that flows only originate from old values of variables, not from new values. It also shows that, even though v is not the target of any flows according to flow rule 1, it still is the target of a flow from w, thereby violating flow rule 2. Syntactically, examples 6 and 7 are nearly identical, yet the flows they exhibit are different.

Example Flows Rationale 1. ' v = x + f (x) + 5

[ f (x) is a constant]

w → v x → v

Flow from old values to new values in expression; no flows from constants. 2. ' v = var(w) w → v var(w) → v 3. if a = 1 then ' v = w else ' v = x if a = 1 then w → v if a ≠ 1 then x → v a → v Unconditional a → v because ' v depends on w. 4. if w = 1 then ' v = 5 else ' v = 6 w → v 5. if w = 1 then ' v = 2 else ' v = w w → v 6. if w = 1 then ' c = 1 and ' v = x else ' c = 3 and ' v = ' c w → c x → v No w → v because ' v = x unconditionally. 7. if w = 1 then ' c = 1 and ' v = ' c else ' c = 3 and ' v = ' c w → c w → v

No c → v because old value of c is irrelevant. 8. if w = 1 then ' c = 2 and ' v = ' c + 1 else ' v = ' c – 1 if w ≠ 1 then c → v w → v w → c 9. if w = 1 then ' c = a and ' v = ' c + 1 else ' c = a and ' v = ' c – 1 a → c a → v w → v No w → c because ' c = a unconditionally. 10. ' v = a a → v 11. ' a = c c → a

12. if a = b then ' w > v “everything” → w Nondeterministic assignment is flow from all variables.

Table 12-3. Examples of Flow Analysis. This table illustrates the flows that result from

various types of expressions that might appear in a specification.

The rules for finding flow depend not only on the specification language but on the specific security properties that the flow analysis is intended to support. In particular, net flow after a succession of state transitions depends on the order in which the functions are invoked. For