3. Theoretical Methods
3.2 Structural identifiability
3.2.1 Taylor series approach
To determine whether a model is structurally globally identifiable it must be shown that the observations determine a unique parameter vector. The output functiony(t,p) can be expanded as a Taylor series in t since f(x,p) is analytic. Expansion at time, t = a, creates a unique expression: y(t,p) = ∞ X i=0 y(i)(a,p)(t−a) i i! , (3.5)
in terms of derivatives of the output at that point. Using Equation (3.1) these derivatives can be written in terms of elements of the state vector and elements ofp. The coefficients of a Tay- lor series are unique for a given output, thus by solving these equations for pit is possible to determine whether there is a uniquepfor the output structure used [163]. If a unique solution exists the model is structurally globally identifiable. However this technique provides an infinite number of coefficients. For a linear system, without input (or with a single impulsive input), it is known that at most 2n−1 (wherenis the state space dimension) independent equations are required to determine the possible solutions forp[164]. For general non-linear systems no strict upper bound has been determined although a loose upper bound exists [165]. As such it can be difficult to prove unidentifiability using this technique. Moreover, in either case, high order coefficients are very complex making them difficult to solve even using symbolic algebra packages. In some cases it may be possible to construct an inductive argument to describe the form of the Taylor series coefficients and thus prove unidentifiability.
While for some systems it is possible to apply this approach by hand, many systems are too complex for this to be practical. Regardless, in order to avoid errors, it is preferable to implement it using a computer algebra package. In this work it was implemented inMaple as follows [158]. This approach is applied to a simple model in Section 3.2.4.
Step 1: Definition of model
Model constants, initial conditions, and differential equations were defined as follows:
assume(constant_name, constant): variable_name(0) := initial_value; D(variable_name(t), t) := function;
The use of the assume function when defining constants ensures thatMaple will not attempt to differentiate them. Initial values are either zero or a constant that has been defined. There is no intrinsic reason why an initial condition should be used rather than a nonzero time. However for the models analyzed the most appropriate time at which to expand was t = 0 since this was the time at which concentrations could be controlled. The D function in Maple denotes the differential operator and can be applied to a function. The definitions used causeMaple to replace variables to which the operator has been applied with the function given. This function should thus be the relevant differential equation invariable\_names(t)and constants. The output function was defined as follows:
y := [a_1*variable_name_1(t), ... ,a_n*variable_name_n(t)]; SUBS1 := a_1 = value_1, ... , a_n = value_n;
f_0_i := subs(SUBS1, y)[i]; g_0_i := eval(f_0_i, t=0);
For maximum flexibility a matrix A could be defined and multiplied by the state vector xto obtain the output functions. However for the models analyzed outputs did not consist of linear
3.2. STRUCTURAL IDENTIFIABILITY 49 combinations of species, as such the definition above was sufficient. SUBS1determines which variables were measured, the value_is were either 1 or 0 indicating whether the variable was measured. Thef\_0\_is are the output functions and theg\_0\_is are the zeroth order Taylor series coefficients.
Step 2: Calculation of Taylor series coefficients
The Taylor series coefficients were calculated as follows:
f_i_j := D(f_i-1_j): g_i_j := eval(f_i_j, t=0):
The Taylor series coefficients are determined inductively, starting from the coefficients defined above. The differential operatorDwas applied to the previous derivative of the output function. Theevalfunction is then used to determine the corresponding Taylor series coefficient. There being no strict upper limit on the number of Taylor series coefficients needed for a nonlinear system an arbitrary number of coefficients was computed. Calculation of these coefficients in
Maple requires large amounts of memory and can require significant computational time. The number of coefficients calculated typically depends on these factors.
Step 3: Definition of an alternative parameter vector
Suppose there exists an alternative parameter vector,p, which produces the same output as the starting parameters. If it can be shown that p=pthen the model is globally identifiable. If there are countable many solutions the model is locally identifiable. The alternative parameter vector was defined and used to create equations from the Taylor series coefficients as follows:
SUBS2 := parameter_name_1 = alt_parameter_name_1, ... , parameter_name_m = alt_parameter_name_m; ga_i_j := subs(SUBS2, g_i_j):
Thesubsfunction uses the list defined by SUBS2to replace names on the left with names on the right. This creates the alternative Taylor series coefficients. Equations are then defined by subtracting these coefficients from the coefficients determined. When these equations are solvedMaple will assume that they are equal to zero unless otherwise stated.
Step 4: Solving for the alternative parameters
The equations derived above are now solved for the alternative parameters:
s:= solve({eqn_1_1, ... , eqn_a_b}, {alt_parameter_1, ... , alt_parameter_name_m}):
eqn_1_1 := alt_parameter_1 = solve(eqn_1_1, alt_parameter_1): eqn_j_k_1 := simplify(subs(s1, ... , si-1, eqn_j_k)):
si := alt_parameter_i = solve(eqn_j_k_1, alt_parameter_i}:
There are two possibilities. In some casesMaple is able to solve all the equations determined for the alternative parameters. In this case the first line of code is used. The solution can be seen by inputtings;. However, this calculation, like the determination of the Taylor series coefficients, consumes memory and computational time. If a solution cannot be obtained within a reasonable time, or without exhausting the available memory, an iterative approach is possible. The first equation is solved for one of the alternative parameters and this solution is used to create a term that can be used in subs. Proceeding one equation at a time, further solutions for alternative parameters are determined, by substituting all known solutions into an equation and then solving for an alternative parameter. This approach is more manageable although it may still run out of the available resources.