CUNY Academic Works CUNY Academic Works
Dissertations, Theses, and Capstone Projects CUNY Graduate Center
5-2019
A Differential Algebra Approach to Commuting Polynomial Vector A Differential Algebra Approach to Commuting Polynomial Vector Fields and to Parameter Identifiability in ODE Models
Fields and to Parameter Identifiability in ODE Models
Peter Thompson
The Graduate Center, City University of New York
How does access to this work benefit you? Let us know!
More information about this work at: https://academicworks.cuny.edu/gc_etds/3229 Discover additional works at: https://academicworks.cuny.edu
This work is made publicly available by the City University of New York (CUNY).
Contact: [email protected]
PARAMETER IDENTIFIABILITY IN
ODE
MODELSby
P
ETERA. T
HOMPSONA dissertation submitted to the Graduate Faculty in Mathematics in partial fulfillment of the requirements for the degree of Doctor of Philosophy, The City University of New York
2019
2019 c
P
ETERA. T
HOMPSONAll Rights Reserved
This manuscript has been read and accepted by the Graduate Faculty in Mathematics in satisfaction of the dissertation requirement for the degree of Doctor of Philosophy.
Professor Alexey Ovchinnikov
Date Chair of Examining Committee
Professor Ara Basmajian
Date Executive Officer
Professor Richard C. Churchill Professor Russell Miller
Professor Alexey Ovchinnikov Supervisory Committee
T
HEC
ITYU
NIVERSITY OFN
EWY
ORKAbstract
A
DIFFERENTIAL ALGEBRA APPROACH TO COMMUTING POLYNOMIAL VECTOR FIELDS AND TO PARAMETER IDENTIFIABILITY INODE
MODELSby
P
ETERA. T
HOMPSONAdviser: Professor Alexey Ovchinnikov
In the first part, we study the problem of characterizing polynomial vector fields that commute with a given polynomial vector field. One motivating factor is that we can write down solution formulas for an ODE that corresponds to a planar vector field that possesses a linearly independent commuting vector field. This problem is also central to the question of linearizability of vector fields. We first show that a linear vector field admits a full complement of commuting vector fields.
Then we study a type of planar vector field for which there exists an upper bound on the degree of a commuting polynomial vector field. Finally, we turn our attention to conservative Newton systems, which form a special class of Hamiltonian systems, and show the following result. Let f ∈ K[x], where K is a field of characteristic zero, and d the derivation that corresponds to the differential equation ¨ x = f (x) in a standard way. We show that if deg f > 2, then any K-derivation commuting with d is equal to d multiplied by a conserved quantity. For example, the classical elliptic equation
¨
x = 6x
2+ a, where a ∈ C, falls into this category.
In the second part, we study structural identifiability of parameterized ordinary differential
equation models of physical systems, for example, systems arising in biology and medicine. A
parameter is said to be structurally identifiable if its numerical value can be determined from perfect
observation of the observable variables in the model. Structural identifiability is necessary for
practical identifiability. We study structural identifiability via differential algebra. In particular, we
use characteristic sets. A system of ODEs can be viewed as a set of differential polynomials in a
differential ring, and the consequences of this system form a differential ideal. This differential ideal
can be described by a finite set of differential equations called a characteristic set. The technique of
studying identifiability via a set of special equations, sometimes called “input-output” equations, has
been in use for the past thirty years. However it is still a challenge to provide rigorous justification
for some conclusions that have been drawn in published studies. Our main result is on linear
systems, which are a topic of current interest. We show that for a linear system of ODEs with one
output, the coefficients of a monic characteristic set are identifiable. This result is then generalized,
with additional hypotheses, to nonlinear systems.
Acknowledgments
The results of chapter Chapter 1 are joint work with Alexey Ovchinnikov and Joel Nagloo. The results of Chapter 2 are joint work with Alexey Ovchinnikov and Gleb Pogudin.
This work was partially supported by the NSF grants CCF-1563942, CCF-0952591, DMS- 1700336, DMS-1606334, and DMS-1760448, by the NSA grant #H98230-15-1-0245, by CUNY CIRG #2248, and by PSC-CUNY grants #69827-00 47, #60456-00 48, and #60098-00 48. I am grateful to the CCiS at CUNY Queens College for the computational resources.
vi
Contents
Contents vii
1 Commuting polynomial vector fields 1
1.1 Introduction . . . . 1
1.2 Basic terminology and related results . . . . 3
1.3 The linear case . . . . 7
1.4 A class of derivations admitting upper bounds on the degree of a commuting derivation 14 1.4.1 The utility of upper bounds . . . . 14
1.4.2 Main result . . . . 14
1.5 Conservative Newton Systems . . . . 20
2 Identifiability for polynomial ODE models 55 2.1 Introduction . . . . 55
2.2 Notation and definitions . . . . 60
2.3 Identifiability of coefficients of a characteristic set . . . . 67
2.3.1 Definitions and basic properties . . . . 67
2.3.2 Results on identifiability . . . . 70
2.4 Identifiability for input-output equations in systems with one output . . . . 76
Bibliography 83
vii
Chapter 1
Commuting polynomial vector fields
1.1 Introduction
We study the problem of characterizing polynomial vector fields that commute with a given polynomial vector field. One motivating factor is that we can write down solution formulas for an ODE that corresponds to a planar vector field that possesses a linearly independent (transversal) commuting vector field (see Theorem 1.2.1). This problem is also central to the question of linearizability of vectors fields (cf. Gin´e and Grau (2006) and Sabatini (1997)). In what follows, we will use the standard correspondence between (polynomial) vector fields and derivations on (polynomial) rings.
In Section 1.3, we show that a K-derivation on K[x
1, . . . , x
n] defined by linear polynomials admits a full complement of commuting K-linearly independent K-derivations. In Section 1.4, we prove a degree bound on the degree of any derivation commuting with a K-derivation on K[x, y] of the form
d = f
1· ∂
∂x + f
2· ∂
∂y satisfying f
1f
26= 0, deg
y∂ f
2∂x < deg
yf
2, deg
y(y f
1) < deg
yf
2, deg
x∂ f
1∂y < deg
xf
1, and deg
x(x f
2) <
1
deg
xf
1. In Section 1.5, we show that a nonlinear planar polynomial derivation corresponding to a conservative Newton system does not admit a linearly independent commuting derivation. Let
d = y ∂
∂x + f (x) ∂
∂y (1.1)
be a K-derivation, where f is a polynomial with coefficients in a field K of zero characteristic. This derivation corresponds to a conservative Newton system, and so to the differential equation ¨ x = f (x).
Observe that d is a special type of Hamiltonian derivation. That is, d(x) =
∂H∂y
and d(y) = −
∂H∂x
, where H =
12y
2− R f (x)dx. It is shown in (Nowicki, 1994, Corollary 7.1.5) that the set of all polynomial derivations that commute with d forms a K[H]-module. In this paper, we show that, for every such d, the module M
dis of rank 1 if and only if deg f > 2. For example, the classical elliptic equation ¨ x = 6x
2+ a, where a ∈ C, falls into this category.
A characterization of commuting planar derivations in terms of a common Darboux polynomial is given in (Petravchuk (2010)). This was generalized to higher dimensions in (Li and Du (2012)).
In (Choudhury and Guha (2013)), Darboux polynomials are used to find linearly independent
commuting vector fields and to construct linearizations of the vector fields. In the case in which K
is the real numbers, our result generalizes a result on conservative Newton systems with a center
to the case in which a center may or may not be present. A vector field has a center at point P if
there is a punctured neighborhood of P in which every solution curve is a closed loop. A center
is called isochronous if every such loop has the same period. It was proven in (Villarini, 1992,
Theorem 4.5) that, if D
1and D
2are commuting vector fields orthogonal at noncritical points, then
any center of D
1is isochronous. The hypothesis of this result can be relaxed to the case in which
D
2is transversal to D
1at noncritical points (cf. (Sabatini, 1997, Theorem, p. 92)). In light of this
result, one approach to showing the nonexistence of a vector field commuting with D is to show
that D has a non-isochronous center. In fact, Amel’kin (Amel’kin, 1977, Theorem 11) has shown
that if the system of ordinary differential equations (ODEs) corresponding to derivation (1.1) is not
linear and has a center at the origin, then there is no transversal vector field that commutes with d.
As far as we are aware, there has not been a standard method to show the nonexistence of a transversal polynomial vector field in the absence of a nonisochronous center. We develop our own method to do this, which includes building a triangular system of differential equations. One technique we use in approaching this system involves constructing a family of pairs of commuting derivations on rings of the form K[x
1/t, x
−1/t, y] (see Lemma 1.5.7) and using recurrence relations.
It is impossible to remove the condition deg f > 2 from the statement of our main result, as every non-zero derivation of degree less than 2 commutes with another transversal derivation (see Proposition 1.2.1). The form of d in our main result implies that d is divergence free (which is the same as Hamiltonian in the planar case). It is not possible to strengthen our result to the case in which d is merely assumed to be divergence free of degree at least 2, as shown in Example 1.2.1 and Proposition 1.2.2.
1.2 Basic terminology and related results
We direct the reader to Kaplansky (1957) and Kolchin (1973) for the basics of a ring with a derivation.
Definition 1.2.1. An S-derivation on a commutative ring R with subring S is a map d : R → R such that d(S) = 0 and for all a, b ∈ R,
d(a + b) = d(a) + d(b) and d(ab) = d(a) · b + a · d(b).
Definition 1.2.2. Let K be a field. A non-zero K-derivation d on K[x
1, . . . , x
n] is called integrable if
there exist commuting K-derivations δ
1, . . . , δ
n−1on K[x
1, . . . , x
n] that are linearly independent from
d over K(x
1, . . . , x
n), and commute with d, that is, for all a ∈ K[x
1, . . . , x
n] and i, j, 1 6 i, j 6 n − 1,
d (δ
i(a)) = δ
i(d(a)) and δ
i(δ
j(a)) = δ
j(δ
i(a)).
The following is a result that follows easily from classical theory, although to the best of our knowledge it is not explicitly stated in this form.
Theorem 1.2.1. Let d and δ be R-derivations on R(x, y) defined by
d(x) = f
1(x, y), d(y) = f
2(x, y), δ(x) = g
1(x, y), δ(y) = g
2(x, y).
Let (x
0, y
0) ∈ R
2. Suppose that d and δ commute and there is no (λ
1, λ
2) ∈ R
2\{(0, 0)} such that
λ
1
f
1(x
0, y
0) f
2(x
0, y
0)
= λ
2
g
1(x
0, y
0) g
2(x
0, y
0)
.
Then the initial value problem
˙
x = f
1(x, y), ˙ y = f
2(x, y), x(0) = x
0, y(0) = y
0has a solution given by
(x(t), y(t)) = F
−1(t, 0),
where
F
x y
=
x
Z
x0 g2(r,y)
∆(r,y)
dr +
y
Z
y0
−g1(x0,s)
∆(x0,s)
ds
x
Z
x0
− f2(r,y)
∆(r,y)
dr +
y
Z
y0
f1(x0,s)
∆(x0,s)
ds
,
and ∆(x, y) = f
1(x, y)g
2(x, y) − f
2(x, y)g
1(x, y).
Proof. Suppose (x(t), y(t)) is a solution to the initial value problem. A straightforward calculation shows that F(x(t), y(t)) = (t, 0). Observing that the Jacobian determinant of F does not vanish at (x
0, y
0), we see that F is a diffeomorphism in a neighborhood of (x
0, y
0). We conclude that (x(t), y(t)) = F
−1(t, 0).
Example 1.2.1. Consider the initial value problem
˙
x = 1 + x
2, y ˙ = −2xy, x(0) = x
0, y(0) = y
0,
where x
0and y
0are real numbers and y
06= 0. The corresponding derivation is
d(x) = 1 + x
2, d(y) = −2xy,
and we observe that the derivation
δ(x) = 0, δ(y) = y
commutes with d, and that d and δ are independent at (x
0, y
0). Using the above formula, we obtain the solution
x(t) = tan(t + tan
−1x
0), y(t) = y
0(1 + x
20) cos
2(t + tan
−1x
0).
We make some observations, in the form of the following propositions:
Proposition 1.2.1. Let K be a field. Every non-zero K-derivation of degree less than or equal to 1 on K[x, y] is integrable.
A proof for n variables is given in 1.3.1. We give a more explicit proof for the case of 2 variables here.
Proof. We will consider the following cases. The symbols a, b, c, e, f , and g are taken to be
elements of K.
Case 0 : d(x) = c, d(y) = g. Observe that d commutes with any constant derivation.
Case 1 : d(x) = ax, d(y) = ay, a 6= 0. Observe that d commutes with δ, where δ(x) = y, δ(y) = x.
Case 2 : d(x) = ax + by, d(y) = ex + f y, different from Case 1. Observe that d commutes with δ, where δ(x) = x, δ(y) = y.
Case 3 : d(x) = ax + by + c, d(y) = ex + f y + g, a f − be 6= 0. In this case, d is equivalent to a derivation from Case 1 or Case 2 via a linear change of coordinates. Let (x
0, y
0) be the solution to the system ax + by + c = ex + f y + g = 0. Now let u = x − x
0and v = y − y
0, so that d(u) = au + bv and d(v) = eu + f v.
Case 4 : d(x) = ax + by + c, d(y) = ex + f y + g, a f − be = 0
(a) a = b = 0, different from Case 0. If e 6= 0, then d commutes with and is transversal to δ given by δ(x) = −
ge, δ(y) = 0. If f 6= 0, then d commutes with and is transversal to δ given by δ(x) = 0, δ(y) = −
gf.
(b) at least one of a and b is not 0. First assume a 6= 0. If f = e = 0, then this is equivalent to Case 4a by swapping the roles of x and y. Assume at least one of f and e is not 0.
By the condition a f − be = 0, it must be that e 6= 0. Using the coordinate z = ex − ay instead of x puts this into the form of Case 4a. Next, assume b 6= 0. If f = e = 0, then this is equivalent to Case 4a. Assume at least one of f and e is not 0. By the condition a f − be = 0, it must be that f 6= 0. Using the coordinate z = f x − by instead of x puts this into the form of Case 4a.
Definition 1.2.3. Let K be a field and let d be a K-derivation on K[x
1, . . . , x
n]. We say d is divergence-free if
n
∑
i=1
∂
∂x
id(x
i) = 0.
Proposition 1.2.2. Let K be a field of characteristic 0. There exist integrable divergence-free K-derivations on K[x, y] that are not coordinate-change equivalent to a derivation of degree less than or equal to 1.
Proof. The K-derivation defined by the same equations as d from Example 1.2.1 is divergence-free and integrable. Note that the vector field corresponding to d vanishes only at the points ( √
−1, 0) and (− √
−1, 0) in K
2. Since charK = 0, these points are distinct. After a coordinate change, the number of points in K
2at which a vector field vanishes does not change. The vector field of any derivation of degree less than or equal to 1 vanishes at zero, one, or infinitely many points. We conclude that d is not coordinate-change equivalent to a derivation of degree no greater than 1.
1.3 The linear case
We show in Proposition 1.3.1 that every nonzero K-derivation defined by polynomials of degree no greater than 1 on K[x
1, ..., x
n] is integrable (see Definition 1.2.2). We will make use of the following lemma.
Lemma 1.3.1. Let K be a field. Let ∂ be a non-zero K-derivation on the polynomial ring K[x
1, . . . , x
n] such that
∂(x) = Cx + a,
where x = (x
1, . . . , x
n)
T, C is the companion matrix of a polynomial over K of degree n, and a is an n × 1 matrix with entries in K. Then there exist K-derivations δ
2, . . . , δ
nsuch that
1. ∀i, j δ
i(x
j) has degree at most 1, 2. ∀i δ
i◦ ∂ = ∂ ◦ δ
i,
3. ∀i, j δ
i◦ δ
j= δ
j◦ δ
i, and
4. {∂, δ
2, . . . , δ
n} is K-linearly independent.
Proof. Write
C =
0 c
01 0 c
1. .. ... .. . 1 c
n−1
, a =
a
0.. . a
n−1
.
Case 1: a
0= 0 or c
06= 0
If c
06= 0, let v = C
−1a. If c
0= 0 let v = (a
1, a
2, . . . , a
n−1, 0)
T. Observe that in either case, Cv = a. Now for i = 0, . . . , n − 1 define δ
ito be the K-derivation given by
δ
i(x) = C
ix +C
iv
and note that ∂ = δ
1.
We first show that for all i and j δ
i◦ δ
j= δ
j◦ δ
i. We have δ
i(δ
j(x)) = δ
i(C
jx +C
jv) = C
j(C
ix + C
iv) = C
i+ jx +C
i+ jv. We also have δ
j(δ
i(x)) = δ
j(C
ix +C
iv) = C
i(C
jx +C
jv) = C
i+ jx +C
i+ jv.
We now show that {δ
0, . . . , δ
n−1} is K-linearly independent. Suppose C
0x,Cx, . . . ,C
n−1x are not K-linearly independent. Then there exist b
0, . . . , b
n−1∈ K not all 0 such that b
0C
0x + . . . b
n−1C
n−1x = (b
0C
0+ . . . + b
n−1C
n−1)x = (0, . . . , 0)
T. Since x
1, . . . , x
nare algebraically indepen- dent over K, the only way this could happen is if b
0C
0+ . . . + b
n−1C
n−1is the zero matrix. Since C is a companion matrix of a degree n polynomial, the minimal polynomial of C has degree n (cf.
(Hoffman and Kunze, 1971, Corollary, p. 230)). Therefore b
0= . . . = b
n−1= 0. We conclude that {C
0x, . . . ,C
n−1x} is K-linearly independent. It follows that {C
0x +C
0v, . . . ,C
n−1x +C
n−1v} is K-linearly independent.
Define δ
nto be δ
0. Now we have shown that {δ
2, . . . , δ
n} satisfy the properties in the statement of the lemma.
Case 2: a
06= 0 and c
0= 0
For i = 1, . . . , n let δ
ibe the K-derivation defined by
δ
i(x) = C
ix +C
i−1a
and note that δ
1= ∂.
We show that for all i and j δ
i◦ δ
j= δ
j◦ δ
i. We have δ
i(δ
j(x)) = δ
i(C
jx +C
j−1a) = C
j(C
ix + C
i−1a) = C
i+ jx+C
i+ j−1a. We also have δ
j(δ
i(x)) = δ
j(C
ix +C
i−1a) = C
i(C
jx +C
j−1a) = C
i+ jx + C
i+ j−1a.
Next we show that the set {δ
1, . . . , δ
n} is K-linearly independent. Suppose (b
1, . . . , b
n) ∈ K
n\{(0, . . . , 0)} is such that
b
1(Cx + a) + b
2(C
2x +Ca) + . . . + b
n(C
nx +C
n−1a) = 0
n×1. (1.2)
It follows that
b
1Cx + b
2C
2x + . . . + b
nC
nx = 0
n×1.
Since x
1, . . . , x
nare K-algebraically independent, and hence K-linearly independent, it follows that b
1C + . . . + b
nC
n= 0
n×n. Since C is a companion matrix and c
0= 0, the minimal polynomial of C is p(X ) = X
n− c
n−1X
n−1− . . . − c
1X . Hence there exists r ∈ K\{0} such that b
n= r and for i = 1, . . . , n − 1 b
i= −rc
i. It follows from this and (1.2) that
(−c
1I − c
2C − . . . +C
n−1)a = 0
n×1,
where I is the n × n identity matrix. Let D = −c
1I − c
2C − . . . +C
n−1. Since CD is the 0 matrix, we
see that the image of D, as a K-linear map from K
n×1to K
n×1, lies in the kernel of C. Observe that
since c
0= 0, the kernel of C has dimension 1. Because D is a K-linear combination of C
0, . . . ,C
n−1,
D is not the zero matrix. Hence, the image of D has positive dimension and thus the image of D
has dimension 1. Therefore the kernel of D has dimension n − 1. Let e
1, . . . , e
nbe the basis for K
n×1where e
ihas 1 in the i-th entry and 0 elsewhere. Observe that since the first column of C
ihas a 1 in the (i + 1)-th entry and 0 in all other entries, De
1= (−c
1, −c
2, . . . , −c
n−1, 1)
T6= 0
n×1. We now argue that for i = 2, . . . , n De
i= 0
n×1. To do this, we work over the field L := K( ˜ c
1, . . . , ˜ c
n−1), where ˜ c
1, . . . , ˜ c
n−1are K-algebraically independent, and consider the matrices ˜ C defined as the companion matrix of X
n− ˜c
n−1X
n−1− . . . − ˜c
1X, and ˜ D := − ˜ c
1I − ˜c
2C ˜ − . . . + ˜ C
n−1. Viewing ˜ C and ˜ D as L-linear maps on L
n, we have that ker ˜ C is the L-span of (− ˜ c
1, − ˜ c
2, . . . , − ˜ c
n−1, 1)
Tand that im ˜ D = ker ˜ C. Thus, each column of ˜ D is of the form (− ˜ c
1r, − ˜ c
2r , . . . , − ˜ c
n−1r , r)
T, where r ∈ L.
Since for i ≥ 1 each element of the top row of ˜ C
iis 0, we see that the top row of ˜ D is (− ˜ c
1, 0, . . . , 0).
Thus, we have
D ˜ =
− ˜c
10 · · · 0
− ˜c
20 · · · 0 .. . .. . .. . 1 0 · · · 0
.
Observing that D is the specialization of ˜ D at ˜ c
1= c
1, . . . , ˜ c
n−1= c
n−1gives us
D =
−c
10 · · · 0
−c
20 · · · 0 .. . .. . .. . 1 0 · · · 0
,
and therefore De
i= 0
n×1for i > 1. Writing a = a
0e
1+ a
1e
2+ . . . + a
n−1e
nand recalling that a
06= 0, we see that Da 6= 0
n×1. This contradicts that (1.2) holds. Therefore {δ
1, . . . , δ
n} is K-linearly independent.
Proposition 1.3.1. Let K be a field. Let ∂ be a non-zero K-derivation on the polynomial ring
R = K[x
1, ..., x
n] such that each ∂(x
i) has degree at most 1. Then there exist K-derivations δ
2, ..., δ
non R such that
1. ∀i, j δ
i(x
j) has degree at most 1, 2. ∀i δ
i◦ ∂ = ∂ ◦ δ
i,
3. ∀i, j δ
i◦ δ
j= δ
j◦ δ
i, and
4. {∂, δ
2, . . . , δ
n} is K-linearly independent.
Proof. Write
∂x = Ax + a,
where A ∈ K
n×nand a ∈ K
n×1. First, we show that without loss of generality we can assume A is in rational canonical form. By (Hungerford, 1974, Theorem 4.6(ii), p. 360), there exists P ∈ GL
n(K) such that ˆ A = PAP
−1is in rational canonical form. Letting ˆ x = ( ˆ x
1, . . . , ˆ x
n)
T= Px, we have K[x
1, . . . , x
n] = K[ ˆ x
1, . . . , ˆ x
n] and ∂( ˆ x) = ˆ A x ˆ + Pa.
Henceforth, we assume that A is in rational canonical form. Write
A =
C
1. ..
C
k
,
where for all i C
iis the companion matrix of a polynomial of degree d
i. For i = 1, . . . , k define l
ias follows. Let l
1= 0 and for i > 1 let l
i= l
i−1+ d
i−1. For i = 1, . . . , k and for j = 1, . . . , d
iwe define the K-derivation δ
i, jas follows. Lemma 1.3.1 for the ring K[x
li+1, . . . , x
li+di] and the K-derivation
∂
i(x
li+1, . . . , x
li+di)
T= C
i(x
li+1, . . . , x
li+di)
T+ (a
li+1, . . . , a
li+di)
Tguarantees the existence of K-derivations δ
2, . . . , δ
dion K[x
li+1, . . . , x
li+di] such that the set
{∂
i, δ
2, . . . , δ
di} is commutative and K-linearly independent. Let δ
i,1be the extension of ∂
ito K[x] by
δ
i,1(x
r) =
∂
i(x
r) if l
i< r ≤ l
i+ d
i0 otherwise
and for j = 2, . . . , d
ilet δ
i, jbe the extension of δ
jto K[x] by
δ
i, j(x
r) =
δ
j(x
r) if l
i< r ≤ l
i+ d
i0 otherwise
.
Observe that ∂ = δ
1,1+ . . . + δ
k,1. If k = 1, then the theorem is proven by Lemma 1.3.1. Assume k > 1. Now consider the set
S := {∂, δ
1,1, . . . , δ
1,d1, δ
2,1, . . . , δ
2,d2, . . . , δ
k−1,1, . . . , δ
k−1,dk−1, δ
k,2, . . . , δ
k,dk}
= {∂} ∪ {δ
i, j| i = 1, . . . , k; j = 1, . . . , d
i}\{δ
k,1}.
Observe that S contains n elements. We now show that S is commutative. Fix i, j, p, q, r such that 1 ≤ i ≤ k, 1 ≤ j ≤ d
j, 1 ≤ p ≤ k, 1 ≤ q ≤ d
p, and 1 ≤ r ≤ n. If i = p, then δ
p,q◦ δ
i, j= δ
i, j◦ δ
p,q. Suppose i 6= p. Since δ
i, j(x
r) ∈ K[x
li, . . . , x
li+di] we have δ
p,q(δ
i, j(x
r)) = 0. Similarly, δ
p,q(x
r) ∈ K[x
lp, . . . , x
lp+dp] and hence δ
i, j(δ
p,q(x
r)) = 0. We conclude that δ
i, jcommutes with δ
p,q. Since
∂ = δ
1,1+ . . . + δ
k,1, we see that ∂ commutes with δ
i, j.
Now we show that S is K-linearly independent. Suppose b, b
1,1, . . . , b
1,d1, b
2,1, . . . , b
k,dk∈ K are such that
b∂ + b
1,1δ
1,1+ . . . + b
k,dkδ
k,dk= 0. (Note that δ
k,1is not included.)
Since ∂ = δ
1,1+ . . . + δ
k,1, this implies
(b
1,1+ b)δ
1,1+ . . . + (b
k−1,dk−1+ b)δ
k−1,dk−1+ bδ
k,1+ (b
k,2+ b)δ
k,2+ . . . + (b
k,dk+ b)δ
k,dk= 0.
(1.3) Equation (1.3) implies that for all i = 1, . . . , k − 1 and for all r such that l
i< r ≤ l
i+ d
i(b
i,1+ b)δ
i,1(x
r) + . . . + (b
i,d1+ b)δ
i,d1(x
r) = 0.
It follows that
∀i = 1, . . . , k − 1 (b
i,1+ b)δ
i,1+ . . . + (b
i,di+ b)δ
i,di= 0. (1.4)
Equation (1.3) also implies that for all r such that l
k< r ≤ l
k+ d
kbδ
k,1(x
r) + (b
k,2+ b)δ
k,2(x
r) + . . . + (b
k,dk+ b)δ
k,dk(x
r) = 0.
It follows that
bδ
k,1+ (b
k,2+ b)δ
k,2+ . . . + (b
k,dk+ b)δ
k,dk= 0. (1.5)
Since for all i δ
i,1, . . . , δ
i,diare K-linearly independent, (1.4) implies that b
i, j= −b for i = 1, . . . , k − 1
and j = 1, . . . , d
iand (1.5) implies that b = 0 and b
k,2= . . . = b
k,dk= −b. We conclude that b = 0
and b
i, j= 0 for all i and j. Therefore S is K-linearly independent.
1.4 A class of derivations admitting upper bounds on the de- gree of a commuting derivation
1.4.1 The utility of upper bounds
Let d(x, y) = ( f
1, f
2) be a K-derivation on K[x, y]. Suppose b ∈ N is such that the following statement is true: “If δ(x, y) = (g
1, g
2) is a K-derivation on K[x, y] that commutes with and is transversal to d, then the degrees of g
1and g
2are no greater than b.” Such a b is sometimes called an upper bound.
We can use this information to determine whether d is integrable. Write g
i= ∑
j,k; j+k6ba
i, j,kx
jy
k. Now the equations d(δ(x)) = δ(d(x)) and d(δ(y)) = δ(d(y)) form a system of two equations of polynomials, and thus a finite system of equations on elements of K obtained by equating like coefficients. These equations are linear in the variables a
i, j,k. Hence the problem of determining whether d is integrable has been reduced to studying a finite system of linear equations over K.
1.4.2 Main result
We present a class of derivations and give an upper bound for each element of this class.
Notation. • Define deg
y(0) := −∞, so that for all n ∈ Z deg
y(0) < n.
• Let P and Q be elements of K[x, y]. Define deg
y(P/Q) = deg
y(P/ gcd(P, Q)) − deg
y(Q/ gcd(P, Q)).
• Let U be a matrix with entries in K(x, y). Define
deg
y(U ) := max{deg
y(u) | u is an entry of U }.
Proposition 1.4.1. Let K be a field of characteristic 0. Let d be a K-derivation on K[x, y] given by
d
x y
=
f
1f
2
satisfying the conditions
• f
26= 0,
• deg
y∂ f
2∂x < deg
yf
2, and
• deg
y(y f
1) < deg
yf
2.
If δ is a K-derivation on K[x, y] defined by
δ
x y
=
g
1g
2
and δ commutes with d, then max{deg
yg
1, deg
yg
2} 6 deg
yf
2. Proof. The equations
d(δ(x)) = δ(d(x)) and d(δ(y)) = δ(d(y))
yield
f
1∂g
1∂x + f
2∂g
1∂y = g
1∂ f
1∂x + g
2∂ f
1∂y and f
1∂g
2∂x + f
2∂g
2∂y = g
1∂ f
2∂x + g
2∂ f
2∂y , (1.6) which we rearrange as
−
y ff12
∂g1
∂x
−
y ff12
∂g2
∂x
− y ∂
∂y
g
1g
2
+
y f2
∂ f1
∂x y f2
∂ f1
∂y y
f2
∂ f2
∂x y f2
∂ f2
∂y
g
1g
2
=
0 0
.
For conciseness of notation, we define the matrices
• g :=
g
1g
2
,
• N :=
−
y ff12
∂g1
∂x
−
y ff12
∂g2
∂x
, and
• M :=
y f2
∂ f1
∂x y f2
∂ f1
∂y y
f2
∂ f2
∂x y f2
∂ f2
∂y
. so that this equation is written
N − y · ∂
∂y g + M · g =
0 0
.
Let M
idenote the i-th row of M, and let
α
i= max{deg
y(M
i), 0}.
Let
D = diag(y
−α1, y
−α2), A = D · M, and B = D · N.
Now we have
B − D · y · ∂
∂y g + A · g = 0. (1.7)
Note that by the construction of D, deg
y(A) 6 0, so D and A are both elements of K(x)[[
1y]]. Hence we can write
D = D
0+ D
1y + . . . , A = A
0+ A
1y + . . . ,
where each D
iis in M
2×2(K), each A
y,iis in M
2×2(K(x)), and the series for A is possibly infinite.
Let µ = deg
y(g) and ν = deg
y(B). Recall that since the entries of g are polynomials, µ > 0, whereas
ν may be negative. Thus, we can write
g =
c
µd
µ
y
µ+ lower degree terms,
where
c
µd
µ
∈ M
2×1(K[x]) and at least one of c
µand d
µis non-zero. Now equation (1.7) becomes
lc(B) · y
ν− (µ · D
0− A
0) ·
c
µd
µ
· y
µ+ terms of degree lower than max{ν, µ} =
0 0
.
Let γ = deg
yy f
1f
2= deg
y(y f
1) − deg
y( f
2). We see from the definition of B that δ
y6 γ + µ. Since we have assumed γ < 0, we have that ν < µ. It follows that (c
µ, d
µ)
Tis a non-zero element of the null space of µD
0− A
0, so det(µD
0− A
0) = 0. Therefore µ belongs to the set
R = {n ∈ N : det(n · D
0− A
0) = 0}.
Observe that if
det(λD
y,0− A
y,0) 6= 0, then R is finite and deg
yg ∈ R.
We first examine the first row of M. It follows from the hypotheses that
deg
yy f
2· ∂ f
1∂x < 0 and deg
yy f
2· ∂ f
1∂y < 0.
Hence, α
1= 0.
Now we consider the second row. Observe that γ < 0 implies deg
yf
2> 2, so
deg
yy f
2∂ f
2∂y = 0.
Since deg
y∂ f2∂x
< deg
yf
2, it follows that
deg
yy f
2∂ f
2∂x 6 0.
Thus α
2= 0 and it follows that D = diag(1, 1) and A = M.
Write f
2= ay
b+ terms of lower degree in y, where b ∈ N and a ∈ K. We see that
A
0=
0 0
∗ b
.
Now
λD
0− A
0=
λ 0
∗ λ − b
.
Now R = {0, b}, so deg
yg = b or 0.
Corollary 1.4.1. Let K be a field of characteristic 0. Let d be a K-derivation on K[x, y] given by
d
x y
=
f
1f
2
satisfying the conditions
• f
16= 0,
• deg
x∂ f
1∂y < deg
xf
1, and
• deg
x(x f
2) < deg
xf
1.
If δ is a K-derivation on K[x, y] defined by
δ
x y
=
g
1g
2
and δ commutes with d, then max{deg
xg
1, deg
xg
2} 6 deg
xf
1.
Proof. This is identical to Proposition 1.4.1 but with the roles of x and y switched.
Corollary 1.4.2. Let K be a field of characteristic 0. Let d = f
1∂∂x
+ f
2∂∂y