A New Method for Computing Elimination Ideals of Likelihood
Equations
Xiaoxian Tang
∗School of Mathematics and Systems
Science, Beihang University Beijing, China [email protected]
Timo de Wolff
Department für Mathematik,Technische Universität Braunschweig
Braunschweig, Germany
t.de- wolff@tu- braunschweig.de
Rukai Zhao
Computer Science & Engineering,
Texas A&M University
College Station, Texas
ABSTRACT
We develop a probabilistic algorithm for computing elimination ideals of likelihood equations. We show experimentally that it is far more efficient than directly computing Gröbner bases or the interpolation method proposed in [39, 40] for medium to large size models. Furthermore, we deduce discriminants of the elimination ideals, which play a central role in real root classification. In partic-ular, we can compute the discriminant of one Jukes-Cantor model in phylogenetics (with size 8 GB text file).
CCS CONCEPTS
•Computing methodologies → Symbolic and algebraic
manipulation; Algebraic algorithms.
KEYWORDS
Maximum likelihood estimation, Likelihood equation, Real root classification, Discriminant, Elimination ideal
ACM Reference Format:
Xiaoxian Tang, Timo de Wolff, and Rukai Zhao. 2019. A New Method for Computing Elimination Ideals of Likelihood Equations. InInternational Symposium on Symbolic and Algebraic Computation (ISSAC ’19), July 15–18, 2019, Beijing, China. ACM, New York, NY, USA, 8 pages. https://doi.org/10. 1145/3326229.3326241
1
INTRODUCTION
This work is motivated by themaximum likelihood estimation
problem in statistics:
Which probability distribution describes a given data set optimally for a chosen statistical model?
A standard way to answer this question is to determine a point in the model that maximizes alikelihood function; see (1). When the model is algebraic (see Definition 2.2) and the data is discrete, then one finds all critical points of the likelihood function by solving a system oflikelihood equations(2) via applying Lagrange multipliers.
∗
Corresponding author
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and /or a fee. Request permissions from [email protected].
ISSAC ’19, July 15–18, 2019, Beijing, China © 2019 Association for Computing Machinery. ACM ISBN 978-1-4503-6084-5/19/07. . . $15.00 https://doi.org/10.1145/3326229.3326241
Solving likelihood equations motivates an important branch in algebraic statistics [1, 13, 14, 24–27, 31].
Likelihood equations form an algebraic system inprobability
variablesp0, . . . , pn,Lagrange multipliersλ1, . . . , λs+1, and
param-etersu0, . . . ,un representing the data obtained from statistical experiments. Given such a system with generically chosen data vector(u0, . . . ,un), the number of complex solutions is a finite
non-negative constant, called themaximum-likelihood-degree (ML-degree); see Definition 2.5 and [27, 31]. Since the variablespi rep-resent probabilities, one is especially interested in areal solution classification[15, 33, 47] of likelihood equations. Unfortunately, this classification is very challenging, since it is a specificreal quantifier eliminationproblem [2–5, 7–11, 16, 17, 23, 28–30, 34–38, 41, 42, 45], a fundamental problem in computational real algebraic geometry. The number of real solutions only changes when the parameters (data) pass a set called thediscriminant variety; see [33, Definition 1]. Hence, these discriminant varieties of likelihood equations play a core role in real solution classification. In [40], Rodriguez and the first author studied how to compute discriminant varieties for a likelihood equation system efficiently. Experiments [40, Tables 2–3] suggest that Gröbner bases [12, 21, 22] cannot be computed directly for medium to large size models. Rodriguez and the first author proposed in [40, Algorithm 2] a probabilistic algorithm based on evaluation/interpolation techniques, which, in theory, works for arbitrary general zero-dimensional systems. In practice, however, it is limited to small algebraic models with ML-degrees not greater than 6. The main bottleneck is that the size of the discriminants we are trying to compute are huge; for instance, a model with ML-degree 6 has a discriminant with total ML-degree 12 and thousands terms.
The key idea of this article is to exploit the special structure of likelihood equations to both improve the computational efficiency and to allow ourselves to compute generators of elimination ideals instead of computing discriminants directly. We summarize the entire challenge in Figure 1. Thus, thetask of this paper is to effi-ciently compute the elimination ideal with respect to all parameters (data) and one variable for a given system of likelihood equations.
More precisely, we have the followingproblem statement:
Input: Likelihood equations
f0, . . . , fn+s+1∈ Q[u0, . . . ,un, p0, . . . , pn, λ1, . . . , λs+1].
Output: A generator of the elimination ideal with respect to one probability variablepifor somei between 0 and n
p
LE
Algebraic Statistics
fundamental
RSC DV Goal: EI
Computational Algebraic Geometry
fundamental motivates motivates motivates
LE: Solving Likelihood Equations; RSC: Real Solution Classification; DV: Computing Discriminant Varieties; EI: Computing Elimination Ideals.
Figure 1: An visualization of our motivation.
Models #pi MLD
Timings
Standard Interpolation Algorithm 1
1 4 3 0.046 s 1.831 s 0.525 s 2 6 2 0.524 s 24.983 s 2.310 s 3 6 4 3.211 s 282.425 s 16.174 s 4 6 6 ∞ 7933.230 s 782.676 s 5 5 12 ∞ 10726.268 s 761.257 s 6 9 10 ∞ > 1583 d 14 d 7 5 23 ∞ 9919.260 s 4624.575 s 8 8 14 ∞ > 4667 d > 15 d 9 8 9 ∞ > 39 d 2 d
Table 1: Runtimes for computing elimination ideals (s: sec-onds; d: days). The models are sorted by the same order with [40, Table 3]. The column “standard” constains the runtimes via a regular FGb Gröbner basis computation, the column “Interpolation” contains the runtimes for [40, Algorithm 2], and the last column contains the runtimes for our Algo-rithm 1.
Ourmain contribution is the development of a probabilistic al-gorithm, Algorithm 1, for computing elimination ideals of Lagrange likelihood equations. We implemented the algorithm inMaple. Our experiments, summarized in Table 1, show that Algorithm 1 is sig-nificantly more efficient than the standard approach of directly computing Gröbner bases, or the evaluation/interpolation method in [40] for statistical models beyond small size, see Table 1 and Section 6 for further details. Our crucial idea to save computational time is to exploit an observation that the elimination ideals have special structures; see (3).
In particular, we are able to compute the discriminant of one Jukes-Cantor model [27,Pcomb, Example 15] in phylogenetics [44, Chapter 15]. This was impossible before. We point out that this is a gigantic polynomial, whose total degree is 176, and which take several GB memory when stored in a text file, see Table 2 (Model 9) for further details.
The article is organized as follows. Section 2 are preliminaries. In Section 3, we discuss the specialization properties of (radical) elimination ideals and multivariate factorization. In Section 4, we introduce general zero-dimensional systems. In Section 5, we in-troduce Algorithm 1 together with a list of sub-algorithms for computing elimination ideals of likelihood equations. In Section 6, we explain the implementation details and compare the efficiency of our code with existing methods.
2
PRELIMINARIES
We assume that the reader is familiar with the fundamental con-cepts of computational algebraic geometry. For a general overview, we refer the reader to [18] and [43].
2.1
Notation
Throughout the paper, we use bold letters for vectors or a finite set of polynomials, e.g.,z = (z1, . . . , zn) andH = {h1, . . . , hm}.
Forh ∈ Q[z] we denote thetotal degreeofh bydeg(h)and the degree off with respect to a particular variable zj asdeg(h, zj). We denote bycoeff(h, zi
j)thecoefficient ofh with respect to the monomialzij. IfN = deg(h, zj), then simply denotecoeff(h, zNj )by lcoeff(h, zj). ForH ⊆ Q[z], we denote by⟨H⟩theidealgenerated
byH in Q[z], and byV(H)theaffine variety {z ∈ Cn |h(z) =
0 for allh ∈ H }. For any ideal I ⊂ Q[z], we denote by
√
Ithe
radical idealofI, and we denote byV(I)the affine variety defined
by the generator polynomials ofI. For any subset S ⊆ Cn, we
denote byI(S)the ideal generated by the polynomials vanishing
onS, i.e.,I(S) = {h ∈ Q[z] | h(z∗) = 0 for all z∗ ∈ S}, and denote byStheZariski closureV(I(S)) of S in Cn. For a positive integern, and for any 1 ≤ i ≤ n, we denote thecanonical projection byproji : Cn → Ci. Finally, we denote bygcd(a,b)thegreatest
common divisorof two integersa,b.
2.2
Algebraic Statistics
In this section we recall the basic notions from algebraic statistics, which we need in this article.
Definition 2.1 (Probability Simplex). We define the
n-dimen-sional probability simplexas∆n = {(p0, . . . , pn) ∈ Rn+1|p0 >
0, . . . , pn > 0, p0+ · · · + pn= 1}.
Definition 2.2 (Algebraic Statistical Model and Model Invari-ant). Given homogenous polynomialsд1, . . . , дs ∈ Q[p0, . . . , pn]
such thatV(д1, . . . , дs) ⊊ Cn+1is irreducible and generically
re-duced, we define analgebraic statistical modelas M = V(д1, . . . , дs) ∩∆n.
Eachдi is called amodel invariantofM. If V(д1, . . . , дs) has
codi-mensions, then we say {д1, . . . , дs} is a set ofindependent model
invariants.
Given an algebraic statistical modelM and adata vectoru =
(u0, . . . ,un) ∈ Rn+1≥0 , themaximum likelihood estimation (MLE)
problem is the optimization problem
max Πn
k=0pukk subject top ∈ M, (1)
which is fundamental in statistics [20, Chapter 2]. One way to solve the MLE problem is to solve a system of likelihood equations [27] formulated by the Lagrange multiplier method. We give the explicit formulation of such a system:
Definition 2.3 (Lagrange Likelihood Equations). Given an
algebraic statistical modelM with a set of independent model
{f0, . . . , fn+s+1} below is said to be the system ofLagrange
likeli-hood equationsofM when set equal to zeros:
f0(u, p, λ) = p0· ∂д 1 ∂p0λ1+ · · · + ∂дs ∂p0λs+ λs+1 −u0, .. . fn(u, p, λ) = pn·∂д1 ∂pnλ1+ · · · + ∂дs ∂pnλs+ λs+1 −un, fn+1(u, p, λ) = д1(p0, . . . , pn), .. . fn+s(u, p, λ) = дs(p0, . . . , pn), fn+s+1(u, p, λ) = дs+1(p0, . . . , pn) := p0+ · · · + pn− 1, (2)
with inderterminatesu= (u0, . . . ,un),p= (p0, . . . , pn), andλ=
(λ1, . . . , λs+1). More specifically,u0, . . . ,un are parameters, and
p0, . . . , pn, λ1, . . . , λs+1are variables.
Theorem 2.4. [27] Given a system of Lagrange likelihood equa-tionsf0, . . . , fn+s+1defined in (2), there exists an affine varietyV ⊊
Cn+1, and a non-negative integerN such that for any b ∈ Cn+1\V , the equations
f0(b,p, λ) = · · · = fn+s+1(b,p, λ) = 0
haveN common complex solutions in Cn+1× Cs+1.
Definition 2.5 (Maximum-Likelihod-Degree). [27] Given an algebraic statistical modelM with a system of Lagrange likelihood equations defined as in (2), the non-negative integerN stated in
Theorem 2.4 is called themaximum-likelihood-degree, short
ML-degree, ofM.
Definition 2.6 (Mixed Discriminant). [40, Definition 4] Given an algebraic statistical modelM with a system of Lagrange likeli-hood equationsf = {f0, . . . , fn+s+1} defined in (2), we define
LM J = projn+1(V(f ) ∩ V(J)),
where J denotes the determinant of Jacobian matrix of f with
respect to(p, λ): det ∂f0 ∂p0 · · · ∂pn∂f0 ∂λ1∂f0 · · · ∂λs+1∂f0 .. . . .. ... ... . .. ... ∂fn+s+1 ∂p0 · · · ∂fn+s+1∂pn ∂fn+s+1∂λ1 · · · ∂fn+s+1∂λs+1 .
IfI(LM J) is principal, then a generator polynomial of I(LM J), denoted byDM J, is said to be amixed discriminantoff .
It is important to highlight thatLM J is a component of the
discriminant variety[33] of Lagrange likelihood equations, which
plays a central role of real root classification; see e.g., [40, Theorem 2]. Notice that ifLM J is principal, then the mixed discriminant DM Jis homogenuous in Q[u]; see [40, Proposition 2]. By Definition
2.6,DM J can be computed by computing the elimination ideal
⟨f , J⟩ ∩ Q[u]. That means to compute a Gröbner base of ⟨f , J⟩ with respect to a lexicographic order, where the determinantJ of the Jacobian matrix, defined as in Definition 2.6, can be huge. So, computingDM Jis usually challenging in practice; see [40, Tables 2–3].
According to [40, Algorithm 2, Strategy 3], one way to improve the efficiency for computingDM Jis to compute the elimination ideal with respect to one probability variable first; for instance, ⟨f ⟩ ∩ Q[u,p0]. Ifp⟨f ⟩ ∩ Q[u,p0] = ⟨Ef⟩, then it is well known
thatDM Jis a factor of the discriminant ofEf with respect top0; see
[40, Lemma 3]. However,Ef is hard to obtain via directly computing Gröbner bases; see the column “standard" in Table 1. Therefore, the goal of the rest of paper is to computeEf more efficiently.
3
SPECIALIZATION PROPERTIES
In this section, we discuss a selection of specialization properties, which guarantee the correctness of our Algorithm 1 in Section 5. Roughly speaking, elimination ideals (and their radicals) “specialize well" over a Zariski open set (see Proposition 3.2), and frac In what follows, we consider polynomial rings with at least two variables, i.e., Q[z1, . . . , zn] withn ≥ 2. Given h ∈ Q[z1, . . . , zn], we denote
for every 1≤i < n, bylmi(h)andlcoeffi(h)theleading monomial andleading coefficientofhwith respect tozi+1, . . . , zn, whenh is considered in Q(z1, . . . , zi)[zi+1, . . . , zn] with the lexicographic
or-derzi+1< · · · < zn. For everyb = (b1, . . . ,bi) ∈ Ci, we define the
polynomialh(b) = h|z
1=b1, ...,zi=bi ∈ C[zi+1, . . . , zn]. For every
polynomial setH ⊆ Q[z1, . . . , zn], we defineH(b) = {h(b) ∈
C[zi+1, . . . , zn] |h ∈ H }.
Definition 3.1. [32, Definition 4.1] GivenH ⊆ Q[z1, . . . , zn], for
any 1≤i < n, a subset д of H is anoncomparable subsetofHwith respect tozi+1, . . . , znif
(1) for everyh ∈ H, there exists a д ∈ д such that lmi(h) is a multiple of lmi(д), and
(2) for anyд1, д2∈д, with д1, д2, the leading monomial lmi(д1)
is not a multiple of lmi(д2), and lmi(д2) is not a multiple of
lmi(д1).
Proposition 3.2. GivenH ⊆ Q[z1, . . . , zn], for any 1 ≤i < n, if
the elimination ideal
⟨H⟩ ∩ Q[z1, . . . , zi, zi+1]= ⟨q⟩ with deg(q, zi+1)> 0,
and ifp⟨q⟩ = ⟨д⟩, then
(1) there exists an affine varietyV ⊊ Cisuch that for anyb ∈ Ci\V , ⟨H(b)⟩ ∩ C[zi+1] = ⟨q(b)⟩, and
(2) there exists an affine varietyW ⊊ Cisuch that for anyb ∈ Ci\W ,
p
⟨H(b)⟩ ∩ C[zi+1] = ⟨д(b)⟩.
Proof. LetG be a Gröbner basis of ⟨H⟩ with respect to the
lexicographic orderz1< · · · < zn. For any 1≤i < n, let N be a
noncomparable set ofG with respect tozi+1, . . . , zn.
Part (1): If⟨H⟩ ∩ Q[z1, . . . , zi+1] = ⟨q⟩, then by [18, page 121,
Theorem 2],G ∩ Q[z1, . . . , zi+1] is a Gröbner basis of ⟨q⟩. So G ∩
Q[z1, . . . , zi+1] contains only one element, sayh, and hence h = c ·q
wherec ∈ Q. Also,Gi = G ∩ Q[z1, . . . , zi]= ∅ since deg(h, zi+1)=
deg(q, zi+1) > 0, and hence, V (Gi) = Ci. By [32, Theorem 4.3], there existsV ⊊ Cisuch that for anyb ∈ Ci\V , N(b) is a Gröbner basis of⟨H(b)⟩. By [18, page 121, Theorem 2], N(b) ∩ C[zi+1] is a Gröbner basis of⟨H(b)⟩ ∩ C[zi+1]. Notice that N (b) ∩ C[zi+1]= {h(b)}. So, we have N(b) ∩ C[zi+1]= ⟨h(b)⟩ = ⟨q(b)⟩.
Part (2): Ifp⟨q⟩ = ⟨д⟩, then V(q) = V(д). So, for any b ∈ Ci,
V(q(b)) = V(д(b)). Hence, we have p⟨q(b)⟩ = p⟨д(b)⟩. Note that
⟨д⟩ is a radical ideal. Then it is a basic fact that there exists an affine varietyV1⊊ Ci such that for anyb ∈ Ci\V1,⟨д(b)⟩ is still radical,
and hencep⟨q(b)⟩ = p⟨д(b)⟩ = ⟨д(b)⟩. By part (1), there exists an affine varietyV2⊊ Ci such that for anyb ∈ Ci\V2, we have the
equality in part (1). LetW = V1∪V2. Then for anyb ∈ Ci\W , we
Proposition 3.3. Letд ∈ Q[z1, . . . , zn]. Ifд = Πr
k=1дmkk, where
everyдkis irreducible in Q[z1, . . . , zn], andдj , дkfor anyj , k,
then for any 1 ≤i < n there exists an infinite subset Γ ⊆ Qi such that for anyb ∈ Γ we have д(b) = Πrk=1дk(b)mk, whereдk(b) is
irreducible in Q[zi+1, . . . , zn], andдj(b) , дk(b) for any j , k. Proof. Given thatд1, . . . , дr are irreducible, by Hilbert’s
irre-ducibility theorem, see e.g., [46, Theorem 1], for any 1≤i < n, there exists an infinite subsetΘ ⊆ Qi such that for anyb ∈ W , д1(b), . . . ,дr(b) are irreducible in Q[zi+1, . . . , zn]. Now consider
an arbitrary pairдj, дkwithj , k and thus дj, дk. Without loss of generality, let
Wj,k = b ∈ Ci |дj(b) − дk(b) = 0 .
Obviously,Wj,kis an affine variety, which does not equal Ci. Then letΓ = Θ ∪ Ðr
k=1Ðk−1j=1
Qi\Wj,k
, and we are done. □
4
GENERAL ZERO-DIMENSIONAL SYSTEMS
Throughout the rest of the paper, we always assume that a sys-tem of Lagrange likelihood equations is general zero-dimensional; see Definition 4.1. A general zero-dimensional system has a nice structure, see [19, Theorem 6.10]. which leads us to analyze their elimination ideals further. The relation between the Shape Lemma [6] and [19, Theorem 6.10] was discussed in [19].
Definition 4.1 (General Zero-Dimensional System). A poly-nomial setH = {h1, . . . , hm} ⊆ Q[a1, . . . ak,y1, . . . ,ym] is called
ageneral zero-dimensional systemif there exists an affine variety
V ⊊ Cksuch that for anyb = (b1, . . . ,bk) ∈ Ck\V , the equations h1(b) = · · · = hm(b) = 0 satisfy:
(1) the number of complex solutions is a positive constant, denoted byN (H);
(2) all complex solutions are distinct;
(3) every pair of distinct complex solutionsy∗= (y∗1, . . . ,y ∗ m) and z∗= (z∗ 1, . . . , z ∗ m) it holds thaty∗1, z∗1.
Proposition 4.2. Consider a general zero-dimensional system
H ⊂ Q[a1, . . . , ak,y1, . . . ,ym]. If the elimination ideal
⟨H⟩ ∩ Q[a1, . . . , ak,y1]
is principal, then its radical ideal is generated by a polynomialд ∈ Q[a1, . . . , ak,y1] such that deg(д,y1) = N (H).
Proof. LetG be a Gröbner basis of ⟨H⟩ with respect to the
lexicographic ordera1 < · · · < ak < y1 < · · · < ym. SinceH is general zero-dimensional, by [19, Theorem 6.10], there exists
T1 = CNy1N+ CN −1y N −1
1 + . . . + C1y1+ C0∈ G,
where N = N (H), and Ci ∈ Q[a1, . . . , ak]. By [18, page 121,
Theorem 2], G ∩ Q[a1, . . . , ak,y1] is a Gröbner basis of ⟨H⟩ ∩
Q[a1, . . . , ak,y1]. By the hypothesis that ⟨H⟩ ∩ Q[a1, . . . , ak,y1]
is principal,G ∩Q[a1, . . . , ak,y1] contains only one element. Notice
T1∈ G∩Q[a1, . . . , ak,y1]. So, we know {T1}= G∩Q[a1, . . . , ak,y1].
And hence,⟨H⟩ ∩ Q[a1, . . . , ak,y1]= ⟨T1⟩. By [18, page 187, Propo-sition 12],p⟨T1⟩= ⟨д⟩, where д =
T1 gcdT1,
∂T1
∂a1, ...,∂ak∂T1,∂T1∂y1
. Hence,
deg(д,y1) ≤ deg(T1,y1)= N (H).
Below, we prove deg(д,y1) ≥N (H). By Definition 4.1, there ex-ists an affine varietyV1⊊ Cksuch that for anyb ∈ Ck\V1,V(H(b))
hasN (H) distinct complex points with distinct y1-coordinates.
By Proposition 3.2 (2), there existsV2 ⊊ Ck such that for any
b ∈ Ck\V2,
p
⟨H(b)⟩ ∩ Q[y1] = ⟨д(b)⟩.
Letb ∈ Ck\(V1∪V2). Thenд(b) = 0 has N (H) distinct complex solutions, which are they1-coordinates of points inV(H(b)). So
deg(д,y1) ≥ deg(д(b),y1) ≥N (H). □
5
ALGORITHM
Given an algebraic modelM, let its Lagrange likelihood equa-tion system bef = {f0, . . . , fn+s+1} ⊆ Q[u, p, λ]. Assuming ⟨f ⟩ ∩
Q[u, p0] is principal, we propose a probabilistic algorithm for
com-puting the polynomialEf generatingp⟨f ⟩ ∩ Q[u,p0]. We simply
denote coeff(Ef, pi
0) byAi(u). ThenEf = Í N
i=0Ai(u)pi0, where by
Theorem 2.4, Definition 4.1 and Proposition 4.2,N is the ML-degree ofM. First, we highlight a fact:
(F1)Ef is homogenous with respect tou, and hence each Ai is
homogenous with the same total degree in Q[u].
We omit the proof of (F1) since the argument is similar to [40, Proposition 2], which is based on a basic fact implied by (2): for every(u,p0) ∈ proj
n+2(V(f )) and for any complex scalar γ , 0,
(γu,p0) is also in projn+2(V(f )).
Besides observing (F1), we make the following assumptions to simplify our algorithm:
(A1) Assume deg(AN,u0) = deg(AN), i.e.,AN contains a term
udeg(AN)
0 ∈ Q[u0].
(A2) AssumeAN is monic with respect tou0, which unifies our output polynomialEf.
If (A1) does not hold, then we apply an invertible linear change to the parametersujsuch that (A1) holds for the new parameters (similar to [40, Algorithm 4]). For instance, obtain new parameters vjas
v0= u0, and vj= bjuj+ u0 for j = 1, . . . ,n,
wherebjare randomly chosen rational numbers. By [40, Lemmas
1–2], deg(AN(v),v0) will be equal to deg(AN(v)).
Let S (u) = n Õ i=0 ui.
The key idea of our algorithm is an observation from experiments: S (u) appears in some coefficients of Ef with respect top0. So, we
further write Ef(u,p0) = N Õ i=0 Ai(u) pi0 = N Õ i=0 S (u)αiRi(u) pi 0, (3)
whereRi ∈ Q[u]\⟨S (u)⟩. In a separate paper, we intend to prove for a general model that at least oneαiin (3) is nonzero. The main algorithm has three steps; see Algorithm 1:
Step 1 ComputeN , (α0, . . . , αN), and the degree of everyuj in
Algorithm 1: (Main Algorithm)
input : Lagrange likelihood equationsf0, · · · , fn+s+1∈ Q[u, p, λ] output : A generator ofp ⟨f0, . . . , fn+s+1⟩ ∩ Q[u, p0]:E
f(u, p0)= Íi=0N Ai(u) pi0
1 N , (α0 , . . . , αN ), L, Ω ←Degrees(f0 , . . . , fn+s+1) 2 AN (u) ←LeadingCoefficient(f0 , . . . , fn+s+1, αN , L)
3 A0 (u ), . . . , AN −1(u) ←Coefficients(f0 , . . . , fn+s+1, AN (u), α0, . . . , αN −1, Ω) 4 ReturnÍN
i=0 Ai(u )pi0
Algorithm 2: (Sub-Algorithm of Algorithm 1) De-grees
input : Lagrange likelihood equationsf0, · · · , fn+s+1∈ Q[u, p, λ] output :N, (α0, . . . , αN), L, Ω, where
• N = deg(Ef, p0),
• αi is the multiplicity of the factorÍn
k=0uk appearing incoeff(Ef, p0i),
• Lis a list with lengthn + 1, whose(j + 1)-th entry isdeg(lcoeff(E
f, p0), uj)for
j = 0, . . . , n,
• Ωis anN × (n + 1)matrix, whose(i + 1, j + 1)-entry isdeg(coeff(E
f, p0i), uj)for
i = 0, . . . , N − 1and forj = 0, . . . , n.
1 f ∗0, . . . , f ∗n+s+1←
replaceu1 , . . . , uninf0 , . . . , fn+s+1with some rational numbers
b1 , . . . , bn 2 д(u0 , p0 ) ←generator of q ⟨f ∗0, . . . , f ∗n+s+1⟩ ∩ Q[u0, p0] 3 N ← deg(д, p0 ) 4 forifrom0toNdo
5 αi ←multiplicity of the factoru0+ Ín
k=0bkincoeffд, pi 0 6 L(1) ← deg(lcoeff(д, p0 ), u0 ) 7 forifrom0toN − 1do 8 Ω(i + 1, 1) ← deg(coeff(д, pi 0), u0 ) 9 forjfrom1tondo
10 f ∗0, . . . , f ∗n+s+1←replaceu0 , . . . , uj−1, uj+1, . . . , uninf0 , . . . , fn+s+1with
some rational numbers 11 д(uj , p0 ) ←generator of q ⟨f ∗0, . . . , f ∗n+s+1⟩ ∩ Q[uj , p0] 12 L(j + 1) ← deg(lcoeff(д, p0 ), uj ) 13 forifrom0toN − 1do 14 Ω(i + 1, j + 1) ← deg(coeff(д, pi0), uj ) 15 returnN , (α0 , . . . , αN ), L, Ω
Step 2 Compute the leading coefficientAN(u) by interpolating RN(u).
Step 3 Compute the coefficientsAi(u) by interpolating Ri(u) for i = 0, . . . , N − 1.
We present the pseudocode in Algorithm 1 and its sub-algorithms Algorithms 2–6, and a running example in Section 5.1. Algorithm 1 is guaranteed to terminate since we only have finite loops. The algorithm is probabilistic, but Propositions 3.2–3.3 guarantee that it provides the correct output for a generical choice of random rational numbers.
5.1
Running Example
In this subsection, we illustrate how Algorithm 1 works by the
linear modelM below given by a weighted four-sided die [40,
Example 1], for which we know the ML-degree is 3:
M = V (p0+ 2p1+ 3p2− 4p3) ∩∆3,
where∆3 = {(p0, p1, p2, p3) ∈ R 4
>0|p0+ p1+ p2+ p3 = 1}. The
input Lagrange likelihood equations (2) are
f0= p0λ1+ p0λ2−u0 f1= p1λ1+ 2p1λ2−u1
f2= p2λ1+ 3p2λ2−u2 f3= p3λ1− 4p3λ2−u3
f4= p0+ 2p1+ 3p2− 4p3 f5= p0+ p1+ p2+ p3− 1
Algorithm 3: (Sub-Algorithm of Algorithm 1) Lead-ingCoefficient
input : Lagrange likelihood equationsf0, . . . , fn+s+1, andaN, L, where
•αN is the multiplicity of the factorÍn
k=0uk appearing inlcoeff(Ef, p0),
• Lis a list, whose(j + 1)-th entry isdeg(lcoeff(Ef, p0), uj)forj = 0, . . . , n. output :lcoeff(E
f, p0):AN(u)
1 d ← L(1) − αN#Here,d = deg(RN , u0 ), and by (A1),deg(RN , u0 )= deg(RN )
2 forifrom0tod − 1do
3 Enumerate all the monomials in the set
{uβ11 · · ·u βnn | Ín j=1 βj = d−i, 0 ≤ βj ≤ L(j + 1) − αN }asUi, 1 , . . . , Ui,ti 4 t ← max(t0 , . . . , td−1) 5 forkfrom1totdo
6 bk,1 , . . . , bk,n ←some rational numbers
7 q(u0 ) ←IntersectForLC(f0 , . . . , fn+s+1, bk,1, . . . , bk,n, αN ) 8 C∗0,k, . . . , C∗d−1,k←
the coefficients ofq(u0 )with respect tou0
0, . . . , ud−10
9 forifrom0tod − 1do
10 Mi ←theti × timatrix whose(k, r )-entry isUi,r |u1
=bk,1 , . . ., un =bk,n
11 Ci (u1 , . . . , un ) ← (Ui,1, . . . , Ui,ti )M−1i (Ci,1, . . . , C∗ i,ti∗ )T
12 Return Ín k=0uk αN ud0+ Σd−1i=0 Ci u1, . . . , un ui 0
Algorithm 4: (Sub-Algorithm of Algorithm 3) Inter-sectForLC
input : Lagrange likelihood equationsf0 , . . . , fn+s+1, some rational numbersb1 , . . . , bnand
αN,
whereaNis the multiplicity of the factorÍn
k=0ukappearing inlcoeff(E f , p0). output : lcoeff(Ef ,p0 ) (Ín k=0uk )αN |u1=b1 , . . ., un =bn
1 f ∗0, . . . , f ∗n+s+1←replaceu1 , . . . , uninf0 , . . . , fn+s+1withb1 , . . . , bn, respectively
2 д(u0 , p0 ) ←generator of the radical of elimination ideal⟨f ∗
0, . . . , f ∗n+s+1⟩ ∩ Q[u0, p0]
3 q(u0 ) ←dividelcoeff(д, p0 )by(u0+ Ín
i=1 bi)αN
4 Makeq(u0 )monic with respect tou0, andreturnq(u0 )
Algorithm 5: (Sub-Algorithm of Algorithm 1) Coeffi-cients
input : Lagrange likelihood equationsf0, . . . , fn+s+1, and
AN(u), α0, . . . , αN −1, Ω, where
•AN(u) = lcoeff(Ef, p0),
•αi is the multiplicity of the factorÍn
k=0uk appearing incoeff(E
f, p0i),
• Ωis anN × (n + 1)matrix, whose(i + 1, j + 1)-entry isdeg(coeff(E f, pi0), uj). output :coeff(E
f, p0i)fori = 0, . . . , N − 1:A0(u), . . . , AN −1(u)
1 d ← deg(AN(u))
2 forifrom0toN − 1do
3 Enumerate all the monomials in
{u0β0· · ·u βn n |Ín j=0βj= d − αi, 0 ≤ βj≤D(i + 1, j + 1) − αi}as Ui,1, . . . , Ui,ti 4 t ← max(t0, . . . , tN −1) 5 forkfrom1totdo
6 bk,0, . . . , bk,n←some rational numbers
7 д(p0) ←Intersect(f0, . . . , fn+s+1, bk,0, . . . , bk,n)
8 C∗
0,k, . . . , C ∗
N −1,k←the coefficients ofд(p0)with respect top0 0, . . . , p
N −1 0
9 forifrom0toN − 1do
10 Mi←ti×ti matrix whose(k, r )-entry is Ui,r
AN (u)|u0=bk,0 , . . ., un =bk,n 11 Ri(u) ← (Ui,1, . . . , Ui,ti)M−1 i (C∗i,1, . . . , Ci,ti∗ )T 12 Return Ín k=0uk α0 R0(u), . . . , Ín k=0uk αN −1 RN −1(u)
wherep0, p1, p2, p3, λ1, λ2 are variables, andu0,u1,u2,u3are
pa-rameters. The output is a generator ofp⟨f0, . . . , f5⟩ ∩ Q[u, p0]. We
write the generator as in the form (3).
Step 1. First, we computeN , (α0, . . . , αN), and deg(Ai,uj) forj =
0, . . . , 3 and for i = 0, . . . , N . For each uj , u0, substituteuj = bj
Algorithm 6: (Sub-Algorithm of Algorithm 5) Inter-sect
input : Lagrange likelihood equationsf0, . . . , fn+s+1, and some rational numbers
b0, b1, . . . , bn output :lcoeff(EEf(u,p0 )
f ,p0)
|u0=b0 , . . ., un =bn
1 ef0, . . . ,fen+s+1←replaceu0, . . . , un inf0, . . . , fn+s+1 withb0, . . . , bn , respectively
2 д(p0) ←generator of the radical of elimination ideal⟨ef0, . . . ,efn+s+1⟩ ∩ Q[p0]
3 Makeд(p0)monic with respect top0 , andreturnд(p0)
we chooseb = (b1,b2,b3)= (2, 12, 7). We substitute uj = bj, and
rename the resulting polynomials asf∗
0, . . . , f ∗ 5. Note thatf ∗ k = fk(u0,b,p, λ). We obtain a generator of q ⟨f∗ 0, . . . , f ∗ 5⟩ ∩ Q[u0, p0]
by computing a Gröbner basis:д∗(u0, p0) = 10(u0+ 21) 2p3 0− (u0+ 21)(43u0+ 276)p 2 0+ 2u0(29u0+ 396)p0− 24u 2 0. Ifb is generic in
the parameter space C3, then by Proposition 3.2 (2),д∗(u0, p0) =
Ef(u0,b,p0). So, we have
N = deg(Ef(u, p0), p0)= deg(Ef(u0, b, p0), p0)= deg(д ∗, p0)
= 3.
And, fori = 0, . . . , N (= 3), we have
deg(Ai(u), u0)= deg(Ai(u0, b), u0)= deg(coeff(д∗, pi
0), u0)= 2.
So, we recordL(1)= deg(A3,u0)= 2 and Ω(i+1, 1) = deg(Ai,u0)=
2 fori = 0, 1, 2. Similarly, we compute the degrees of other parame-ters, and have
L = [2, 2, 2, 2] , and Ω = 2 0 0 0 2 1 1 1 2 2 2 2 , whereL(j + 1) records deg(A3,uj), andΩ(i + 1, j + 1) records
deg(Ai,uj) fori = 0, 1, 2. Notice S(u0,b) = u0+ 21. By
Proposi-tion 3.3, checking the multiplicity of the factoru0+ 21 in each
coeff(д∗, pi
0) fori = 0, . . . , 3, we have α0 = α1 = 0, α2 = 1, and
α3= 2.
Step 2. The second step is to recover the leading coefficientAN(u). ByStep 1, we knowN = 3 and α3= 2. We write AN asA3(u) =
S (u)2
R3(u). By the degrees recored in L, we know the degrees
ofu0,u1,u2,u3 inA3(u) are all 2. So, R3(u) ∈ Q. According to
the assumption (A2),A3(u) is monic with respect to u0. Hence, R3(u) = 1 and A3(u) = S (u)
2
.
Step 3. The last step is to interpolateA0(u), A1(u) and A2(u). As
an example, we show how to interpolateA2(u) in details. By Step
1, we haveα2 = 1. So we write A2(u) = S (u)R2(u). By the last
row ofΩ, the degrees of u0,u1,u2,u3inA2(u) are 2, 2, 2, 2. Thus,
the degrees ofu0,u1,u2,u3inR2(u) are 1, 1, 1, 1, respectively. By
(F1),A2is homogenous, and we have deg(A2)= deg(A3)= 2. So
R2 is also homogenous, and deg(R2) = deg(A2) − deg(S (u)) =
1. Then we can assumeR2(u) = Í
3
k=0Ckuk, whereCk ∈ Q. In order to determine the four coefficientsCk, we establish four linear equations by sampling four times. The correctness of this sampling step is guaranteed by Proposition 3.2 (2). We show below how to do the sampling and establish the first linear equation (4) in details. The other equations in (5) are similarly obtained.
For everyuj, substituteuj= bjintof0, . . . , f5, wherebjis a ran-dom rational number. For instance, we chooseb = (b0,b1,b2,b3)=
(5, 6, 11, 32). We substituteuj = bj, and rename the resulting polyno-mials as ef0, . . . ,ef5. Note efk= fk(b,p, λ). We compute a generator
of q
⟨ ef0, . . . ,fe5⟩ ∩ Q[p0] and make it monic:
e д(1)(p 0) = p 3 0− 7 5p 2 0+ 481 1458p 0− 5 243. By Proposition 3.2 (2), ifb is generic in C4, then
e д(1)(p 0)= Ef(b,p0) A3(b) . So coeff( e д(1), p2 0)= A2(b) A3(b)
. By the discussion above, we haveA2(u) = S (u) Í3
k=0Ckuk, and byStep 2,A3(u) = S (u) 2 . So, we obtain −7 5 = coeff(eд (1), p2 0) = A2(b) A3(b) = 5C0+ 6C1+ 11C2+ 32C3 54 . (4)
Similarly, we obtain the other linear equations by samplings:
−311 120 = 11C0+ 2C1+ 3C2+ 8C3 24 , − 244 115 = 7C0+ 2C1+ 5C2+ 9C3 23 , (5) −181 110 = 7C0+ 3C1+ 13C2+ 21C3 44 . (6)
SolvingC0, . . . ,C3from the 4 equations (4)–(5), we have C0= − 43 10, C 1= −2, C2= − 3 2, C 3= − 4 5, and henceA2(u) = −S (u)(
43
10u0+2u1+ 3 2u2+
4
5u3). One can similarly
interpolateA0(u) and A1(u). Finally, the output is Ef = S (u) 2 p3 0− S (u)(43 10u0+2u1+ 3 2u2+ 4 5u3)p 2 0+ 1
5u0(29u0+23u1+21u2+14u3)p0− 12
5u 2
0. Also, it is straightforward to check the discriminant ofEf
with respect top0is 4u2 0(u0+ u1+ u2+ u3) 2(441u4 0+ 4998u 3 0u1+ 20041u 2 0u 2 1+ 33320u0u 3 1+ 19600u 4 1− 756u3 0u2+ 20034u 2 0u1u2+ 83370u0u 2 1u2+ 79800u 3 1u2− 5346u 2 0u 2 2+ 55890u0u1u 2 2+ 119025u2 1u 2 2+ 4860u0u 3 2+ 76950u1u 3 2+ 18225u 4 2− 1596u 3 0u3− 11116u 2 0u1u3− 17808u0u 2 1u3+ 4480u 3 1u3+ 7452u 2 0u2u3− 7752u0u1u2u3+ 49680u 2 1u2u3− 17172u0u 2 2u3+ 71460u1u 2 2u3+ 27540u 3 2u3+ 2116u 2 0u 2 3+ 6624u0u1u 2 3− 4224u 2 1u 2 3− 9528u0u2u 2 3+15264u1u2u 2 3+14724u 2 2u 2 3−1216u0u 3 3−512u1u 3 3+3264u2u 3 3+256u 4 3),
where the last factor is the mixed discriminant off0, . . . , f5.
6
COMPUTATIONAL RESULTS
In this section, we explain the implementation details, and com-pare the timings of Algorithm 1 and existing methods by testing a list of interesting algebraic models.
6.1
Implementation
First, we explain the implementation and experimental details. Testing models,Maple code and computational results are available online via:
https://sites.google.com/site/rootclassification/publications/supplementary- materials/ lle2018.
Software We implemented Algorithm 1 in Maple 2018, where
we use theFGb command fgb_gbasis_elim for computing
elimination ideals, for instance, in Algorithm 2-Lines 2, 11, Algorithm 4-Line 2 and Algorithm 6-Line 2.
Hardware and System We used a 3.2 GHz Intel Core i5 processor (8 GB of RAM) under OS X 10.9.3.
Testing Models Testing Models are chosen from the literatures [20, 27] and have been tested by both standard elimination method and Algorithm 1.
6.2
Computing elimination ideals
We have computed the radical elimination idealsEf for testing models by standard elimination, [40, Algorithm 2] and Algorithm 1. Table 1 compares the timings of the three methods.
Conclusion from Table 1: For smaller models with ML-degree less than 5, computing Gröbner bases directly (standard elimination) is the fastest method; for larger models with ML-degree greater than 5, Algorithm 1 is the fastest. Particularly, comparing columns “In-terpolation" and “Algorithm 1", we see the structure of elimination ideals indeed improves the efficiency significantly.
Instruction for Table 1:
(1) The columns “#pi” and “ML-Degree” give the number of proba-bility variablesn and ML-degree N , respectively.
(2) In the column “standard” we record the time to compute the elimination ideal⟨f ⟩ ∩ Q[u,p0]. When FGb returned no
out-put until we run out the memory, we record “∞”.
(3) We record timings of [40, Algorithm 2] and Algorithm 1 in the columns “Interpolation" and “Algorithm 1". Timings in italics font means the computation did not finish within two weeks, but we estimate the sampling timing providing a lower bound; see Example 6.1.
Example 6.1. In Table 1, the estimated total timings for sam-pling, seeStep 3 in Section 5.1, are displayed in italic. We ex-plain how to estimate these timings by the running example in Section 5.1. There are 4 parametersui (i = 0, 1, 2, 3). We know deg(A2,ui) are all 2. Also by (F1) and (A1),A2is homogenous and
deg(A2) = deg(A2,u0) = 2. So A2is a linear combination of 10 monomials. [40, Algorithm 2] interpolatesA2(u) directly without
any structure, so we need to sample 10 times. However, Algorithm 1 interpolatesR2(u), which is a factor of A2(u) as shown in the
running example, so we only need to sample 4 times since there are 4 possible monomials inR2(u). We check by Maple the timing
for doing sample once inStep 3 is 0.02 second. Then we estimate the timing of sampling in [40, Algorithm 2] and Algorithm 1 are 0.02 × 10 = 2 (seconds) and 0.02 × 4 = 0.08 (second), respectively.
6.3
Computing discriminants
GivenEf(u,p0), we can directly compute the discriminant of
Ef with respect top0, denoted bydiscr(Ef;p0), by eliminatingp0
fromEf and∂Ef
∂p0
. We summarize the computational timing of this method for computing discriminants in Table 2. Here, we have a list of remarks on Table 2:
(1) We can not save the large result for Model 6 into a text file. The size of a temporary file when we interrupt the saving process is 32 GB.
(2) For Model 6 and Model 9, the estimated timings for [40,
Al-gorithm 2] to computeDM Js are 13374 and 454833 days,
respectively. Note thatDM Jis a factor of discr(Ef;p0).
ACKNOWLEDGMENTS
We thank David A. Cox, Hoon Hong, Anne Shiu, Frank Sottile, and Bernd Sturmfels for their support and advice. TdW was partially supported by the DFG grant WO 2206/1-1. XT was partially sup-ported by the NSF (DMS-1513364, DMS-1752672, and CCF-1708884).
Models Degree Size
Our Method Ef discr(E f;p0) Total Model 4 110 7.5 MB 782.676 s 0.027 s 783 s Model 6 342 >32 GB 14 d 11.379 s 14 d Model 9 176 8.68 GB 2 d 81.015 s 2 d
Table 2: Runtimes for computing discriminants
REFERENCES
[1] C. Amendola, N. Bliss, I. Burke, C. R. Gibbons, M. Helmer, S. Hoşten, E. D. Nash, J. I. Rodriguez, and D. Smolkin. 2018. The maximum likelihood degree of toric varieties. (2018).Accepted by J. Symb. Comput. Arxiv: 1703.02251.
[2] D. S. Arnon and S. Dennis. 1988. A cluster-based cylindrical algebraic decompoi-sition algorithm.J. Symb. Comput. 5, 1 (1988), 189–212.
[3] S. Basu, R. Pollack, and M. F. Roy. 1996. On the combinatorial and algebraic complexity of quantifier elimination.Journal of ACM 43, 6 (1996), 1002–1045. [4] S. Basu, R. Pollack, and M. F. Roy. 1999. Computing roadmaps of semi-algebraic
sets on a variety.Journal of the AMS 3, 1 (1999), 55–82.
[5] S. Basu, R. Pollack, and M. F. Roy. 2006.Algorithms in Real Algebraic Geometry. Springer-Verlag.
[6] E. Becker, M. G. Marinari, T. Mora, and C. Traverso. 1994. The shape of the Shape Lemma. InProceedings of ISSAC’94. ACM New York, 129–133.
[7] C. W. Brown. 2001. Improved projection for cylindrical algebraic decomposition. J. Symb. Comput. 32, 5 (2001), 447–465.
[8] C. W. Brown. 2001. Simple CAD construction and its applications. J. Symb. Comput. 31, 5 (2001), 521–547.
[9] C. W. Brown. 2003. QEPCAD B: a program for computing with semi-algebraic sets using CADs.ACM SIGSAM Bulletin 37, 4 (2003), 97–108.
[10] C. W. Brown. 2012. Fast simplifications for Tarski formulas based on monomial inequalities.J. Symb. Comput 47, 7 (2012), 859–882.
[11] C. W. Brown. 2013.Constructing a single open cell in a cylindrical algebraic decomposition. InISSAC Proceedings of the International Symposium on Symbolic and Algebraic Computation. acm, 133–140.
[12] B. Buchberger. 1965. An algorithm for finding the basis elements of the residue class ring of a zero dimensional polynomial ideal.J. Symb. Comput. 41, 3 (1965), 475–511.
[13] M.-L. G. Buot, S. Hoşten, and D. Richards. 2007. Counting and locating the solu-tions of polynomial systems of maximum likelihood equasolu-tions, II: The Behrens-Fisher problem.Statistica Sinica 17 (2007), 1343–1354.
[14] F. Catanese, S. Hoşten, A. Khetan, and B. Sturmfels. 2006. The maximum likeli-hood degree.Amer. J. Math. 128, 3 (2006), 671–697.
[15] C. Chen, J. H. Davemport, J. P. May, M. M. Maza, B. Xia, and R. Xiao. 2010. Triangular decomposition of semi-algebraic systems.. InISSAC’10 Proceedings of the 35th International Symposium on Symbolic and Algebraic Computation. ACM New York, 187–194.
[16] G. E. Collins. 1975. Quantifier Elimination for the Elementary Theory of Real Closed Fields by Cylindrical Algebraic Decomposition. InLecture Notes In Com-puter Science, Vol. 33. Springer-Verlag, Berlin, 134–183.
[17] G. E. Collins and H. Hong. 1991. Cylindrical algebraic decomposition for quntifier elimination.J. Symb. Comput. 12, 3 (1991), 299–328.
[18] D. A. Cox, J. Little, and D. Oshea. 2015.Ideals, varieties, and algorithms: an intro-duction to computational algebraic geometry and commutative algebra. Springer. [19] A. Dickenstein, M. P. Millán, A. Shiu, and X. Tang. 2019.Multistationarity in
structrued reaction networks. Bulletin of Mathematical Biology. 81, 5 (2019), 1527–1581.
[20] M. Drton, B. Sturmfels, and S. Sullivant. 2009. Lectures on algebraic statistics. Springer.
[21] J. C. Faugère. 1999. A new efficient algorithm for computing Gröbner bases (F4). Journal of Pure and Applied Algebra 139, 1 (1999), 61–88.
[22] J. C. Faugère, P. Gianni, D. Lazard, and T. Mora. 1993. Efficient computation of zero-dimensional Gröbner bases by change of ordering.J. Symb. Comput. 16, 4 (1993), 329–344.
[23] D. Grigoriev. 1988. Complexity of deciding Tarski algebra.J. Symb. Comput. 5, 1–2 (1988), 65–108.
[24] E. Gross, M. Drton, and S. Petrović. 2012. Maximum likelihood degree of variance component models.Electronic Journal of Statistics 6 (2012), 993–1016. [25] E. Gross and J. I. Rodriguez. 2014. Maximum likelihood geometry in the presence
of data zeros. InISSAC’14 Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation. ACM New York, 232–239.
[26] J. Hauenstein, J. I. Rodriguez, and B. Sturmfels. 2012. Maximum likelihood for matrices with rank constraints.Journal of Algebraic Statistics (2012). To Appear. [27] S. Hoşten, A. Khetan, and B. Sturmfels. 2005. Solving the likelihood equations.
[28] H. Hong. 1990. An improvement of the projection operator in cylindrical algebraic decomposition. InISSAC Proceedings of the International Symposium on Symbolic and Algebraic Computation. ACM, 261–264.
[29] H. Hong. 1992.Simple solution formula construction in cylindrical algebraic decomposition based quantifier elimination. InISSAC Proceedings of the Interna-tional Symposium on Symbolic and Algebraic Computation. ACM, 177–188. [30] H. Hong and M. Safey EI Din. 2012.Variant Quantifier Elimination. J. Symb.
Comput. 47, 7 (2012), 883–901.
[31] J. Huh and B. Sturmfels. 2014.Likelihood geometry. Number 63–117. Springer International Publishing.
[32] D. Kapur, Y. Sun, and D. K. Wang. 2010. A new algorithm for computing com-prehensive Gröbner systems. InISSAC’10 Proceedings of the 35th International Symposium on Symbolic and Algebraic Computation. 29–36.
[33] D. Lazard and F. Rouillier. 2005. Solving parametric polynomial systems.Journal of Symbolic Computation 42, 6 (2005), 636–667.
[34] S. McCallum. 1988. An improved projection operation for cylindrical algebraic decomposition of three-dimensional space.J. Symb. Comput. 5, 1 (1988), 141–161. [35] S. McCallum. 1999. On projection in CAD-Based quantifier elimination with
equational constrants. InISSAC Proceedings of the International Symposium on Symbolic and Algebraic Computation. ACM, 145–149.
[36] J. Renegar. 1992. On the computational comlexity and geometry of the first-order theory of the reals. Part I.J. Symb. Comput. 13, 3 (1992), 255–299.
[37] J. Renegar. 1992. On the computational comlexity and geometry of the first-order theory of the reals. Part II.J. Symb. Comput. 13, 3 (1992), 301–327.
[38] J. Renegar. 1992. On the computational comlexity and geometry of the first-order theory of the reals. Part III.J. Symb. Comput. 13, 3 (1992), 329–352.
[39] J. I. Rodriguez and X. Tang. 2015. Data-Discriminants of Likelihood Equations. InISSAC’15 Proceedings of the 40th International Symposium on Symbolic and Algebraic Computation. ACM New York, 307–314.
[40] J. I. Rodriguez and X. Tang. 2017. A Probabilistic Algorithm for Computing Data-Discriminants of Likelihood Equations.Journal of Symbolic Computation 83 (2017), 342–364.
[41] M. Safey EI Din and E. Schost. 2003. Polar varieties and computation of one point in each connected component of a smooth real algebraic set. InISSAC’03 Proceedings of International Symposium on Symbolic and Algebraic Computation. 224–231.
[42] M. Safey EI Din and E. Schost. 2004. Properness defects of projections and computaion of in each connected component of a real algebraic set.Discrete and Computational Geometry 32, 3 (2004), 417–430.
[43] B. Sturmfels. 2002. Solving systems of polynomial equations. InRegional confer-ence series in mathematics. Vol. 97. American Mathematical Society, Providconfer-ence, R. I.
[44] S. Sullivant. 2018.Algebraic Statistics. Graduate Studies in Mathematics, Vol. 194. American Mathematical Society.
[45] A. Tarski. 1951.A decision method for elementary algebra and geometry. University of California Press. University of California Press. University of California Press. [46] M. B. Villarino, W. Gasarch, and K. W. Regan. 2018. Hilbert’s proof of his
irre-ducibility theorem.The American Mathematical Monthly 125, 6 (2018), 513–530. [47] L. Yang, X. Hou, and B. Xia. 2001. A complete algorithm for automated discovering of a class of inequality-type theorems. Science in China Series F: Information Sciences 44, 1 (2001), 33–49.