Real Polynomial Conversion (RPC) - Techniques for Polynomial Conversion

Techniques Using Mixed Integer Linear Programming

5.2 Techniques for Polynomial Conversion

5.2.3 Real Polynomial Conversion (RPC)

In [31], J. Borghoff et al. provided a method based on converting polynomial equations over F2 into polynomial equations over R. We call this method Real Polynomial Con-version (RPC). They studied their method for systems of polynomial equations due to the Bivium Cipher but an algorithm for general systems of polynomial equations is still missing. We provide an algorithm for RPC to solve general systems of polynomial equations. The first ingredient that we need is the following definition.

Definition 5.2.5. The standard conversion is given by the map φ : F2 = {0, 1} → {0, 1} ⊂ R defined by φ(0) = 0 and φ(1) = 1. The map φ can be extended to a map Φ : F2[x₁, . . . , x_n] −→ R[X1, . . . , X_n] defined by

c 7→ φ(c) x_i 7→ X_i

where c ∈ F2. Then the standard representation of a polynomial f ∈ F2[x₁, . . . , x_n] is Φ(f ).

So the task of solving the polynomial equation system f₁ = · · · = f_m = 0 can be rephrased as follows: Find a tuple (a₁, . . . , a_n) ∈ {0, 1}ⁿ such that

F₁(a₁, . . . , a_n) ≡ 0 (mod 2) ...

F_m(a₁, . . . , a_n) ≡ 0 (mod 2)

(5.4)

where F_i ∈ R[X1, . . . , X_n] are standard representations of the polynomials f_i. Thus we are looking for an integer solution (a₁, . . . , a_n) of the system 5.4 which satisfies 0 ≤ a_i ≤ 1. So the idea is to formulate this system as a system of linear equalities and inequalities over R and solve it using an IP-solver.

Example 5.2.6. Consider the polynomial f = x₁x₂+x₃x₄+x₅+x₆+1 ∈ F2[x₁, . . . , x₆].

In the following we explicitly explain how to lift this polynomial over R using standard representation in such a way that the residue class of a zero of Φ(f ) in F2 represent a zero of f . We use the following conversion rules for addition and multiplication.

Φ(x_ix_j) = X_iX_j

Φ(x_i+ x_j) = X_i+ X_j − 2X_iX_j

Considering each term as a node we apply the map Φ once for each pair of nodes. This results in the following conversion steps.

1) f = (x₁x₂+ x₃x₄) + (x₅+ x₆) + 1.

2) Taking standard representation we have

(X₁X₂+ X₃X₄− 2X₁X₂X₃X₄) + (X₅+ X₆− 2X₅X₆) + 1.

3) Let f⁰ = X₁X₂ + X₃X₄ − 2X₁X₂X₃X₄, and f⁰⁰ = X₅ + X₆− 2X₅X₆. Now the polynomial in step 2) becomes (f⁰) + (f⁰⁰) + 1.

4) Taking standard representation we have f⁰+ f⁰⁰− 2f⁰f⁰⁰+ 1.

5) Let f⁰⁰⁰ = f⁰ + f⁰⁰− 2f⁰f⁰⁰. Now the polynomial in step 4) becomes f⁰⁰⁰+ 1 6) Finally, taking standard representation we have f⁰⁰⁰+ 1 − 2f⁰⁰⁰ = 1 − f⁰⁰⁰. By substituting the values of f⁰, f⁰⁰ and f⁰⁰⁰ we have the polynomial

F = 8X₁X₂X₃X₄X₅X₆− 4X₁X₂X₃X₄X₅− 4X₁X₂X₃X₄X₆+ 2X₁X₂X₃X₄

−4X₁X₂X₅X₆− 4X₃X₄X₅X₆+ 2X₁X₂X₅+ 2X₃X₄X₅+ 2X₁X₂X₆+ 2X₃X₄X₆

−X₁X₂− X₃X₄+ 2X₅X₆ − X₅− X₆+ 1 ∈ R[X1, . . . , X₆]

(5.5) The polynomial F has 16 terms in its support and degree 6.

The effect of standard representation is that every tuple (a₁, . . . , a_n) ∈ {0, 1}ⁿ at which F is satisfied corresponds uniquely to a zero of f in Fⁿ2, that is, the residue class of (a₁, . . . , a_n) in Fⁿ2 represent a zero of f . To see this it suffices to observe the standard conversion rule for addition which is given by the following table.

x₁ x₂ x₁+ x₂ X₁+ X₂− 2 · X₁· X₂

0 0 0 0

0 1 1 1

1 0 1 1

1 1 0 0

The standard representation results in increasing degree and increasing number of terms over the real domain.

Remark 5.2.7. (Splitting)

To keep the degrees of converted polynomials low, we introduce some new auxiliary variables. This will split a long polynomial into smaller polynomials, then we take its standard representations. The maximum number of terms in a polynomial over F² could be four to keep the real polynomial quadratic. For instance, the equation x1x2+ x3x4+ x5+ x6+ 1 = 0 can be split up into two equations y1+ x1x2 = x3x4+ x5

and y1 = x6+ 1 having at most four terms. The variable y1 is the splitting variable.

To keep the degree of real polynomial two we introduce two more variables y2 and y3

as follows:

y₁+ y₂ = y₃+ x₅ y1 = x6+ 1 y₂ = x₁x₂ y3 = x3x4

Now the standard representation results in the following four quadratic equations which hold over reals.

Y1+ Y2− 2Y1Y2 = Y3+ X5− 2Y3X5

Y₁ = 1 − X₆ Y2− X1X2 = 0 Y₃− X₃X₄ = 0

While converting a boolean equation, we ensure that the new equations are defined over R. The only requirement we have is that the solution of the system over F2 is also a solution of the real system. The additional non-binary solutions of the real system can be ignored.

In the following we abuse the notation Tⁿ. Since the monoid of terms Tⁿ does not depend on the ring of coefficients, we consider Tⁿ as monoid of terms of F2[x₁, . . . , x_n] and R[X1, . . . , X_n]. The only distinction we make is the following. An element of the monoid of terms for F2[x₁, . . . , x_n] will be denoted by t and an element of the monoid of terms for R[X1, . . . , X_n] will be denoted by T . The following proposition turns above ideas into an effective algorithm.

Proposition 5.2.8. (Real Polynomial Conversion (RPC) )

Let f₁, . . . , f_m ∈ P = F2[x₁, . . . , x_n]. Then the following instructions define an algo-rithm which computes a tuple (a₁, . . . , a_n) ∈ {0, 1}ⁿ whose residue class in Fⁿ2 represent a zero of the 0-dimensional radical ideal I = hf₁, . . . , f_m, x²₁+ x₁, . . . , x²_n+ x_ni.

1) Reduce f₁, . . . , f_m modulo the field equations, i.e. make their support squarefree.

For i = 1, . . . , m, let S_i be the set of terms of degree ≥ 2 in f_i. Let S =Sm i=1S_i and s = |S|.

2) For every t_j ∈ S, introduce a new indeterminate x_n+j and form the equation f_m+j⁰ : x_n+j = t_j. For i = 1, . . . , m, write f_i = P

jt_j + `_i where the sum extends over all j such that t_j ∈ S_i and where `_i ∈ P_≤1. Form the equation f_i⁰ :P

jx_n+j+ `_i = 0.

3) For i = 1, . . . , m + s, let F_i be the equation which is the standard representation of f_i⁰. Let S_i⁰ be the set of terms of degree ≥ 2 in F_i and let S⁰ =Sm+s

i=1 S_i⁰. 4) For every T_k∈ S⁰, introduce a new real indeterminate X_n+s+k. For i = 1, . . . , m+

s, replace T_k ∈ S_i⁰ by X_n+s+k in the support of F_i. This makes F_i linear.

5) For T_k ∈ S⁰, write T_k =Q

α∈N_kX_α. Form the linear inequalities I_n+s+k : P

α∈NkX_α− X_n+s+k ≤ |N_k| − 1, and

I_kα : X_α ≥ X_n+s+k for all α ∈ N_k.

6) For all α ∈ {1, . . . , n}, let I_α : X_α ≤ 1.

7) Choose a linear polynomial C ∈ Q[Xα, X_n+j, X_n+s+k] and use an IP solver to find the tuple of natural numbers (a_α, a_n+j, a_n+s+k) which solves the system of equations and inequalities {F_i, I_n+s+k, I_kα, I_α} and minimizes (or maximizes) C.

8) Return (a₁, . . . , a_n) and stop.

Proof. For α = 1, . . . , n, we are looking for natural numbers aα for which Iα holds, therefore we have aα ∈ {0, 1}. Similarly, we have an+j ∈ {0, 1} by Iα and Fm+j

where j = 1, . . . , s. Also we have an+s+k ∈ {0, 1} by Iα, Fm+j and Ikα. Moreover, if T_k = Q

α∈NkX_α ∈ S⁰ and if one of the numbers a_α for α ∈ N_k is zero then I_kα implies a_n+s+k = 0. On the other hand, if a_α = 1 for all α ∈ N_k then I_n+s+k implies a_n+s+k ≥ 1. Altogether, this means that a_n+s+k equals Q

α∈Nka_α, the value of T_k at (a₁, . . . , a_n, a_n+1, . . . , a_n+s).

Next it follows from standard representation 5.2.5 that F_i ∈ {0, 1}. In this way the solutions of the IP problem correspond uniquely to the tuples (a₁, . . . , a_n) ∈ {0, 1}ⁿ which satisfy the above reformulation of the given polynomial system.

Assume that we are in the setting of the algorithm in Proposition 5.2.8. Note that if max{deg(f_i) | i ∈ {1, . . . , m}} ≤ 2 and for i = 1, . . . , m, the maximum number of terms in the support of f_i does not exceed 4, the algorithm works with quadratic polynomials in all of its iterations.

Remark 5.2.9. Assume that we are in the setting of the algorithm in Proposition 5.2.8. As in Remark 5.2.3, if we can find a feasible binary/integer-valued solution for the MILP for an arbitrary objective function, this solution can be converted into a solution for the original system. Hence it is not important to find a minimal (or maximum) solution but a feasible point. But we have three natural questions again.

Which linear function might be a good objective function? Which variables should be restricted to be binary or integers? Which optimization direction (maximize or minimize) should we choose?

An objective function can affect the running time of an IP solver strongly. We try to study it with the help of computation experiences in Section 5.2.4. A partial answer to the second question could be the following. The difficulty of solving a mixed integer program depends more on the number of integer variables than on the number

of continuous variables (see [87]). Therefore our intuition tells us to keep as many variables continues as we can. As proposed by F. Glover and E. Woolsey in [87], the linear inequalities in step 4) of the algorithm keep the variables X_n+s+k continuous. It is however necessary to keep upper bounds of 1 on these variables, as noted by A.J.

Goldman [88].

In view of these remarks we fix variables as follows. The initial state variables X₁, . . . , X_nwill be forced to take on binary values. All other newly introduced variables will be kept continuous in the interval [0, 1]. These variables depend on the initial state variables. This means that we do not have to restrict them to be integer or binary. In Section 5.2.4 we confirm our intuition by experiments.

Again we do not have an answer for the third question at this stage but we remark that it can affect the running time of an IP solver in certain cases. We try to study it with the help of computation experiences in Section 5.2.4.

To understand Proposition 5.2.8 better, we now apply it in a concrete case.

Example 5.2.10. Over the field K = F2, consider f₁, f₂, f₃ ∈ K[x₁, x₂, x₃], where f₁ = x₁x₂+ x₁x₃ + 1, f₂ = x₁x₃+ x₂x₃ + x₁, and f₃ = x₁x₂+ x₁x₃ + x₂+ 1. Let us follow the steps of the algorithm in Proposition 5.2.8.

1) Let S1 = {x1x2, x1x3}, S2 = {x1x3, x2x3}, and S3 = {x1x2, x1x3}. Let S = {x1x2, x1x3, x2x3} and s = 3.

2) Introduce new indeterminates x₁, x₂, x₃. Form the equations f₄⁰ : x₄ = x₁x₂, f₅⁰ : x₅ = x₁x₃ and f₆⁰ : x₆ = x₂x₃. Form the equations f₁⁰ : x₄ = x₅ + 1, f₂⁰ : x₅ = x₆+ x₁ and f₃⁰ : x₄+ x₅ = x₂+ 1.

3) The standard representations of the equations f₁⁰, . . . , f₆⁰ are:

F1 : X4+ X5− 1 = 0, F2 : X5− X6− X1+ 2X1X6 = 0, F3 : X4+ X5− 2X4X5+ X2− 1 = 0, F4 : X4− X1X2 = 0, F5 : X5− X1X3 = 0, F6 : X6− X2X3 = 0.

Let S₁⁰ = ∅, S₂⁰ = {X₁X₆}, S₃⁰ = {X₄X₅}, S₄⁰ = {X₁X₂}, S₅⁰ = {X₁X₃} and S₆⁰ = {X₂X₃}. Let S⁰ = {X₁X₂, X₁X₃, X₁X₆, X₂X₃, X₄X₅}.

4) Introduce new real indeterminates X₇, . . . , X₁₁for X₁X₂, X₁X₃, X₁X₆, X₂X₃, X₄X₅

respectively. Using new real indeterminates linearize F_i as follows F₁ : X₄+ X₅− 1 = 0, F₂ : X₅− X₆− X₁+ 2X₉ = 0, F₃ : X₄+ X₅− 2X₁₁+ X₂− 1 = 0, F₄ : X₄− X₇ = 0, F₅ : X₅− X₈ = 0, F₆ : X₆− X₁₀ = 0.

5) Form the linear inequalities

I₇ : X₁+ X₂− X₇ ≤ 1, I₁₁ : X₁ ≥ X₇, I₁₂ : X₂ ≥ X₇, I₈ : X₁+ X₃− X₈ ≤ 1, I₂₁ : X₁ ≥ X₈, I₂₃ : X₃ ≥ X₈, I₉ : X₁+ X₆− X₉ ≤ 1, I₃₁ : X₁ ≥ X₉, I₃₆ : X₆ ≥ X₉. I₁₀ : X₂+ X₃ − X₁₀≤ 1, I₄₂ : X₂ ≥ X₁₀, I₄₃ : X₃ ≥ X₁₀, I₁₁ : X₄+ X₅ − X₁₁≤ 1, I₅₄ : X₄ ≥ X₁₁, I₄₅ : X₅ ≥ X₁₁.

6) Let I₁ : X₁ ≤ 1, I₂ : X₂ ≤ 1 and I₃ : X₃ ≤ 1.

7) Let C = X₁+ X₂+ X₃. Now use an IP solver to minimize C subject to {F₁, . . . , F₆, I₇, . . . , I₁₁, I₁₁, I₁₂, I₂₁, I₂₃, I₃₁, I₃₆, I₄₂, I₄₃, I₅₄, I₅₅, I₁, I₂, I₃}.

8) Choose values for X1, X2 and X3 from the solution provided by an IP solver.

This will return (1, 0, 1).

Remark 5.2.11. Integer Polynomial Conversion (IPC) introduces one new integer variable per term and per equation. In hope of getting more and stronger constraints one can do the following. Apply RPC to equations with no more than three terms.

In this case the number of terms per equation and the number of new variables is the same as when using the IPC. But by replacing a quadratic term by a new variable we get three constraints instead of only the restriction that the variable is binary. It looks like that we get stronger constraints by using RPC in these cases. For equations with more than three terms we use the IPC. We call this strategy Mixed Polynomial Conversion (MPC) and is omitted. But computational experiences shows that MPC does not provide any improvement.

In document New Techniques for Polynomial System Solving (Page 133-139)