Exact Floating-Point Algorithms in RealAlgebraic

In this chapter we discuss, implement and evaluate strategies to integrate error- free transformations into an expression dag based number type on the example of

RealAlgebraic. We have seen in Section 4.1.3, that exact floating-point algorithms based on error-free transformations can lead to very fast and exact implementations of arithmetic. But error-free transformations also have quite a few limitations. Basically, only ring operations can be implemented exactly and even these may fail in case overflow or underflow occurs.

Utilizing error-free transformations directly is far from simple or straightforward. To allow a non-expert to benefit from their efficiency, it is necessary to wrap them into a more user-friendly solution, for example a number type. The main goal of expression dag based number types is to integrate fast evaluation strategies, which may occasionally fail to compute a sign, with more expensive but also more conservative strategies, into a user-friendly solution that eventually computes all signs correctly. Error-free transformations, with their high speed but also their apparent limitations fit well into this scheme.

In the first part of this chapter, we discuss our LocalPolicy model

Local_dou-

ble_sum

, which places exact arithmetic based on error-free transformations at

the very first evaluation stage. This way, dag creation is avoided and deferred to the point where error-free transformations can not provide an exact result for an operation. Since there are many ways to implement exact arithmetic based on error- free transformations,

Local_double_sum

is again configurable by several policies. We present our concepts and models, and the rationale behind them.

In the second part, we compare different variants of

Local_double_sum

by means of experiments, with the goal of finding an optimal

Local_double_sum

variant. Our findings are, that in general, the choice of implementation parameters depends on the geometric problem solved. We therefore give guidelines, how one may choose these parameters a priori without resorting to experiments and give a default variant, which should give good performance in most circumstances.

We then compare the default

Local_double_sum

variant to other RealAlgebraic variants and other expression dag based number type. It turns out that placing exact arithmetic based on error-free transformations at the very first evaluation stage, improves performance for geometric problems and input data with many near degenerate configurations but is less advisable for non-degenerate data. Therefore

we close with a short discussion how error-free transformations may be integrated into the later stages of expression dag evaluation.

5.1. Deferring Dag Construction

In this section we catch up on the discussion of LocalPolicy model

Local_dou-

ble_sum

, which we omitted previously in Chapter 3. The goal behind

Local_dou-

ble_sum

is to let RealAlgebraic benefit from the speed of error-free transformations

and exact floating-point algorithms. To this end,

Local_double_sum

represents a number as the sum of a sequence of floating-point numbers and provides basic arithmetic operations on this representation, as required by the LocalPolicy concept. Just like RealAlgebraic itself,

Local_double_sum

is configurable by means of policies, where each policy reflects a single implementation alternative. The design and implementation are based on the following considerations.

When we first started work on

Local_double_sum

, our RealAlgebraic imple- mentation was much slower that exact predicate implementations based on error-free transformations for all types of input data. Therefore we decided to bring error-free transformations to the very first evaluation stage, i.e., on the LocalPolicy level and before dag creation. One source of overhead in the first evaluation stages is dynamic memory management for dag node creation. But if the floating-point filter can de- termine the sign, dag creation was pointless. By deferring dag creation we postpone and potentially eliminate this source of overhead. To avoid memory management within

Local_double_sum

, too, we limit the number of summands to some small constant. The decision to represent a number as a sum of floating-point numbers does not permit operations such as division or radicals. For the remaining operations addition, subtraction, and multiplication it is far from clear how to implement them optimally. Finally, a framework like RealAlgebraic promises its user nearly uncondi- tional correctness of computations, i.e., the only limit should be memory or time constraints. Inexactness arising from overflow or underflow are not acceptable and therefore must be taken care of without the user noticing. With this in mind, we identify four orthogonal design decisions which are reflected in four concepts that govern the behavior of our implementation.

DoubleSumOperations: Models for this concept provide a set of raw operations on sums of floating-point numbers, namely ring operations, sign computation and compression. A compression transforms a sum of floating-point numbers into an equivalent sum with potentially fewer summands.

DoubleSumMaxLength: Models provide a limit on the number of summands al- lowed.

DoubleSumCompression: Models decide when and how to apply compression to reduce the number of summands.

DoubleSumProtection: Models provide a systematic way to handle overflow or underflow.

ExpressionDagPolicy DataMediator Basic_expression_dags Local_double_sum_to_expression_dag_node_mediator Expansion_to_expression_dag_node_mediator Local_double_sum_to_expression_dag_tree_mediator LocalPolicy Local_double_sum_to_expression_dag_mediator_statistics Local_double_sum Double_sum_storage DoubleSumOperations Double_sum_plain_operations DoubleSumProtection Double_sum_expansion_zeroelim_operations Double_sum_no_protection Double_sum_expansion_zeroelim_selfprotect_operations Double_sum_warning_protection Double_sum_restoring_protection DoubleSumCompression Double_sum_no_compression DoubleSumMaxLength Double_sum_lazy_compression Int_to_type<N> Double_sum_lazy_aggressive_compression Double_sum_permanent_compression

closely coupled, host policy

Figure 5.1. Concepts and models inLocal_double_sum.

These four policies are complemented by the DataMediator policy of RealAlgebraic. Contrary to the other LocalPolicy models, for

Local_double_sum

there are several interesting ways to convert a sum of floating-point numbers into an expression dag. Here we benefit from our earlier decision to separate conversion to an expression dag from the LocalPolicy concept. An overview of the five relevant concepts and their models is given in Figure 5.1, which complements Figure 3.1.

5.1.1. Basic Arithmetic Operations. Our decision to implement a LocalPolicy based on error-free transformations, as well as the decision of how to implement arithmetic operations is based on the experiments from Section 4.1.3. We have the following three models for DoubleSumOperations.

Double_sum_plain_operations

(

plaiO

): This model is build around the Signk

algorithm, which leads to the fastest predicates in Section 4.1.3. Arithmetic operations are implemented as plainly as possible. Addition and subtraction simply copy summands, the multiplication performs TwoProduct for all pairs of summands. To compress a sum, we perform VecSum, but additionally eliminate zero summands on the fly. Sign computation is performed using Signk.

In Figure 4.3, sign computation based on AccSum shows a performance similar to Signk, however Signk is better suited in our case for practical reasons.

Signk improves the representation of a sum with each VecSum step and never increases the number of summands. We can thus let it work on the original sum. AccSum on the other hand produces an additional summand in the extrac- tion step, which complicates its use in a framework with a limited number of summands.

Double_sum_expansion_zeroelim_operations

(

expaO

): This model is based

on arithmetic with floating-point expansions as described by Shewchuk[104, 105]. We maintain zero-free, strongly non-overlapping expansions. The basic

algorithms for arithmetic on this type of expansions are FastExpansionSum, the corresponding subtraction routine, and ScaleExpansion. We use an implementation of these algorithms that additionally eliminates zero summands on the fly. We implement missing functionality, i.e., operations involving expansions with only one summand and a general multiplication routine following suggestions by Shewchuk[104, section 2.8]. For compression, we use Compress. The sign of a zero-free expansion is always the sign of the most significant summand and can simply be read of.

Double_sum_expansion_zeroelim_selfprotect_operations

(

protO

This model derives from

expaO

and performs all operations in exactly the same manner, but is free from overflow and underflow. We discuss the way this is achieved in Section 5.1.3 below.

Note that the models employ two opposing strategies. With plain sums, arithmetic operations are lazy, at the cost of unstructured sums and potentially many summands. Compression and sign computation do all the work. With expansions, a normal form is maintained by arithmetic operations. This makes them more expensive, but also reduces the number of summands. The sign of an expansion can be determined at no extra cost!

5.1.2. Number of Summands and Compression. Arithmetic operations on sums of floating-point numbers rapidly increase the number of summands. For polynomial expressions c over floating-point numbers, define #s(c) by #s(f ) = 1 for

f _{∈ ˜}_{F, and}

#s(a ± b) = #s(a) + #s(b), #s(a × b) = 2#s(a)#s(b).

Then #s(c) is an upper bound on the number of floating-point summands required to represent c. Note that #s(c) grows exponentially in the degree of the expression c. The two individual rules are sharp in general. To see this in case of the addition, let

e= "−2i_m n i=1, f = " −2n−2 j m m j=1.

Then the sequence e1, e2, . . ., en, f1, f2, . . ., fm is a non-adjacent and maximally

non-overlapping expansion, representing e+ f and no sequence representing e + f with fewer summands exists. To show the claim for the multiplication, we have to choose the eiand fjsuch that the product ei× fjrequires two summands for storage.

Furthermore, we have to space the eiand fjsufficiently, such that the mn individual

products do not interfere with each other. This is for example achieved by setting

e= "−3i_m (1 + 2"m)n_i₌₁, f = "−4n j_m (1 + 2"m)m_j₌₁.

The arithmetic operations implemented in

plaiO

attain the given bounds. We know on the other hand, that polynomial expressions of degree d over b bit integers can be evaluated exactly with d(b + O(1)) bit precision [56]. This translates to roughly (d × b)/p summands, which is linear in the degree d. The arithmetic operations of

expaO

may show better behavior than predicted by #s in many cases, but we already

observed in Section 4.2.3, that without compression, expansions often carry only a few bits of information per summand.

For these reasons, it is likely that often a more compact representation of a number with fewer summands can be computed. Fewer summands make further operations on a number cheaper and, since we limit the maximum number of summands, enable us to defer dag creation for larger expressions. Since it is unclear when to attempt compression in an optimal way, we implement the following schemes.

Double_sum_no_compression

(

noC

): No compression is triggered.

Double_sum_lazy_compression

(

lazyC

): Triggers a single compression step

on the operands to an arithmetic operation, if the number of summands of the result, as predicted by #s, is larger than the maximum number of summands.

Double_sum_lazy_aggressive_compression

(

laagC

): Initially behaves like

lazyC

, but triggers additional compression steps as long as the number of

summands was decreased in the previous step.

Double_sum_permanent_compression

(

permC

): Triggers a single compression

step on each result of an arithmetic operation.

From top to bottom, these policies provide an increasing amount of additional compression. Due to the different approaches in our DoubleSumOperations models, we can expect, that

plaiO

will benefit more from additional compression than

expaO

5.1.3. Handling Floating-Point Exceptions. Error-free transformations are, de- spite their name, not completely free from errors or inexactness. Any such error or inexactness is however linked to a floating-point exception. Since we consider polynomial expressions only, and make sure that input numbers are always elements from F, underflow and overflow are the only relevant exceptions to us.

The IEEE 754 standard requires the availability of a set of flags that are raised when an exception occurs and may be checked and reset by the user. This provides us with a way to be notified of floating-point exceptions after the fact. Based on this mechanism we provide the following three policies for handling floating-point exceptions. For these to be effective, a basic condition is that operations terminate in all cases, especially if floating-point exceptions occur. Hence our rigorous discussion that Signk always terminates.

Double_sum_restoring_protection

(

restP

): This model stores a backup of

In document Algorithm engineering for expression dag based number types (Page 145-150)