• No results found

Underlying Type

Chapter 12. Composite Objects

12.3. Underlying Type

In Chapters 2 through 5 we studied algorithms on mathematical values and saw how equational reasoning as enabled by regular types applies to algorithms as well as to proofs. In Chapters 6

through 11 we studied algorithms on memory and saw how equational reasoning remains useful in a world with changing state. We dealt with small objects, such as integers and pointers, which are cheaply assigned and copied. In this chapter we introduced composite objects that satisfy the

requirements of regular types and can thus be used as elements of other composite objects. Dynamic sequences and other composite objects that separate the header from the remote parts allow for an efficient way to implement rearrangements: moving headers without moving the remote parts.

To understand the problem of an inefficient rearrangement involving composite objects, consider the swap_basic procedure defined as follows:

template<typename T> requires(Regular(T)) void swap_basic(T& x, T& y) {

T tmp = x; x = y; y = tmp; }

Suppose that we call swap_basic(a, b) to interchange two dynamic sequences. The copy construction and the two assignments it performs take linear time. Furthermore, an out-of-memory exception could occur even though no net increase of memory is needed.

We could avoid this expensive copying by specializing swap_basic to swap the headers of the specific dynamic sequence type and, if necessary, update links from the remote parts to the header. There are, however, problems with specializing swap_basic. First, it needs to be repeated for each data structure. More important, many rearrangement algorithms are not based on swap_basic, including in-place permutations, such as cycle_from, and algorithms that use a buffer, such as

merge_n_with_buffer. Finally, there are situations, such as reallocating a single-extent array, in which objects are moved from an old extent to a new one.

We want to generalize the idea of swapping headers to arbitrary rearrangements, to allow the use of buffer memory and reallocation, and to continue to write abstract algorithms that do not depend on the implementation of the objects they manipulate. To accomplish this, we associate every regular type T with its underlying type, U = UnderlyingType(T). The type U is identical to the type T when T has no remote parts or has remote parts with links back to the header.[7] Otherwise U is identical to type T in every respect except that it does not maintain ownership: Destruction does not affect the remote parts, and copy construction and assignment simply copy the header without copying the remote parts. When the underlying type is different from the original type, it has the same layout (bit pattern) as the header of the original type.

[7] This explains the warning against links from remote parts to the header in our discussion of doubly linked lists.

The fact that the same bit pattern could be interpreted as an object of a type and of its underlying type allows us to view the memory as one or the other, using the built-in reinterpret_cast function template. Objects of UnderlyingType(T) may only be used to hold temporary values while

implementing a rearrangement of objects of type T. The complexity of copy construction and assignment for a proper underlying type—one that is not identical to the original type—are proportional to the size of the header of type T. An additional benefit in this case is that copy construction and assignment for UnderlyingType(T) never throw an exception.

The implementation of the underlying type for an original type T is straightforward and could be automated. U = UnderlyingType(T) always has the same layout as the header of T. The copy constructor and assignment for U just copy the bits; they do not construct a copy of the remote parts of T. For example, the underlying type of pairT

0, T1 is a pair whose members are the underlying types of T0 and T1; similarly for other tuple types. The underlying type of array_kk, T is an array_kk whose elements are the underlying type of T.

Once UnderlyingType(T) has been defined, we can cast a reference to T into a reference to UnderlyingType(T), without performing any computation, with this procedure:

template<typename T> requires(Regular(T)) UnderlyingType(T)& underlying_ref(T& x) { return reinterpret_cast<UnderlyingType(T)&>(x); }

Now we can efficiently swap composite objects by rewriting swap_basic as follows:

template<typename T> requires(Regular(T)) void swap(T& x, T& y) {

UnderlyingType(T) tmp = underlying_ref(x); underlying_ref(x) = underlying_ref(y); underlying_ref(y) = tmp;

}

which could also be accomplished with:

swap_basic(underlying_ref(x), underlying_ref(y));

Many rearrangement algorithms can be modified for use with underlying type simply by reimplementing exchange_values and cycle_from the same way we reimplemented swap.

To handle other rearrangement algorithms, we use an iterator adapter. Such an adapter has the same traversal operations as the original iterator, but the value type is replaced by the underlying type of the original value type; source returns underlying_ref(source(x.i)), and sink returns

underlying_ref(sink(x.i)), where x is the adapter object, and i is the original iterator object inside

x.

Exercise 12.9.

Now we can reimplement reverse_n_with_temporary_buffer as follows:

template<typename I>

requires(Mutable(I) && ForwardIterator(I))

void reverse_n_with_temporary_buffer(I f, DistanceType(I) n) { // Precondition: mutable_counted_range(f, n) temporary_buffer<UnderlyingType(ValueType(I))> b(n); reverse_n_adaptive(underlying_iterator<I>(f), n, begin(b), size(b)); }

where underlying_iterator is the adapter from Exercise 12.9.

Project 12.5.

Use underlying type systematically throughout a major C++ library, such as STL, or design a new library based on the ideas in this book.

Algorithms C++ Software Engineering Programming Alexander Stepanov Paul McJones Addison-Wesley Professional Elements of Programming

12.4. Conclusions

We extended the structure types and constant-size array types of C++ to dynamic data structures with remote parts. The concepts of ownership and regularity determine treatment of parts by copy construction, assignment, equality, and total ordering. As we showed for the case of dynamic sequences, useful varieties of data structures should be carefully implemented, classified, and documented so that programmers can select the best one for each application. Rearrangements on nested data structures are efficiently implemented by temporarily relaxing the ownership invariant.

Algorithms C++ Software Engineering Programming Alexander Stepanov Paul McJones Addison-Wesley Professional Elements of Programming

Afterword

We recap the main themes of the book: regularity, concepts, algorithms and their interfaces, programming techniques, and meanings of pointers. For each theme, we also discuss its particular limitations.

Algorithms C++ Software Engineering Programming Alexander Stepanov Paul McJones Addison-Wesley Professional Elements of Programming

Regularity

Regular types define copy construction and assignment in terms of equality. Regular functions return equal results when applied to equal arguments. For example, regularity of transformations allowed us to define and reason about algorithms for analyzing orbits. Regularity was in fact relied on throughout

the book by ordering relations, the successor function for forward iterators, and many others.

When we work with built-in types, we usually treat the complexity of equality, copying, and

assignment as constant. When we deal with composite objects, the complexity of these operations is expected to be linear in the area of objects: the total amount of memory, including remote as well as local parts. Our expectation, however, that equality is at worst linear in the area of its arguments cannot always be met in practice.

For example, consider representing a multiset, or unordered collection of potentially repeated elements, as an unsorted dynamic sequence. Although inserting a new element takes constant time, testing two multisets for equality takes O(n log n) time to sort them and then compare them

lexicographically. If equality testing is infrequent, this is a good tradeoff; however, putting such multisets into a sequence to be searched with find could lead to unacceptable performance. For an extreme example, consider a situation in which the equality for a type must be implemented with graph isomorphism, a problem for which no polynomial-time algorithm is known.

We noted in Section 1.2 that when implementing behavioral equality on values is not feasible, we can often implement representational equality. For composite objects, we often implement

representational equality with the techniques of Section 7.4. Such structural equality is often useful in giving the semantics of copy construction and assignment and may be useful for other purposes. Recall that representational equality implies behavioral equality. Similarly, while a natural total ordering is not always realizable, a default total ordering based on structure (e.g., lexicographical ordering for sequences) allows us to efficiently sort and search. There are, of course, objects for which neither copy construction nor assignment—nor even equality—makes sense, because they own a unique resource.

Algorithms C++ Software Engineering Programming Alexander Stepanov Paul McJones Addison-Wesley Professional Elements of Programming

Concepts

We use concepts from abstract algebra—semigroups, monoids, and modules—to describe such algorithms as power, remainder, and gcd. In many cases we need to adapt standard mathematical concepts to fit algorithms. Sometimes, we introduce new concepts, such as HalvableMonoid, to strengthen requirements. Sometimes, we relax requirements, as with the partially associative property. Often we deal with partial domains, as with the definition-space predicate passed to collision_point. Mathematical concepts are tools to be used and freely modified. It is the same with concepts originating in computer science. The iterator concepts describe fundamental properties of certain algorithms and data structures; however, there are other coordinate structures described by concepts yet to be discovered. It is a task of the programmer to determine whether a given concept is useful.

Algorithms C++ Software Engineering Programming Alexander Stepanov Paul McJones Addison-Wesley Professional Elements of Programming