Fold Expressions in C++17

(1)

CHRISTOPH HONAL,

Technical University of Munich, Germany

In this report, we explore the new fold expression feature and syntax in C++17, in conjunction with the existing template and parameter pack functionality. We also discuss implementations in earlier versions of C++ and look at several usage examples.

CCS Concepts: Software and its engineering Language features; Additional Key Words and Phrases: C++, Fold Expressions, C++17 ACM Reference Format:

Christoph Honal. 2018. Fold Expressions in C++17. 1, 1 (June 2018), 9 pages.

1 VARIADIC EXPRESSIONS

To understand fold expressions in C++, we first have to look at the underlying data structures these expressions operate on. The seasoned programmer may be already familiar with fold expressions in other languages, such as Haskell’s foldl or OCaml’s fold left which in most cases execute on arbitrary data structures. However, in C++17, fold expressions exclusively work on the arguments of a variadic function.

A function in C++ is called variadic, when it takes an arbitrary amount of parameters, or - in a more formal way - when it has an arbitrary arity.

1.1 Variadic Arguments

The first implementation of a mechanism to provide the developer with the option to write variadic functions reaches all the way back to the C origins of C++, and is generally referred to as the varargs construct.

This mechanism operates entirely during runtime, and works by directly reading from the callstack using the C functions va list, va arg and so on. This however comes with several downsides, such as the missing type safety and the unknown length of the actual parameter list.

A C-style variadic function is denoted by using three dots as the last parameter in the signature:

int p r i n t f (c o n s t c h a r* format , . . . ) ;

Because of the mentioned downsides and the availability of improved mechanisms in C++, this construct should never be used in new C++ code.

Author’s address: Christoph Honal, Technical University of Munich, Arcisstraße 21, Munich, BY, 80333, Germany, [email protected].

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

(2)

1.2 Variadic Templates

Starting with C++11, it became possible to use templates with an arbitrary amount of arguments. Because this way of defining variadic functions is implemented using templates, the compiler is able to guarantee type safety and deduce the amount of actually applied parameters following a function call. Thus, the whole construct is a compile-time only feature.

1.3 Parameter Packs

In C++, the way a list of arbitrarily many parameters is represented is by using a parameter pack. This “special” parameter has to be last in the function signature and is denoted by three dots after the type specifier of the pack. The pack will expand when necessary, but always at compile time.

The specification becomes apparent in this this listing of a type-safe version of the above example:

t e m p l a t e<t y p e n a m e... Params >

v o i d p r i n t f (c o n s t std :: s t r i n g & format , P a r a m s ... p a r a m e t e r s );

To actually access the length of the parameter pack (parameters in the example), thesizeof...(args) operator may be used. The size will be zero when the parameter pack contains no parameters.

1.3.1 Expansion. To better understand the static expansion of parameter packs at compile time, this example demonstrates expansion on the above printf function:

p r i n t f (" I ␣ h a v e ␣ % d ␣ and ␣ % f ␣ % s ", 42 , 13.5 , " B e e s ");

This call would force the compiler to generate a function signature like this:

v o i d p r i n t f (c o n s t std :: s t r i n g & format ,

int param1 , d o u b l e param2 , c o n s t c h a r* p a r a m 3 );

It is important to understand that the expansion of parameter packs is applied to the whole expression containing the pack. This behavior is demonstrated here:

t e m p l a t e<t y p e n a m e... Params > s t r u c t V e c t o r T u p l e {

std :: tuple < std :: vector < Params >... > b a s e ; };

The base object will be a std::tuple containing an arbitrary amount of std::vector, which cor-respond to the passed type arguments in the parameter pack Paranms. We see, the whole expression containing the parameter pack gets expanded, in this case std::vector<Params>.

2 FOLD EXPRESSIONS

Generally speaking, a fold expression reduces a traversable structure of values into an aggregator using a binary operator. It does so by recursively merging the aggregator with the next element in the structure using the given operator. This mechanism can be seen very clearly in the Haskell reference implementation [5] of a left fold on lists:

f o l d l f z [] = z

f o l d l f z ( x : xs ) = f o l d l f ( f z x ) xs

(3)

2.1 Folding in C++11

In C++11, folding may be implemented using recursive templates. The following example of a function

which computes the sum of all values in a parameter pack demonstrates this pattern:

t e m p l a t e<t y p e n a m e T1 , t y p e n a m e... Tn >

a u t o sum ( T1 arg1 , Tn ... a r g n ) {

// Add the f i r s t e l e m e n t to the sum of the r e m a i n i n g e l e m e n t s

r e t u r n a r g 1 + sum ( a r g n . . . ) ; }

We notice the implicit destructuring of the pack by passing the head of it as a separate parameter to the function. This is roughly equivalent to Haskell’s (x:xs) pattern above. To correctly terminate the

recursion, an identity function is needed:

// The " t a i l " i d e n t i t y f u n c t i o n , w h i c h t e r m i n a t e s the r e c u r s i o n t e m p l a t e<t y p e n a m e T > a u t o sum ( T arg ) { r e t u r n arg ; }

This directly corresponds to the first line of the Haskell definition above.

However, the recursive template pattern stresses the compiler, because it has to generate a function signature for each step.

For example, sum(1, 2, 3, 4) would expand to 1 + (2 + (3 + 4)) like this:

sum (1 , 2 , 3 , 4) 1 + sum (2 , 3 , 4) 1 + (2 + sum (3 , 4)) 1 + (2 + (3 + sum ( 4 ) ) ) 1 + (2 + (3 + 4))

This forces the compiler to generate the following functions:

sum (int, int, int, int) sum (int, int, int) sum (int, int) sum (int)

This, as the reader may has noticed, can grow very fast and lead to an increase of binary size and compile time.

(4)

2.2 Fold Expressions in C++17

To circumvent the problem of recursive template-folding in C++11, a new syntactic feature was introduced in C++17: dedicated fold expressions.

We look at the same example as before, bot now notice the implementation using a fold expression:

t e m p l a t e<t y p e n a m e... T > a u t o sum ( T ... a r g s ) { // Use a f o l d e x p r e s s i o n o v e r the "+" o p e r a t o r r e t u r n ( . . . + a r g s ); }

Notice the new syntactic feature (... op args).

In this example, (... + args) would expand to (((arg1 + arg2) + arg3) + arg4) .... On the other hand, (args + ...) would expand to (arg1 + (arg2 + (arg3 + arg4))) .... Just as with parameter packs, the expansion of the fold expression purely happens at compile time.

2.3 Fold Expression Syntax

As seen before there are

∙ Unary left folds: (... op args) → ((arg1 op arg2) op arg3) op ...

∙ Unary right folds: (args op ...) → arg1 op (arg2 op ... (argN-1 op argN))

Syntax-wise, the parenthesis are mandatory and Op has to be a well-defined operator for the applied arguments.

Notice that folding direction matters on packs with heterogenous types, as can be seen in this example:

sum(std::string("hello"), "world", "!") will expand to ((std::string("hello") + "world") + "!"), but sum("hello", "world", std::string("!")) will not compile, because the plus operator is only

de-fined on strings when at least one operand is a std::string. This is not the case with ("hello"+ "world").

2.4 Empty Parameter Packs

As noted in section 1.3, parameter packs may be empty. To use fold expressions with parameter packs, there are several things to consider.

For example, what would be the result of sum() from the previous example? To answer this question, we have to examine the default value for our operator used. According to the C++17 reference [6], The default value are as follows:

∙ && → true (Logical and) ∙ || → false (Logical or)

∙ , → void() (Function sequencing)

All other operators do not have a default value, thus sum() from our example (which operates on +) is ill-formed.

(5)

2.5 Binary Folds

Based on our example, we now want to change our implementation such that the result of sum() is 0, as we assume it only operates on numerical values. To achieve this, we can use a binary fold.

Binary fold operations incorporate a default value as follows:

∙ Binary left folds: (val op ... op args) → (((val op arg1) op arg2) op arg3) op ... ∙ Binary right folds: (args op ... op val) → arg1 op (arg2 op ... (argN op val))) Notice how val acts as ”arg0”. In the case of our updated summation example, this would be 0. It now looks like this:

t e m p l a t e<t y p e n a m e... T > a u t o sum ( T ... a r g s ) { // Use a f o l d e x p r e s s i o n o v e r the "+" o p e r a t o r // and 0 as a d e f a u l t v a l u e r e t u r n (0 + ... + a r g s ); }

2.6 Advanced Usage Examples

2.6.1 Traversing A Tree [1]. Of course, fold expression can be used to solve a much broader set of problems, not just sum up values as seen here. One such problem would be to traverse the nodes of a binary tree.

Consider an unsorted binary tree, implemented using raw pointers:

s t r u c t N o d e { int v a l u e ; N o d e * l e f t ; N o d e * r i g h t ; };

We would like to select a node in this tree by calling a function like

Node* node = traverse(root, left, right, left). To implement this functionality, we declare two static member pointers:

a u t o l e f t = & N o d e :: l e f t ; a u t o r i g h t = & N o d e :: r i g h t ;

Then we fold over the pointer-to-member operator ->*:

t e m p l a t e<t y p e n a m e T , t y p e n a m e... TD > N o d e * t r a v e r s e ( T start , TD ... p a t h s ) { r e t u r n ( s t a r t - >* ... - >* p a t h s ); }

(6)

2.6.2 Apply Function. We would like to apply an unary function to several arguments:

v o i d a p p l y (c o n s t F func , c o n s t T & . . . a r g s );

We use the function sequencing operator (the ”comma“ operator), to fold over a sequence of functions

t e m p l a t e <t y p e n a m e F , t y p e n a m e... T > v o i d a p p l y (c o n s t F func , c o n s t T & . . . a r g s ) { ( f u n c ( a r g s ) , . . . ) ; }

In this example apply(func, 3.9f, 42, "Apple", 5.3f )

would expand to ((func(3.9f) , func(42)) , func("Apple")) , func(5.3f). Because the parentheses can be ignored, this sequences the calls to func.

2.6.3 Printing Function [1]. Furthermore, we might want to write a printing function, which would output

“helloworld5” for the call print( "hello", "world", 5).

t e m p l a t e<t y p e n a m e... T >

v o i d p r i n t (c o n s t T & . . . a r g s ) {

( std :: c o u t < < ... < < a r g s ) < < ’ \ n ’; }

Notice how std::cout was used as a default value. Otherwise, the output of print(1, 8) with this

implementation with the following implementation would not be the desired result.

t e m p l a t e<t y p e n a m e... T >

v o i d p r i n t (c o n s t T & . . . a r g s ) {

std :: c o u t < < ( . . . < < a r g s ) < < ’ \ n ’; }

Using this implementation, the result of the call above would be “256”, because 1 << 8 = 256. Now, to demonstrate further complexity, we would like to add a space between each parameter of the

print() function. t e m p l a t e<t y p e n a m e... T > v o i d p r i n t (c o n s t T & . . . a r g s ) { ( std :: c o u t < < ... < < o u t p u t _ s p a c e ( a r g s )) < < ’ \ n ’; } t e m p l a t e<t y p e n a m e T >

c o n s t T & o u t p u t _ s p a c e (c o n s t T & arg ) {

// T h i s is an i d e n t i t y f u n c t i o n w i t h a s i d e e f f e c t

std :: c o u t < < ’ ␣ ’;

r e t u r n arg ;

}

(7)

In this example, print("hello", "world", 5) would expand to

((std::cout << append_space("hello")) << append_space("world")) << append_space(5), which outputs “hello world 5”. Notice again that the parameter pack expansion happens on the whole expression containing the pack.

3 CONCLUSION

∙ Fold expressions reduce a variadic parameter pack into a single value using an operator ∙ The expansion is purely textual and happens during compile time

∙ Use std::accumulate to do the same with STL containers (e.g. std::initializer list or std::vector)

∙ Replace C++11 recursive templates with fold expressions to relieve the compiler ∙ Use fold expressions to implement variadic type traits (See Appendix A)

REFERENCES

[1] Nicolai M. Josuttis. 2018. C++17 - The Complete Guide. Leanpub. 101–111 pages.

[2] CPP Reference. 2017. Variadic functions. Retrieved May 28, 2018 from http://en.cppreference.com/w/cpp/utility/ variadic

[3] CPP Reference. 2018. Parameter pack. Retrieved May 28, 2018 from http://en.cppreference.com/w/cpp/language/ parameter pack

[4] CPP Reference. 2018. Variadic arguments. Retrieved May 28, 2018 from http://en.cppreference.com/w/cpp/language/ variadic arguments

[5] Haskell Reference. 2016. Fold. Retrieved June 10, 2018 from https://wiki.haskell.org/Fold

(8)

A

APPENDIX: EXERCISES

A.1 Check Positive

Write a function

c o n s t e x p r b o o l a r e P o s i t i v e (c o n s t T & . . . a r g s );

Which returns whether all parameters are greater or equal to zero. Hint: Use a mapping function to project T& into a new type

A.2 Add Many

Write a function

v o i d p u s h _ b a c k _ m a n y ( std :: vector < T >& vec , c o n s t K & . . . a r g s );

Which pushes all arguments into the supplied vector.

A.3 Check Type Equality [1]

Write a function

c o n s t e x p r b o o l i s H o m o g e n e o u s ( T1 , TN . . . ) ;

Which returns whether all parameters have the same type. Hint: Use std::is same as a mapping function

(9)

B

APPENDIX: SOLUTIONS

B.1 Check Positive

t e m p l a t e <t y p e n a m e... T > c o n s t e x p r b o o l a r e P o s i t i v e (c o n s t T & . . . a r g s ) { r e t u r n (( a r g s >= 0) && . . . ) ; }

Notice the anonymous mapping function f(p) → p >= 0, which is of type T& → bool

B.2 Add Many

t e m p l a t e <t y p e n a m e T , t y p e n a m e... K >

v o i d p u s h _ b a c k _ m a n y ( std :: vector < T >& vec , c o n s t K & . . . a r g s ) { ( vec . p u s h _ b a c k ( a r g s ) , . . . ) ;

}

Notice the comma operator ”,”.

B.3 Check Type Equality [1]

t e m p l a t e<t y p e n a m e T1 , t y p e n a m e... TN > c o n s t e x p r b o o l i s H o m o g e n e o u s ( T1 , TN . . . ) { r e t u r n ( std :: is_same < T1 , TN >:: v a l u e && . . . ) ; }

Notice that this can even be implemented as a custom type trait:

t e m p l a t e<t y p e n a m e T1 , t y p e n a m e... TN > s t r u c t i s _ h o m o g e n e o u s { s t a t i c c o n s t e x p r b o o l v a l u e = ( std :: is_same < T1 , TN >:: v a l u e && . . . ) ; };