Lightweight Abstractions in C++, An Introduction To CRTP and Expression Templates
Lightweight Abstractions in C++, An Introduction To CRTP and Expression Templates
Tip 29
Peter Steinbach
Scientic Computing Facility Max Planck Institute of Molecular Cell Biology and Genetics
Outline
1. Motivation 2. Performant Inheritance 3. Expressive And Fast Calculations 4. The Case of Eigen 5. Summary 6. Appendix 7. Literature
LightAbstractions
2 / 28
Outline
1. Motivation 2. Performant Inheritance 3. Expressive And Fast Calculations 4. The Case of Eigen 5. Summary 6. Appendix 7. Literature
LightAbstractions
3 / 28
LightAbstractions
4 / 28
LightAbstractions
4 / 28
Disclaimer
LightAbstractions
5 / 28
Outline
1. Motivation 2. Performant Inheritance 3. Expressive And Fast Calculations 4. The Case of Eigen 5. Summary 6. Appendix 7. Literature
LightAbstractions
6 / 28
one-dimensional histogram with double precision data extensive use of virtual inheritance overloaded with responsibilities
root.cern.ch[2]
LightAbstractions
7 / 28
one-dimensional histogram with double precision data extensive use of virtual inheritance overloaded with responsibilities
root.cern.ch[2]
LightAbstractions
7 / 28
class AbstrBase { public : virtual unsigned update ( const unsigned & _in ) const = 0; }; class Derived : public AbstrBase { public : unsigned update ( const unsigned & _in ) const { return _in - 1; } };
11
13
dynamic polymorphism
function pointers to available implementations stored on stack pointers resolved through table ( vtable)
P. Steinbach (MPI CBG) LightAbstractions Dec 12nd, 2013 8 / 28
unsigned count_down ( AbstrBase * _updater , const unsigned & _start_index ){ unsigned i = _start_index ; for (; i >0;) { i = _updater - > update ( i ) ; } return i ; }
11
LightAbstractions
9 / 28
unsigned count_down ( AbstrBase * _updater , const unsigned & _start_index ){ unsigned i = _start_index ; for (; i >0;) { i = _updater - > update ( i ) ; } return i ; }
11
t = 2.1 s
P. Steinbach (MPI CBG) LightAbstractions Dec 12nd, 2013 9 / 28
template < typename Daughter > struct CRTPBase { int update ( const int & _in ) const { return Daughter :: do_update ( _in ) ; } }; struct CRTPDerived : public CRTPBase < CRTPDerived > { int do_update ( const int & _in ) const { return _in - 1; } }; rst discussed in 1995 [3] sub-templated Design-by-Policy implementation [4]
P. Steinbach (MPI CBG) LightAbstractions Dec 12nd, 2013 10 / 28
11
13
template < typename Daughter > struct CRTPBase { int update ( const int & _in ) const { return Daughter :: do_update ( _in ) ; } };
10
12
14
struct CRTPDerived : public CRTPBase < CRTPDerived > { int do_update ( const int & _in ) const { return _in - 1; } }; CRTPDerived declared
LightAbstractions
11 / 28
template < typename Daughter > struct CRTPBase { int update ( const int & _in ) const { return Daughter :: do_update ( _in ) ; } };
10
12
14
struct CRTPDerived : public CRTPBase < CRTPDerived > { int do_update ( const int & _in ) const { return _in - 1; } }; CRTPDerived declared CRTPBase instantiated
P. Steinbach (MPI CBG) LightAbstractions Dec 12nd, 2013 11 / 28
template < typename Daughter > struct CRTPBase { int update ( const int & _in ) const { return Daughter :: do_update ( _in ) ; } };
10
12
14
struct CRTPDerived : public CRTPBase < CRTPDerived > { int do_update ( const int & _in ) const { return _in - 1; } }; CRTPDerived declared CRTPBase instantiated
P. Steinbach (MPI CBG)
LightAbstractions
11 / 28
template < typename Daughter > struct CRTPBase { int update ( const int & _in ) const { return Daughter :: do_update ( _in ) ; } };
10
12
14
struct CRTPDerived : public CRTPBase < CRTPDerived > { int do_update ( const int & _in ) const { return _in - 1; } }; CRTPDerived declared CRTPBase instantiated
P. Steinbach (MPI CBG)
template < typename Daughter > struct CRTPBase { int update ( const int & _in ) const { return Daughter :: do_update ( _in ) ; } };
10
12
14
struct CRTPDerived : public CRTPBase < CRTPDerived > { int do_update ( const int & _in ) const { return _in - 1; } }; CRTPDerived declared CRTPBase instantiated
P. Steinbach (MPI CBG)
CRTPDerived dened
11 / 28
template < typename Daughter > struct CRTPBase { int update ( const int & _in ) const { return Daughter :: do_update ( _in ) ; } };
10
12
14
struct CRTPDerived : public CRTPBase < CRTPDerived > { int do_update ( const int & _in ) const { return _in - 1; } }; CRTPDerived declared CRTPBase instantiated
P. Steinbach (MPI CBG)
LightAbstractions
12 / 28
template < typename T > unsigned count_down ( const T & _updater , const unsigned & _start_index ){ unsigned i = _start_index ; for (; i >0;) i = _updater . update ( i ) ; return i ; }
Any Guesses?
P. Steinbach (MPI CBG) LightAbstractions Dec 12nd, 2013 12 / 28
template < typename T > unsigned count_down ( const T & _updater , const unsigned & _start_index ){ unsigned i = _start_index ; for (; i >0;) i = _updater . update ( i ) ; return i ; }
t = 0s
P. Steinbach (MPI CBG) LightAbstractions Dec 12nd, 2013 12 / 28
template < typename Daughter > class CRTPBase { public : int update ( const int & _in ) const { return Daughter :: do_update ( _in ) ; } };
0 s . . . a bug? compiler simply emitted no code for crtp count down optimisation schemes dropped redundant operations impossible with dynamic polymorphism vtable incurs branching (virtual = 1000x branches than crtp) for loop acutally performed including indirection through inheritance
P. Steinbach (MPI CBG) LightAbstractions Dec 12nd, 2013 13 / 28
CRTP: Wrap-Up
1
template < typename Daughter > class CRTPBase { public : int update ( const int & _in ) const { return Daughter :: do_update ( _in ) ; } };
Outline
1. Motivation 2. Performant Inheritance 3. Expressive And Fast Calculations 4. The Case of Eigen 5. Summary 6. Appendix 7. Literature
LightAbstractions
15 / 28
double a [4] = {2 , 3 , 4 , 5}; double b [4] = {5 , 4 , 9 , 2}; // b <- 1.0* a + b cblas_daxpy (4 , 1.0 , a , 1 ,
b , 1) ;
LightAbstractions
16 / 28
double a [4] = {2 , 3 , 4 , 5}; double b [4] = {5 , 4 , 9 , 2}; // b <- 1.0* a + b cblas_daxpy (4 , 1.0 , a , 1 ,
b , 1) ;
LightAbstractions
16 / 28
inline const Vector operator +( const Vector & _first , const Vector & _second ) { Vector temporary ( _first . size () ) ; for ( size_t index = 0; index < _first . size () ;++ index ) temporary [ index ] = _first [ index ] + _second [ index ]; return temporary ;
11
Issues
large vectors: temporary becomes the bottleneck
LightAbstractions
17 / 28
inline const Vector operator +( const Vector & _first , const Vector & _second ) { Vector temporary ( _first . size () ) ; for ( size_t index = 0; index < _first . size () ;++ index ) temporary [ index ] = _first [ index ] + _second [ index ]; return temporary ;
11
Issues
large vectors: temporary becomes the bottleneck large vectors: sum will always be carried out entirely
P. Steinbach (MPI CBG) LightAbstractions Dec 12nd, 2013 17 / 28
LightAbstractions
18 / 28
template < typename A , typename B > class sum { const A & left_ ; const B & right_ ; public : explicit sum ( const A & _left , const B & _right ) : left_ ( _left ) , right_ ( _right ) {} size_t size () const { return left_ . size () ; } double operator []( size_t _index ) const { return left_ [ index ] + right_ [ index ]; } };
10
12
14
16
LightAbstractions
18 / 28
template < typename A , typename B > class sum { const A & left_ ; const B & right_ ; public : explicit sum ( const A & _left , const B & _right ) : left_ ( _left ) , right_ ( _right ) {} size_t size () const { return left_ . size () ; } double operator []( size_t _index ) const { return left_ [ index ] + right_ [ index ]; } };
sum is cheap in terms of memory (2 references inside) stores and passes the operation of one element
P. Steinbach (MPI CBG) LightAbstractions Dec 12nd, 2013 18 / 28
10
12
14
16
v_ [ index ] = _in [ index ]; return (* this ) ; } // ... }; added new assignment operator to Vector just an example of interfacing Vector with sum required for (c = a + b)
P. Steinbach (MPI CBG) LightAbstractions Dec 12nd, 2013 19 / 28
10
12
no memory overhead by temporaries achieved lazy evaluation (operation is conduction once it is really needed) intermediate abstraction has small memory foot print syntax is expressive and readable
LightAbstractions
20 / 28
There is more . . .
expression templates rst reported in 1995 [7] same scheme can be applicable in many more elds expression templates allow modular jump-in of accelerator code used by: Eigen [8], Blaze[9], MTL4[10] . . .
LightAbstractions
21 / 28
Outline
1. Motivation 2. Performant Inheritance 3. Expressive And Fast Calculations 4. The Case of Eigen 5. Summary 6. Appendix 7. Literature
LightAbstractions
22 / 28
for linear algebra, matrix and vector operations, numerical solvers and related algorithms header-only library complying C++98 standard performed for SSE 2/3/4, ARM NEON, and AltiVec instruction sets and non-vectorized implementations open source under Mozilla Public License v2 for details visit [8]
LightAbstractions
23 / 28
Derived:typename
MatrixBase
CRTP
CRTP
VectorXf
CwiseBinaryOp
LightAbstractions
24 / 28
Derived:T OtherDerived:T
Derived:T OtherDerived:T
Derived:typename
MatrixBase
+operator=()
internal::assign_impl
+run(dst:Derived,other:OtherDerived)
internal::assign_selector
+run(dst:Derived,other:OtherDerived) CRTP CRTP
internal::pload
internal::padd
internal::pstore
VectorXf
CwiseBinaryOp
LightAbstractions
25 / 28
Outline
1. Motivation 2. Performant Inheritance 3. Expressive And Fast Calculations 4. The Case of Eigen 5. Summary 6. Appendix 7. Literature
LightAbstractions
26 / 28
Summary C++ template engine provides tools for lightweight abstractions CRTP can replace virtual inheritance at runtime expression templates reduce the need for temporaries and more revisit your code and wonder, what portions are compile time constants
LightAbstractions
27 / 28
Summary C++ template engine provides tools for lightweight abstractions CRTP can replace virtual inheritance at runtime expression templates reduce the need for temporaries and more revisit your code and wonder, what portions are compile time constants
Tip 29
Final Word
from [11]
LightAbstractions
28 / 28
Final Word
from [11]
Outline
1. Motivation 2. Performant Inheritance 3. Expressive And Fast Calculations 4. The Case of Eigen 5. Summary 6. Appendix 7. Literature
LightAbstractions
29 / 28
template < typename T > T add ( const T & _first , const T & _second ) { return _first + _second ; } C++ templates processed by template engine before compilation into binary
LightAbstractions
30 / 28
template < typename T > T add ( const T & _first , const T & _second ) { return _first + _second ; } C++ templates processed by template engine before compilation into binary templates and template engine form a Turing-complete programming language [12]
LightAbstractions
30 / 28
template < typename T > T add ( const T & _first , const T & _second ) { return _first + _second ; } C++ templates processed by template engine before compilation into binary templates and template engine form a Turing-complete programming language [12] using C++ templates for Meta Programming: C++ Template Meta Programming
LightAbstractions
30 / 28
unsigned int runt ime_fa ctoria l ( unsigned int n ){ if ( n == 0) return 1; else return n * ru ntime_ factor ial (n -1) : }
LightAbstractions
31 / 28
LightAbstractions
32 / 28
input parameters required at compile time (implies static constness) result known at compile time already, no runtime investment
LightAbstractions
32 / 28
input parameters required at compile time (implies static constness) result known at compile time already, no runtime investment these days CPU are so fast, this example is more academic
P. Steinbach (MPI CBG) LightAbstractions Dec 12nd, 2013 32 / 28
Appendix: VTable
clang -cc1 -fdump-record-layouts virtual_sizeof.cpp
*** Dumping AST Record Layout 0 | class Derived 0 | class AbstrBase (primary base) 0 | (AbstrBase vtable pointer) 0 | (AbstrBase vftable pointer) | [sizeof=8, dsize=8, align=8 | nvsize=8, nvalign=8]
Appendix: Disclaimer
All images (except stated otherwise) in this presentation were taken from the OpenClipArt gallery and are subject to the public domain. See http://openclipart.org/share for more information on sharing terms.
LightAbstractions
34 / 28
Outline
1. Motivation 2. Performant Inheritance 3. Expressive And Fast Calculations 4. The Case of Eigen 5. Summary 6. Appendix 7. Literature
LightAbstractions
35 / 28
Literature
[1] [2] A. Hunt and D. Thomas, The Pragmatic Programmer. Addison and Wesley, 2000. R. Brun and F. Rademakers, Root - an object oriented data analysis framework, in Proceedings AIHENP96 Workshop, vol. A of Nucl. Inst. & Meth. in Phys. Res., pp. 8186, September 1996. http://root.cern.ch. J. O. Coplien, Curiously recurring template pattern, in C++ Report, pp. 2427, February 1995. A. Alexandrescu, Modern C++ Design: Generic Programming and Design Patterns Applied. Addison-Wesley, 2001. E. Bendersky, http://eli.thegreenplace.net/2013/12/05/the- cost- of- dynamic- virtual- calls- vs- static- crtp- dispatch- in- c/. interesting performance evalution of CRTP. 1 de, Expression templates revisited: A performance analysis of current methodologies, SIAM Journal K. Iglberger, G. Hager, J. Treibig, and U. RA 4 on Scientic Computing, vol. 34, no. 2, pp. C42C69, 2012. T. Veldhuizen, Expression templates, in C++ Report, vol. 5, pp. 2631, June 1995. Eigen. http://eigen.tuxfamily.org/index.php?title=Main_Page. C++ library of template headers for linear algebra, matrix and vector operations, numerical solvers and related algorithms. Blaze-lib. http://code.google.com/p/blaze- lib/.
[10] Mtl4 - matrix template library. http://www.simunova.com/de/node/24. [11] K. Rocki, M. Burtscher, and R. Suda, The future of accelerator programming: Abstraction, performance or can we have both?, 2014. http://olab.is.s.u- tokyo.ac.jp/~kamil.rocki/pub.html. [12] Proving turing-completeness of c++ templates. http://matt.might.net/articles/c++- template- meta- programming- with-lambda- calculus/. [13] Template metaprogramming. http://en.wikipedia.org/wiki/Template_metaprogramming. [14] http://talesofcpp.fusionfenix.com/post- 12/episode- eight- the- curious- case- of- the- recurring- template- pattern.
LightAbstractions
36 / 28