Advanced C++
January 28, 2019
Mike Spertus
mike_spertus@[Link]
SORTING PERFORMANCE
Advanced: Were you surprised by the
different median performances?
⚫ When I first built and run median_test.cpp (on
Canvas), it showed that nth_element was
the fastest, then partial_sort, and sort
was slowest
⚫ Seems logical
⚫ “The less you
need sorted,
the less time it takes”
⚫ Unfortunately, that is not correct
Oops! Forgot to optimize!
⚫ Those last results were from a debug build,
which is worthless for benchmarking
⚫ If we switch to a release build, the results are
much different
⚫ Now partial_sort
is the slowest!
The moral: Always benchmark
Don’t just assume
⚫ Implementers use different algorithms for sort and
partial_sort
⚫ The problem with using partial_sort in the
homework is that sorting half the range is pretty
close to sorting all of the range, so the most
efficient overall sorting algorithm is best
⚫ partial_sort is really designed for getting, say, a top 10
out of a million elements
⚫ This isn’t relevant to nth_element, and indeed its
linear performance guarantee (28.7.2
[[Link]]) gives the best performance
PASSING ARGUMENTS
TO FUNCTIONS AND METHODS
Passing arguments to
⚫ Consider the declaration of the function for
printing Triangles in
class_oriented_pascal.cpp
⚫ inline ostream&
operator<<(ostream &os, Triangle triangle)
⚫ What are those mysterious &s, and which
arguments should have one?
How to pass arguments
⚫ In C++, there are three ways to pass arguments to a
function
⚫ By value
⚫ void f(X x);
⚫ x is a copy of the caller’s object
⚫ Any changes f makes to x are not seen by the caller
⚫ By reference
⚫ void g(X &x);
⚫ x refers to the caller’s existing object without creating a new copy
⚫ Any changes g makes to x also change what the caller sees
⚫ By move
⚫ void h(X &&x);
⚫ h takes over x from the caller, who should not ever use that
object again
Example:
void f(int x) { x = 3; cout << "f’s x is " << x << endl; }
void g(int &x) { x = 4; cout << "g’s x is " << x << endl; }
void h(int &&x) { x = 5; cout << "h’s x is " << x << endl; }
int main() {
int x = 2;
cout << x << endl; // prints 2
f(x); // print’s f’s x is 3
cout << x << endl; // prints 2
g(x); // print’s g’s x is 4
cout << x << endl; // prints 4
h(x); // Illegal! x may be used again
h(5*5); // print’s h’s x is 5
return 0;
}
Why does only C++ make you
specify how an object is passed?
⚫ Most languages only have only one way of
passing arguments
⚫ In C, arguments are always passed by value
⚫ The function always gets its own copy
⚫ In Java, arguments are always passed by
reference
⚫ The function sees the same object the caller passed
⚫ Exception: built-in numeric types like int are passed by
value
⚫ As usual, C++ let’s you choose which you want
(and adds a new “move” mode)
Passing by value:
pros and cons
⚫ Pro: It is very safe to call a function that takes its
arguments by value because the function you called can’t
actually see the object you passed it
⚫ Con: Copying an object may be very expensive
⚫ Imagine you are copying a map with a million key-value pairs
⚫ Con: Doesn’t work with inheritance/OO
⚫ If I copy an Animal, but it’s really a Gorilla, my copy will only
have Animal fields (and virtual functions may try to reference
non-existent Gorilla fields, crashing my program!)
⚫ Con: Sometimes I may want to modify the caller’s object.
E.g., they want me to clean the data in a vector.
Passing by reference:
pros and cons
⚫ Con: It is dangerous to call a function that takes its arguments by
reference because it may unexpectedly modify your object
⚫ Pro: You can fix this by passing the argument by “const
reference”, which says the function is not allowed to modify it
(e.g., by calling a non-const method on it)
⚫ void f(int const &i) { i = 3; // illegal: i is const}
⚫ You can say either X const or const X. Which is better? HW!
⚫ Pro: Efficient, since object doesn’t need to be copied
⚫ Pro: Works with inheritance/OO
⚫ No slicing because I am working with the original object
⚫ Pro: I can modify the object if desired and appropriate
⚫ Con: Managing memory is difficult
⚫ If my object has many owners, how do I now when it is safe to delete
⚫ We will learn best practices around this later
Passing by move
Pros and Cons
⚫ Sometimes you need your own copy of an object
(by value!) but the caller doesn’t need it any
more, so it seems wasteful to copy the object
⚫ For example, if it is a binary tree, you
could just copy the
root and take over
ownership of the leaves
⚫ As we will see later, this can make
algorithms like sort as much as
100x faster
So what should our Triangle
printer look like?
⚫ In the code, it is
⚫ ostream&
operator<<(ostream &os, Triangle triangle)
⚫ What is best?
⚫ We want to print to the original ostream, so better not copy it
⚫ In fact, the compiler won’t even let us make a copy of an ostream as
we will see)
⚫ Using ostream& is indeed best here
⚫ Copying a (possibly big) Triangle just so our printing function
can see it seems wasteful, so we would be better off with a
reference
⚫ But we also want to be able to print Triangles that we don’t have
the right to modify, which leads us to:
⚫ ostream&
operator<<(ostream &os, Triangle const &triangle)
CONSTRUCTORS AND
DESTRUCTORS
How does C++ know how to
copy and move an object?
⚫ A class’ “copy constructor” explains how to copy the class
⚫ struct X {
X(X const &x) { /* how to copy */ }
};
⚫ Not to worry. If you don’t specify a copy constructor, the compiler will
automatically generate one for you
⚫ It calls the copy constructors of all of the base classes and members in the same order
we discussed last week
⚫ Still, sometimes you need to explain how to copy your class, so the ability to
customize is good
⚫ If you are familiar with clone() methods in Java, this is very similar
⚫ Later this quarter, you will write a binary tree that does a deep copy
⚫ Sometimes, you don’t want a class to be copyable
⚫ struct X { X(X const &x) = delete; // Don’t copy X objects };
⚫ (That is how ostreams prevent themselves from being copied)
⚫ Move construction is handled analogously
⚫ struct X { X(X const &&x) { /* how to move */ } };
invoking a constructor to
create an object
// [Link] on chalk
#include<initializer_list>
using std::initializer_list
struct A {
A(int i, int j = 0); // #1
explicit A(int i); // #2
};
struct B {
B(int i, int j); // #3
B(initializer_list<int> l); // #4
};
int main() {
A a0 = 3; // A(3, 0) #1
A a1(3); // A(3) #2
A a1{3}; // A(3) #2. Uniform init: best practice but
B b0(3, 5); // #3
B b1 = { 3, 5}; // #4
B b2{3, 5}; // #4 Initializer list is preferred
}
Constructor Signatures
struct A {
A(int _i = 0) : i(_i) {}
// Alternate
A(int _i = 0) : i{_i} {}
// Alternate 2
// A(int _i = 0) { i = _i; }
int i;
};
Implicit conversions
⚫ Built-in
⚫ int i = 780;
long l = i;
char c = 7;
char c = i; // No warning, but dangerous!
⚫ Polymorphism
⚫ Animal *ap = new Dog;
⚫ Animal a = Dog(); // Ill-formed! Slicing
⚫ User-defined
⚫ Constructors
⚫ Operator overloading
⚫ “Standard Conversions”
⚫ Defined in clause 4 of the standard
Constructors and typecasts
struct A {
A();
A(int i);
A(int i, string s);
explicit A(double d);
};
A a;
A a0(1, "foo");
A aa = { 1, "foo"};
A a1(7); // Calls A(int)
a1 = 77;// ok
A a3(5.4); // Calls A(double)
a3 = 5.5; // Calls A(int)!!
Type conversion operators
struct seven {
operator int() { return 7; }
};
struct A { A(int); }
int i = seven();
A a = 7;
A a = seven(); // Illegal, two user-
// defined conversions not allowed
Explicit conversions
⚫ Old-style C casts (Legal but bad!)
⚫ char *cp f(void *vp) { return (char *)vp; }
⚫ New template casting operators
⚫ static_cast<T>
⚫ Like C casts, but only makes conversions that are always valid. E.g, convert
one integral type to another (truncation may still occur).
⚫ dynamic_cast<T*>
⚫ Casts between pointer types. Can even cast a Base* to a Derived* but only
does the cast if the target object really is a Derived*.
⚫ Only works when the base class has a vtable (because the compiler adds a
secret virtual function that keeps track of the real run-time type of the object).
⚫ If the object is not really a T *, dynamic_cast<T*> returns nullptr;
⚫ reinterpret_cast<T*>
⚫ Does a bitwise reinterpretation between any two pointer types, even for
unrelated types. Never changes the raw address stored in the pointer. Also
can convert between integral and pointer types.
⚫ const_cast<T>
⚫ Can change constness or volatileness only
Copy constructors
⚫ Classes can have constructors that show
how to make copies.
⚫ Signature is T(T const &)
⚫ A default copy constructor is almost always
generated
⚫ Calls the copy constructors of all the base classes
and members in the order we will discuss
⚫ T(T const &) = delete;
Default constructor
⚫ The “No argument” constructor is called the default constructor
⚫ struct A {
A() : i{5} (); // Default constructor
int i;
};
A a; // Calls default constructor
⚫ If you don’t define any constructors in your class, the compiler will
generate a default constructor for you
⚫ struct B {
string s;
};
B b; // Call compiler-generated default constructor
⚫ If you define any constructors, the compiler will not generate a default
constructor
⚫ struct C {
C(double d) : d{d} {}
double d;
};
C c; // Ill-formed! C has no default constructor
Review: Order of construction
⚫ Virtual base classes first
⚫ Even if not immediate
⚫ First base class constructors are run in the order
they are declared
⚫ Next, member constructors are run in the order of
declaration
⚫ This is defined, but very complicated
⚫ Best practice: Don’t rely on it
⚫ Good place for a reminder: Best practice: don’t use virtual
functions in constructors
Constructor ordering
class A {
public:
A(int i) : y(i++), x(i++) {}
int x, y;
int f() { return x*y*y; }
};
⚫ What is A(2).f()?
Answer: 18! (x is initialized first, because it was declared
first. Order in constructor initializer list doesn’t matter)
Destructor ordering
⚫ Reverse of constructor ordering
⚫ Begin by calling total object destructor
⚫ Then members in reverse order of
declaration
⚫ Then non-virtual base classes in reverse
order
⚫ Virtual base classes
Non-static Data Member
Initializers
⚫ We can simplify constructors by giving initializers to members that usually have the same value
⚫ Bad:
struct A {
A(): a(7), b(5), hash_algorithm("MD5"), s("Constructor run") {}
A(int a_val) : a(a_val), b(5), hash_algorithm("MD5"), s("Constructor
run") {}
A(D d) : a(f(d)), b(g(d)), hash_algorithm("MD5"), s("Constructor run")
{}
int a;
int b;
// Cryptographic hash to be applied to all A instances
HashingFunction hash_algorithm;
std::string s; // String indicating state in object lifecycle
};
⚫ Good:
struct A {
A(): a(7), b(5) {}
A(int a_val) : a(a_val), {}
A(D d) : a(f(d)), b(g(d)) {}
int a{7}
int b{5};
// Cryptographic hash to be applied to all A instances
HashingFunction hash_algorithm{"MD5"};
std::string s{"Constructor run"}; // String indicating state in object lifecycle
};
Default construction of
primitive types
⚫ Primitive types are not initialized when default
constructed unless an empty initializer is
explicitly passed
int i; // i contains garbage.
// Efficient but unsafe
int j{}; // Initialized to 0
struct A {
int j;
};
A a; // a.j could be anything
// How would you fix?
// More on this in a few slides
HW 3.1
⚫ People tend to use C-style casts because they are shorter
and more convenient
⚫ Let’s try to see if C++-style casts are really better
⚫ Define classes D and B such that D inherits from B and create
a B *b, such that dynamic_cast<D*>(b) and the c-style
cast(D*)b give different results.
⚫ You can demonstrate they give different results simply by
printing them as pointers:
cout << dynamic_cast<D*>(b) << endl;
cout << (D*)b;
⚫ Which one is better?
⚫ If you wanted to get the C-style behavior but still don’t want to
use “bad” C++ casts, what C++ cast would you use?
HW 3.2 – More practice with
classes
⚫ Your assignment is to implement the "Animal Game." The idea is that you chose a secret animal. The computer
then asks you questions about the animal, terminating with a guess. If the guess is right, the computer wins, if it is
wrong you win. But as part of winning, you have to provide your animal, and a differentiating yes/no question,
which the program uses to build a decision tree.
⚫ While the program only starts out knowing one animal, the longer it runs, the smarter it gets:
⚫ Think of an animal.
Is it a cow? No
You sure fooled me. What was it? butterfly
Tell me a question that distinguishes a butterfly from a cow
Does it fly?
Thanks, let’s play again.
Think of an animal.
Does it fly? No
Is it a cow? No
You sure fooled me. What was it? fish
Tell me a question that distinguishes a fish from a cow
Does it swim?
Thanks, let’s play again.
Think of an animal.
Does it fly? No
Does it swim? Yes
Is it a fish? Yes
Hooray, I won! Let’s play again ^C
⚫ You may find std::getline to be helpful
⚫ For extra credit, persist your tree so the program doesn’t start over every time. [Link] can help with
this.
⚫ As we discussed in class, Boost is an incubator for high-quality libraries, many of which are eventually adopted into the standard
⚫ Note: to read and write from a file, you can use std::ifstream and std::ofstream
HW 3.3 – Extra credit
⚫ C++’ facility for creating your own implicit type conversions is very powerful but it
has potential for misuse
⚫ Consider an Employee class
struct Employee {
Employee(int empID) {
/* initialize info by looking up their ID */
}
/* ... */
};
⚫ This allows you to conveniently call functions that expect employees by just
passing in their ID
⚫ unsigned getSalary(Employee const &e);
/* ... */
cout << "Employee 10’s salary is " << getSalary(10);
⚫ Do you think this is a good idea to allow this? Why? If not, what would you
change?
⚫ If you wanted to prevent such implicit conversions but still allow construction of
Employees from ints, how would you change the Employee class?