Before you decide to use virtual functions everywhere without much thinking, I have to make you aware of the size and speed considerations. The space overhead of polymorphism is: one pointer per every instance of the class. There is an additional per-class overhead of a vtable, but it’s not that important. One more pointer per object is not a lot when dealing with a small number of relatively large objects. It becomes significant, however, when dealing with a large number of lightweight objects. A virtual function call is slower than a direct call and significantly slower than the use of an inline function. Again, you can safely ignore this overhead when calling a heavy-weight member function, But turning an inline
function such as AtEnd () into a virtual function may significantly slow down your loops.
So don’t even think of the idea of creating the class Object—the mother of all objects—with a handy virtual destructor (and maybe one more integer field for some kind of a class ID [run-time typing!], plus some
conditionally compiled debugging devices, etc.). Don’t try to make it the root of all classes. In fact, if you hear
somebody complaining about how slow C++ is, he or she is probably a victim of this Smalltalk syndrome in C++. Not
that Smalltalk is a poor language. When size and speed are of no concern, Smalltalk beats C++ on almost all fronts. It is a truly object oriented language with no shameful
heritage of the hackers’ C. It unifies built-in types with user defined types much better than C++. It has a single- rooted hierarchy of objects. All methods are virtual (you can even override them at runtime!). Java, on the other hand, tries to strike a balance between "objectivity" and performance. In Java all methods are virtual, except when explicitly declared final (and then they cannot be
overridden).
If speed and size are of concern to you, stick to C++ and use polymorphism wisely. Well designed polymorphic classes will lead to C++ code that is as fast as the
equivalent C code (and much better from the maintenance point of view). This is possible when polymorphism is used to reduce a string of conditionals (or a switch statement) to a single virtual function call.
The rule of thumb for those coming from the C background is to be on the lookout for the switch statements and
complicated conditionals. It is natural in C++ to use polymorphism in their place.
Parse Tree
Virtual member functions, virtual destructors, pure virtual functions, protected data members.
I will demonstrate the use of polymorphism in an example of a data structure—the arithmetic tree. An arithmetic expression can be converted into a tree structure whose nodes are
arithmetic operators and leaf nodes are numbers. Figure 2-3 shows the example of a tree that corresponds to the
expression 2 * (3 + 4) + 5. Analyzing it from the root towards the leaves we first encounter the plus node, whose children are the two terms that are to be added. The left child is a product of two factors. The left factor is number 2 and the right factor is the sum of 3 and 4. The right child of the top level plus node is number 5. Notice that the tree
representation doesn’t require any parentheses or the
knowledge of operator precedence. It uniquely describes the calculation to be performed.
Figure 2-3 The arithmetic tree corresponding to the expression 2 * (3 + 4) + 5.
We will represent the nodes of the arithmetic tree as objects inheriting from a single class Node. The direct descendants of the Node are NumNode representing a number and BinNode representing a binary operator. For simplicity, we will restrict ourselves to only two classes derived from BinNode, the
AddNode and the MultNode. Figure 2-4 shows the class hierarchy I have just described. Abstract classes are the
classes that cannot be instantiated, they only serve as parents for other classes. I’ll explain this term in a moment
Figure 2-4 The class hierarchy of nodes.
What are the operations we would like to perform on a node? We would like to be able to calculate its value and, at some point, destroy it. The Calc method returns a double as the result of the calculation of the node’s value. Of course, for some nodes the calculation may involve the recursive
calculations of its children. The method is const since it
doesn’t change the node itself. Since each type of node has to provide its own implementation of the Calc method, we make this function virtual. However, there is no "default"
implementation of Calc for an arbitrary Node. The function that has no implementation (inherited or otheriwise) is called
pure virtual. That’s the meaning of = 0 in the declaration of Calc.
A class that has one or more pure virtual functions is called an
abstract class and it cannot be instantiated (no object of this class can be created). Only classes that are derived from it, and which provide their own implementations of all the pure virtual functions, can be instantiated. Notice that our sample arithmetic tree has instances of AddNodes, MultNodes and NumNodes, but no instances of Nodes or BinNodes.
A rule of thumb is that, if a class has a virtual function, it probably needs a virtual destructor as well--and once we
decide to pay the overhead of a vtable pointer, all subsequent virtual functions don’t increase the size of the object. So, in such a case, adding a virtual destructor doesn't add any significant overhead.
In our case we can anticipate that some of the descendant nodes will have to destroy their children in their destructors, so we really need a virtual destructor. A destructor cannot be made pure virtual, because it is actually called by the
destructors of the derived classes. That's why I gave it an empty body. (Even though I made it inline, the compiler will create a function body for it, because it needs to stick a
pointer to it into the virtual table).
source
class Node {
public:
virtual ~Node () {}
virtual double Calc () const = 0; };
NumNode stores a double value that is initialized in its
constructor. It also overrides the Calc virtual function. In this case, Calc simply returns the value stored in the node.
class NumNode: public Node {
public:
NumNode (double num) : _num (num ) {} double Calc () const;
private:
const double _num; };
double NumNode::Calc () const {
cout << "Numeric node " << _num << endl; return _num;
}
BinNode has two children that are pointers to nodes. They are initialized in the constructor and deleted in the destructor—this is why I could make them const pointers (but not pointers to const, since I have to call the non-const method on
them—the destructor). The Calc method is still pure virtual, inherited from Node, only the descendants of BinNode will know how to implement it.
class BinNode: public Node {
public:
BinNode (Node * pLeft, Node * pRight) : _pLeft (pLeft), _pRight (pRight) {} ~BinNode ();
protected:
Node * const _pLeft; Node * const _pRight; }; BinNode::~BinNode () { delete _pLeft; delete _pRight; }
This is where you first see the advantage of polymorphism. A binary node can have children which are arbitrary nodes. Each of them can be a number node, an addition node, or a
multiplication node. There are nine possible combinations of children—it would be silly to make separate classes for each of them (consider, for instance,
AddNodeWithLeftMultNodeAndRightNumberNode). We had no choice but to accept and store pointers to children as more general pointers to Nodes. Yet, when we call destructors through them, we need to call different functions to destroy different nodes. For instance, AddNode has a different
destructor than a NumNode (which has an empty one), and so on. This is why we had to make the destructors of Nodes
virtual.
Notice that the two data members of BinNode are not
private—they are protected. This qualification is slightly weaker than private. A private data member or method cannot be accessed from any code outside of the
from the code of the derived class. Had we made _pLeft and _pRight private, we’d have to provide public methods to set and get them. That would be tantamount to exposing them to everybody. By making them protected we are letting classes derived from BinNode manipulate them, but, at the same time, bar anybody else from doing so.
Table 1
Access specifier Who can access such member?
public
anybody
protected