Type erasure in C++

Polymorphism is the ability to use objects of different types in a generic way, using the same notation. In C++ there are two forms of polymorphism, dynamic and static polymorphism. Let’s examine both.

Dynamic polymorphism

C++ supports dynamic polymorphism through inheritance and virtual (member) functions: we can define classes (derived classes) that derive from other classes (base classes); we override the behavior of the base classes in the derived classes by declaring a member function virtual in the base class, and redefining the same function in the derived class, with a different implementation (giving a different function body). An object or a pointer to a derived type can be assigned to a reference or a pointer to a base type (if the implicit derived to base conversion is accessible): if a virtual function is called on a reference or a pointer to a base class, the virtual dispatch mechanism will choose the correct implementation, based on the actual (run-time) type of the referenced (or pointed to) object.

This is called dynamic binding (or late, or run-time binding), because the call is resolved at run-time, in contrast with the mechanism that resolves the call at compile time, which is called static binding (or early, or compile-time binding):

For the virtual dispatch to work, it is necessary that the called function is declared virtual, and that the call is performed through a reference or a pointer, since this is the only case in which the static and dynamic type can differ (otherwise the type is known during compilation and the call is statically bound). There’s no implicit conversion between objects, and when we assign a derived object to a base class object slicing occurs (the base class’ copy constructor is called).

If we have a reference or a pointer to a base type, we can use all the operations declared in the public interface of that base type (for example we can call all the public methods that are exposed through that interface). This kind of interface is sometimes called an  explicit interface, because it is explicitly declared in the definition of the class and we can see it in the source code.

Since we can treat different types as one, as long as they derive from a common base class, we can also store pointers to different objects in a container or array (reference are not objects, so we cannot have containers or arrays in which the elements are references):

Dynamic polymorphism is by nature invasive and intrusive: we need to implement explicit interfaces (read: derive from common base classes) to be able to treat a derived type as a base type. This results in a strong coupling between two pieces of code, something that we, in general, try to avoid.

Static polymorphism

In C++, templates also allow us to use objects of different types with the same notation. Unlike dynamic polymorphism, the common interface is not represented by a (possibly abstract) base class from which all other types must derive. Instead, the interface is implicit and is expressed by a series of valid operations/expressions inside the template when it is instantiated:

Unlike dynamic polymorphism, classes for which the template can be instantiated are not related: again, the interface commonality is not expressed through a common base class, the template works for every type that implements the necessary (implicit) interface (in this case, every type that has a Print() function).

Please note that with dynamic polymorphism we have a single function Display at run-time whereas with static polymorphism there is a different function for each type with which the template is instantiated (in our code there is a Display<A>, Display<B> and Display<C> function), which increases code size. It’s important to remember that both explicit and implicit interfaces are checked at compile time: if a function tries to use an operation that is not part of an explicit interface for that type the compiler issues an error; likewise, if we instantiate a template with a type that doesn’t implement the operations used within it (the  interface implicitly expressed by the set of expressions used by the template instance), again the program does not compile.

A limit of static polymorphism is that all types must be known at compile time: for example, we can’t have eterogenous collections of objects:

Despite these limitations, static polymorphism is not invasive nor intrusive: the types involved are unrelated and independent of each other.

It would be nice if we could use some unrelated types that implement some common operation (i.e conform to the same interface, or partial interface) without the need for deriving from a base class (as in static polymorphism) as the same type (as with pointer and reference to base classes in dynamic polymorphism), allowing us to store them in etherogeneous containers. Let’s see how this can be achieved in C++.

Bridging static and dynamic polymorphism: type erasure

Type erasure is a pattern/idiom/technique that provides benefits from both models of polymorphism: unrelated types can be used as the same type through a common interface without the need to derive from a common base class (as with templates); heterogeneous collection of objects of different unrelated types are also possible (as with dynamic polymorphism).

A simple implementation for a type erasure class looks like this:

Let’s break the implementation apart. The magic of type erasure happens thanks to the Concept base class and the derived class Model, which is a class template. These classes are nested inside the type erasure class Object, which is an ordinary class, but its constructor is templatized on the type of the object that we want to type-erase. The constructor perfectly forwards the instance to the Model constructor, and the Model class stores a (copy-constructed or move-constructed) copy  of the object.

The Object class stores a pointer to an instance of Concept, and simply delegates every operation to it . The interface of the object (here, the Print method) is replicated in the Object class, the Concept abstract base class (as pure virtual methods) and the Model class.

We can pass any object that implements the same interface (in this case a Print member function taking no arguments and returning void) to an Object constructor, and we can also have heterogeneous collections of type-erased objects:

This is type erasure in a nutshell. A more realistic implementation would provide copy and move operations (construction and assignment):

Implementation for copy and move constructors follows:

Note that if we want to provide normal copy semantics, we need to implement a copy constructor that takes a non-const lvalue reference to Object. Without it, the forwarding construtor would be a better match for a non const lvalue, so the non-const Object copy constructor simply casts the argument to a const one, and delegates the construction to the normal copy constructor.

Copy and move assignment operators are similarly implemented:

The copy-and-swap idiom is used here to implement the copy assignment operator, and the Swap function simply exchanges the pointers inside the Object instances:





2 Responses

  1. Hi Luca, thank you very much for the post, very useful!

    Just a question: shouldn’t it be:

    void Print() { mConcept->Print(); }

    instead of

    void Invoke() { mConcept->Print(); }

    in the Object class interface?

Leave a Reply

Your email address will not be published. Required fields are marked *