Type erasure in C++

Polymorphism is the ability to use objects of different types in a generic way, using the same notation. In C++ there are two forms of polymorphism, dynamic and static polymorphism. Let’s examine both.

Dynamic polymorphism

C++ supports dynamic polymorphism through inheritance and virtual (member) functions: we can define classes (derived classes) that derive from other classes (base classes); we override the behavior of the base classes in the derived classes by declaring a member function virtual in the base class, and redefining the same function in the derived class, with a different implementation (giving a different function body). An object or a pointer to a derived type can be assigned to a reference or a pointer to a base type (if the implicit derived to base conversion is accessible): if a virtual function is called on a reference or a pointer to a base class, the virtual dispatch mechanism will choose the correct implementation, based on the actual (run-time) type of the referenced (or pointed to) object.

This is called dynamic binding (or late, or run-time binding), because the call is resolved at run-time, in contrast with the mechanism that resolves the call at compile time, which is called static binding (or early, or compile-time binding):

C++
#include <iostream>

class Base
{
public:
    virtual void Print() { std::cout << "in base class" << std::endl; }
};

class Derived1 : public Base
{
public:
    void Print() override { std::cout << "in derived1 class" << std::endl; }
};

class Derived2 : public Base
{
public:
    void Print() override { std::cout << "in derived2 class" << std::endl; }
};

// pass by reference (or pointer) : dynamic dispatch
void DisplayPoly(Base &base)
{
    base.Print();
}

// pass by value : statically bound
void Display(Base base)
{
    base.Print();
}

int main(int argc, char **argv)
{
    Base b;
    Derived1 d1;
    Derived2 d2;

    DisplayPoly(b);       // calls Base::Print
    DisplayPoly(d1);      // calls Derived1::Print
    DisplayPoly(d2);      // calls Derived2::Print

    Display(b);   // calls Base::Print
    Display(d1);  // calls Base::Print (slicing, the base copy-ctor is called)
    Display(d2);  // calls Base::Print (slicing, the base copy-ctor is called)

    return 0;
}

For the virtual dispatch to work, it is necessary that the called function is declared virtual, and that the call is performed through a reference or a pointer, since this is the only case in which the static and dynamic type can differ (otherwise the type is known during compilation and the call is statically bound). There’s no implicit conversion between objects, and when we assign a derived object to a base class object slicing occurs (the base class’ copy constructor is called).

If we have a reference or a pointer to a base type, we can use all the operations declared in the public interface of that base type (for example we can call all the public methods that are exposed through that interface). This kind of interface is sometimes called an  explicit interface, because it is explicitly declared in the definition of the class and we can see it in the source code.

Since we can treat different types as one, as long as they derive from a common base class, we can also store pointers to different objects in a container or array (reference are not objects, so we cannot have containers or arrays in which the elements are references):

C++
void DisplayVec(std::vector<Base*> const &vec)
{
    for (Base *base : vec)
        base.Print();
}

int main(int argc, char **argv)
{
    std::vector<Base*> vec;

    Base b;
    Derived1 d1;
    Derived2 d2;

    vec.push_back(&b);
    vec.push_back(&d1);
    vec.push_back(&d2);

    DisplayVec(vec);   // prints "in base class", "in derived1 class", "in derived2 class" 

    return 0;
}

Dynamic polymorphism is by nature invasive and intrusive: we need to implement explicit interfaces (read: derive from common base classes) to be able to treat a derived type as a base type. This results in a strong coupling between two pieces of code, something that we, in general, try to avoid.

Static polymorphism

In C++, templates also allow us to use objects of different types with the same notation. Unlike dynamic polymorphism, the common interface is not represented by a (possibly abstract) base class from which all other types must derive. Instead, the interface is implicit and is expressed by a series of valid operations/expressions inside the template when it is instantiated:

C++
class A
{
public:
    void Print() { std::cout << "in base class" << std::endl; }
};

class B
{
public:
    void Print() { std::cout << "in base class" << std::endl; }
};

class C
{
public:
    void Print() { std::cout << "in base class" << std::endl; }
};

template <typename T>
void Display(T const t)
{
    t.Print();
}

int main(int argc, char **argv)
{
    A a;
    B b;
    C c;

    Display(a);   // instantiates Display<A>
    Display(b);   // instantiates Display<B>
    Display(c);   // instantiates Display<C>

    return 0;
}

Unlike dynamic polymorphism, classes for which the template can be instantiated are not related: the interface commonality is not expressed through a common base class, the template works for every type that implements the necessary (implicit) interface (in this case, every type that has a Print() function). This is somethimes called “duck typing” (if it quacks like a duck, it must be a duck).

Please note that with dynamic polymorphism we have a single function Display at run-time whereas with static polymorphism there is a different function for each type with which the template is instantiated (in our code there is a Display<A>, Display<B> and Display<C> function), which increases code size. It’s important to remember that both explicit and implicit interfaces are checked at compile time: if a function tries to use an operation that is not part of an explicit interface for that type the compiler issues an error; likewise, if we instantiate a template with a type that doesn’t implement the operations used within it (the  interface implicitly expressed by the set of expressions used by the template instance), again the program does not compile.

A limit of static polymorphism is that all types must be known at compile time: for example, we can’t have eterogenous collections of objects:

C++
template <typename T>
void DisplayVec(const std::vector<T> &vec)
{
    for (T &t : vec)
        t.Print();
}

int main(int argc, char **argv)
{
    A a;

    std::vector<A> vec;  // a vector of elements of type A

    DisplayVec(vec);   

    return 0;
}

Despite these limitations, static polymorphism is not invasive nor intrusive: the types involved are unrelated and independent of each other.

It would be nice if we could use some unrelated types that implement some common operation (i.e conform to the same interface, or partial interface) without the need for deriving from a base class (as in static polymorphism) as the same type (as with pointer and reference to base classes in dynamic polymorphism), allowing us to store them in etherogeneous containers. Let’s see how this can be achieved in C++.

Bridging static and dynamic polymorphism: type erasure

Type erasure is a pattern/idiom/technique that provides benefits from both models of polymorphism: unrelated types can be used as the same type through a common interface without the need to derive from a common base class (as with templates); heterogeneous collection of objects of different unrelated types are also possible (as with dynamic polymorphism).

A simple implementation for a type erasure class looks like this:

C++
#include <utility>

class Object
{
private:
    class Concept
    {
    public:
        virtual ~Concept() = default;
        virtual void Print() = 0;
        virtual Concept *Clone() = 0;
    protected:
        Concept() = default;
    };  
    
    template <typename T>
    class Model : public concept
    {
    public:
        template <typename U>
        Model(U &&u) : mInstance(std::forward<U>(u)) {}
        void Print() override { mInstance.Print(); }
    private:
        T mInstance;
    };
    
public:
    template <typename T>
    Object (T&& t) : mConcept(new Model<T>(std::forward<T>(t))) {}    // forwarding constructor
    ~Object() { delete mConcept; }                                    // destructor
    void Print() { mConcept->Print(); }
private:
    Concept *mConcept;
};

Let’s break the implementation apart. The magic of type erasure happens thanks to the Concept base class and the derived class Model, which is a class template. These classes are nested inside the type erasure class Object, which is an ordinary class, but its constructor is templatized on the type of the object that we want to type-erase. The constructor perfectly forwards the instance to the Model constructor, and the Model class stores a (copy-constructed or move-constructed) copy  of the object.

The Object class stores a pointer to an instance of Concept, and simply delegates every operation to it . The interface of the object (here, the Print method) is replicated in the Object class, the Concept abstract base class (as pure virtual methods) and the Model class.

We can pass any object that implements the same interface (in this case a Print member function taking no arguments and returning void) to an Object constructor, and we can also have heterogeneous collections of type-erased objects:

C++
#include <iostream>
#include <vector>

class C
{
public:
    void Print() { std::cout << "hello from C" << std::endl; }
};

class A 
{ 
public: 
    void Print() { std::cout << "hello from A" << std::endl; } 
};

void PrintVec(const std::vector<Object> &vec)
{
    for (Object &object : vec)
        object.Print();
}

int main(int argc, char **argv)
{
    C c;
    A a;

    Object o1(c);
    Object o2(a);

    o1.Print();  // prints "hello from C"
    o2.Print();  // prints "hello from A"

    std:::vector<Object> vec;

    vec.push_back(C());
    vec.push_back(A());

    PrintVec(vec);   // prints "hello from C" and "hello from A"

    return 0;
}

This is type erasure in a nutshell. A more realistic implementation would provide copy and move operations (construction and assignment):

C++
#include <utility>

class Object
{
private:
    class Concept
    {
    public:
        virtual ~Concept() = default;
        virtual Concept *Clone() = 0;
        virtual void Print() = 0;
    protected:
        Concept() = default;
    };
    
    template <typename T>
    class Model : public Concept
    {
    public:
        template <typename U>
        Model(U &&u) : mInstance(std::forward<U>(u)) {}
        Model *Clone() override { return new Model(mInstance); }
        void Print() override { mInstance.Print(); }
    private:
        T mInstance; 
    };
    
public:
    ~Object() { delete mConcept; }
    Object(const Object&);        // copy ctor
    Object(Object &);             // copy ctor for non const Objects (inhibits forwarding ctor)
    Object(Object&&);             // move ctor
    template <typename T>         // forwarding ctor
    Object(T&&);
    Object &operator=(const Object&);   // copy-assignment
    Object &operator=(Object&);         // assignment operator for non-const Objects (inhibits forwarding copy-assignment operator)
    Object &operator=(Object&&);        // move-assignment
    template <typename T>               // forwarding assignment operator
    Object &operator=(T&&);
    void Print() const;
    void Swap(Object &other);           // used in copy assignment operator
private:
    Concept *mConcept;
};

Implementation for copy and move constructors follows:

C++
Object::Object(const Object &other) : mConcept(other.mConcept->Clone())
{
}

Object::Object(Object &other) : Object(static_cast<const Object&>(other))  // calls copy ctor taking const 
{
}

Object::Object(Object &&other) : mConcept(other.mConcept)
{
    other.mConcept = nullptr;
}

template <typename T>
Object::Object(T &&t) : mConcept(new Model<T>(std::forward<T>(t))) 
{
}

Note that if we want to provide normal copy semantics, we need to implement a copy constructor that takes a non-const lvalue reference to Object. Without it, the forwarding construtor would be a better match for a non const lvalue, so the non-const Object copy constructor simply casts the argument to a const one, and delegates the construction to the normal copy constructor. This can be avoided with a little use of SFINAE:

C++
template <typename T, typename = typename std::enable_if<!std::is_same<T, Object>::value>::type>
Object(T &&t) : mConcept(new Model<T>()std::forward<T>(t)) {}

With this declaration the forwarding constructor will not participate in overload resolution when the type of the argument to the constructor is an (even non-const) Object, so the regular copy contructor will be called. There’s no need to declare a non-const copy constructor.

Copy and move assignment operators are similarly implemented:

C++
Object &Object::operator=(const Object &other)
{   
    Object temp(other);    // copy and swap
    Swap(temp);
    return *this;
}

Object &Object::operator=(Object &other)
{
    *this = static_cast<const Object&>(other);   // calls copy assignment operator taking const
}

Object &Object::operator=(Object &&other)
{
    Swap(other);
    return *this;
}

template <typename T>
Object &Object::operator=(T &&t)
{
    Object temp(std::forward<T>(t));
    Swap(temp);
    return *this;
}

Similarly to the case of the forwarding constructor, SFINAE can be used instead of declaring a non-const copy-assignment operator:

C++
template <typename T>
typename std::enable_if<std::is_same<T, Object>::value,Object>::type &Object::operator=(T &&t)
{
    Object temp(std::forward<T>(t));
    Swap(temp);
    return *this;
}

The copy-and-swap idiom is used here to implement the copy assignment operator, and the Swap function simply exchanges the pointers inside the Object instances:

C++
void Object::Swap(Object&other)
{
    Concept *temp = mConcept;
    mConcept = other.mConcept;
    other.mConcept = temp;
}

This kind of type erasure is used in the delegate implementation which uses polymorphism and dynamic allocations.

Leave a Reply

Your email address will not be published. Required fields are marked *