3 Understanding Smart Pointers in C++: History, Problems Solved, and Best Practices

3.1 Background: Origins of Smart Pointers in C++

In the early days of C++, memory management was done with raw pointers and explicit new/delete calls. This manual approach was error-prone – forgetting to delete memory led to memory leaks, and deleting twice or using an invalid pointer led to crashes. To address these issues, the concept of smart pointers was developed. Smart pointers are objects that simulate raw pointers but also automatically manage memory (and other resources) to prevent common bugs. They were popularized in C++ in the early 1990s as a response to criticism that C++ lacked automatic garbage collection. In other words, C++ smart pointers were created to provide safer, automated memory management (RAII) in a language without built-in garbage collection.

The first smart pointer in C++’s standard library was std::auto_ptr, introduced in the late 90s (formally in the 2003 standard). auto_ptr was a simple RAII pointer that automatically deleted its owned object when it went out of scope. However, it had flawed copy semantics: copying an auto_ptr transferred ownership (leaving the source empty) instead of duplicating, which was confusing and made it incompatible with standard containers. Due to these issues, auto_ptr was deprecated in C++11 and ultimately removed in C++17.

Modern C++ (since C++11) introduced a new suite of smart pointers – std::unique_ptr, std::shared_ptr, and std::weak_ptr – largely inspired by prior practice in the Boost library. C++11 “fixed” auto_ptr by replacing it with unique_ptr for single-ownership and adding shared_ptr/weak_ptr for shared-ownership use cases. These modern smart pointers leverage C++11 features (like move semantics) to manage ownership more safely and efficiently than auto_ptr did.

3.2 The Problems with Raw Pointers (and How Smart Pointers Fix Them)

Raw pointers in C/C++ are powerful but come with no automatic memory management. Developers must manually delete any new’d memory, which is error-prone. Memory leaks occur if a code path skips a needed delete (e.g. due to an exception or logic error). Dangling pointers occur if an object is deleted while someone still has a pointer to it. Double deletions or freeing invalid memory can happen if multiple pointers mistakenly manage the same object. These issues made robust memory management difficult in large, complex code.

Smart pointers address these problems using RAII (Resource Acquisition Is Initialization) and other mechanisms:

Automatic Deallocation (RAII): A smart pointer object owns a heap allocation and is responsible for deleting it when the smart pointer goes out of scope. This guarantees that allocated memory is freed exactly once, at the right time, even if exceptions are thrown. For example, std::unique_ptr and std::shared_ptr will call delete on their managed object in their destructors, preventing leaks. By tying the object’s lifetime to a smart pointer’s scope, we avoid both leaks and premature frees.
Unique Ownership or Reference Counting: Smart pointers keep track of who “owns” an object:
- Unique ownership: std::unique_ptr represents sole ownership. It cannot be copied (only moved), so there is always exactly one owner responsible for deletion. This prevents the ambiguity of multiple deletes – only the unique owner deletes the object.
- Shared ownership: std::shared_ptr allows multiple pointers to share ownership of the same object via an internal reference count. The object is deleted when the last shared_ptr owning it is destroyed. This prevents dangling pointers in common scenarios – if two parts of code share an object, it won’t be freed until both have finished using it. However, shared pointers introduce a small overhead (a control block with the ref-count) and require care to avoid reference cycles (which std::weak_ptr solves; see below).
Exception Safety: With raw pointers, an exception thrown between a new and delete can leak memory. Smart pointers guard against this by ensuring the delete happens in their destructor no matter what. For example, if you allocate an object into a unique_ptr or shared_ptr, you don’t need a delete statement at all – the smart pointer’s destructor will clean up even if an exception is thrown, thus preventing leaks in error paths.
Better Semantics: Smart pointers make ownership explicit. Code that uses a unique_ptr or shared_ptr clearly communicates if a function or class assumes ownership of a resource. This clarity can prevent bugs. For instance, when you see a function taking a unique_ptr parameter, you know it expects to take ownership (as opposed to a raw pointer parameter, where it’s unclear if the function will copy, delete, or just use the pointer). In essence, smart pointers encode the ownership contract in the type system.

In summary, smart pointers were introduced to make memory management safer and easier in C++. They automate object destruction (preventing leaks) and coordinate ownership (preventing double-delete and dangling pointers). In languages with garbage collection (Java, C#, etc.), the runtime would handle this; in C++, smart pointers are a way to achieve deterministic, safer memory cleanup without a garbage collector.

3.3 Types of Smart Pointers in Modern C++

Modern C++ provides several smart pointer classes in the <memory> header, each addressing different use cases:

std::unique_ptr<T>: A unique_ptr owns a heap-allocated object exclusively. It cannot be copied, only moved, meaning you can transfer ownership but never accidentally have two unique_ptrs owning the same object. When a unique_ptr is destroyed (or reset/reassigned), it deletes the object it manages. This is ideal for the common case of a single owner. unique_ptr is lightweight (typically the size of a raw pointer) since it doesn’t require a reference count. It’s the go-to smart pointer for exclusive ownership and replaces the old auto_ptr (which had flawed copying behavior). Best practices in modern C++ dictate using unique_ptr (usually created with std::make_unique) for owning dynamically allocated objects, instead of raw new/delete calls. Example usage: auto ptr = std::make_unique<Foo>(); gives you a unique_ptr that will delete the Foo when it goes out of scope.
std::shared_ptr<T>: A shared_ptr holds a pointer to a heap object and allows multiple owners via reference counting. Copies of a shared_ptr point to the same object and increase the internal count; when a shared_ptr is destroyed or reset, it decreases the count, and when the count drops to zero the object is deleted. This is useful when you truly need shared ownership semantics – e.g. an object that is used by multiple parts of a program with independent lifetimes. Shared pointers incur some overhead for thread-safe reference counting and a control block to manage the object’s lifetime. They also can suffer from cyclic references (if two objects reference each other via shared_ptr, they won’t free unless broken manually), which is what std::weak_ptr addresses. In modern C++, you typically use std::make_shared to create shared_ptrs (which efficiently allocates the control block and object in one go). Use shared_ptr only when ownership must be shared; otherwise prefer unique_ptr for simplicity and performance.
std::weak_ptr<T>: weak_ptr is a companion to shared_ptr. It is not an owner, but rather a safe observer of an object managed by shared_ptr. A weak_ptr does not increase the reference count, so it won’t prolong the object’s lifetime. However, you can check if the object still exists (weak_ptr.expired()) or obtain a temporary shared_ptr (weak_ptr.lock()) if it’s still alive. weak_ptrs are used to break reference cycles and to safely refer to objects that might be deleted. For example, in a graph or observer pattern, you might have a weak_ptr to avoid keeping an object alive unintentionally. If you attempt to lock a weak_ptr to a destroyed object, you get an empty shared_ptr instead of dereferencing a dangling raw pointer – this avoids dangling pointer issues.
(Historical) std::auto_ptr<T>: As mentioned, auto_ptr was the first attempt at a smart pointer (standardized in C++98/03) but had issues. It’s now obsolete, superseded by unique_ptr. Modern code should not use auto_ptr (it’s removed in C++17). We mention it only for historical context – use unique_ptr instead.

Beyond these, C++ and libraries offer other specialized smart pointers (like std::scoped_ptr in Boost, or std::shared_ptr with custom deleters for arrays, etc.), but the core three (unique, shared, weak) cover most needs. With these tools, manual new and delete are largely unwarranted in modern C++: you allocate with make_unique/make_shared and let the smart pointers manage deletion.

3.4 Polymorphism, Dynamic Dispatch, and Smart Pointers

Smart pointers work seamlessly with C++’s polymorphism (inheritance and virtual functions), just like raw pointers do. Polymorphic dynamic dispatch in C++ requires using pointers or references to base classes so that virtual function calls resolve to the derived class implementations at runtime. You can use smart pointers to hold polymorphic objects and still get proper dynamic dispatch. For example:

struct Base { virtual void doSomething(); virtual ~Base(){} };
struct Derived : public Base { void doSomething() override; /* ... */ };

std::unique_ptr<Base> ptr = std::make_unique<Derived>(); 
ptr->doSomething();  // calls Derived::doSomething() via Base pointer (dynamic dispatch)

In the above snippet, ptr is a unique_ptr<Base> that actually holds a Derived. Calling ptr->doSomething() correctly invokes the overridden method in Derived thanks to the virtual function. The important thing to remember is to give your base class a virtual destructor if you plan to delete derived objects through a base pointer (raw or smart) – this ensures the derived destructor runs. In the case of unique_ptr, its destructor will call delete on a Base*, so Base::~Base must be virtual to allow Derived to be destroyed fully. (With shared_ptr, a similar caution applies, though make_shared<Derived> internally knows to call the correct destructor; still, it’s good practice to have a virtual destructor in polymorphic base classes.)

For casting between polymorphic types, C++ provides dynamic_cast (runtime-checked downcast) and static_cast (compile-time, unchecked). With smart pointers:

For unique_ptr, there isn’t a built-in casting function. If you need to downcast a unique_ptr<Base> to unique_ptr<Derived>, you’ll have to do it manually (and carefully). One approach is to use dynamic_cast on the raw pointer and construct a new unique_ptr, e.g.:
```
std::unique_ptr<Base> basePtr = std::make_unique<Derived>();
if (Derived* d = dynamic_cast<Derived*>(basePtr.get())) {
    // Use d, but basePtr still owns the object.
}
```
This lets you observe the object as a Derived if it really is that type. If you actually need to transfer ownership as a different type, you could release the pointer and recapture it in a new unique_ptr of the derived type, but that’s rarely needed and can be dangerous if the cast is wrong. In practice, frequent casting might indicate a design issue – consider using virtual functions or a variant instead of downcasting.
For shared_ptr, there are standard casting utilities: std::dynamic_pointer_cast<T>(sp) and std::static_pointer_cast<T>(sp). These create a new shared_ptr<T> from a shared_ptr<U> by casting the underlying pointer. For example, if you have std::shared_ptr<Base> sp = std::make_shared<Derived>();, you can do:
```
auto derivedSP = std::dynamic_pointer_cast<Derived>(sp);
if (derivedSP) {
    // cast succeeded, use derivedSP as shared_ptr<Derived>
    derivedSP->someDerivedMethod();
}
```
dynamic_pointer_cast will return an empty shared_ptr if the object isn’t actually of the target type. Notably, this new shared_ptr shares ownership with the original – the reference count is not duplicated but shared, so you won’t double-delete. This is safer than extracting a raw pointer and using dynamic_cast on it, which gives you a raw pointer that you might accidentally delete or not know if it’s owned. With dynamic_pointer_cast, the object stays managed by shared_ptr and the ref-count is properly maintained (the example above would increment the use count while derivedSP exists).

In summary, smart pointers support polymorphism just as raw pointers do. Use a unique_ptr<Base> or shared_ptr<Base> to point to derived instances when you need polymorphic behavior. Just remember the rule about virtual destructors for base classes. And if downcasting is needed, prefer the smart pointer casting functions for shared_ptr, or avoid downcasting if possible by designing appropriate virtual functions.

3.5 Modern Best Practices for Using Smart Pointers

With the introduction of smart pointers, C++ developers have converged on some best practices to write safer and clearer code:

Use Smart Pointers for Ownership: In modern C++, you should rarely (if ever) call delete explicitly. If you allocate something with new, immediately put the raw pointer into a unique_ptr or shared_ptr so it will be correctly deleted later. The C++ Core Guidelines state: “Prefer unique_ptr over raw pointers for owning memory”, and generally advise using objects (including smart pointers) to manage resources. Raw pointers are now typically used only for non-owning references or in low-level code, not for expressing ownership.
std::unique_ptr as the Default Choice: Use unique_ptr for dynamically allocated objects whenever you have a single owning reference. It has zero overhead beyond a raw pointer and clearly conveys sole ownership. You can still transfer ownership by using std::move on the unique_ptr (for example, returning a unique_ptr from a function or storing it in another unique_ptr). Always favor unique_ptr unless you specifically need shared ownership or a polymorphic behavior that shared_ptr provides (like being able to copy the pointer freely). In short, unique_ptr is the go-to smart pointer for most cases of heap allocation.
std::shared_ptr for Shared Ownership: Use shared_ptr only when you truly need multiple parts of the code to own a pointer (meaning the object should persist as long as anyone needs it). Classic use cases might be in observer patterns, caches, or passing objects across threads where ownership is shared. Be mindful of performance and lifetime: shared_ptr uses atomic reference counting, which has a cost. If you use shared_ptr everywhere by default, you may introduce unnecessary overhead and complexity. So, use shared_ptr sparingly, and document why an object has to have shared ownership. When you do use shared_ptr, also consider whether a design could lead to cyclic references – if so, use weak_ptr for the back-reference to break the cycle.
Use std::make_unique and std::make_shared: These factory functions (C++14 for make_unique, C++11 for make_shared) should be used instead of calling new directly. For example, do auto p = std::make_unique<Foo>(args); rather than Foo* p = new Foo(args);. Not only do they make code concise, they also ensure exception safety in construction and, in the case of make_shared, can be more efficient (one allocation for both control block and object). The only caveat: make_shared might slightly delay the freeing of memory in some weak_ptr scenarios (due to control block and object being one allocation), but this is a minor point. Overall, use these helpers to create smart pointers.
Avoid auto_ptr, raw new, and raw owning pointers: As mentioned, auto_ptr is deprecated/removed. Likewise, avoid using naked new and delete in application code – they should be wrapped in smart pointers or other RAII containers. If you find yourself writing new a lot, ask if a smart pointer or a standard container (like std::vector or std::string which manage their memory internally) would do the job. Raw pointers can be used for non-owning access (more on that below), but never let raw pointers be the sole owner of dynamic memory in modern code unless you have a very good reason.
Custom Deleters and Other Resources: smart pointers are not limited to managing new allocations. You can supply a custom deleter to unique_ptr/shared_ptr to manage other resources (files, sockets, malloc/free, etc.). This is an advanced but powerful feature: for example, std::unique_ptr<FILE, decltype(&fclose)> fileptr(fopen(...), &fclose); will call fclose when it goes out of scope. This turns many C-style resource management tasks into safe RAII patterns using the same smart pointer concept.
Thread Safety: Note that shared_ptr’s control block operations are thread-safe (you can copy shared_ptr on multiple threads), which is another reason to use it in multithreaded scenarios needing shared access. unique_ptr, on the other hand, is not thread-safe to share between threads (you must transfer it between threads or use synchronization, since it can’t be copied). Usually each thread would have its own unique_ptr or share a common object via shared_ptr. This is just to be aware that shared_ptr has some overhead to support thread safety.
Don’t Overuse Smart Pointers: While smart pointers help manage dynamic memory, remember that not everything needs to be on the heap. Use stack objects or plain values where appropriate. Overusing heap allocations (even with smart pointers) can lead to performance issues. Smart pointers are a tool to manage necessary dynamic objects, not a mandate to heap-allocate everything.

In essence, modern C++ style = use RAII and smart pointers for resource management. This greatly reduces memory bugs and makes ownership semantics explicit. The remaining question is how to use smart pointers in function interfaces, which we cover next.

3.6 Using Pointers and Smart Pointers in Function Interfaces

One area that can be confusing is how to pass smart pointers to functions or return them, especially with const and reference qualifiers involved. The guiding principle is: only pass or return a smart pointer if you need to communicate ownership semantics. If not, prefer using a raw pointer or reference to refer to the object. Let’s break down scenarios for function parameters and return types:

3.6.1 Passing Smart Pointers to Functions (Parameters)

Ask, “Does the function need to take ownership of the object? Or share ownership? Or neither?” The answer determines how you should pass the argument:

Function takes ownership (sink): If the function is expected to assume ownership of a dynamically allocated object (meaning the caller should give up ownership), use a smart pointer parameter by value. Typically, this means a parameter of type std::unique_ptr<T> (not reference) so that the function will receive and own the pointer, and the caller must std::move their unique_ptr when calling. For example: void f(std::unique_ptr<Widget> ptr). The caller does f(std::move(myPtr)); and after the call, myPtr is null and the function’s copy owns the object. This pattern clearly indicates a transfer of ownership. (For shared ownership transfer, you could also pass a std::shared_ptr by value, which would increment the ref count, meaning the function now co-owns the object. But if the function is meant to exclusively take a newly created object, unique_ptr is more appropriate.)
Function reseats or modifies the smart pointer: If the function needs to modify the caller’s smart pointer itself – e.g. perhaps set it to null or make it point to a different object – then pass the smart pointer by non-const reference. For unique_ptr, this would be void f(std::unique_ptr<T>& ptr). This tells the caller that the function might change their unique_ptr (reset it or replace it). The function can do ptr.reset() or ptr = std::make_unique<T>(...) internally, affecting the caller’s unique_ptr. (You cannot pass by value here, because that would only modify a copy.) Similarly, for shared_ptr, f(std::shared_ptr<T>& p) would allow the function to reassign the caller’s shared_ptr (though this is less common). Note: you cannot bind an rvalue (temporary) to a non-const lvalue reference, so the caller must pass an actual lvalue shared_ptr/unique_ptr variable to these functions (not a temporary or std::move, which is good because we don’t want to accidentally move in this case).
Function shares ownership (wants to use and possibly keep a copy): If a function is not taking exclusive ownership but wants to participate in shared ownership, you can pass a std::shared_ptr<T> by value. This way, the function gets its own shared_ptr pointing to the object (bumping the ref count), and can store it or use it freely without worrying if the caller’s copy goes away. For example void f(std::shared_ptr<Foo> sp). Inside f, sp.use_count() would be at least 2 (one in caller, one in callee) until f ends or unless f stores it elsewhere. This strategy ensures the object stays alive for the duration of the function (and beyond, if f stores the shared_ptr). Keep in mind copying a shared_ptr is relatively heavy (atomic refcount increment/decrement), so do this only if needed.
Function only needs to observe/use the object (no ownership change): This is the most common case – the function just needs to read or manipulate the object without taking ownership. In this scenario, do not pass a smart pointer at all; instead, pass a raw pointer or reference to the object. For example, void f(const Widget* w) or void f(Widget& w) if you expect a valid object. This makes it clear that f is not responsible for lifetime management; it’s just using the object. Passing a smart pointer here would add no benefit – it would just obscure the fact that you only needed to use the object. In fact, passing a const std::unique_ptr<T>& or const std::shared_ptr<T>& to just use the object is considered needlessly convoluted, because the smart pointer is then just “a useless wrapper around a naked pointer” in that context. The C++ Core Guidelines put it plainly: *“For general use, take T* or T& arguments rather than smart pointers”* when you don’t need to manipulate ownership. This way, your function’s interface focuses on the object itself, not how it’s managed.

In practice, if you have a std::unique_ptr<Foo> myPtr and you want to call a function that just uses Foo, you’d do something like void useFoo(const Foo& obj); and call useFoo(*myPtr); – or if the function takes a Foo*, call useFoo(myPtr.get());. This does the right thing: it passes a pointer/reference to the actual Foo without transferring ownership, and the lifetime is guaranteed because the unique_ptr remains in scope keeping the object alive.

What about const correctness? If the function shouldn’t modify the object, use a pointer-to-const or reference-to-const (const Foo* or const Foo&). If the function shouldn’t (or needn’t) modify the smart pointer itself, you might be tempted to pass a const std::unique_ptr<Foo>&. As discussed, that prevents the function from moving or resetting the unique_ptr, but it still allows modifying the pointed object (since the unique_ptr would give access to a non-const Foo). This is another reason to prefer passing the object or a raw pointer directly – e.g. f(const Foo* ptr) expresses that the function won’t modify the Foo via this pointer. In short, use const on the pointer or reference to signify read-only access to the pointee, rather than wrapping the pointer in a const smart pointer reference, which complicates matters. One Stack Overflow answer nicely summarizes: passing const unique_ptr<T>& “is just using the unique_ptr as a useless wrapper around a naked pointer,” so you might as well pass a naked pointer or reference to begin with. This yields clearer semantics and solves any const-correctness issues.

To recap parameter guidelines:

Use unique_ptr<T> by value when the function takes ownership (caller gives it up).
Use unique_ptr<T>& (rarely const&) if the function needs to modify or reseat the caller’s smart pointer.
Use shared_ptr<T> by value if the function should share ownership (it will make a copy to keep the object alive).
Use shared_ptr<T>& if the function might reseat the shared pointer (e.g. assign a new shared_ptr to it).
Do not use smart pointers in the param list just to access the object – in that case, use raw T* or const T& (and document that the function does not take ownership). The raw pointer can be seen as “I just need to observe or use this object, it must outlive the call but I’m not owning it.”

3.6.2 Returning Pointers or Smart Pointers from Functions

For function return types, the choice of raw pointer vs smart pointer also boils down to ownership semantics:

Returning a new heap object: If a function is creating a new object that the caller will own, the modern best practice is to return a std::unique_ptr<T> rather than a raw T*. For example, a factory function might be std::unique_ptr<Foo> createFoo(args...) { return std::make_unique<Foo>(args...); }. This makes it clear the caller takes ownership via the unique_ptr (and no manual delete is needed). It also provides exception safety (if creation fails, no need to worry about delete in the caller). In older C++, such functions would often return a raw pointer and put the burden on the caller to eventually delete it, which is less safe. Returning unique_ptr communicates the transfer of ownership clearly and is the recommended approach. The caller can then decide to keep it in a unique_ptr or even convert to a shared_ptr if needed (by moving it into a shared_ptr). The key is that using unique_ptr in the interface prevents forgetting to manage the memory.
Returning shared ownership: If a function is returning an object that it will share ownership of with the caller, return a std::shared_ptr<T>. An example might be a function that retrieves an object from a cache or subsystem where the object is managed by a shared_ptr internally; returning a shared_ptr lets the caller hold a reference without worrying that the object might disappear. It also makes it clear that the object could be shared elsewhere. Keep in mind, returning shared_ptr increases the ref count (or constructs a new shared_ptr) so there’s some overhead. Only use shared_ptr in returns if sharing is intended. If the function always creates a fresh object just for the caller, unique_ptr is usually sufficient (the caller can share it later if needed).
Returning a reference or raw pointer to an existing object: Sometimes a function returns a reference or pointer to an object that it still owns. For example, Foo& getFoo() or Foo* getFoo() might return a pointer/reference to an internal object (like an element of a container or a singleton). In this case, you are not transferring ownership, just providing access. Be very careful with this – you must ensure that the object outlives the return and clearly document that the caller should not delete it. A raw pointer return can be appropriate to indicate “non-owning reference” (in fact, the Core Guidelines say returning a T* can be used to indicate the caller should not free it – it’s just a “position” or reference). If you do this, consider using tools like gsl::not_null or at least assert that the returned pointer isn’t null if that’s expected. Never return a pointer or reference to a local object (that leads to dangling pointers).
Returning nullptr to signal “not found” or similar: If using raw pointers for non-owning returns, a common pattern is returning nullptr to indicate a missing result (e.g., search functions returning T*). This is fine for non-owning scenarios. If using smart pointers, returning an empty unique_ptr or shared_ptr can also indicate a null result. But be mindful that returning a shared_ptr just to use its null to indicate “not found” is overkill if ownership isn’t needed – a raw pointer can do the same with less overhead.

In summary, match the return type to the ownership semantics:

Use unique_ptr<T> to return a newly created object that the caller will own (exclusive ownership transfer).
Use shared_ptr<T> to return an object that is shared between caller and callee.
Use raw T* or T& to return references to objects managed elsewhere (but document lifetime assumptions and never return pointers to locals). If a raw pointer is returned from a function that created an object, that’s typically a code smell – it implies the caller must delete it, which is not exception-safe or clear (prefer returning unique_ptr in that case).

Finally, regarding const-correctness in returns: returning a const T* or const T& can indicate the caller should not modify the object through that pointer/reference. Smart pointers can also be const, e.g. returning std::shared_ptr<const T> if you want to present a read-only shared object. This will prevent the caller from modifying T (the pointed-to object) via that pointer (they’d only have const access). Such usage is less common but can be used to enforce an object’s immutability from the caller’s perspective.

To wrap up, smart pointers in C++ were introduced to make memory management safer by automating resource cleanup and clarifying ownership semantics. They originated in the 90s (with early reference-counting ideas) and became standard in the late 90s/2000s with auto_ptr, evolving to the robust unique_ptr/shared_ptr we use today. Smart pointers solve problems of memory leaks and dangling pointers by ensuring objects are properly destroyed when no longer needed. Modern best practices suggest using unique_ptr as a default for owning pointers, shared_ptr only when needed for sharing, and using raw pointers/references for non-owning uses to keep interfaces clear. When using inheritance, smart pointers behave like raw pointers (respecting polymorphism), and C++ provides casting utilities for shared_ptr to navigate class hierarchies safely. By following these practices – choosing the right smart pointer and passing/returning it correctly – you can write C++ code that is both safer (less prone to memory errors) and easier to understand in terms of object ownership and lifespan.

3.7 Factory Methods, Modern C++ Practices, and Pybind11 for Polymorphic Interfaces

3.7.1 1. Factory Methods in C++ Design

What are Factory Methods? – A Factory Method is a creational design pattern that provides an interface for creating objects, but allows the subclasses or separate functions to decide which concrete class to instantiate. In essence, you call a factory method instead of calling a constructor directly. This indirection lets you create objects without specifying the exact class of the object being created. The factory method typically returns a pointer or smart pointer to an abstract base class or interface that the concrete products implement. This promotes loose coupling: client code calls the factory interface and doesn’t need to know about the concrete subclasses, making it easier to extend or change implementations later.

How are they used in C++? – In C++, factory methods can be implemented in different styles:

Static factory functions in a class: A class can provide static methods that construct instances in specialized ways. For example, consider a 2D vector class that can be initialized either from Cartesian coordinates (x,y) or polar coordinates (angle, magnitude). You cannot have two constructors with the same signature, so you can use named static factory methods:
```
struct Vec2 {
    float x, y;
    // private constructor to force use of factories
private:
    Vec2(float x_val, float y_val) : x(x_val), y(y_val) {}
public:
    static Vec2 fromLinear(float x, float y) { 
        return Vec2(x, y); 
    }
    static Vec2 fromPolar(float angle, float magnitude) { 
        return Vec2(magnitude * cos(angle), magnitude * sin(angle)); 
    }
};
```
Here, Vec2::fromLinear and Vec2::fromPolar are factory methods that create Vec2 objects in different ways. The caller can clearly indicate which construction they want, improving code clarity over using constructors (and avoiding the need for an impossible constructor overload). This is a simple form of the Factory Method pattern where the class is its own factory.
Factory function or class hierarchy: More commonly associated with the formal Factory Method pattern (from the GoF design patterns), you might have an abstract creator with a virtual factory method, and multiple concrete creators. For example, an abstract ShapeFactory class could declare virtual Shape* createShape() = 0, and derived factory classes like CircleFactory override it to instantiate a Circle, while SquareFactory creates a Square. Client code uses a ShapeFactory* interface, not knowing which concrete factory it’s given, to obtain a new Shape and use it polymorphically. The key benefit is that the decision of which Shape subclass to create can be deferred until runtime (for example, based on user input or configuration) and encapsulated in the factory. The client just knows it’s getting a Shape pointer.

As a concrete example, if you have an abstract product class Shape (with a method like draw()), and concrete products Circle and Square implementing Shape, you can have:
```
class ShapeFactory {
public:
    virtual Shape* createShape() = 0;
    virtual ~ShapeFactory() {}
};
class CircleFactory : public ShapeFactory {
public:
    Shape* createShape() override { return new Circle(); }
};
class SquareFactory : public ShapeFactory {
public:
    Shape* createShape() override { return new Square(); }
};
```
Now the client can do:
```
ShapeFactory* factory = (userWantsCircle ? new CircleFactory() : new SquareFactory());
std::unique_ptr<Shape> shp(factory->createShape());
shp->draw();
```
The client code didn’t need to know which concrete Shape was created – that decision was made inside the factory method at runtime. This is the classic Factory Method pattern usage. (In practice, you might not even need separate CircleFactory classes; a simpler approach is often to use a single free function or static method with a switch or if-else to choose the concrete type based on a parameter.)
Free function factory: In many C++ projects, a simpler approach is just a free function that returns a base-class pointer. For example, a factory function for a game AI might be:
```
std::unique_ptr<Enemy> createEnemy(EnemyType type) {
    switch(type) {
        case EnemyType::Goblin: return std::make_unique<Goblin>();
        case EnemyType::Dragon: return std::make_unique<Dragon>();
        // ...
    }
}
```
This function encapsulates the logic of which subclass to create. The key point is that the correct type to create is determined at runtime (here by the EnemyType value). The caller just gets a std::unique_ptr<Enemy> and doesn’t have to deal with the details. This is often called a factory method as well (though it might not involve a whole class hierarchy of factories). It’s perfectly fine in C++ to implement the factory pattern with free functions or static methods if you don’t need a full Factory class hierarchy – in many cases a simple function is sufficient.

When (and when not) to use factory methods: Use factory methods when object creation is non-trivial or you want to decouple what you create from how it’s created. Scenarios include: when you have a family of related classes and you need to choose one at runtime (e.g., different game State objects based on a config string), or when construction involves complex setup that you want to centralize and possibly reuse. Factories can also improve code readability by giving descriptive names to creation processes (like Vec2::fromPolar) and by avoiding exposure of new and delete in user code.

However, do not overuse factories when they’re not needed. If object construction is simple and doesn’t vary, calling a constructor directly is clearer. Introducing a factory for a class that could be directly constructed just adds indirection and complexity for no gain. In particular, avoid factories for “trivially easy to construct objects” – it’s over-engineering. For example, if you find yourself writing a factory that just does return new X(args); and nothing else, consider whether a direct new X(args) (or better, a smart pointer or stack allocation) in the caller would be sufficient. Factories also come with a slight runtime cost (dynamic dispatch or branching) and additional classes to maintain, so you want to use them only when they buy you clear benefits (like flexibility or hiding complexity). As one source notes: “Avoid using the Factory Method pattern when your object creation process is straightforward and doesn’t require additional complexity… Don’t implement it if you don’t need to hide concrete implementation details, as this can lead to unnecessary overhead.”.

In summary, factory methods are a powerful design technique in C++ (and other OOP languages) to abstract away and manage object creation. They shine when the creation logic is complex or when the code needs to work with a base interface while deferring the choice of concrete subclass to runtime. But if those conditions don’t apply, simple construction is usually preferable for clarity and performance.

Concrete example in practice: A real-world use might be something like OpenSpiel’s game loaders: OpenSpiel defines a base Game class and various derived game classes. It provides a factory function LoadGame(std::string game_name) which internally decides which derived game class to instantiate (Poker, Chess, etc.) based on the string, and returns a pointer (or smart pointer) to Game. The user simply calls auto game = LoadGame("chess") and gets a polymorphic Game object without needing to know the exact class name for chess. This is a typical factory pattern usage – the library can add new games (new classes) without changing the code that calls LoadGame, since that code only deals with the base Game interface.

3.8 2. Modern C++ Tools and Practices for Safe, Performant Code

Moving beyond “bare bones” C++ (manual arrays, raw pointers, etc.), modern C++ development involves a rich toolchain and set of practices to write safer and faster code. Here are some important tools and concepts you should be aware of:

Smart Pointers and RAII: In modern C++, you should almost never use new and delete directly. Instead, use smart pointers (std::unique_ptr, std::shared_ptr, etc.) and the RAII idiom (Resource Acquisition Is Initialization) to manage resources. RAII means owning resources in objects that automatically release them in their destructors. For memory, unique_ptr is a unique-ownership pointer that deletes the object when it goes out of scope; shared_ptr is a reference-counted pointer for shared ownership. These eliminate most memory leaks and make lifetime management easier. In fact, a common guideline is “Avoid raw new/delete, C-style arrays, and manual memory management unless absolutely necessary – use smart pointers and RAII for all resources (memory, file handles, sockets, etc.)”. This greatly reduces memory errors and makes code more robust.
Use STL Containers instead of raw arrays: The C++ Standard Library provides containers like std::vector, std::array, std::string, std::map, etc., which handle memory management and sizing for you. Prefer these over raw C arrays or manual malloc. For example, std::vector<int> data(n); manages a dynamic array of int without you worrying about deleting it. These containers have bounds-checking (with .at()), iterators, and work with algorithms. Similarly, use algorithms like std::sort, <algorithm> library functions, rather than hand-writing loops for common tasks – they are well-tested and often optimized. Don’t reinvent the wheel: “Prefer the Standard Library, especially the STL, which provides highly optimized and well-tested components. For example, use std::vector or std::array over raw C-style arrays, and use STL algorithms over hand-written loops where applicable.”. This leads to safer and often faster code due to decades of optimization of these components.
Modern C++ Language Features: Make use of language enhancements from C++11/C++14/C++17/C++20 which improve both safety and performance:
- Auto and ranged for-loops: auto helps avoid type mismatches and makes code cleaner when types are long. Range-based for loops (for(auto& x : container)) avoid index errors and make loops more expressive.
- Const-correctness: Use const pervasively for variables and function parameters that shouldn’t change. This catches bugs and enables compiler optimizations.
- Move semantics (rvalue references): Modern C++ allows you to move (steal) resources instead of copying, which is crucial for performance with large data (e.g., moving a large vector instead of copying it). Understand std::move and move constructors to avoid unnecessary deep copies.
- Concurrency libraries: C++11 introduced <thread>, <future>, <mutex> etc. If your computations can benefit from parallelism, you can use threads or higher-level tools like thread pools, or even parallel algorithms (C++17) to utilize multiple cores. However, be careful with thread safety and data races; use synchronization primitives or message-passing patterns as needed. Also consider high-level parallel frameworks or libraries (TBB, OpenMP) for numerically intensive code.
- Error handling: Prefer exceptions for error cases instead of error codes, in most high-level application logic (in performance-critical lower-level code, sometimes people avoid exceptions, but generally modern C++ uses exceptions for robustness). Also consider using standard types like std::optional or std::variant for representing optional values or variant types safely.
Build Systems and Package Managers: Unlike small single-file programs, larger C++ projects use build systems. You’ve discovered CMake, which is the de facto standard for C++ builds. CMake lets you manage compilation of multiple source files, manage dependencies, set up proper compiler flags for optimization, etc. For interfacing with Python, tools like scikit-build-core integrate CMake with Python’s packaging, making it easier to build C++ extensions as Python modules. This is very useful for projects that have a C++ back-end and a Python front-end (common in ML and scientific computing). Apart from CMake, be aware of package managers like Conan or vcpkg, which help manage C++ library dependencies (similar to pip in Python). They can save you from the pain of manually building and linking many libraries in large projects.
Libraries for Scientific Computing: Given your interest in linear algebra and deep learning, you should leverage existing high-performance libraries:
- Eigen – a hugely popular C++ template library for linear algebra (vectors, matrices, solvers). It’s header-only (no separate linking needed) and uses expression templates and SIMD instructions to achieve high performance on CPUs. Eigen is so well-regarded that it’s used inside major projects like TensorFlow and Stan for their linear algebra needs. In fact, “Eigen has been adopted for use within both the TensorFlow machine learning library and the Stan Math library, as well as at CERN”, which speaks to its performance and reliability. Using Eigen, you get Python-like matrix operations (+, -, * etc. overloaded for matrices) but with C++ performance (including vectorized operations with SSE/AVX). It saves you from writing low-level loops and enables writing math computations in a high-level, safer manner (no explicit memory allocation for intermediate results, since it cleverly avoids temporaries in many cases).
- BLAS/LAPACK and others: For certain heavy linear algebra tasks, you might use platform-optimized BLAS libraries (like Intel MKL, OpenBLAS). However, Eigen often is sufficient and more convenient. Other libraries like Armadillo (another C++ linear algebra lib), Boost.uBlas, or GPU-accelerated ones (CUDA libraries like cuBLAS, or Tensor libraries) are also available. For deep learning, you wouldn’t typically write everything from scratch – you might interface with libraries like PyTorch (libtorch) or TensorFlow C++ API, but if you do need to write custom C++ computations, use these established libraries for the heavy lifting.
- Visualization and I/O: If your project grows, also be aware of libraries for tasks like parsing (e.g., JSON libraries), visualization (maybe writing data to files that Python can plot, or using something like OpenCV for images), etc.
Performance Tools: To write performant C++ code, knowing the language and libraries is step one; step two is measuring and tuning. Get familiar with profilers (such as Valgrind’s callgrind, perf on Linux, or Visual Studio Profiler on Windows) to find bottlenecks. Also learn to use optimization flags (-O2, -O3) and how to measure the effect of changes. For numerical code, techniques like ensuring memory is contiguous and aligned (which libraries like Eigen handle for you) and minimizing allocations (reusing buffers, etc.) can be important. Modern C++ also has tools for heterogeneous computing (CUDA for GPU, SYCL for multi-backend), if you venture there.
Safety and Debugging Tools: Modern C++ encourages writing safe code by design, but using tools is essential. For example, run your code with AddressSanitizer and UndefinedBehaviorSanitizer (compiler flags like -fsanitize=address,undefined) to catch memory errors (out-of-bounds, use-after-free, etc.) early. Use Valgrind to detect memory leaks. These tools are invaluable as projects scale. Writing unit tests (e.g., with Google Test) is also a best practice to ensure each component works and to catch regressions early. Static analysis tools (like clang-tidy or Cppcheck) and linters can automatically flag potential issues and enforce style or best practices. Many C++ projects also follow the C++ Core Guidelines which codify best practices; there are even checker tools for some of those rules.

In summary, beyond the basics of arrays and pointers, modern C++ development involves using the standard library for common data structures and algorithms, leveraging smart pointers and RAII for memory safety, using tools like CMake for builds, and harnessing specialized libraries for heavy tasks like linear algebra. It also means keeping up with language improvements (C++17/20 features) that let you write cleaner and faster code. By combining these, you get the performance of C++ with much less of the traditional pitfalls of memory management and complexity. As one guideline succinctly puts it: use the “modern subset” of C++ – e.g., C++20+, no raw pointers if possible, no manual memory management, and plenty of help from libraries and static analysis. This will let you build performant, safe, and maintainable C++ projects, especially for computational tasks that interface with Python.

3.9 3. Best Practices for Pybind11 with Polymorphic C++ Interfaces

Interfacing C++ and Python using pybind11 is a powerful way to get the best of both worlds (C++ speed, Python ease). However, it can get tricky when you need to expose complex C++ designs – especially class hierarchies (inheritance), factory functions, and ownership of objects across the language boundary. Let’s break down the best practices for a scenario like OpenSpiel’s, where we have C++ polymorphic classes, some factory functions, and we want to use them from Python (possibly even subclass them in Python). We’ll cover how to expose classes and inheritance, manage object lifetimes safely, and allow method overrides between C++ and Python.

3.9.1 Exposing C++ Classes and Inheritance to Python

Pybind11 makes it straightforward to expose classes. For a simple class with no inheritance, you use py::class_<T>(module, "ClassName") and add .def() for its methods and constructors (.def(py::init<...>())). When dealing with inheritance (a base class and derived classes), pybind11 allows you to specify the base in the template parameters. For example, suppose we have:

// C++ code
struct Game { 
    virtual ~Game() = default;
    virtual std::string name() const = 0;
    virtual void play() = 0;
};
struct ChessGame : public Game {
    std::string name() const override { return "Chess"; }
    void play() override { std::cout << "Playing chess\n"; }
};
struct PokerGame : public Game {
    std::string name() const override { return "Poker"; }
    void play() override { std::cout << "Playing poker\n"; }
};
std::shared_ptr<Game> LoadGame(const std::string& game_type) {
    if(game_type == "chess") return std::make_shared<ChessGame>();
    if(game_type == "poker") return std::make_shared<PokerGame>();
    throw std::runtime_error("Unknown game type");
}

Here Game is an abstract base class, and ChessGame/PokerGame are concrete derived classes. We also have a factory LoadGame that returns a shared_ptr<Game>.

To bind these with pybind11:

namespace py = pybind11;
PYBIND11_MODULE(mygames, m) {
    // Bind the base class Game, use shared_ptr as holder type
    py::class_<Game, std::shared_ptr<Game>>(m, "Game")
        .def("name", &Game::name)
        .def("play", &Game::play);
    
    // Bind derived classes, specifying Game as the base
    py::class_<ChessGame, Game, std::shared_ptr<ChessGame>>(m, "ChessGame")
        .def(py::init<>());
    py::class_<PokerGame, Game, std::shared_ptr<PokerGame>>(m, "PokerGame")
        .def(py::init<>());
    
    // Bind the factory function
    m.def("load_game", &LoadGame, py::arg("game_type"));
}

A few important things to note in this binding code:

We specified std::shared_ptr<...> as the holder type for these classes (the second template argument in py::class_). By default, pybind11 uses std::unique_ptr<T> as the holder for classes, but when dealing with class hierarchies and factory functions, using std::shared_ptr is often more convenient. It allows Python to share ownership of objects and easily handle polymorphic conversions. In the above, because Game uses std::shared_ptr<Game> as holder, any function returning a std::shared_ptr<Game> (like LoadGame) will automatically convert to a Python Game object that holds a shared pointer. Also, a ChessGame* can be converted to Game in Python since pybind11 knows ChessGame derives Game and both use compatible holders.
We listed Game as a base for ChessGame in the binding (py::class_<ChessGame, Game, ...>). This tells pybind11 about the inheritance relationship so that upcasts (ChessGame -> Game) are understood. After this, if load_game("chess") returns a Game (actually a ChessGame under the hood), Python will see it as a mygames.Game instance. But dynamic dispatch still works: calling game.name() in Python will invoke ChessGame::name() in C++ due to C++ virtual dispatch.
We exposed Game’s methods (which are virtual) to Python. Pybind11 will allow calls on the base class instance, which actually invoke the derived override, as expected. This is straightforward since C++ handles virtual dispatch natively.

When not to expose derived classes separately: In some cases, you might choose not to expose ChessGame and PokerGame as Python-visible classes at all, if you want to treat them abstractly. You could just expose Game and the load_game function, and users get Game objects. This is fine if Python code never needs to explicitly construct or refer to ChessGame. In OpenSpiel’s case, they likely don’t expose each game class individually; they provide a factory to get games by name. Exposing the base class and factory is sufficient for usage. However, exposing the derived classes can be useful if you anticipate subclassing them in Python or wanting to inspect the concrete type on the Python side.

3.9.2 Memory Management and Ownership between C++ and Python

One of the trickiest aspects is managing object lifetimes across the boundary. Here are best practices:

Use smart pointers as holders: As shown, using std::shared_ptr (or the newer py::smart_holder) is recommended for classes, especially if instances may be created in C++ and passed to Python or vice versa. A shared pointer holder ensures that the C++ object isn’t deleted as long as Python has a reference to it (because the Python object will keep a shared_ptr). It also handles multiple references gracefully. If you used the default unique_ptr holder, you could still pass objects around, but you wouldn’t be able to easily create new C++ objects on the Python side to pass back into C++ (since unique_ptr can’t be copied). The py::smart_holder (used via py::classh<T> alias) introduced in pybind11 v2.6+ is even more powerful: it can manage both unique and shared pointers and avoids some pitfalls like slicing. For example, py::class_<Game, PyGame, py::smart_holder> (if we also want trampolines, see below) would be a safe default. Smart holder automatically keeps Python subclass alive when passed to C++, and supports conversions of both unique_ptr and shared_ptr. If using plain shared_ptr as we did above, it works in most cases but be mindful that passing a unique_ptr from C++ to Python wouldn’t be supported in that setup.
Factory function return value policy: When binding a factory like load_game that returns a raw pointer or a smart pointer, you need to tell pybind11 how to convert it. If you return a std::shared_ptr<Game> directly (as in our example), and Game is registered with shared_ptr holder, everything is automatic. If instead your factory returned a raw Game*, you should specify a return value policy, typically return_value_policy::take_ownership (if the factory allocates a new object and transfers ownership to Python) or reference/reference_internal if appropriate. In practice, prefer returning smart pointers – it’s clearer and less error-prone since pybind11 can handle them directly.
Keep-alive for cross-language references: A crucial scenario is when Python passes an object (especially a Python subclass of a C++ class) into a C++ function that stores it for later use. For instance, suppose Game had a method registerCallback(GameObserver* obs) to store an observer pointer. If a Python class extends GameObserver and you pass an instance, you must ensure Python’s object doesn’t get garbage-collected while C++ still holds the pointer. Pybind11 offers the keep_alive<> policy for this. For example:
```
.def("registerCallback", &Game::registerCallback, py::arg("obs"), py::keep_alive<1, 2>());
```
The keep_alive<1,2> tells pybind11 that the object at argument 2 (the observer) should be kept alive at least as long as the object at argument 1 (the Game this pointer) remains alive. This effectively increases the refcount or holds a reference internally to prevent premature deletion. In a Stack Overflow example, passing a Python-derived object into C++ and retrieving it later failed when not using keep_alive, because the temporary Python object was destroyed too soon. The solution was either to hold a Python reference or use keep_alive. So, whenever a C++ function stores a pointer to a Python object, use keep_alive (or manage the lifetimes by ensuring the Python object is held in a variable on the Python side).
Avoiding slicing and multiple inheritance issues: If you allow Python subclasses (see next section), note that storing a base pointer to a Python subclass can lead to slicing if not handled properly. Pybind11’s py::trampoline_self_life_support (used in trampolines) and smart_holder work together to avoid this by ensuring the actual Python object stays around. Without going too deep: if you’re not using smart_holder, make sure to inherit py::trampoline_self_life_support in your trampoline classes (as pybind11 enforces). This helps keep the Python part alive when C++ only knows about the base part.

3.9.3 Allowing Python to Override C++ Virtual Functions (Trampoline Classes)

If your library expects users to possibly subclass your C++ classes in Python (e.g., to implement a callback interface or an abstract class in Python), you need to use trampoline classes in pybind11. A trampoline class is a C++ class that inherits your C++ base and overrides virtual methods to redirect calls to Python.

For example, if Game had a virtual method virtual void on_event(int x), and you want Python subclasses of Game to be able to override on_event, you would do something like:

struct PyGame : Game, pybind11::trampoline_self_life_support {
    using Game::Game; // inherit constructors if any

    // Override virtual methods to delegate to Python
    void on_event(int x) override {
        PYBIND11_OVERRIDE(void, Game, on_event, x);
    }
    std::string name() const override {
        PYBIND11_OVERRIDE_PURE(std::string, Game, name, /* no args */);
    }
    // etc. for each virtual function you want Python to be able to override
};

Then bind Game with this trampoline:

py::class_<Game, PyGame, std::shared_ptr<Game>>(m, "Game")
    .def("on_event", &Game::on_event)
    .def("name", &Game::name)
    // ... other defs ...
;

Key points about trampolines:

The trampolines override the base class virtuals using the PYBIND11_OVERRIDE macro (or _PURE variant if the base method is pure virtual). This macro checks if the Python object has an override for the method; if yes, it calls it, otherwise (or for pure versions, if not overridden) it can call a default or throw.
The Game class in the binding is told that its alias (trampoline) is PyGame by that template parameter. That way, if a Python subclass is created, pybind11 will actually allocate a PyGame C++ object to back it, which can call back into Python.
Notice we included pybind11::trampoline_self_life_support as a base of PyGame – as mentioned, this is required to safely handle certain lifetime issues when using std::unique_ptr holders. Since we used shared_ptr, it might be less critical, but it’s a good practice as pybind11 mandates for trampolines with certain holders.
When binding, put the base class first, then the trampoline class in py::class_<> template parameters. Pybind11 documentation emphasizes the order: py::class_<Base, PyBase> means Base is the actual type for Python, and PyBase is the trampoline. All method definitions still refer to &Game::on_event etc., not the trampoline’s methods.

After this setup, Python can do:

class MyGame(Game):
    def __init__(self):
        Game.__init__(self)  # call base constructor if needed
    def on_event(self, x):
        print("Python handling event", x)
    def name(self):
        return "MyGame"
g = MyGame()
g.on_event(42)       # calls MyGame.on_event in Python
cpp_call_somehow(g)  # if C++ calls Game::on_event on g, it will route to Python override

This bridging is complex under the hood, but pybind11 takes care of it via the trampoline. Note that if C++ will store g (as a Game* or shared_ptr<Game>), we must use the earlier-mentioned keep_alive or smart holder approach to ensure the MyGame Python object isn’t destroyed too early.

One limitation: when you create Python subclasses like MyGame, pybind11 has to allocate a C++ PyGame object (since Python object needs a C++ backend). This means Python may succeed in instantiating a class that is abstract in C++ without implementing all pure virtuals. In our example, if name() was pure and Python class didn’t override it, what happens? The trampoline’s PYBIND11_OVERRIDE_PURE will throw a runtime error if called. But Python could still instantiate the class (because from Python’s perspective, it’s not abstract once bound). So the design principle “you cannot instantiate an abstract class” isn’t enforced on the Python side. This is a minor quirk – essentially, you have to rely on runtime errors if a pure virtual isn’t overridden. In practice, it’s not a big issue, but it’s good to be aware that Python classes can be created even if they don’t override everything (they just can’t successfully call the missing methods).

Trampolines vs alternative approach: If you don’t need Python to subclass your C++ classes, you can avoid trampolines. For example, if Game is meant to be subclassed only in C++ and just used in Python, you can bind it without a trampoline. Only use trampolines if you want Python-side inheritance of that class. Trampolines do have a slight overhead, so pybind11 by default only initializes them when needed (like when a Python subclass is actually created) to avoid unnecessary cost.

3.9.4 End-to-End Example and Best Practices Summary

Putting it all together with an example (combining the above ideas):

Suppose we are designing a C++ library with an abstract base class Game and multiple game types. We want to expose this to Python such that users can load games by name, call methods on them, and even implement their own game in Python by subclassing Game (perhaps for quick prototyping).

C++ side design:

Game is an abstract class with some virtual methods (like play()).
Concrete games like ChessGame, PokerGame derive Game.
A LoadGame factory returns std::shared_ptr<Game> so that it can hand out either a ChessGame or PokerGame as a Game.

Pybind11 binding:

PYBIND11_MODULE(mygames, m) {
    // Trampoline class for Game to allow Python overrides
    struct PyGame : Game, py::trampoline_self_life_support {
        using Game::Game; // inherit constructors if any

        void play() override {
            PYBIND11_OVERRIDE_PURE(void, Game, play, /* no args */);
        }
        std::string name() const override {
            PYBIND11_OVERRIDE_PURE(std::string, Game, name, /* no args */);
        }
    };

    py::class_<Game, PyGame, std::shared_ptr<Game>>(m, "Game")
        .def("play", &Game::play)
        .def("name", &Game::name);
        // (If Game had a constructor or factory method, we might use py::init or def_static here)

    py::class_<ChessGame, Game, std::shared_ptr<ChessGame>>(m, "ChessGame")
        .def(py::init<>());  // assuming it’s default constructible
    py::class_<PokerGame, Game, std::shared_ptr<PokerGame>>(m, "PokerGame")
        .def(py::init<>());

    m.def("load_game", &LoadGame, py::arg("name"));
}

What this achieves:

You can call in Python:

import mygames
game = mygames.load_game("chess")   # this returns a mygames.Game instance (backed by ChessGame)
print(game.name())                 # "Chess"  (calls ChessGame::name)
game.play()                        # invokes ChessGame::play(), prints "Playing chess"
isinstance(game, mygames.Game)     # True
isinstance(game, mygames.ChessGame) # True as well, since we bound ChessGame class

The object is an instance of ChessGame (and also recognized as a Game since ChessGame is subclass of Game in Python too). If we hadn’t exposed ChessGame in pybind, it would appear only as Game type to Python, which is fine because the methods are all on Game. Exposing the derived class allows for isinstance checks or downcasting in Python if needed.

If the Python user tries to create their own game:
```
class MyGame(mygames.Game):
    def __init__(self):
        mygames.Game.__init__(self)  # call base constructor (even if none, pybind will construct PyGame part)
    def name(self):
        return "Mine"
    def play(self):
        print("Playing my custom game")

g2 = MyGame()
print(g2.name())  # "Mine"
g2.play()         # prints "Playing my custom game"
```
This works because our binding used the trampoline PyGame which routes virtual calls. If C++ code (in the library) later calls a virtual method on a Game pointer that actually points to a MyGame (Python) instance, it will call into Python. For example, if there’s a C++ function:
```
void Tournament(Game* game1, Game* game2) {
    std::cout << "Starting games: " << game1->name() << " vs " << game2->name() << "\n";
    game1->play();
    game2->play();
}
```
and we bind that as m.def("tournament", &Tournament), then in Python:
```
mygames.tournament(mygame_instance, mygames.load_game("poker"))
```
will call MyGame.name() (Python) for the first game and PokerGame::name() (C++) for the second, etc. Pybind11’s trampolines and holders ensure that the Python object mygame_instance stays alive during this call and that the virtual dispatch works correctly. (Under the hood, tournament receives a Game*. If that was created in Python, pybind11 actually passes a pointer to the PyGame object, whose virtual play() calls PYBIND11_OVERRIDE to go back to Python.)

Lifetime considerations: In the above, we used std::shared_ptr<Game> everywhere. This means both C++ and Python are sharing ownership. If load_game("chess") creates a shared_ptr and returns it to Python, Python’s object holds one reference; if you also keep one in C++ (maybe in a global or somewhere), the object lives until both are done. If Python deletes its reference (object goes out of scope) but C++ still has one, the object lives (but Python no longer can access it unless you passed it back). This shared ownership model is usually what you want for game environments, etc., to avoid premature deletion.

If your design instead had unique ownership (say the C++ side strictly manages lifetime and Python should not extend it), you could use py::nodelete or other strategies, but that’s advanced and rarely needed for typical use.

Summary of best practices for pybind11 in this context:

Use appropriate holder types (std::shared_ptr or py::smart_holder) for classes to simplify memory management and polymorphism. This avoids manual new/delete management and makes C++ polymorphic objects behave well in Python.
Expose base classes and derived classes with proper inheritance in bindings so that Python knows the relationships. This enables Python to upcast automatically and call the correct methods.
Bind factory functions in a way that returns ownership to Python. If using shared_ptr, it’s seamless. If using raw pointers, use return_value_policy to avoid memory leaks or double frees (e.g., return_value_policy::take_ownership if the function returns a new heap object).
Trampolines for virtual overrides: Use them if and only if you need Python to override C++ virtual methods. Implement trampolines carefully for each virtual function. Remember to include py::trampoline_self_life_support in the inheritance to prevent slicing issues.
Keep alive any cross-boundary pointers: If C++ holds onto a Python-created object, use keep_alive or ensure the Python object is referenced somewhere in Python. If Python holds a C++ object created via factory, use smart pointers (as we did) so that C++ doesn’t accidentally free it while Python still uses it.
Testing the interface: It’s helpful to write some test code in Python to ensure that methods dispatch correctly (especially virtuals) and that no lifetime issues appear (e.g., use a Python subclass in a C++ function and see if it works, as in the tournament example).
Documentation and clarity: Consider naming conventions in Python API – e.g., factory function names (load_game) should be pythonic (lowercase with underscore, as we did). Pybind11 allows you to add docstrings as well. And because C++ exceptions will translate to Python exceptions, ensure you handle errors (like unknown game type) by throwing C++ exceptions, which pybind will turn into Python RuntimeError or such.

By following these practices, you can create Python bindings that feel natural to Python users while harnessing a robust C++ backend. A user can create and use Game objects in Python without worrying that they’re C++ under the hood – method calls Just Work, polymorphism Just Works. Meanwhile, you maintain the performance-critical parts in C++, and you can even allow power-users to extend functionality in Python via subclassing, thanks to trampolines and pybind11’s support.

This setup (C++ core + pybind11 interface) is common in many advanced projects (OpenSpiel, PyTorch, etc.). It does have a learning curve, but once mastered, it provides an “orienting view” of designing software that spans C++ and Python: write the heavy logic in C++ (with modern C++ best practices as discussed), expose a clean API to Python, and manage lifetimes carefully so that the two languages interact safely. With these tools – smart pointers, CMake build with scikit-build, pybind11 for binding, and good software design patterns – you’ll be well-equipped to develop performant, safe, and user-friendly C++/Python hybrid projects.

Sources:

Factory Method pattern concept and usage; when (not) to use factories.
Modern C++ safe practices: prefer high-level constructs (STL containers/algorithms) over low-level pointers; avoid manual memory management, use RAII and smart pointers.
Eigen library for linear algebra (popular in ML, used by TensorFlow/Stan) and high-performance kernels.
Pybind11 advanced features: smart_holder for safe pointer passing; keep_alive usage to maintain object lifetimes; trampoline (virtual override) setup; general pybind11 class binding mechanics.

Miscellany