Skip to content

Latest commit

 

History

History
400 lines (290 loc) · 15.4 KB

ownership.rst

File metadata and controls

400 lines (290 loc) · 15.4 KB
.. cpp:namespace:: nanobind

Object ownership

Python and C++ don't manage the lifetime and storage of objects in the same way. Consequently, two questions arise whenever an object crosses the language barrier:

  • Who actually owns this object? C++? Python? Both?!
  • Can we safely determine when it is no longer needed?

This is important: we must exclude the possibility that Python destroys an object that is still being used by C++ (or vice versa).

The :ref:`previous section <exchange>` introduced three ways of exchanging information between C++ and Python: :ref:`type casters <type_casters>`, :ref:`bindings <bindings>`, and :ref:`wrappers <wrappers>`. It is specifically :ref:`bindings <bindings>` for which these two questions must be answered.

A problematic example

Consider the following problematic example to see what can go wrong:

#include <nanobind/nanobind.h>
namespace nb = nanobind;

struct Data { };
Data data; // Data global variable & function returning a pointer to it
Data *get_data() { return &data; }

NB_MODULE(my_ext, m) {
    nb::class_<Data>(m, "Data");

    // KABOOM, calling this function will crash the Python interpreter
    m.def("get_data", &get_data);
}

The bound function my_ext.get_data() returns a Python object of type my_ext.Data that wraps the pointer &data and takes ownership of it.

When Python eventually garbage collects the object, nanobind will try to free the (non-heap-allocated) C++ instance via operator delete, causing a segmentation fault.

To avoid this problem, we can

  1. Provide more information: the problem was that nanobind incorrectly transferred ownership of a C++ instance to the Python side. To fix this, we can add add a :ref:`return value policy <rvp>` annotation that clarifies what to do with the return value.
  2. Make ownership transfer explicit: C++ types passed via :ref:`unique pointers <unique_ptr>` (std::unique_ptr<T>) make the ownership transfer explicit in the type system, which would have revealed the problem in this example.
  3. Switch to shared ownership: C++ types passed via :ref:`shared pointers <shared_ptr>` (std::shared_ptr<T>), or which use :ref:`intrusive reference counting <intrusive_intro>` can be shared by C++ and Python. The whole issue disappears because ownership transfer is no longer needed.

The remainder of this section goes through each of these options.

Return value policies

nanobind provides several return value policy annotations that can be passed to :func:`module_::def`, :func:`class_::def`, and :func:`cpp_function`. The default policy is :cpp:enumerator:`rv_policy::automatic`, which is usually a reasonable default (but not in this case!).

In the :ref:`problematic example <ownership_problem>`, the policy :cpp:enumerator:`rv_policy::reference` should have been specified explicitly so that the global instance is only referenced without any implied transfer of ownership, i.e.:

m.def("get_data", &get_data, nb::rv_policy::reference);

On the other hand, this is not the right policy for many other situations, where ignoring ownership could lead to resource leaks. As a developer using this library, it is important that you familiarize yourself with the different options below. In particular, the following policies are available:

Unique pointers

Passing a STL unique pointer embodies an ownership transfer---a return value policy annotation is therefore not needed. To bind functions that receive or return std::unique_ptr<..>, add the extra include directive

#include <nanobind/stl/unique_ptr.h>

Note

While this this header file technically contains a :ref:`type caster <type_casters>`, it is not affected by their usual limitations (mandatory copy/conversion, inability to mutate function arguments).

Example: The following example binds two functions that create and consume instances of a C++ type Data via unique pointers.

#include <nanobind/stl/unique_ptr.h>

namespace nb = nanobind;

NB_MODULE(my_ext, m) {
    struct Data { };
    nb::class_<Data>(m, "Data");
    m.def("create", []() { return std::make_unique<Data>(); });
    m.def("consume", [](std::unique_ptr<Data> x) { /* no-op */ });
}

Calling a function taking a unique pointer from Python invalidates the passed Python object. nanobind will refuse further use of it:

Python 3.11.1 (main, Dec 23 2022, 09:28:24) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import my_ext

>>> x = my_ext.create()
>>> my_ext.consume(x)

>>> my_ext.consume(x)
<stdin>:1: RuntimeWarning: nanobind: attempted to access an uninitialized instance of type 'my_ext.Data'!

TypeError: consume(): incompatible function arguments. The following argument types are supported:
    1. consume(arg: my_ext.Data, /) -> None

Invoked with types: my_ext.Data

We strongly recommend that you replace all use of std::unique_ptr<T> by std::unique_ptr<T, nb::deleter<T>> in your code. Without the latter type declaration, which references a custom nanobind-provided deleter :cpp:class:`nb::deleter\<T\> <deleter>`, nanobind cannot transfer ownership of objects constructed using :cpp:class:`nb::init\<...\> <init>` to C++ and will refuse to do so with an error message. Further detail on this special case can be found in the advanced :ref:`section <unique_ptr_adv>` on object ownership.

Shared ownership

In a shared ownership model, an object can have multiple owners that each register their claim by holding a reference. The system keeps track of the total number of references and destroys the object once the count reaches zero. Passing such an object in a function call shares ownership between the caller and callee. nanobind makes this behavior seamless so that everything works regardless of whether caller/callee are written in C++ or Python.

Shared pointers

STL shared pointers (std::shared_ptr<T>) allocate a separate control block to keep track of the reference count, which makes them very general but also slightly less efficient than other alternatives.

nanobind's support for shared pointers requires an extra include directive:

#include <nanobind/stl/shared_ptr.h>

Note

While this this header file technically contains a :ref:`type caster <type_casters>`, it is not affected by their usual limitations (mandatory copy/conversion, inability to mutate function arguments).

You don't need to specify a return value policy annotation when a function returns a shared pointer.

nanobind's implementation of std::shared_ptr support typically allocates a new shared_ptr control block each time a Python object must be converted to std::shared_ptr<T>. The new shared_ptr "owns" a reference to the Python object, and its deleter drops that reference. This has the advantage that the Python portion of the object will be kept alive by its C++-side references (which is important when implementing C++ virtual methods in Python), but it can be inefficient when passing the same object back and forth between Python and C++ many times, and it means that the use_count() method of std::shared_ptr will return a value that does not capture all uses. Some of these problems can be mitigated by modifying T so that it inherits from std::enable_shared_from_this<T>. See the :ref:`advanced section <shared_ptr_adv>` on object ownership for more details on the implementation.

nanobind has limited support for objects that inherit from std::enable_shared_from_this<T> to allow safe conversion of raw pointers to shared pointers. The safest way to deal with these objects is to always use std::make_shared<T>(...) when constructing them in C++, and always pass them across the Python/C++ boundary wrapped in an explicit std::shared_ptr<T>. If you do this, then there shouldn't be any surprises. If you will be passing raw T* pointers around, then read the :ref:`advanced section on object ownership <enable_shared_from_this>` for additional caveats.

Intrusive reference counting

Intrusive reference counting is the most flexible and efficient way of handling shared ownership. The main downside is that you must adapt the base class of your object hierarchy to the needs of nanobind.

The core idea is to define base class (e.g. Object) common to all bound types requiring shared ownership. That class contains a builtin atomic counter (e.g., m_ref_count) and a Python object pointer (e.g., m_py_object).

class Object {
...
private:
    mutable std::atomic<size_t> m_ref_count { 0 };
    PyObject *m_py_object = nullptr;
};

The core idea is that such Object instances can either be managed by C++ or Python. In the former case, the m_ref_count field keeps track of the number of outstanding references. In the latter case, reference counting is handled by Python, and the m_ref_count field remains unused.

This is actually little wasteful---nanobind therefore ships with a more efficient reference counter sample implementation that supports both use cases while requiring only sizeof(void*) bytes of storage:

#include <nanobind/intrusive/counter.h>

class Object {
...
private:
    intrusive_counter m_ref_count;
};

Please read the dedicated :ref:`section on intrusive reference counting <intrusive>` for more details on how to set this up.