Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Storing const reference to Eigen types in a class when working with nb::Dref causes data corruption. #802

Closed
MarcelFerrari opened this issue Nov 27, 2024 · 4 comments

Comments

@MarcelFerrari
Copy link

MarcelFerrari commented Nov 27, 2024

Problem description

Storing a const reference to an Eigen object as part of a class/structure that is initialised with a nb::Dref object compiles without any issues but results in data corruption.
If this is intended behaviour, then a compile-time error should be thrown.

Reproducible example code

C++ file

#include <nanobind/nanobind.h>
#include <nanobind/eigen/dense.h>
#include <iostream>

#include <Eigen/Dense>

namespace nb = nanobind;

struct ArrayRef{
    const Eigen::ArrayXi & ref;
    ArrayRef(const Eigen::ArrayXi & arr) : ref(arr) {}
    int operator[](int i) const { return ref[i]; }
    int size() const { return ref.size(); }
};

void correct(const nb::DRef<Eigen::ArrayXi> & b) {
    const Eigen::ArrayXi & ref = b;

    // Print contentents of the array
    for (int i = 0; i < ref.size(); i++) {
        std::cout << ref[i] << std::endl;
    }
}

void corrupt(const nb::DRef<Eigen::ArrayXi> & b) {
    const ArrayRef ref(b);

    // Print contents of the array
    for (int i = 0; i < ref.size(); i++) {
        std::cout << ref[i] << std::endl;
    }
}

// Wrapper functions for Python
NB_MODULE(array_ref, m) {
    m.def("correct", &correct, 
          "Works with Eigen Ref.");
    
    m.def("corrupt", &corrupt,
          "Crashes with custom ArrayRef.");
}

Python file

import array_ref as ar
import numpy as np

x = np.array([1, 2, 3, 4, 5], dtype=np.int32)

# This will work
print("Correct data")
ar.correct(x)
print()
# This will corrupt the data
print("Corrupt data")
ar.corrupt(x)

Result

Correct data
1
2
3
4
5

Corrupt data
1431658479
5
-1276353212
514520861
5
@wjakob
Copy link
Owner

wjakob commented Dec 10, 2024

I'm surprised that you can just cast a nb::DRef<Eigen::ArrayXi> into a Eigen::ArrayXi & reference (in either of the two versions). One is a potentially non-contiguous slice into a external data structure, and the other is a Eigen-managed contiguous array. I would expect this to fail with a type casting issue. In any case, I don't think that the problem here is with nanobind (which takes care of the Python..C++ boundary), but rather with your code. CCing @WKarel for extra Eigen expertise.

@MarcelFerrari
Copy link
Author

@wjakob thanks for your reply.

The get started documentation states the following:

f3() uses nb::DRef to support any memory layout (row-major, column-major, slices) without copying. It may still perform an implicit conversion when called with the wrong data type—for example, the function expects a single precision array, but NumPy matrices often use double precision.

I assumed that in case of non-contiguous data, a copy operation was performed in order to enable the many Eigen optimisations that can only be performed on contiguous memory. However, I now see that "slices" are mentioned along side row-major and column-major layouts.

Nevertheless, it took me a surprising amount of time to debug this issue since I was testing with const Eigen::ArrayXi & ref = b; and the data looked fine. It would probably be best if a compile-time error is raised for this, especially since I was not working with slices or strange data layouts.

@WKarel
Copy link
Contributor

WKarel commented Dec 10, 2024

void correct(const nb::DRef<Eigen::ArrayXi> & b) {
    const Eigen::ArrayXi & ref = b;
}

Compiles fine for 2 reasons:

  1. Eigen::Array offers a converting constructor (that is enabled for your specialization of nb::DRef).
  2. The reference on the left side is const, and not mutable.

Because the reference is local, it extends the lifetime of the temporary used to initialize it.

void corrupt(const nb::DRef<Eigen::ArrayXi> & b) {
    const ArrayRef ref(b);
    ref.size();
}

Also compiles fine, but accesses invalid memory, because ArrayRef's constructor extends the lifetime of the temporary only until it returns. And its data member does not, because it is initialized from another reference. This is plain C++, nothing special to Eigen or even nanobind.

@wjakob
Copy link
Owner

wjakob commented Dec 12, 2024

Thank you @WKarel, I will close this issue.

@wjakob wjakob closed this as completed Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants