Improve type_caster for complex number types. by hpkfft · Pull Request #854 · wjakob/nanobind

hpkfft · 2025-01-12T19:07:01Z

Results for this PR were obtained with Python 3.12.8 and g++ 14.2.0.

Currently, after importing the module defined in tests/test_stl.cpp as t and defining

    class MyComplex:
        def __init__(self, real, imag):
            self.re = real
            self.im = imag
        def __complex__(self):
            return complex(self.re, self.im)

the following raises a TypeError:

    assert t.complex_value_double(MyComplex(1, 2)) == complex(1.0, 2.0)

Instead, it is expected that the __complex__ method be called and the assertion be correct.
Note that the Stable ABI for PyComplex_RealAsDouble,ImagAsDouble was fixed in version 3.13. See cpython issue 109598

Also, the current complex type_caster does not honor nb::arg().noconvert().
The following test passes

    with pytest.raises(TypeError, match="incompatible function arguments"):
        t.complex_value_double_nc(0)

but given

    class MyInt(int):
        def __new__(cls, value):
            return super().__new__(cls, value)

the following tests DID NOT RAISE <class 'TypeError'> and fail:

    with pytest.raises(TypeError, match="incompatible function arguments"):
        t.complex_value_double_nc(MyInt(7))

    with pytest.raises(TypeError, match="incompatible function arguments"):
        t.complex_value_double_nc(0.0)

    with pytest.raises(TypeError, match="incompatible function arguments"):
        t.complex_value_float_nc(1.1 + 2.0j)  # Inexact narrowing conversion

This PR re-writes the complex type_caster.

Regarding performance, given the extension

    m.def("cadd", [](std::complex<double> x, std::complex<double> y) {
                          return x + y; });

    m.def("caddf", [](std::complex<float> x, std::complex<float> y) {
                          return x + y; });

and measuring time with

python3 -m timeit -n 100000000 -r 10 \
-s "import numpy as np, my_extension as me; x = complex(1); y = complex(2)" \
"me.cadd(x, y)"

I obtained the following results:

	old 312	new 312	old abi3	new abi3
cadd(complex)	27.6 ns	23.7 ns	31.0 ns	27.5 ns
caddf(complex)	28.0 ns	26.5 ns	29.2 ns	29.8 ns
cadd(np.complex128)	42.8 ns	30.0 ns	45.7 ns	44.1 ns
caddf(np.complex64)	393. ns	81.1 ns	403. ns	284. ns

For the last two lines above, note that

>>> issubclass(np.complex128, complex)
True
>>> issubclass(np.complex64, complex)
False

I would expect that compiling new abi3 with -DPy_LIMITED_API=0x030D0000 would reduce caddf(np.complex64) by 2X, but I don't have Python 3.13 handy.

wjakob · 2025-01-14T02:38:17Z

Like with the floating point type casters, could you move the (now fairly large) body of the converter to the static library component?

hpkfft · 2025-01-14T05:59:59Z

Sure. Do you prefer not to include <complex> in the static library, or is it OK to do so for a single cpp file?

wjakob · 2025-01-14T11:51:31Z

It's fine to include in the cpp file.

hpkfft · 2025-01-15T05:29:37Z

I ran the caddf(np.complex64) benchmark with Python 3.13.1

	(nsec)
old 313 non-STABLE	380.
new 313 non-STABLE	82.1
old abi3 0x030C0000	384.
new abi3 0x030C0000	274.
new abi3 0x030D0000	136.

The old complex caster used the C API functions from the stable abi regardless of whether STABLE_ABI was specified.
Prior to version 3.13, these functions had a bug (or, at least, some undesirable behavior) when the python object was neither complex nor a subclass of complex. Since np.complex64 is important to nanobind users, a workaround was needed. The default from_python</*Recursive=*/true> made a temporary python object and called from_python</*Recursive=*/false> on it. It did this using the heuristic that an object with an attribute named imag should probably be converted to complex. It's not foolproof, but it works for np.complex64 .

The new complex caster uses PyComplex_AsCComplex(), which is not in the stable abi, when STABLE_ABI is not specified. This is faster, and no workaround is needed.

The new caster uses the same complex C API functions as the old one did when STABLE_ABI is specified. A workaround is applied when the object is not complex, not a subtype of complex, and has attribute __complex__. Also, the workaround was made faster. The function from_python() is no longer templated.

Also, templating is removed from from_cpp() since there seems to be no reason to distinguish rvalues from lvalues.

For non-STABLE_ABI builds, the new caster is much faster, and for STABLE_ABI builds it's faster.

As this Note mentions, the behavior of a stable abi function can change. In particular, the workaround is no longer needed in version 3.13. But, it is still perfectly correct to define Py_LIMITED_API=0x030C0000 and build with the workaround code in place.

wjakob · 2025-02-02T05:14:49Z

Thanks!

wjakob · 2025-02-02T05:16:45Z

Actually, one question I realized after merging: is the && !PyType_IsSubtype(Py_TYPE(ob), &PyComplex_Type) check really needed? I'm thinking that calling the complex() constructor should still do the right cast from a subtype.

hpkfft · 2025-02-03T03:14:06Z

Yes, the code would still be correct without that check, but it would be slower. Referring to the table in my original comment for this PR, the case cadd(np.complex128) is 44ns for abi3 because Numpy's complex128 is a subtype of Python's complex. Without this check, that would increase to about 280ns.

The old code effectively had this check, because it used PyComplex_Check() to test whether to avoid calling the complex() constructor. This check (after macro expansion) is found in Python's object.h:

static inline int PyObject_TypeCheck(PyObject *ob, PyTypeObject *type) {
    return Py_IS_TYPE(ob, type) || PyType_IsSubtype(Py_TYPE(ob), type);
}

The new code already has the boolean is_complex for an exact type match, so I used PyType_IsSubtype() directly to retain this optimization. I think Numpy complex double precision, np.complex128, is an important enough use case to justify avoiding a performance regression.

wjakob · 2025-02-04T07:36:24Z

Thank you for clarifying!

miklhh · 2025-06-04T18:05:18Z

Dear @hpkfft and @wjakob,

We just bumped our project from using Nanobind 2.4.0 to Nanobind 2.7.0. I was surprised to learn that

nb::isinstance<std::complex<double>>(python_obj)

no longer matches the complex-valued type numpy.complex128. I have verified that before the version bump, nb::isinstance<std::complex<double>> did indeed match numpy.complex128, but after the version bump, it does not.

After looking in the Nanobind changelog and browsing the source code a bit, I seem to have narrowed this change in behaviour down to this PR. I guess it might have something to do with the fact that the type_caster for complex numbers was changed with this PR, although I might be wrong. Was this an intended change in behaviour or an accidental one? Do you intend for nb::ininstance<std::complex<double>> to match the type numpy.complex128? Right now, that seems not to be the case.

wjakob · 2025-06-04T22:09:32Z

@miklhh Do you include nanobind/stl/complex.h?

hpkfft · 2025-06-05T00:09:28Z

This PR was primarily intended to avoid unintentional value-changing narrowing conversions (and the resulting loss of numerical precision). Similar was done for real values. See the nice table in #829.
This intentionally affects overload resolution of functions. See the changelog at the top (the "Files Changed" tab above).

I'm not sure what python_obj is exactly in your example above, or how it is bound using nanobind.

As a wild guess, I see the following:
https://github.com/apytypes/apytypes/blob/c99152b7221667d926e8353696cf55ec90f5d382/src/apycfixed_wrapper.cc#L371
and wonder if it should be &APyCFixed::from_complex.

hpkfft · 2025-06-05T01:11:00Z

If I'm understanding the nanobind code correctly, nb::isinstance() does not allow the type caster to perform conversions.
In your case, a conversion is needed since the type numpy.complex128 is not exactly complex.
This PR intentionally checks for an exact type (Line 905): bool is_complex = PyComplex_CheckExact(ob);
This is consistent with the type caster for Python float (Line 953): bool is_float = PyFloat_CheckExact(o);

As an alternative to nb::isinstance(), perhaps consider cast() or try_cast()?
https://nanobind.readthedocs.io/en/latest/api_core.html#casting

miklhh · 2025-06-08T09:11:48Z

Thank you both for your answers.

@miklhh Do you include nanobind/stl/complex.h?

Yes we do.

I'm not sure what python_obj is exactly in your example above, or how it is bound using nanobind.

Should have been more clear. The python_obj is of type nb::object in this context. We use nb::object because we want the Python-exported function to match any argument.

As an alternative to nb::isinstance(), perhaps consider cast() or try_cast()? https://nanobind.readthedocs.io/en/latest/api_core.html#casting

We ended up using try_cast<std::complex<double>>(python_obj, cpp_obj) as a "last resort" when nb::isinstance<std::complex<double>>(python_obj) does not match a np.complex128. The only downside I see using try_cast() is that it "matches" significantly more types than just numpy.complex128. For example, try_cast<std::complex<double>>() matches the Python type float and np.int16 as well. I would have liked to be able to use nb::isinstance<std::complex<double>>() to match exactly the Python types complex or numpy.complex128. Anyway, for now, we managed to have something that works. Thank you for your involved answer @hpkfft.

hpkfft · 2025-06-08T16:31:41Z

@miklhh, @oscargus,
The test nb::isinstance<std::complex<double>>() tries the type caster without allowing conversions.
As you've observed, nb::try_cast<std::complex<double>>() allows all conversions (because the second parameter defaults to bool convert = true).

There's also https://nanobind.readthedocs.io/en/latest/api_core.html#_CPPv4N8nanobind10isinstanceE6handle6handle
This has nothing to do with the nanobind type caster.

Note that numpy.complex128 is a subclass of python's complex.
Note that numpy.complex64 is not.

>>> z = np.complex128(1+2j)
>>> isinstance(z, complex)
True
>>> z = np.complex64(1+2j)
>>> isinstance(z, complex)
False

The following, I think, will match python's complex or any subclass:

    m.def("iscomplex", [](const nb::object& obj) {
            return nb::isinstance(obj, nb::handle(&PyComplex_Type)); });

Improve type_caster for complex number types.

120449f

hpkfft added 2 commits January 14, 2025 19:56

Move load_cmplx() into the nanobind library.

c9cbde1

Fix visibility of load_cmplx().

0de790d

wjakob force-pushed the master branch 2 times, most recently from 98def19 to 92d9cb3 Compare January 27, 2025 01:29

wjakob merged commit 90ef5bc into wjakob:master Feb 2, 2025

hpkfft deleted the complex branch February 4, 2025 08:36

miklhh mentioned this pull request Jun 4, 2025

Error in benchmarks apytypes/apytypes#706

Closed

miklhh mentioned this pull request Jun 5, 2025

src: support more Numpy types during construction apytypes/apytypes#708

Merged

2 tasks

Conversation

hpkfft commented Jan 12, 2025

Uh oh!

wjakob commented Jan 14, 2025

Uh oh!

hpkfft commented Jan 14, 2025

Uh oh!

wjakob commented Jan 14, 2025

Uh oh!

hpkfft commented Jan 15, 2025

Uh oh!

wjakob commented Feb 2, 2025

Uh oh!

wjakob commented Feb 2, 2025

Uh oh!

hpkfft commented Feb 3, 2025

Uh oh!

wjakob commented Feb 4, 2025

Uh oh!

miklhh commented Jun 4, 2025

Uh oh!

wjakob commented Jun 4, 2025

Uh oh!

hpkfft commented Jun 5, 2025

Uh oh!

hpkfft commented Jun 5, 2025

Uh oh!

miklhh commented Jun 8, 2025

Uh oh!

hpkfft commented Jun 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants