Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Array4: __array_interface__ #17

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
160 changes: 160 additions & 0 deletions src/Base/Array4.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
/* Copyright 2021 The AMReX Community
*
* Authors: Axel Huebl
* License: BSD-3-Clause-LBNL
*/
#include <pybind11/pybind11.h>
#include <pybind11/numpy.h>
#include <pybind11/stl.h>

#include <AMReX_Config.H>
#include <AMReX_Array4.H>
#include <AMReX_IntVect.H>

#include <sstream>
#include <type_traits>

namespace py = pybind11;
using namespace amrex;


template< typename T >
void make_Array4(py::module &m, std::string typestr)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed in #19: we want to make sure that users do not construct Array4s on their own and only get them from MFIter loops

{
// dispatch simpler via: py::format_descriptor<T>::format() naming
auto const array_name = std::string("Array4_").append(typestr);
py::class_< Array4<T> >(m, array_name.c_str(), py::buffer_protocol())
.def("__repr__",
[](Array4<T> const & a4) {
std::stringstream s;
s << a4.size();
return "<amrex.Array4 of size '" + s.str() + "'>";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should indicate the polymorphic type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in #19

}
)
#if defined(AMREX_DEBUG) || defined(AMREX_BOUND_CHECK)
.def("index_assert", &Array4<T>::index_assert)
#endif

.def_property_readonly("size", &Array4<T>::size)
.def_property_readonly("nComp", &Array4<T>::nComp)

.def(py::init< >())
.def(py::init< Array4<T> const & >())
.def(py::init< Array4<T> const &, int >())
.def(py::init< Array4<T> const &, int, int >())
//.def(py::init< T*, Dim3 const &, Dim3 const &, int >())

.def(py::init([](py::array_t<T> & arr) {
py::buffer_info buf = arr.request();

auto a4 = std::make_unique< Array4<T> >();
a4.get()->p = (T*)buf.ptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read this correctly, this aliases the Array4 to the python buffer. Idk enough about the buffer protocol, so I am asking here: will arr be kept alive as long as this instance of pyamrex.Array4?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also what happens when the user changes to contents of arr?

Copy link
Member Author

@ax3l ax3l Feb 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, at the moment (and I am just trying the design of the data access) I made the Python Array4 behave like the C++ Array4: it's purely a view to data that is managed elsewhere.

If you create a numpy array and pass it to an Array4, the data is owned by Python. If you return an Array4 that is created on from C++ data, then the C++ side owns the data.

This might not be the final design because it can shoot Python users in the foot and I'll discuss with @sayerhs (feel free to join!) how to design this in the end. In his prior design, he basically did not expose Array4 as a user-creatable type at all but instead used it to return views to C++ data only. This sounds to me like something we might want to do again - but I need to catch up with him to see the details.

a4.get()->begin = Dim3{0, 0, 0};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably needs to be adjusted to a box index space

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this function here is mainly to support views into numpy and other python buffer objects via Array4.
In #19 currently mapped to a zero-based index space for incoming buffer protocols.

Potentially can implement shift-like functions later on to adjust the index space if we like to.

// TODO: likely C->F index conversion here
// p[(i-begin.x)+(j-begin.y)*jstride+(k-begin.z)*kstride+n*nstride];
a4.get()->end.x = (int)buf.shape.at(0);
a4.get()->end.y = (int)buf.shape.at(1);
a4.get()->end.z = (int)buf.shape.at(2);
a4.get()->ncomp = 1;
// buffer protocol strides are in bytes, AMReX strides are elements
a4.get()->jstride = (int)buf.strides.at(0) / sizeof(T);
a4.get()->kstride = (int)buf.strides.at(1) / sizeof(T);
a4.get()->nstride = (int)buf.strides.at(2) * (int)buf.shape.at(2) / sizeof(T);
return a4;
}))


.def_property_readonly("__array_interface__", [](Array4<T> const & a4) {
auto d = py::dict();
auto const len = length(a4);
// TODO: likely F->C index conversion here
// p[(i-begin.x)+(j-begin.y)*jstride+(k-begin.z)*kstride+n*nstride];
auto shape = py::make_tuple( // Buffer dimensions
len.x < 0 ? 0 : len.x,
len.y < 0 ? 0 : len.y,
len.z < 0 ? 0 : len.z//, // zero-size shall not have negative dimension
//a4.ncomp
);
Comment on lines +72 to +77
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this handle ghost cells?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet at all, let's have a VC about how you solved it in the past.

So far I am mainly experimenting with the data interface (trying the GPU version next)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled in #19

// buffer protocol strides are in bytes, AMReX strides are elements
auto const strides = py::make_tuple( // Strides (in bytes) for each index
sizeof(T) * a4.jstride,
sizeof(T) * a4.kstride,
sizeof(T)//,
//sizeof(T) * a4.nstride
);
d["data"] = py::make_tuple(long(a4.dataPtr()), false);
d["typestr"] = py::format_descriptor<T>::format();
d["shape"] = shape;
d["strides"] = strides;
// d["strides"] = py::none();
d["version"] = 3;
return d;
})

// not sure if useful to have this implemented on top
/*
.def_buffer([](Array4<T> & a4) -> py::buffer_info {
auto const len = length(a4);
// TODO: likely F->C index conversion here
// p[(i-begin.x)+(j-begin.y)*jstride+(k-begin.z)*kstride+n*nstride];
auto shape = { // Buffer dimensions
len.x < 0 ? 0 : len.x,
len.y < 0 ? 0 : len.y,
len.z < 0 ? 0 : len.z//, // zero-size shall not have negative dimension
//a4.ncomp
};
// buffer protocol strides are in bytes, AMReX strides are elements
auto const strides = { // Strides (in bytes) for each index
sizeof(T) * a4.jstride,
sizeof(T) * a4.kstride,
sizeof(T)//,
//sizeof(T) * a4.nstride
};
return py::buffer_info(
a4.dataPtr(),
shape,
strides
);
})
*/
;
}

void init_Array4(py::module &m) {
make_Array4< float >(m, "float");
make_Array4< double >(m, "double");
make_Array4< long double >(m, "longdouble");

make_Array4< short >(m, "short");
make_Array4< int >(m, "int");
make_Array4< long >(m, "long");
make_Array4< long long >(m, "longlong");

make_Array4< unsigned short >(m, "ushort");
make_Array4< unsigned int >(m, "uint");
make_Array4< unsigned long >(m, "ulong");
make_Array4< unsigned long long >(m, "ulonglong");

// std::complex< float|double|long double> ?

/*
py::class_< PolymorphicArray4, Array4 >(m, "PolymorphicArray4")
.def("__repr__",
[](PolymorphicArray4 const & pa4) {
std::stringstream s;
s << pa4.size();
return "<amrex.PolymorphicArray4 of size '" + s.str() + "'>";
}
)
;
*/

// free standing C++ functions:
/*
contains
lbound
ubound
length
makePolymorphic
*/
}
1 change: 1 addition & 0 deletions src/Base/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
target_sources(pyAMReX
PRIVATE
AMReX.cpp
Array4.cpp
Box.cpp
Dim3.cpp
IntVect.cpp
Expand Down
2 changes: 2 additions & 0 deletions src/pyAMReX.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ namespace py = pybind11;

// forward declarations of exposed classes
void init_AMReX(py::module&);
void init_Array4(py::module&);
void init_Box(py::module &);
void init_Dim3(py::module&);
void init_IntVect(py::module &);
Expand All @@ -37,6 +38,7 @@ PYBIND11_MODULE(amrex_pybind, m) {
init_AMReX(m);
init_Dim3(m);
init_IntVect(m);
init_Array4(m);
init_Box(m);

// API runtime version
Expand Down
86 changes: 86 additions & 0 deletions tests/test_array4.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
# -*- coding: utf-8 -*-

import pytest
import numpy as np
import amrex

def test_array4_empty():
empty = amrex.Array4_double()

# Check properties
assert(empty.size == 0)
assert(empty.nComp == 0)

# assign empty
emptyc = amrex.Array4_double(empty)
# Check properties
assert(emptyc.size == 0)
assert(emptyc.nComp == 0)

def test_array4():
# from numpy (also a non-owning view)
x = np.ones((2, 3, 4,))
arr = amrex.Array4_double(x)

x[1, 1, 1] = 42
# TypeError: 'amrex.amrex_pybind.Array4_double' object is not subscriptable
# assert(arr[1, 1, 1] == 42)

# copy to numpy
c_arr2np = np.array(arr, copy=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dumb question -- but I need to learn eventually -- is this the "official" way to define a amrex -> python deepcopy (i.e. using the np.array constructor)? Or do we need to define a dedicated deepcopy also?

Copy link
Contributor

@JBlaschke JBlaschke Feb 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Followup question: do we need to define a non-aliasing constructor (i.e. python -> amrex deepcopy)?

I can imaging that for data analytics applications we might want to keep a time series in python (looking at you machine learning folks) -- for that we need the former (amrex -> python) which is covered by this.

For experimental science applications we frequently have python provide a data source. A python -> amrex deepcopy (probably a simple copy constructor?) would be helpful in ensuring that the data is owned by amrex.

Come to think of it: A lot of algorithms in amrex are involve creating copies of data and then applying functions to them. Users might want to preserve this workflow.

My opinion is that this case best approach would be to expose the Array4 copy constructor (assuming that hasn't been deleted), and settle on the workflow: i) shallow-copy into amrex; ii) call copy constructor to create a deep copy.

Copy link
Member Author

@ax3l ax3l Feb 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some details above. This is just a view at the moment, the same way as the C++ Array4 is a plain view to non-owned data.

I just added the numpy overload so I can test around with it, maybe in the end we do not let users create this type directly because it will easily lead to memory issues if used wrongly (and segfaulting interpreters are never well received by users, even if they caused it by not following an API contract).

assert(c_arr2np.ndim == 3)
#assert(c_arr2np.ndim == 4)
assert(c_arr2np.dtype == np.dtype("double"))
np.testing.assert_array_equal(x, c_arr2np)
assert(c_arr2np[1, 1, 1] == 42)

# view to numpy
v_arr2np = np.array(arr, copy=False)
assert(v_arr2np.ndim == 3)
#assert(c_arr2np.ndim == 4)
assert(v_arr2np.dtype == np.dtype("double"))
np.testing.assert_array_equal(x, v_arr2np)
assert(v_arr2np[1, 1, 1] == 42)

# change original buffer once more
x[1, 1, 1] = 43
assert(v_arr2np[1, 1, 1] == 43)

# copy array4 (view)
c_arr = amrex.Array4_double(arr)
v_carr2np = np.array(c_arr, copy=False)
x[1, 1, 1] = 44
assert(v_carr2np[1, 1, 1] == 44)

# from cupy

# to numpy

# to cupy

return

# Check indexing
assert(obj[0] == 1)
assert(obj[1] == 2)
assert(obj[2] == 3)
assert(obj[-1] == 3)
assert(obj[-2] == 2)
assert(obj[-3] == 1)
with pytest.raises(IndexError):
obj[-4]
with pytest.raises(IndexError):
obj[3]

# Check assignment
obj[0] = 2
obj[1] = 3
obj[2] = 4
assert(obj[0] == 2)
assert(obj[1] == 3)
assert(obj[2] == 4)

#def test_iv_conversions():
# obj = amrex.IntVect.max_vector().numpy()
# assert(isinstance(obj, np.ndarray))
# assert(obj.dtype == np.int32)