-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Omp target offload #1212
Omp target offload #1212
Changes from 32 commits
98e185e
fe95922
af168f2
f014a0c
ffe777c
38da021
b12e813
6dddbb2
94c9318
3bd5298
7acb4c4
c915d0b
5a12cbc
be0a14b
3b237e0
1d77aa9
45bb1da
863af63
6724bef
6bad511
903e5b9
8fadc6c
550af7a
ae8b6ef
2e8028e
0423cf1
c31a68a
cdaa15b
496530e
62e11ca
1107b33
cf8ff66
e10f9cd
64dfa4b
49e7dc9
b7d08de
895253c
f5c7b19
e0882cf
6290a4e
0502f08
4454bf0
cd83a3c
39f941e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -34,20 +34,25 @@ | |
|
||
namespace galsim | ||
{ | ||
|
||
class PUBLIC_API Silicon | ||
{ | ||
public: | ||
Silicon(int numVertices, double numElec, int nx, int ny, int qDist, | ||
double diffStep, double pixelSize, double sensorThickness, double* vertex_data, | ||
const Table& tr_radial_table, Position<double> treeRingCenter, | ||
const Table& abs_length_table, bool transpose); | ||
|
||
template <typename T> | ||
bool insidePixel(int ix, int iy, double x, double y, double zconv, | ||
ImageView<T> target, bool* off_edge=0) const; | ||
|
||
void scaleBoundsToPoly(int i, int j, int nx, int ny, | ||
~Silicon(); | ||
|
||
bool insidePixel(int ix, int iy, double x, double y, double zconv, | ||
Bounds<int>& targetBounds, bool* off_edge, | ||
int emptypolySize, | ||
Bounds<double>* pixelInnerBoundsData, | ||
Bounds<double>* pixelOuterBoundsData, | ||
Position<float>* horizontalBoundaryPointsData, | ||
Position<float>* verticalBoundaryPointsData, | ||
Position<double>* emptypolyData) const; | ||
|
||
void scaleBoundsToPoly(int i, int j, int nx, int ny, | ||
const Polygon& emptypoly, Polygon& result, | ||
double factor) const; | ||
|
||
|
@@ -72,6 +77,8 @@ namespace galsim | |
template <typename T> | ||
void initialize(ImageView<T> target, Position<int> orig_center); | ||
|
||
void finalize(); | ||
|
||
template <typename T> | ||
double accumulate(const PhotonArray& photons, int i1, int i2, | ||
BaseDeviate rng, ImageView<T> target); | ||
|
@@ -249,6 +256,12 @@ namespace galsim | |
|
||
void updatePixelBounds(int nx, int ny, size_t k); | ||
|
||
void updatePixelBoundsGPU(int nx, int ny, size_t k, | ||
Bounds<double>* pixelInnerBoundsData, | ||
Bounds<double>* pixelOuterBoundsData, | ||
Position<float>* horizontalBoundaryPointsData, | ||
Position<float>* verticalBoundaryPointsData); | ||
|
||
Polygon _emptypoly; | ||
mutable std::vector<Polygon> _testpoly; | ||
|
||
|
@@ -265,6 +278,19 @@ namespace galsim | |
Table _abs_length_table; | ||
bool _transpose; | ||
ImageAlloc<double> _delta; | ||
std::shared_ptr<bool> _changed; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm pretty sure this is wrong. This used to be a local vector. And _changed.get() is still being used as a bare C array. This might explain the crashes you were seeing. So after fixing this, probably could try to put back the min/max usage, which IMO is more readable than the two step you switched to. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I moved _changed to be a member variable rather than local to avoid the overhead of allocating it on the GPU every time update is called. The reason it's now a shared_ptr to a bare array instead of a vector is that bool vectors have an optimised implementation that packs multiple bools into each byte, so you can't get a pointer to the raw data, which the GPU requires. I agree that std::min and std::max would be more readable, but surprisingly the GPU doesn't support them (I was using them initially but they caused weird bugs that took a long time to track down). We could implement custom, GPU-friendly min and max functions rather than writing out the algorithm though. |
||
|
||
// GPU data | ||
std::vector<double> _abs_length_table_GPU; | ||
std::vector<Position<double> > _emptypolyGPU; | ||
double _abs_length_arg_min, _abs_length_arg_max; | ||
double _abs_length_increment; | ||
int _abs_length_size; | ||
|
||
// need to keep a pointer to the last target image's data and its data type | ||
// so we can release it on the GPU later | ||
void* _targetData; | ||
bool _targetIsDouble; | ||
}; | ||
|
||
PUBLIC_API int SetOMPThreads(int num_threads); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not have two ways to do this please. Rewrite the existing updatePixelBounds to work with the GPU. Don't repeat everything in a second function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not fixed yet. You still have both of these functions.