-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented batch processing for the indexed raster class. #433
Conversation
@elidwa I did not realize that std::unordered_map<std::string, bool> was thread safe. Can you confirm that? |
@jpswinski, std::unordered_map is not thread-safe when multiple threads are modifying it. However, in my context, it is safe because only the main thread populates the map, and the worker threads only read from it. This read-only operation is thread-safe. The SR dictionary was intended to behave the same way, but it did not. Without a mutex protecting access to the dictionary, the application was crashing, which did not make sense to me since the threads were not modifying the dictionary. |
@elidwa That is good to know, thanks! Independent of this PR, as you mentioned, the intended behavior of the SR dictionary is that reads would be thread safe. So I need to investigate why that is not the case (because it is certainly assumed elsewhere). |
I am using the spatial filter recommended by @dshean. After identifying all unique rasters within the AOI and the points that intersect each raster, batch-threaded readers process each raster for all its points. No raster is sampled by more than one thread, which greatly improves GDAL's performance.
Through extensive testing on an 8-core machine, I found that 20 reader threads is the optimal number. Adding more threads slows down performance, likely due to GDAL's global cache locks.
Prior to this change, sampling 64k points in 3DEP rasters took 140 seconds. Now it takes 35. Similarly, processing 525k points used to take 22 minutes, but now takes 4 minutes.
There is still a lot of room for further improvements, such as replacing vectors of features with a GEOS R-tree. I will continue working on these optimizations, but the current implementation is ready for use. I will also add more tests.
Note: These optimizations only apply to indexed class datasets. For single index files, such as VRTs, the code remains unchanged. We create multiple sampler objects and allow them to read the same VRT in parallel, each processing a different range of points. Once all the threads complete, the results are combined and returned. This part of the code has not been optimized yet.
I also resolved issue #430 by replacing the SR dictionary with std::unordered_map<std::string, bool> bandsDict. This fixed the problem, and the mutex is no longer required.