Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmarks for GeometryOps.jl (a Julia package) #12

Merged
merged 16 commits into from
Jul 12, 2024

Conversation

asinghvi17
Copy link
Contributor

@asinghvi17 asinghvi17 commented Apr 6, 2024

https://github.com/asinghvi17/GeometryOps.jl is a Julia package for (mostly) vector geometry operations. It's still pretty early stage, but I realized it could do half the operations in this benchmark, so wanted to get a foundation of code going.

This PR adds Julia capability to the run_benchmarks.sh, and a folder geometryops which contains:

  • Benchmark files for GeometryOps.jl.
  • A Julia Project.toml which defines a list of packages which must be installed, that Julia can be pointed to.
  • (Optional) A Julia plotting file in geometryops/plots.jl.

Here are the comparisons with GeometryOps,
download-5
(edited from the original)
This PR is still WIP, but is in a runnable state now.

@kadyb
Copy link
Owner

kadyb commented Apr 7, 2024

Thanks, great idea! I will be happy to add this as several people have asked to include Julia in benchmarks. I don't know Julia personally, but I think I will test it during the holidays.

@evetion, what do you think?

@evetion
Copy link

evetion commented Apr 7, 2024

This stuff is great, finally having generic native Julia code for things we would otherwise use GDAL/GEOS for (as do most libraries). In that sense, would be good to benchmark that as well (LibGEOS.jl).

I think we discussed benchmarking at least some file loading in Julia last summer, would be good to eventually include that here as well, but that's not the point of GeometryOps.jl.

PS. Where has GEOS gone in the graph?

@asinghvi17
Copy link
Contributor Author

asinghvi17 commented Apr 7, 2024

Good point @evetion - I think I hadn't installed the R GEOS package on my machine then, so it didn't run. Posting updated benchmarks here (plus GeometryOps calling out to GEOS's buffer:
download-12


download-10

@kadyb
Copy link
Owner

kadyb commented Apr 8, 2024

BTW: Have you seen this year's edition of Spatial Data Science across Languages organized by Martin in Prague? Maybe you will be interested as Julia programmers.

@martinfleis
Copy link
Contributor

I've shared the invitation with @evetion I believe (and Martijn and Fabian) but happy to extend it! My knowledge of Julia-land is limited, so feel free to throw names at me.

@rafaqz
Copy link

rafaqz commented Apr 11, 2024

I'll most likely be there, also an author of GeometryOps.jl and in EU

This ensures that LibGEOS functionality like buffer is available to GeometryOps directly.
@asinghvi17
Copy link
Contributor Author

We have just released a new version of GeometryOps with support for buffer - this PR should be ready to merge after that!

@asinghvi17 asinghvi17 changed the title [WIP] Add benchmarks for GeometryOps.jl (a Julia package) Add benchmarks for GeometryOps.jl (a Julia package) Jun 17, 2024
@asinghvi17
Copy link
Contributor Author

asinghvi17 commented Jul 11, 2024

I've just run and updated the PR with the latest changes to GeoDataFrames.jl, which uses GDAL's chunked writes to get some more speedup.

@kadyb this should now run with no additional setup, so the PR is good to merge from my end!

comparison

@kadyb
Copy link
Owner

kadyb commented Jul 12, 2024

Thank you very much! I haven't had time to sit it down yet, but I will look into it during the holidays. (There is one problem, because I longer haven't access to the machine on which I tested this, but I will ask someone to help). The second issue is that we also need to update geopandas, because it now uses a new, faster engine (pyogiro) to load and save data.

So overall, based on the new results, Julia outperforms the R and Python packages and the GEOS binding. It would be also interesting to see what the performance of binding to georust looks like (rsgeo).

And one more question that I am curious about. Will the geometryops binding from R/Python be the fastest of all the packages tested? If so, maybe Julia will eventually replace Rust and C++ in the future?

@kadyb kadyb merged commit e4c374b into kadyb:main Jul 12, 2024
@evetion
Copy link

evetion commented Jul 12, 2024

Thank you very much! I haven't had time to sit it down yet, but I will look into it during the holidays. (There is one problem, because I longer haven't access to the machine on which I tested this, but I will ask someone to help). The second issue is that we also need to update geopandas, because it now uses a new, faster engine (pyogiro) to load and save data.

I expect that pyogrio will get the read/write times to at least the same level as Julia. In the end, they all should be pretty similar (and limited by IO).

So overall, based on the new results, Julia outperforms the R and Python packages and the GEOS binding. It would be also interesting to see what the performance of binding to georust looks like (rsgeo).

Yeah, we should test it. Like pyogrio, I expect georust to be on par with Julia.

And one more question that I am curious about. Will the geometryops binding from R/Python be the fastest of all the packages tested? If so, maybe Julia will eventually replace Rust and C++ in the future?

Not sure what you mean with the sentence. Julia is not a generic replacement for Rust and C++ (Rust might be for C++ though), but it certainly is easy to implement new algorithms, probably for a wider audience than if you would do it in Rust or C++ (neither of all linked authors are proficient in those languages).

@kadyb
Copy link
Owner

kadyb commented Jul 12, 2024

Not sure what you mean with the sentence.

I saw some benchmarks and Julia demonstrated the same speed as low-level languages. If Julia has easier syntax and a lower entry barrier, then I think it could be a very good choice for writing geoprocessing algorithms compared to C++ or Rust. Moreover, we can see that geometryops is faster than R binding to GEOS (probably the same is true for pygeos). Hence, I am also curious what the overhead of calling Julia from R looks like.

Julia is not a generic replacement for Rust and C++

What are the limitations? Or why Rust / C++ would be better?

@evetion
Copy link

evetion commented Jul 13, 2024

I saw some benchmarks and Julia demonstrated the same speed as low-level languages. If Julia has easier syntax and a lower entry barrier, then I think it could be a very good choice for writing geoprocessing algorithms compared to C++ or Rust. Moreover, we can see that geometryops is faster than R binding to GEOS (probably the same is true for pygeos). Hence, I am also curious what the overhead of calling Julia from R looks like.

Agreed! Calling other languages will always cause overhead, and I'm not sure what that will be from R/Python to Julia. Much also has to do with the geometry types used. Seems like a good experiment for SDSL.

What are the limitations? Or why Rust / C++ would be better?

Julia is dynamically typed (like Python/R), whereas Rust/C++ are statically typed. Julia has a garbage collector (like Python/R), whereas the other languages do not. So that makes Julia very easy and similar to Python and R, but we can't (yet) make small executables/libraries, or guarantee that the memory footprint is known beforehand and small enough for embedded systems.

@rafaqz
Copy link

rafaqz commented Jul 13, 2024

Julia should have small compiled binaries soonish (currently they're too big).

We will experiment with calling GeometryOps.jl from R and python. If R/Python packages are wrapping GEOS we may be able to just rewrap the same C objects as Julia LibGEOS.jl objects, as GeometryOps.jl already accepts them without conversion (as a short term experiment with minimal changes).

Mostly GeometryOps.jl isn't actually dynamically typed, but statically known algs (hence this performance). So in theory we will be able to compile good static binaries. But practically not yet.

@asinghvi17 asinghvi17 deleted the as/geometryops branch August 5, 2024 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants