This repo is a testbed for benchmarking basic geospatial queries across a range of data stores.
It is written in Ruby, and organized as series of Rake tasks.
1 second of PostGIS time ≈ 6 seconds of Elasticsearch time ≈ 7 seconds of Mongo time
Well, sort of. On my machine, PostGIS and Mongo display steady throughput, while Elasticsearch is more erratic, possibly due to JVM garbage collection.
Between Mongo and Elastic, it seems that Elastic has better peak throughput, but Mongo has better average throughput.
Postgres is still the hands-down winner, nearly an order of magnitude faster.
Perform a set of 100 bounding box queries that represent a moving window, as might be fetched from a front-end map client, over a swath of the United States extending from New York to Florida.
Repeat this 100 times for each data store.
The spatial dataset for this benchmark is the Museum Universe Data File, published by the Institute of Museum and Library Services, a collection of ~33,000 museums and related organizations in the United States.
See it on a map
For N=100, i.e. 10,000 bounding box queries:
Data store | Version | Index | Elapsed | Normalized | Throughput |
---|---|---|---|---|---|
Postgres / PostGIS | 10.1 / 2.4 | GiST | 2.38 sec | 1.0 | 4,210 queries/sec |
Elasticsearch | 2.4 | n/a | 14.90 sec | 6.27 | 671 queries/sec |
MongoDB | 3.4 | 2dsphere | 16.77 sec | 7.06 | 596 queries/sec |
Hardware note: This is on my 2014-vintage Mac laptop:
- Macbook Pro
- Intel i7, quad-core, 2.3GHz
- 16 GB RAM
You will need working installations of:
- PostgreSQL with the PostGIS spatial extensions
- MongoDB
- Elasticsearch
With Homebrew this would be something like
brew install postgresql postgis
brew install mongodb
brew install elasticsearch
# follow post-install instructions
brew services start postgresql
brew services start mongodb
brew services start elasticsearch
This project will take care of creating the necessary databases and indexes when you do rake load
.
You can configure the services and databases in databases.yml.
- Clone this project and install its dependencies
$ git clone https://github.com/anandaroop/spatial-benchmarks.git
$ cd spatial-benchmarks
$ bundle install
- Obtain the MUDF csv datafile:
$ rake get_csv
- Load up the data
$ rake load
- Run the benchmarks
$ rake benchmark
If you run the benchmarks, why not open an issue or PR with the results 😀 ?