Skip to content

jbgriesner/mimirsbrunn

 
 

Repository files navigation

travis GitHub license GitHub tag

Mímirsbrunn

Mimirsbrunn is an independent geocoding and reverse-geocoding system written in Rust and built upon Elasticsearch. It can handle addresses, streets, points-of-interest (POI), administrative regions or public transport stops. In particular Navitia uses it as its global geocoding service.

Getting Started

Mimirsbrunn is composed of several parts: some of them manage the data import in Elasticsearch while a web service (bragi) wraps Elasticsearch interactions in order to return formated responses (using geocodejson as the responses format)

Install

  • To use the Mimirsbrunn components you need an Elasticsearch database (Elasticsearch version needs to be 2.x).
  • To build you must first install rust following these instructions.
  • You need some development packages: make libssl1.0-dev libgeos-dev
  • Then to build Mimirsbrunn:
cargo build --release

Data Input

Mimirsbrunn relies on geographical datasets to find what users are looking for. These locations belong to different data types and come from various sources. To import these locations Mimirsbrunn comes along with the following specific tools:

Data Types Data Sources Import Tools
Addresses OpenAddresses or BANO (the french opendata dataset) openaddresses2mimir or bano2mimir
Streets OpenStreetMap osm2mimir
POI OpenStreetMap osm2mimir
Public Transport Stops Navitia.io data platform or any GTFS data repository ntfs2mimir or stops2mimir
Administrative Regions OpenStreetMap or Cosmogony osm2mimir or cosmogony2mimir

To use another datasource you have to write your own data importer. See for instance Fafnir, an external component to import POIs from another database.

There are several components in Mimirsbrunn. Most of them are dedicated to the import of data while other are web services (bragi) to wrap Elasticsearch interactions. All the Mimirsbrunn's components described below implement the --help (or -h) argument to explain their use.

Import Tools

Before using Bragi, you have to import data into Elasticsearch. The default and easiest way to import data is to use the docker_mimir tool. However the following import tools are still possible.

osm2mimir

  • This tool imports OpenStreetMap data into Mimir. You can get OpenStreetMap data from Geofabrik, for instance:
curl -O http://download.geofabrik.de/europe/france-latest.osm.pbf
  • Then to import all those data into Mimir, you only have to do:
cargo run --release --osm2mimir --input=france-latest.osm.pbf --level=8 --level=9 --import-way --import-admin --import-poi --dataset=france --connection-string=http://localhost:9200
  • The level parameter refers to administrative levels in OpenStreetMap and is used to control which Admin to import.

bano2mimir

  • This tool imports bano's data into Mimir. It is recommended to run bano integration after OSM or Cosmogony integration in order to attach addresses to admins. You can get bano's data from OpenStreetMap, for instance:
curl -O http://bano.openstreetmap.fr/data/full.csv.gz
gunzip full.csv.gz
  • To import all those data into Mimir, you only have to do:
cargo run --release --bano2mimir -i full.csv --dataset=france --connection-string=http://localhost:9200/
  • The --connection-string argument refers to the ElasticSearch url.

ntfs2mimir

  • This tool imports data from the ntfs files into Mimir. It is recommended to run ntfs integration after OSM or Cosmogony integration so that stops are attached to admins. You can get these data from Navitia.

  • To import all those data into Mimir, you only have to do:

cargo run --release --ntfs2mimir -i <path_to_folder_with_ntfs_file> --dataset=idf --connection-string=http://localhost:9200/
  • The --connection-string argument refers to the ElasticSearch url

  • The ntfs input file needs to match the NTFS specification.

stops2mimir

  • This import tool is still available but is now deprecated because ntfs2mimir imports already stops.

Bragi is the webservice built around ElasticSearch. Its purpose is to hide the ElasticSearch complexity and to return consistent formated responses. Its responses format follow the geocodejson-spec. This is a format used by other geocoding API such as Addok or Photon.

  • To run Bragi:
cargo run --release --bragi --connection-string=http://localhost:9200/munin
  • Then you can call the API (the default Bragi's listening port is 4000):
curl "http://localhost:4000/autocomplete?q=rue+hector+malot"

Contribute

Integration tests

To test, you need to manually build mimir and then simply launch:

cargo test

Integration tests are spawning one ElasticSearch docker, so you'll need a recent docker version. Only one docker is spawn, so ES base has to be cleaned before each test.

To write a new test:

  • write your test in a separate file in tests/
  • add a call to your test in tests/tests.rs::test_all()
  • pass a new ElasticSearchWrapper to your test method to get the right connection string for ES base
  • the creation of this ElasticSearchWrapper automatically cleans ES base (you can also refresh ES base, clean up during tests, etc.)

Geocoding tests

We use geocoder-tester to run real search queries and check the output against expected to prevent regressions.

Feel free to add some tests cases here.

When a new Pull Request is submitted, it will be manually tested using this repo that loads a bunch of data into the geocoder, runs geocoder-tester and then add the results as a comment in the PR.

More documentation

For more precise documentation on use, troubleshooting, development please check the documentation directory.

About

mimir data import

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Rust 99.7%
  • Shell 0.3%