All the Places

A project to generate point of interest (POI) data sourced primarily from major websites with 'store location' pages. The project uses scrapy, a popular Python-based web scraping framework, to write individual site spiders to retrieve POI data, publishing the results in a standard format. There are various scrapy tutorials, this series on YouTube is reasonable.

Getting started

Development setup

Windows users may need to follow some extra steps, please follow the scrapy docs for upto date details.

Clone a copy of the project from the GitHub All The Places repo (or your own fork if you are considering contributing to the project):
```
$ git clone git@github.com:alltheplaces/alltheplaces.git
```
If you haven't done so already, install pipenv and check that it runs:
```
$ pipenv --version
pipenv, version 2022.8.30
```
Use pipenv to install the project dependencies:
```
$ cd alltheplaces
$ pipenv install
```
Test for successful project installation:
```
$ pipenv run scrapy
```
If the above runs without complaint, then you have a functional installation and are ready to run and write spiders.

Contributing code

Many of the sites provide their data in a standard format. Others export their data via simple APIs. We have a number of guides to help you develop spiders:

What should I call my spider?
Using Wikidata and the Name Suggestion Index
Sitemaps make finding POI pages easier
Data from many POI pages can be extracted without writing code
What is expected in a pull request?

The weekly run

The output from running the project is published on a regular cadence to our website: alltheplaces.xyz. You should not run all the spiders to pick up the output: the less the project "bothers" a website the more we will be tolerated.

Contact us

Communication is primarily through tickets on the project GitHub issue tracker. Many contributors are also present on OSM US Slack, in particular we watch the #poi channel.

License

The data generated by our spiders is provided on our website and released under Creative Commons’ CC-0 waiver.

The spider software that produces this data (this repository) is licensed under the MIT license.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

All the Places

Getting started

Development setup

Contributing code

The weekly run

Contact us

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

All the Places

Getting started

Development setup

Contributing code

The weekly run

Contact us

License