Skip to content

Commit

Permalink
v1.2 released! This closes #2, #7 and #8.
Browse files Browse the repository at this point in the history
  • Loading branch information
thampiman committed Mar 30, 2015
1 parent db3b61b commit 2cf0b19
Show file tree
Hide file tree
Showing 11 changed files with 140 additions and 69 deletions.
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,6 @@ Thumbs.db
*.pyc
admin1CodesASCII.txt
admin2Codes.txt
cities1000.zip
cities1000.zip
.python-version
__pycache__
45 changes: 30 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ Reverse Geocoder
=================
A Python library for offline reverse geocoding. It improves on an existing library called [reverse_geocode](https://pypi.python.org/pypi/reverse_geocode/1.0) developed by [Richard Penman](https://bitbucket.org/richardpenman/reverse_geocode).

*Update*: v1.1 released! Python 3 not supported still.
*UPDATE*: v1.2 released with Python3 support and more accurate geocoding! See release notes below.

### About
Ajay Thampi | [@thampiman](https://twitter.com/thampiman) | [opensignal.com](http://opensignal.com) | [ajaythampi.com](http://ajaythampi.com)
Expand All @@ -14,13 +14,22 @@ Ajay Thampi | [@thampiman](https://twitter.com/thampiman) | [opensignal.com](htt
The K-D tree is populated with cities that have a population > 1000. The source of the data is [GeoNames](http://download.geonames.org/export/dump/).

## Installation
For first time installation use,
```
$ pip install reverse_geocoder
```

Or upgrade an existing installation using,
```
$ pip install --upgrade reverse_geocoder
```

Package can be found on [PyPI](https://pypi.python.org/pypi/reverse_geocoder/).

*Update*: v1.1 released containing [Brandon](https://github.com/bdon)'s and [David](https://github.com/DavidJFelix)'s fixes
### Release Notes
1. v1.0 - First version with support for only Python2
2. v1.1 - Fix for issue [#1](https://github.com/thampiman/reverse-geocoder/issues/1) by [Brandon](https://github.com/bdon)
3. v1.2 - Support for Python 3, conversion of [Geodetic](http://en.wikipedia.org/wiki/Geodetic_datum) coordinates to [ECEF](http://en.wikipedia.org/wiki/ECEF) for use in K-D trees to find nearest neighbour using the Euclidean distance function. This release fixes issues [#2](https://github.com/thampiman/reverse-geocoder/issues/2) and [#8](https://github.com/thampiman/reverse-geocoder/issues/8). Special thanks to [David](https://github.com/DavidJFelix) for his help in partly fixing [#2](https://github.com/thampiman/reverse-geocoder/issues/2).

## Usage
The library supports two modes:
Expand All @@ -31,7 +40,7 @@ The library supports two modes:
```python
import reverse_geocoder as rg

coordinates = (51.5214588,-0.1729636),(13.9280531,100.3735803)
coordinates = (51.5214588,-0.1729636),(9.936033, 76.259952),(37.38605,-122.08385)

results = rg.search(coordinates) # default mode = 2

Expand All @@ -40,18 +49,24 @@ print results

The above code will output the following:
```
[{'admin1': 'England',
'admin2': 'Greater London',
'cc': 'GB',
'lat': '51.51116',
'lon': '-0.18426',
'name': 'Bayswater'},
{'admin1': 'Nonthaburi',
'admin2': '',
'cc': 'TH',
'lat': '13.91783',
'lon': '100.42403',
'name': 'Bang Bua Thong'}]
[{'name': 'Barbican',
'cc': 'GB',
'lat': '51.51988',
'lon': '-0.09446',
'admin1': 'England',
'admin2': 'Greater London'},
{'name': 'Cochin',
'cc': 'IN',
'lat': '9.93988',
'lon': '76.26022',
'admin1': 'Kerala',
'admin2': 'Ernakulam'},
{'name': 'Mountain View',
'cc': 'US',
'lat': '37.38605',
'lon': '-122.08385',
'admin1': 'California',
'admin2': 'Santa Clara County'}]
```

If you'd like to use the single-threaded K-D tree, set mode = 1 as follows:
Expand Down
4 changes: 3 additions & 1 deletion README.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ This library improves on an existing library called reverse_geocode developed by
2. The performance is much faster since a parallelized K-D tree is implemeneted
(See https://github.com/thampiman/reverse-geocoder for performance comparison)

Supports Python 2 and 3.

Example usage:
>>> import reverse_geocoder as rg
>>> coordinates = (51.5214588,-0.1729636),(13.9280531,100.3735803)
Expand All @@ -21,4 +23,4 @@ Example usage:
'cc': 'TH',
'lat': '13.91783',
'lon': '100.42403',
'name': 'Bang Bua Thong'}]
'name': 'Bang Bua Thong'}]
Binary file added dist/reverse_geocoder-1.2.tar.gz
Binary file not shown.
Binary file modified performance.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added performance_logscale.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
74 changes: 51 additions & 23 deletions reverse_geocoder/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,11 @@
import os
import sys
import csv
csv.field_size_limit(sys.maxint)
import urllib
csv.field_size_limit(sys.maxsize)
import zipfile
import collections
from scipy.spatial import cKDTree as KDTree
import time
import cKDTree_MP as KDTree_MP
import multiprocessing as mp
from reverse_geocoder import cKDTree_MP as KDTree_MP
import numpy as np

GN_URL = 'http://download.geonames.org/export/dump/'
GN_CITIES1000 = 'cities1000'
Expand Down Expand Up @@ -59,6 +56,9 @@

RG_FILE = 'rg_cities1000.csv'

A = 6378.137 # major axis in kms
E2 = 0.00669437999014

def singleton(cls):
instances = {}
def getinstance(mode=2):
Expand Down Expand Up @@ -92,7 +92,7 @@ def query(self,coordinates):
def extract(self,local_filename):
if os.path.exists(local_filename):
print('Loading formatted geocoded file...')
rows = csv.DictReader(open(local_filename,'rb'))
rows = csv.DictReader(open(local_filename,'rt'))
else:
gn_cities1000_url = GN_URL + GN_CITIES1000 + '.zip'
gn_admin1_url = GN_URL + GN_ADMIN1
Expand All @@ -103,29 +103,37 @@ def extract(self,local_filename):

if not os.path.exists(cities1000_zipfilename):
print('Downloading files from Geoname...')
urllib.urlretrieve(gn_cities1000_url,cities1000_zipfilename)
urllib.urlretrieve(gn_admin1_url,GN_ADMIN1)
urllib.urlretrieve(gn_admin2_url,GN_ADMIN2)
try: # Python 3
import urllib.request
urllib.request.urlretrieve(gn_cities1000_url,cities1000_zipfilename)
urllib.request.urlretrieve(gn_admin1_url,GN_ADMIN1)
urllib.request.urlretrieve(gn_admin2_url,GN_ADMIN2)
except ImportError: # Python 2
import urllib
urllib.urlretrieve(gn_cities1000_url,cities1000_zipfilename)
urllib.urlretrieve(gn_admin1_url,GN_ADMIN1)
urllib.urlretrieve(gn_admin2_url,GN_ADMIN2)


print('Extracting cities1000...')
z = zipfile.ZipFile(open(cities1000_zipfilename,'rb'))
open(cities1000_filename,'wb').write(z.read(cities1000_filename))

print('Loading admin1 codes...')
admin1_map = {}
t_rows = csv.reader(open(GN_ADMIN1,'rb'),delimiter='\t')
t_rows = csv.reader(open(GN_ADMIN1,'rt'),delimiter='\t')
for row in t_rows:
admin1_map[row[ADMIN_COLUMNS['concatCodes']]] = row[ADMIN_COLUMNS['asciiName']]

print('Loading admin2 codes...')
admin2_map = {}
for row in csv.reader(open(GN_ADMIN2,'rb'),delimiter='\t'):
for row in csv.reader(open(GN_ADMIN2,'rt'),delimiter='\t'):
admin2_map[row[ADMIN_COLUMNS['concatCodes']]] = row[ADMIN_COLUMNS['asciiName']]

print('Creating formatted geocoded file...')
writer = csv.DictWriter(open(local_filename,'wb'),fieldnames=RG_COLUMNS)
writer = csv.DictWriter(open(local_filename,'wt'),fieldnames=RG_COLUMNS)
rows = []
for row in csv.reader(open(cities1000_filename,'rb'),delimiter='\t',quoting=csv.QUOTE_NONE):
for row in csv.reader(open(cities1000_filename,'rt'),delimiter='\t',quoting=csv.QUOTE_NONE):
lat = row[GN_COLUMNS['latitude']]
lon = row[GN_COLUMNS['longitude']]
name = row[GN_COLUMNS['asciiName']]
Expand Down Expand Up @@ -154,26 +162,46 @@ def extract(self,local_filename):
os.remove(cities1000_filename)

# Load all the coordinates and locations
coordinates,locations = [],[]
geo_coords,locations = [],[]
for row in rows:
coordinates.append((row['lat'],row['lon']))
geo_coords.append((row['lat'],row['lon']))
locations.append(row)
return coordinates,locations
ecef_coords = geodetic_in_ecef(geo_coords)
return ecef_coords,locations

def geodetic_in_ecef(geo_coords):
geo_coords = np.asarray(geo_coords).astype(np.float)
lat = geo_coords[:,0]
lon = geo_coords[:,1]

lat_r = np.radians(lat)
lon_r = np.radians(lon)
normal = A / (np.sqrt(1 - E2*(np.sin(lat_r) ** 2)))
x = normal * np.cos(lat_r) * np.cos(lon_r)
y = normal * np.cos(lat_r) * np.sin(lon_r)
z = normal * (1 - E2) * np.sin(lat)
return np.column_stack([x,y,z])

def rel_path(filename):
return os.path.join(os.getcwd(), os.path.dirname(__file__), filename)

def get(coordinate,mode=2):
def get(geo_coord,mode=2):
rg = RGeocoder(mode=mode)
return rg.query([coordinate])[0]
return rg.query(geodetic_in_ecef([geo_coord]))[0]

def search(coordinates,mode=2):
def search(geo_coords,mode=2):
rg = RGeocoder(mode=mode)
return rg.query(coordinates)
return rg.query(geodetic_in_ecef(geo_coords))

if __name__ == '__main__':
print('Testing single coordinate...')
city = (37.38605,-122.08385)
print('Reverse geocoding 1 city...')
result = get(city)
print(result)

print('Testing coordinates...')
cities = [(-37.81, 144.96),(31.76, 35.21)]

cities = [(51.5214588,-0.1729636),(9.936033, 76.259952),(37.38605,-122.08385)]
print('Reverse geocoding %d cities...' % len(cities))
results = search(cities)
print(results)
28 changes: 25 additions & 3 deletions reverse_geocoder/cKDTree_MP.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ def pquery(self,x_list,k=1,eps=0,p=2,
for p in pool: p.start()
for p in pool: p.join()
if ierr.value != 0:
raise RuntimeError, ('%d errors in worker processes' % (ierr.value))
raise RuntimeError('%d errors in worker processes' % (ierr.value))

return _d.copy(),_i.astype(int).copy()

Expand All @@ -98,7 +98,7 @@ def __init__(self,ndata,nprocs):
def __iter__(self):
return self

def next(self):
def next(self): # Python 2 support
self._lock.acquire()
ndata = self._ndata.value
start = self._start.value
Expand All @@ -117,4 +117,26 @@ def next(self):
return slice(s0, s1)
else:
self._lock.release()
raise StopIteration
raise StopIteration

def __next__(self): # Python 3 support
self._lock.acquire()
ndata = self._ndata.value
start = self._start.value
chunk = self._chunk
if ndata:
if chunk > ndata:
s0 = start
s1 = start + ndata
self._ndata.value = 0
else:
s0 = start
s1 = start + chunk
self._ndata.value = ndata - chunk
self._start.value = start + chunk
self._lock.release()
return slice(s0, s1)
else:
self._lock.release()
raise StopIteration

Loading

0 comments on commit 2cf0b19

Please sign in to comment.