-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #165 from urbanbigdatacentre/dev-morphological_inf…
…ormality_model Morphological model code and documentation
- Loading branch information
Showing
14 changed files
with
815 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,47 @@ | ||
# ideamaps-models | ||
# IDEAMAPS Data Ecosystem: Models of Deprivation | ||
|
||
|
||
## 📚 Introduction | ||
|
||
The [IDEAMAPS Data Ecosystem project](https://www.ideamapsnetwork.org/project/ideamaps-data-ecosystem) is co-designing and developing a participatory data-modelling ecosystem to produce deprived area maps routinely and accurately at scale across cities in lower and middle-income countries to support multiple local stakeholders in their decision-making. | ||
|
||
In this repository, we store several model developed within the project that address different [domains of deprivation](https://doi.org/10.1016/j.compenvurbsys.2022.101770), including morphological informality and barriers to healthcare. Additionally, we provide tools to deploy our models to new cities. | ||
|
||
|
||
## 🌍 Available Models | ||
|
||
The following is a list of deprivation models that have been developed within the project: | ||
|
||
| Domain of Deprivation | City | Folder | Version | Documentation | | ||
|:-------------------------:|:---------------:|:-----------------------------------------------------------------------------------------------------------------------:|:-------:|---------------| | ||
| Morphological Informality | Nairobi (Kenya) | [Link](https://github.com/urbanbigdatacentre/ideamaps-models/tree/main/Sub-domains/MorphologicalInformality/Nairobi_v3) | V3 | | | ||
| Morphological Informality | Lagos (Nigeria) | [Link](https://github.com/urbanbigdatacentre/ideamaps-models/tree/main/Sub-domains/MorphologicalInformality/Lagos_v3) | V3 | | | ||
| Morphological Informality | Kano (Nigeria) | [Link](https://github.com/urbanbigdatacentre/ideamaps-models/tree/main/Sub-domains/MorphologicalInformality/Kano_v3) | V3 | | | ||
| Barriers to Healthcare | Kano (Nigeria) | [Link](https://github.com/urbanbigdatacentre/ideamaps-models/tree/main/Sub-domains/BarriersHealthCareAccess/Kano_v1.1) | V1.1 | | | ||
|
||
|
||
## ⚙️ Model Deployment | ||
|
||
|
||
We also provide code to deploy our models to new cities. To do so, please follow the instructions available in the respective model directories. | ||
|
||
|
||
|
||
## 🗺️ Model Validation | ||
|
||
|
||
Our model outputs can also be validated using the [IDEAMAPS Data Ecosystem platform](https://www.ideamapsdataecosystem.org/). The validation data will be used to iteratively improve our models. | ||
|
||
|
||
|
||
## ✏️ Contributing | ||
We appreciate all contributions. Please refer to Contributing Guidelines. | ||
|
||
|
||
## 📝 References | ||
|
||
If you find this work useful, please cite our IDEAMAPS Data Ecosystem umbrella paper: | ||
|
||
``` | ||
``` |
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
77 changes: 77 additions & 0 deletions
77
Sub-domains/MorphologicalInformality/Sourcecode/V3/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
# Deploying the Morphological Informality Model (V3) | ||
|
||
|
||
|
||
This folder contains all required code to model morphological informality based on building footprint data. | ||
|
||
We refer to our publication for a detailed description of the methodology: [preprint](). | ||
|
||
|
||
|
||
|
||
## 🛠️ Setup | ||
|
||
|
||
1. **Clone the repository**: | ||
``` | ||
git clone https://github.com/urbanbigdatacentre/ideamaps-models.git | ||
cd ideamaps-models/Sub-domains/MorphologicalInformality/Sourcecode/V3 | ||
``` | ||
2. **Create a virtual environment using Conda**: | ||
``` | ||
conda env create ideamaps-models python=3.10 | ||
conda activate ideamaps-models | ||
``` | ||
3. **Install dependencies from requirements.txt file** using pip | ||
``` | ||
pip install -r requirements.txt | ||
``` | ||
## 🏚️ Prepare Building Footprint Data | ||
Our model requires building footprints as input data. There are several providers for open building footprint data. We recommend using data from the [Overture Map Foundation](https://overturemaps.org/). | ||
## ⚙️ Run Model | ||
Follow these steps to obtain clusters of similar urban form types. | ||
1. **Create the basic urban form elements** | ||
``` | ||
python geoelements.py -e *path to the file* -b *path to the builidng footprints file* -o *path the the output dir* | ||
``` | ||
2. **Create the basic urban form elements** | ||
``` | ||
python morphometrics.py -b *path to building footprints file* -t *path to tessellation file* -o *path the the output dir* | ||
``` | ||
3. **Create the basic urban form elements** | ||
The morphometrics dir corresponds to the output dir used in step 2. | ||
``` | ||
python aggregation.py -m *path to the morphometrics dir* -b *path to the builidng footprints file* -g *path to the grid file* -o *path the the output dir* | ||
``` | ||
4. **Create the basic urban form elements** | ||
``` | ||
python clustering.py -m *path to the morphometrics file* -o *path the the output dir* | ||
``` | ||
The resulting urban form clusters can be linked to morphological informality. | ||
## 📝 Reference | ||
If you find this work useful, please cite: | ||
``` | ||
|
||
``` |
94 changes: 94 additions & 0 deletions
94
Sub-domains/MorphologicalInformality/Sourcecode/V3/aggregation.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,94 @@ | ||
import geopandas as gpd | ||
import pandas as pd | ||
import numpy as np | ||
from pathlib import Path | ||
|
||
from parsers import aggregation_parser as argument_parser | ||
|
||
|
||
if __name__ == '__main__': | ||
args = argument_parser().parse_known_args()[0] | ||
# TODO: add ltcWRB | ||
|
||
umm = gpd.read_parquet(args.building_file) | ||
umm = umm[['uID', 'geometry']] | ||
assert np.all(umm.is_valid) | ||
|
||
# Loading Urban Morphometrics (UMM) | ||
metrics = ['sdbAre', 'ssbElo', 'stbOri', 'stcOri', 'ssbCCD', 'sdcAre', 'sscERI', 'sicCAR', 'mtbAli', 'mtbNDi', | ||
'mtcWNe', 'mdcAre', 'ltcBuA', 'ltbIBD', 'ltcWRB'] | ||
|
||
for metric in metrics: | ||
metric_values = pd.read_parquet(Path(args.morphometrics_dir) / f'{metric}.pq') | ||
umm = pd.merge(umm, metric_values, on='uID', how='inner') | ||
|
||
umm = gpd.GeoDataFrame(umm, geometry='geometry') | ||
umm = umm.to_crs("EPSG:4326") | ||
umm['centroid'] = umm.geometry.centroid | ||
umm = gpd.GeoDataFrame(umm, geometry='centroid').drop(columns='geometry') | ||
|
||
#Aggregation to the grid | ||
grid = gpd.read_file(args.grid_file) | ||
grid = grid[['geometry']] | ||
grid['grid_id'] = range(1, len(grid) + 1) # create column containing an unique raw numbering for each grid | ||
grid = grid.to_crs("EPSG:4326") | ||
assert np.all(grid.is_valid) | ||
|
||
# Perform Spatial Join | ||
umm_grid = gpd.sjoin(grid, umm, how='inner', predicate='intersects') | ||
|
||
# handle missing data | ||
has_missing_values = umm_grid.isnull().values.any() | ||
|
||
umm_grid = umm_grid.dropna() | ||
|
||
# Assuming 'joined' is your GeoDataFrame | ||
# 'geometry' is the column name of the grid geometry | ||
# 'grid_id' is the identifier for each grid cell | ||
|
||
# 'variables' is a list of the variable names you want to aggregate by mean and median | ||
median = ['sdcAre', 'ssbElo', 'ssbCCD', 'mtbAli', 'mtbNDi', 'ltcBuA', 'sdbAre', 'sscERI', 'sicCAR', 'mtcWNe', | ||
'mdcAre', 'ltbIBD', 'ltcWRB'] | ||
sd = ['stbOri', 'stcOri'] | ||
sum = ['sdcAre'] | ||
|
||
# Set the grid geometry as the active geometry | ||
print(umm_grid.columns) | ||
umm_grid = umm_grid.set_geometry('geometry') | ||
|
||
# Group by 'grid_id' and calculate median and std | ||
median_values = umm_grid.groupby('grid_id')[median].median().add_prefix('md_') | ||
sd_values = umm_grid.groupby('grid_id')[sd].std().fillna(0).add_prefix('sd_') | ||
sum_values = umm_grid.groupby('grid_id')[sum].sum().add_prefix('sum_') | ||
|
||
building_counts = umm_grid.groupby('grid_id').size().rename('bcount') | ||
single_building_grids = building_counts[building_counts == 1] | ||
|
||
# the NaN values are because there is only 1 building per grid | ||
sd_values.isnull().sum() | ||
|
||
# Assuming df_shp and join_df are your DataFrames and 'uID' is the common column | ||
merge_stats = pd.merge(median_values, sd_values, on='grid_id', how='inner') | ||
merge_stats = pd.merge(merge_stats, sum_values, on='grid_id', how='inner') | ||
merge_stats = pd.merge(merge_stats, building_counts, on='grid_id', how='inner') | ||
|
||
merge_stats.isnull().sum() | ||
|
||
if grid.index.name != 'grid_id': | ||
grid = grid.set_index('grid_id') | ||
|
||
if merge_stats.index.name != 'grid_id': | ||
merge_stats = merge_stats.set_index('grid_id') | ||
|
||
# Perform Spatial Join | ||
df_stats = pd.merge(grid, merge_stats, on='grid_id', how='inner') | ||
gdf_stats = gpd.GeoDataFrame(df_stats, geometry='geometry', crs='EPSG:4326') | ||
|
||
# if any column is duplicated | ||
duplicate_columns = gdf_stats.columns[gdf_stats.columns.duplicated()] | ||
print(duplicate_columns) | ||
|
||
gdf_stats = gdf_stats.loc[:, ~gdf_stats.columns.duplicated()] | ||
|
||
# Export to a new gpkg | ||
gdf_stats.to_parquet(Path(args.output_dir) / 'morphometrics_grid.pq') |
57 changes: 57 additions & 0 deletions
57
Sub-domains/MorphologicalInformality/Sourcecode/V3/clustering.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
import geopandas as gpd | ||
from sklearn.cluster import KMeans | ||
from sklearn.preprocessing import StandardScaler | ||
from pathlib import Path | ||
|
||
from parsers import clustering_parser as argument_parser | ||
|
||
SEED = 7 | ||
|
||
if __name__ == '__main__': | ||
args = argument_parser().parse_known_args()[0] | ||
|
||
gdf = gpd.read_parquet(args.morphometrics_file) | ||
print(gdf.columns) | ||
gdf.head() | ||
#TODO: only cluster cells with buildings | ||
|
||
#TODO: add md_ltcWRB | ||
morph_isl = ['md_ssbCCD', 'md_mtbAli', 'md_ltcBuA', 'md_mtcWNe', 'md_ltcWRB', 'sd_stbOri', 'sd_stcOri'] | ||
|
||
#TODO add md_ltcWRB and md_ltbIBD | ||
morph_sds = ['md_sdcAre', 'md_ssbElo', 'md_mtbNDi', 'md_ltbIBD', 'md_ltcBuA', 'md_sdbAre', 'md_sscERI', 'md_sicCAR', | ||
'md_mtcWNe', 'md_mdcAre', 'md_ltcWRB', 'sum_sdcAre'] | ||
|
||
gdf_isl = gdf[morph_isl] | ||
gdf_sds = gdf[morph_sds] | ||
|
||
# Initialize the StandardScaler object | ||
scaler = StandardScaler() | ||
|
||
# Scale the data by standardizing features | ||
# by removing the mean and scaling to unit variance | ||
data_isl = scaler.fit_transform(gdf_isl) | ||
data_sds = scaler.fit_transform(gdf_sds) | ||
|
||
# elbo Irregular Layout | ||
# Calculating sum of squared distances for k in range 1 to 20 | ||
ssd = [] | ||
for k in [6, 8, 10]: | ||
km = KMeans(n_clusters=k, random_state=SEED) | ||
km = km.fit(data_isl) | ||
ssd.append(km.inertia_) | ||
gdf[f'isl_c{k}'] = km.labels_ | ||
|
||
|
||
|
||
# elbo Small, Dense Structures | ||
# Calculating sum of squared distances for k in range 1 to 20 | ||
ssd = [] | ||
for k in [6, 8, 10]: | ||
km = KMeans(n_clusters=k, random_state=SEED) | ||
km = km.fit(data_sds) | ||
ssd.append(km.inertia_) | ||
gdf[f'sds_c{k}'] = km.labels_ | ||
|
||
gdf.to_parquet(Path(args.output_dir) / 'clustering.pq') | ||
|
Oops, something went wrong.