generated from worldbank/template
-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implement API design with Postgres Backend (#8)
* Initial project structure for API Create fastapi application skeleton with some initial description of the summary endpoint. Includes field valdiation for aoi polygon. * Add summary response with mocked data * Add duckdb data ingestion * Adapt api to read from duckdb * Fix and cleanup h3 generation functionality * Add tests on api * Adapt api for storing data within postgres Load sample data in NYC for development. Modify api to use postgres. Visualization notebook with lonboard for quick QA. * Add configuration variable for table name * Add unit tests for db_utils and update existing API tests Modified app/routers/api.py to utilize get_available_fields and get_summaries from db_utils.py, updated tests/test_api.py to align with the refactored API logic, added app/utils/db_utils.py with utility functions for database operations, including get_available_fields and get_summaries, and added tests/test_db_utils.py with unit tests for the new database utility functions using pytest and unittest.mock to ensure functionality without a real database connection. * Fix definition of environment variables error Add notebook visualization that includes the available fields endpoint. Fix discovered bug in order of environment variables defined before being loaded from env file. * Update field validation to use geojson_pydantic Required shifting some types around for the aoi. Shapely is still used in h3_utils.py. * Remove error handling on HTTPException * Format with black and remove print statements * Add pydantic settings * Update db_utils tests to match new summaries output that reflects df structure * Add github action CI for space2stats api
- Loading branch information
Showing
25 changed files
with
1,228 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
name: Run Tests | ||
|
||
on: [push, pull_request] | ||
|
||
jobs: | ||
test: | ||
runs-on: ubuntu-latest | ||
|
||
env: | ||
DB_HOST: localhost | ||
DB_PORT: 5432 | ||
DB_NAME: mydatabase | ||
DB_USER: myuser | ||
DB_PASSWORD: mypassword | ||
DB_TABLE_NAME: space2stats | ||
|
||
steps: | ||
- name: Checkout code | ||
uses: actions/checkout@v2 | ||
|
||
- name: Set up Python | ||
uses: actions/setup-python@v2 | ||
with: | ||
python-version: 3.11 | ||
|
||
- name: Install dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install -r space2stats_api/requirements.txt | ||
- name: Set PYTHONPATH | ||
run: echo "PYTHONPATH=$(pwd)/space2stats_api" >> $GITHUB_ENV | ||
|
||
- name: Run tests | ||
run: pytest space2stats_api/tests |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -93,4 +93,11 @@ target/ | |
_build/ | ||
|
||
# python-dotenv | ||
.env | ||
.env | ||
wb_aws.env | ||
db.env | ||
|
||
# data | ||
*.parquet | ||
*.duckdb | ||
.pgdata |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
version: '3' | ||
|
||
services: | ||
database: | ||
image: ghcr.io/stac-utils/pgstac:v0.8.5 | ||
environment: | ||
- POSTGRES_USER=username | ||
- POSTGRES_PASSWORD=password | ||
- POSTGRES_DB=postgis | ||
- PGUSER=username | ||
- PGPASSWORD=password | ||
- PGDATABASE=postgis | ||
ports: | ||
- "${MY_DOCKER_IP:-127.0.0.1}:5439:5432" | ||
command: postgres -N 500 | ||
volumes: | ||
- ./.pgdata:/var/lib/postgresql/data |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
hex_id,fields | ||
862a1008fffffff,"{'__index_level_0__': 2436230, 'ogc_fid': 11, 'sum_pop_f_0_2020': 3355.96850585938, 'sum_pop_f_10_2020': 12371.3955078125, 'sum_pop_f_15_2020': 15563.8896484375, 'sum_pop_f_1_2020': 12494.43359375, 'sum_pop_f_20_2020': 30224.130859375, 'sum_pop_f_25_2020': 42427.28125, 'sum_pop_f_30_2020': 34711.5625, 'sum_pop_f_35_2020': 25574.31640625, 'sum_pop_f_40_2020': 20973.458984375, 'sum_pop_f_45_2020': 18116.025390625, 'sum_pop_f_50_2020': 18691.546875, 'sum_pop_f_55_2020': 21246.267578125, 'sum_pop_f_5_2020': 12426.166015625, 'sum_pop_f_60_2020': 22672.314453125, 'sum_pop_f_65_2020': 20404.287109375, 'sum_pop_f_70_2020': 17031.431640625, 'sum_pop_f_75_2020': 11438.015625, 'sum_pop_f_80_2020': 17598.7109375, 'sum_pop_m_0_2020': 3499.36499023438, 'sum_pop_m_10_2020': 12757.32421875, 'sum_pop_m_15_2020': 14690.669921875, 'sum_pop_m_1_2020': 13028.8857421875, 'sum_pop_m_20_2020': 24383.478515625, 'sum_pop_m_25_2020': 36570.5, 'sum_pop_m_30_2020': 33656.46875, 'sum_pop_m_35_2020': 25711.21875, 'sum_pop_m_40_2020': 21449.7265625, 'sum_pop_m_45_2020': 18393.3671875, 'sum_pop_m_50_2020': 17531.7890625, 'sum_pop_m_55_2020': 18519.333984375, 'sum_pop_m_5_2020': 12790.00390625, 'sum_pop_m_60_2020': 17991.046875, 'sum_pop_m_65_2020': 15532.0927734375, 'sum_pop_m_70_2020': 12730.666015625, 'sum_pop_m_75_2020': 8303.8662109375, 'sum_pop_m_80_2020': 9250.68359375}" | ||
862a100d7ffffff,"{'__index_level_0__': 2436238, 'ogc_fid': 19, 'sum_pop_f_0_2020': 3169.98120117188, 'sum_pop_f_10_2020': 11957.44140625, 'sum_pop_f_15_2020': 14855.185546875, 'sum_pop_f_1_2020': 11801.9931640625, 'sum_pop_f_20_2020': 27791.64453125, 'sum_pop_f_25_2020': 38212.46875, 'sum_pop_f_30_2020': 31441.3046875, 'sum_pop_f_35_2020': 23516.759765625, 'sum_pop_f_40_2020': 19418.1640625, 'sum_pop_f_45_2020': 16919.677734375, 'sum_pop_f_50_2020': 17537.130859375, 'sum_pop_f_55_2020': 19871.884765625, 'sum_pop_f_5_2020': 11981.1435546875, 'sum_pop_f_60_2020': 21095.8203125, 'sum_pop_f_65_2020': 18827.03125, 'sum_pop_f_70_2020': 15865.4326171875, 'sum_pop_f_75_2020': 10600.365234375, 'sum_pop_f_80_2020': 16381.328125, 'sum_pop_m_0_2020': 3306.43798828125, 'sum_pop_m_10_2020': 12306.724609375, 'sum_pop_m_15_2020': 14101.4326171875, 'sum_pop_m_1_2020': 12310.578125, 'sum_pop_m_20_2020': 22694.451171875, 'sum_pop_m_25_2020': 32823.91015625, 'sum_pop_m_30_2020': 30206.892578125, 'sum_pop_m_35_2020': 23337.458984375, 'sum_pop_m_40_2020': 19612.8359375, 'sum_pop_m_45_2020': 16889.306640625, 'sum_pop_m_50_2020': 16264.783203125, 'sum_pop_m_55_2020': 17167.7890625, 'sum_pop_m_5_2020': 12373.455078125, 'sum_pop_m_60_2020': 16715.095703125, 'sum_pop_m_65_2020': 14265.3515625, 'sum_pop_m_70_2020': 11801.837890625, 'sum_pop_m_75_2020': 7603.23095703125, 'sum_pop_m_80_2020': 8574.2255859375}" | ||
862a100dfffffff,"{'__index_level_0__': 2436239, 'ogc_fid': 20, 'sum_pop_f_0_2020': 5147.3330078125, 'sum_pop_f_10_2020': 21324.5234375, 'sum_pop_f_15_2020': 22045.63671875, 'sum_pop_f_1_2020': 19163.771484375, 'sum_pop_f_20_2020': 27736.291015625, 'sum_pop_f_25_2020': 35799.5234375, 'sum_pop_f_30_2020': 33238.703125, 'sum_pop_f_35_2020': 27172.52734375, 'sum_pop_f_40_2020': 23106.755859375, 'sum_pop_f_45_2020': 21305.44921875, 'sum_pop_f_50_2020': 22005.82421875, 'sum_pop_f_55_2020': 23561.41796875, 'sum_pop_f_5_2020': 20879.88671875, 'sum_pop_f_60_2020': 23332.68359375, 'sum_pop_f_65_2020': 18850.859375, 'sum_pop_f_70_2020': 17307.85546875, 'sum_pop_f_75_2020': 11436.3251953125, 'sum_pop_f_80_2020': 17651.9609375, 'sum_pop_m_0_2020': 5364.07861328125, 'sum_pop_m_10_2020': 22061.70703125, 'sum_pop_m_15_2020': 22792.357421875, 'sum_pop_m_1_2020': 19971.615234375, 'sum_pop_m_20_2020': 27098.421875, 'sum_pop_m_25_2020': 32599.54296875, 'sum_pop_m_30_2020': 30635.79296875, 'sum_pop_m_35_2020': 24916.66796875, 'sum_pop_m_40_2020': 20631.275390625, 'sum_pop_m_45_2020': 18518.20703125, 'sum_pop_m_50_2020': 18673.708984375, 'sum_pop_m_55_2020': 19300.26953125, 'sum_pop_m_5_2020': 21773.6953125, 'sum_pop_m_60_2020': 18752.81640625, 'sum_pop_m_65_2020': 14246.966796875, 'sum_pop_m_70_2020': 12312.962890625, 'sum_pop_m_75_2020': 7566.3466796875, 'sum_pop_m_80_2020': 8843.8251953125}" | ||
862a10707ffffff,"{'__index_level_0__': 2436390, 'ogc_fid': 49, 'sum_pop_f_0_2020': 1031.58679199219, 'sum_pop_f_10_2020': 3744.32397460938, 'sum_pop_f_15_2020': 3917.22143554688, 'sum_pop_f_1_2020': 3840.6474609375, 'sum_pop_f_20_2020': 5425.830078125, 'sum_pop_f_25_2020': 8483.66796875, 'sum_pop_f_30_2020': 8123.6982421875, 'sum_pop_f_35_2020': 6064.3916015625, 'sum_pop_f_40_2020': 4933.3759765625, 'sum_pop_f_45_2020': 4362.7236328125, 'sum_pop_f_50_2020': 4338.94580078125, 'sum_pop_f_55_2020': 4514.39990234375, 'sum_pop_f_5_2020': 3701.62890625, 'sum_pop_f_60_2020': 4312.689453125, 'sum_pop_f_65_2020': 3729.978515625, 'sum_pop_f_70_2020': 3232.61254882812, 'sum_pop_f_75_2020': 2183.283203125, 'sum_pop_f_80_2020': 3135.68408203125, 'sum_pop_m_0_2020': 1086.77282714844, 'sum_pop_m_10_2020': 3941.2705078125, 'sum_pop_m_15_2020': 4346.45556640625, 'sum_pop_m_1_2020': 4046.28857421875, 'sum_pop_m_20_2020': 5813.85107421875, 'sum_pop_m_25_2020': 8800.37109375, 'sum_pop_m_30_2020': 8521.666015625, 'sum_pop_m_35_2020': 6522.4853515625, 'sum_pop_m_40_2020': 5050.24072265625, 'sum_pop_m_45_2020': 4325.6865234375, 'sum_pop_m_50_2020': 4091.5224609375, 'sum_pop_m_55_2020': 4049.44775390625, 'sum_pop_m_5_2020': 3862.1474609375, 'sum_pop_m_60_2020': 3625.97485351562, 'sum_pop_m_65_2020': 2958.13623046875, 'sum_pop_m_70_2020': 2380.19506835938, 'sum_pop_m_75_2020': 1527.44067382812, 'sum_pop_m_80_2020': 1581.6474609375}" | ||
862a1070fffffff,"{'__index_level_0__': 2436391, 'ogc_fid': 50, 'sum_pop_f_0_2020': 285.267578125, 'sum_pop_f_10_2020': 1234.03344726562, 'sum_pop_f_15_2020': 1272.86743164062, 'sum_pop_f_1_2020': 1062.06518554688, 'sum_pop_f_20_2020': 1501.95458984375, 'sum_pop_f_25_2020': 1971.18774414062, 'sum_pop_f_30_2020': 1970.33056640625, 'sum_pop_f_35_2020': 1658.791015625, 'sum_pop_f_40_2020': 1498.34887695312, 'sum_pop_f_45_2020': 1380.0244140625, 'sum_pop_f_50_2020': 1413.89428710938, 'sum_pop_f_55_2020': 1476.39672851562, 'sum_pop_f_5_2020': 1194.50146484375, 'sum_pop_f_60_2020': 1449.78100585938, 'sum_pop_f_65_2020': 1220.18408203125, 'sum_pop_f_70_2020': 1001.80200195312, 'sum_pop_f_75_2020': 692.449951171875, 'sum_pop_f_80_2020': 1021.06573486328, 'sum_pop_m_0_2020': 301.492889404297, 'sum_pop_m_10_2020': 1309.16052246094, 'sum_pop_m_15_2020': 1371.35522460938, 'sum_pop_m_1_2020': 1122.52270507812, 'sum_pop_m_20_2020': 1590.68212890625, 'sum_pop_m_25_2020': 2043.75939941406, 'sum_pop_m_30_2020': 1984.005859375, 'sum_pop_m_35_2020': 1687.57556152344, 'sum_pop_m_40_2020': 1456.80505371094, 'sum_pop_m_45_2020': 1328.02758789062, 'sum_pop_m_50_2020': 1317.73278808594, 'sum_pop_m_55_2020': 1337.02490234375, 'sum_pop_m_5_2020': 1246.01293945312, 'sum_pop_m_60_2020': 1250.81298828125, 'sum_pop_m_65_2020': 1011.77893066406, 'sum_pop_m_70_2020': 772.69921875, 'sum_pop_m_75_2020': 493.368988037109, 'sum_pop_m_80_2020': 513.383544921875}" | ||
862a10727ffffff,"{'__index_level_0__': 2436394, 'ogc_fid': 53, 'sum_pop_f_0_2020': 1815.30310058594, 'sum_pop_f_10_2020': 6547.26708984375, 'sum_pop_f_15_2020': 7142.46240234375, 'sum_pop_f_1_2020': 6758.46142578125, 'sum_pop_f_20_2020': 10860.0166015625, 'sum_pop_f_25_2020': 16565.375, 'sum_pop_f_30_2020': 15229.623046875, 'sum_pop_f_35_2020': 11273.6337890625, 'sum_pop_f_40_2020': 9142.7578125, 'sum_pop_f_45_2020': 8026.2275390625, 'sum_pop_f_50_2020': 8047.564453125, 'sum_pop_f_55_2020': 8562.94921875, 'sum_pop_f_5_2020': 6503.41015625, 'sum_pop_f_60_2020': 8420.705078125, 'sum_pop_f_65_2020': 7366.7041015625, 'sum_pop_f_70_2020': 6323.734375, 'sum_pop_f_75_2020': 4252.55859375, 'sum_pop_f_80_2020': 6203.60986328125, 'sum_pop_m_0_2020': 1909.22351074219, 'sum_pop_m_10_2020': 6867.8671875, 'sum_pop_m_15_2020': 7672.0263671875, 'sum_pop_m_1_2020': 7108.4482421875, 'sum_pop_m_20_2020': 10829.1796875, 'sum_pop_m_25_2020': 16439.6875, 'sum_pop_m_30_2020': 15721.458984375, 'sum_pop_m_35_2020': 11972.9853515625, 'sum_pop_m_40_2020': 9380.048828125, 'sum_pop_m_45_2020': 8014.259765625, 'sum_pop_m_50_2020': 7586.525390625, 'sum_pop_m_55_2020': 7627.609375, 'sum_pop_m_5_2020': 6769.341796875, 'sum_pop_m_60_2020': 6963.20947265625, 'sum_pop_m_65_2020': 5769.19580078125, 'sum_pop_m_70_2020': 4668.7470703125, 'sum_pop_m_75_2020': 3006.267578125, 'sum_pop_m_80_2020': 3161.91650390625}" | ||
862a1072fffffff,"{'__index_level_0__': 2436395, 'ogc_fid': 54, 'sum_pop_f_0_2020': 1863.88208007812, 'sum_pop_f_10_2020': 7031.26171875, 'sum_pop_f_15_2020': 8313.15625, 'sum_pop_f_1_2020': 6939.32373046875, 'sum_pop_f_20_2020': 14435.01171875, 'sum_pop_f_25_2020': 20009.3125, 'sum_pop_f_30_2020': 16851.1328125, 'sum_pop_f_35_2020': 12630.2119140625, 'sum_pop_f_40_2020': 10353.865234375, 'sum_pop_f_45_2020': 9068.0224609375, 'sum_pop_f_50_2020': 9348.8232421875, 'sum_pop_f_55_2020': 10468.044921875, 'sum_pop_f_5_2020': 7024.83740234375, 'sum_pop_f_60_2020': 10943.9521484375, 'sum_pop_f_65_2020': 9579.228515625, 'sum_pop_f_70_2020': 8196.7197265625, 'sum_pop_f_75_2020': 5452.208984375, 'sum_pop_f_80_2020': 8342.0283203125, 'sum_pop_m_0_2020': 1944.47668457031, 'sum_pop_m_10_2020': 7258.96826171875, 'sum_pop_m_15_2020': 8105.0302734375, 'sum_pop_m_1_2020': 7239.70361328125, 'sum_pop_m_20_2020': 12277.5703125, 'sum_pop_m_25_2020': 17537.927734375, 'sum_pop_m_30_2020': 16196.8681640625, 'sum_pop_m_35_2020': 12452.357421875, 'sum_pop_m_40_2020': 10254.236328125, 'sum_pop_m_45_2020': 8837.01171875, 'sum_pop_m_50_2020': 8528.3359375, 'sum_pop_m_55_2020': 8954.818359375, 'sum_pop_m_5_2020': 7267.75244140625, 'sum_pop_m_60_2020': 8682.8642578125, 'sum_pop_m_65_2020': 7253.91015625, 'sum_pop_m_70_2020': 6027.0146484375, 'sum_pop_m_75_2020': 3859.2119140625, 'sum_pop_m_80_2020': 4310.0849609375}" | ||
862a10757ffffff,"{'__index_level_0__': 2436399, 'ogc_fid': 58, 'sum_pop_f_0_2020': 834.680480957031, 'sum_pop_f_10_2020': 3883.02319335938, 'sum_pop_f_15_2020': 3978.943359375, 'sum_pop_f_1_2020': 3107.55615234375, 'sum_pop_f_20_2020': 4433.38232421875, 'sum_pop_f_25_2020': 5126.501953125, 'sum_pop_f_30_2020': 5110.86669921875, 'sum_pop_f_35_2020': 4612.77783203125, 'sum_pop_f_40_2020': 4320.1806640625, 'sum_pop_f_45_2020': 4065.07666015625, 'sum_pop_f_50_2020': 4238.0830078125, 'sum_pop_f_55_2020': 4474.5029296875, 'sum_pop_f_5_2020': 3739.5732421875, 'sum_pop_f_60_2020': 4464.12255859375, 'sum_pop_f_65_2020': 3650.60009765625, 'sum_pop_f_70_2020': 3047.7119140625, 'sum_pop_f_75_2020': 2093.39038085938, 'sum_pop_f_80_2020': 3182.42626953125, 'sum_pop_m_0_2020': 878.235229492188, 'sum_pop_m_10_2020': 4092.75439453125, 'sum_pop_m_15_2020': 4156.81689453125, 'sum_pop_m_1_2020': 3269.85815429688, 'sum_pop_m_20_2020': 4496.69580078125, 'sum_pop_m_25_2020': 4984.02490234375, 'sum_pop_m_30_2020': 4775.9384765625, 'sum_pop_m_35_2020': 4325.20068359375, 'sum_pop_m_40_2020': 3944.751953125, 'sum_pop_m_45_2020': 3706.37255859375, 'sum_pop_m_50_2020': 3792.64916992188, 'sum_pop_m_55_2020': 3921.2607421875, 'sum_pop_m_5_2020': 3899.29052734375, 'sum_pop_m_60_2020': 3796.07446289062, 'sum_pop_m_65_2020': 2990.9345703125, 'sum_pop_m_70_2020': 2328.72338867188, 'sum_pop_m_75_2020': 1461.92907714844, 'sum_pop_m_80_2020': 1593.95361328125}" | ||
862a10767ffffff,"{'__index_level_0__': 2436401, 'ogc_fid': 60, 'sum_pop_f_0_2020': 3148.86889648438, 'sum_pop_f_10_2020': 13023.1484375, 'sum_pop_f_15_2020': 13454.26953125, 'sum_pop_f_1_2020': 11723.392578125, 'sum_pop_f_20_2020': 16926.83203125, 'sum_pop_f_25_2020': 21846.34375, 'sum_pop_f_30_2020': 20248.33984375, 'sum_pop_f_35_2020': 16524.6953125, 'sum_pop_f_40_2020': 14018.3984375, 'sum_pop_f_45_2020': 12919.609375, 'sum_pop_f_50_2020': 13344.369140625, 'sum_pop_f_55_2020': 14300.04296875, 'sum_pop_f_5_2020': 12757.994140625, 'sum_pop_f_60_2020': 14169.640625, 'sum_pop_f_65_2020': 11414.6669921875, 'sum_pop_f_70_2020': 10501.298828125, 'sum_pop_f_75_2020': 6928.44677734375, 'sum_pop_f_80_2020': 10685.734375, 'sum_pop_m_0_2020': 3280.53686523438, 'sum_pop_m_10_2020': 13470.740234375, 'sum_pop_m_15_2020': 13894.583984375, 'sum_pop_m_1_2020': 12214.142578125, 'sum_pop_m_20_2020': 16504.916015625, 'sum_pop_m_25_2020': 19838.443359375, 'sum_pop_m_30_2020': 18610.52734375, 'sum_pop_m_35_2020': 15098.4765625, 'sum_pop_m_40_2020': 12466.5986328125, 'sum_pop_m_45_2020': 11185.37109375, 'sum_pop_m_50_2020': 11283.837890625, 'sum_pop_m_55_2020': 11677.3203125, 'sum_pop_m_5_2020': 13300.9453125, 'sum_pop_m_60_2020': 11366.205078125, 'sum_pop_m_65_2020': 8611.40625, 'sum_pop_m_70_2020': 7453.0439453125, 'sum_pop_m_75_2020': 4575.9609375, 'sum_pop_m_80_2020': 5342.28955078125}" | ||
862a10777ffffff,"{'__index_level_0__': 2436403, 'ogc_fid': 62, 'sum_pop_f_0_2020': 3401.08959960938, 'sum_pop_f_10_2020': 14066.287109375, 'sum_pop_f_15_2020': 14531.94140625, 'sum_pop_f_1_2020': 12662.421875, 'sum_pop_f_20_2020': 18282.650390625, 'sum_pop_f_25_2020': 23596.2109375, 'sum_pop_f_30_2020': 21870.20703125, 'sum_pop_f_35_2020': 17848.302734375, 'sum_pop_f_40_2020': 15141.2568359375, 'sum_pop_f_45_2020': 13954.455078125, 'sum_pop_f_50_2020': 14413.2373046875, 'sum_pop_f_55_2020': 15445.458984375, 'sum_pop_f_5_2020': 13779.89453125, 'sum_pop_f_60_2020': 15304.611328125, 'sum_pop_f_65_2020': 12328.96875, 'sum_pop_f_70_2020': 11342.439453125, 'sum_pop_f_75_2020': 7483.4072265625, 'sum_pop_f_80_2020': 11541.6484375, 'sum_pop_m_0_2020': 3543.30419921875, 'sum_pop_m_10_2020': 14549.73046875, 'sum_pop_m_15_2020': 15007.5234375, 'sum_pop_m_1_2020': 13192.482421875, 'sum_pop_m_20_2020': 17826.94140625, 'sum_pop_m_25_2020': 21427.48046875, 'sum_pop_m_30_2020': 20101.208984375, 'sum_pop_m_35_2020': 16307.8466796875, 'sum_pop_m_40_2020': 13465.1591796875, 'sum_pop_m_45_2020': 12081.306640625, 'sum_pop_m_50_2020': 12187.66015625, 'sum_pop_m_55_2020': 12612.66015625, 'sum_pop_m_5_2020': 14366.3359375, 'sum_pop_m_60_2020': 12276.625, 'sum_pop_m_65_2020': 9301.1689453125, 'sum_pop_m_70_2020': 8050.02490234375, 'sum_pop_m_75_2020': 4942.490234375, 'sum_pop_m_80_2020': 5770.201171875}" |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
import pandas as pd | ||
|
||
|
||
df = pd.read_parquet('space2stats.parquet') | ||
chunk_size = 100000 # Number of rows per chunk | ||
|
||
for i in range(0, len(df), chunk_size): | ||
chunk = df.iloc[i:i + chunk_size] | ||
chunk.to_parquet(f'parquet_chunks/space2stats_part_{i // chunk_size}.parquet') | ||
|
||
print("Parquet file split into smaller chunks.") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
!/bin/bash | ||
|
||
# Load environment variables from wb_aws.env | ||
source wb_aws.env | ||
|
||
# S3 and file configuration | ||
S3_BUCKET="wbg-geography01" | ||
PARQUET_FILE="Space2Stats/parquet/GLOBAL/combined_population.parquet" | ||
LOCAL_PARQUET_FILE="space2stats.parquet" | ||
|
||
# PostgreSQL configuration | ||
DB_HOST="${MY_DOCKER_IP:-127.0.0.1}" | ||
DB_PORT=5439 | ||
DB_NAME="postgis" | ||
DB_USER="username" | ||
DB_PASSWORD="password" | ||
|
||
# Download Parquet file from S3 | ||
echo "Downloading Parquet file from S3..." | ||
aws s3 cp --quiet s3://$S3_BUCKET/$PARQUET_FILE $LOCAL_PARQUET_FILE |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
#!/bin/bash | ||
|
||
# Database connection details | ||
DB_HOST="localhost" | ||
DB_PORT="5439" | ||
DB_NAME="postgis" | ||
DB_USER="username" | ||
DB_PASSWORD="password" | ||
|
||
# Path to the sample Parquet file | ||
PARQUET_FILE="nyc_sample.parquet" | ||
|
||
# Name of the target table | ||
TABLE_NAME="space2stats_nyc_sample" | ||
|
||
# Check if the table exists | ||
TABLE_EXISTS=$(psql -h $DB_HOST -p $DB_PORT -d $DB_NAME -U $DB_USER -tAc "SELECT EXISTS (SELECT FROM information_schema.tables WHERE table_schema='public' AND table_name='$TABLE_NAME');") | ||
|
||
echo "Importing $PARQUET_FILE..." | ||
|
||
if [ "$TABLE_EXISTS" = "t" ]; then | ||
# Table exists, append data | ||
ogr2ogr -f "PostgreSQL" \ | ||
PG:"host=$DB_HOST port=$DB_PORT dbname=$DB_NAME user=$DB_USER password=$DB_PASSWORD" \ | ||
"$PARQUET_FILE" \ | ||
-nln $TABLE_NAME \ | ||
-append | ||
else | ||
# Table does not exist, create table and import data | ||
ogr2ogr -f "PostgreSQL" \ | ||
PG:"host=$DB_HOST port=$DB_PORT dbname=$DB_NAME user=$DB_USER password=$DB_PASSWORD" \ | ||
"$PARQUET_FILE" \ | ||
-nln $TABLE_NAME | ||
|
||
TABLE_EXISTS="t" | ||
fi | ||
|
||
if [ $? -ne 0 ]; then | ||
echo "Failed to import $PARQUET_FILE" | ||
exit 1 | ||
fi | ||
|
||
echo "Successfully imported $PARQUET_FILE" | ||
|
||
echo "The Parquet file has been imported." |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
#!/bin/bash | ||
|
||
# Database connection details | ||
DB_HOST="localhost" | ||
DB_PORT="5439" | ||
DB_NAME="postgis" | ||
DB_USER="username" | ||
DB_PASSWORD="password" | ||
|
||
# Directory containing the Parquet chunks | ||
CHUNKS_DIR="parquet_chunks" | ||
|
||
# Name of the target table | ||
TABLE_NAME="space2stats" | ||
|
||
# Flag to check if the table exists | ||
TABLE_EXISTS=$(psql -h $DB_HOST -p $DB_PORT -d $DB_NAME -U $DB_USER -tAc "SELECT EXISTS (SELECT FROM information_schema.tables WHERE table_schema='public' AND table_name='$TABLE_NAME');") | ||
|
||
# Loop through each Parquet file in the chunks directory | ||
for PARQUET_FILE in "$CHUNKS_DIR"/*.parquet; | ||
do | ||
echo "Importing $PARQUET_FILE..." | ||
|
||
if [ "$TABLE_EXISTS" = "t" ]; then | ||
# Table exists, append data | ||
ogr2ogr -f "PostgreSQL" \ | ||
PG:"host=$DB_HOST port=$DB_PORT dbname=$DB_NAME user=$DB_USER password=$DB_PASSWORD" \ | ||
"$PARQUET_FILE" \ | ||
-nln $TABLE_NAME \ | ||
-append | ||
else | ||
# Table does not exist, create table and import data | ||
ogr2ogr -f "PostgreSQL" \ | ||
PG:"host=$DB_HOST port=$DB_PORT dbname=$DB_NAME user=$DB_USER password=$DB_PASSWORD" \ | ||
"$PARQUET_FILE" \ | ||
-nln $TABLE_NAME | ||
|
||
TABLE_EXISTS="t" | ||
fi | ||
|
||
if [ $? -ne 0 ]; then | ||
echo "Failed to import $PARQUET_FILE" | ||
exit 1 | ||
fi | ||
|
||
echo "Successfully imported $PARQUET_FILE" | ||
done | ||
|
||
echo "All Parquet chunks have been imported." |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
import pandas as pd | ||
import h3 | ||
|
||
|
||
# Load the full dataset | ||
df = pd.read_parquet('space2stats.parquet') | ||
|
||
# Define the bounding box for New York City (approximate values) as a GeoJSON polygon | ||
nyc_polygon = { | ||
"type": "Polygon", | ||
"coordinates": [[ | ||
[-74.259090, 40.477399], | ||
[-73.700272, 40.477399], | ||
[-73.700272, 40.917577], | ||
[-74.259090, 40.917577], | ||
[-74.259090, 40.477399] | ||
]] | ||
} | ||
|
||
# Generate H3 indices for the bounding box using polyfill | ||
resolution = 6 | ||
nyc_hexagons = h3.polyfill(nyc_polygon, resolution, geo_json_conformant=True) | ||
|
||
# Filter the dataframe for New York City H3 indices | ||
nyc_df = df[df['hex_id'].isin(nyc_hexagons)] | ||
|
||
nyc_df.to_parquet('nyc_sample.parquet') | ||
|
||
print("Filtered file for New York City.") |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
from fastapi import FastAPI | ||
|
||
from .routers import api | ||
|
||
|
||
app = FastAPI() | ||
|
||
app.include_router(api.router) | ||
|
||
|
||
@app.get("/") | ||
def read_root(): | ||
return {"message": "Welcome to Space2Stats!"} |
Empty file.
Empty file.
Empty file.
Oops, something went wrong.