Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2024 modelling data update #670

Draft
wants to merge 159 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
159 commits
Select commit Hold shift + click to select a range
acb0154
Initial test
Damonamajor Oct 9, 2024
0652b3a
Sort headings
Damonamajor Oct 9, 2024
5191e58
New test
Damonamajor Oct 10, 2024
c4807fb
local
Damonamajor Oct 10, 2024
6e2a088
surface width
Damonamajor Oct 10, 2024
ccc67fa
More tests
Damonamajor Oct 10, 2024
6a9c130
Query improvements
Damonamajor Oct 10, 2024
12b2ebf
Make all distinct_pins
Damonamajor Oct 10, 2024
a22e314
Make traffic_width unique
Damonamajor Oct 11, 2024
d835273
switch to daily_traffic
Damonamajor Oct 11, 2024
7bd08b8
Try master
Damonamajor Oct 11, 2024
ca76942
Fix master?
Damonamajor Oct 11, 2024
8ba251d
Another master test
Damonamajor Oct 11, 2024
1cf60f2
Another master test
Damonamajor Oct 11, 2024
d3ba259
Add config
Damonamajor Oct 11, 2024
e34b8c8
switch to width
Damonamajor Oct 11, 2024
4ae5554
Make year last column
Damonamajor Oct 11, 2024
d2a45fa
Merge into master
Damonamajor Oct 15, 2024
aace136
Try to remove 2014
Damonamajor Oct 15, 2024
3f070de
Remove minor_collector
Damonamajor Oct 15, 2024
b50e813
Remove parcel.year = pin.year
Damonamajor Oct 15, 2024
ebbc52e
Remove minor again
Damonamajor Oct 15, 2024
e8671f7
Remove Freeway
Damonamajor Oct 15, 2024
967114c
Start from begining
Damonamajor Oct 15, 2024
ba0b0a1
Separate freeway
Damonamajor Oct 15, 2024
aa91cf8
Add principal
Damonamajor Oct 15, 2024
639468c
Try with major
Damonamajor Oct 15, 2024
ed4ba5f
Add other
Damonamajor Oct 16, 2024
0c41137
re-add major
Damonamajor Oct 16, 2024
7cc2b22
remove year
Damonamajor Oct 16, 2024
03bdf61
Remove year from join
Damonamajor Oct 16, 2024
09ce316
Remove all pcl.year
Damonamajor Oct 16, 2024
7291c2d
remove year from joins
Damonamajor Oct 16, 2024
f0ca25f
Try a bit of refactor
Damonamajor Oct 16, 2024
205e5c5
rename data_year
Damonamajor Oct 16, 2024
33e1993
Try to add minor collector
Damonamajor Oct 16, 2024
ce34f6f
Add CTE
Damonamajor Oct 16, 2024
6469e29
Add lands and surface type
Damonamajor Oct 24, 2024
0e8dc1f
Switch to speed limit
Damonamajor Oct 24, 2024
b4351bc
Lowercase and
Damonamajor Oct 24, 2024
6bc3d82
Add dailiy_traffic
Damonamajor Oct 24, 2024
373ec21
Test local in wrong query
Damonamajor Oct 24, 2024
3ba546f
Try different join syntax
Damonamajor Oct 24, 2024
6d15e66
Add local
Damonamajor Oct 24, 2024
fdae401
Try to add other
Damonamajor Oct 24, 2024
fe8a1cc
Add minor collector
Damonamajor Oct 24, 2024
601292a
Add year to join for minor_collector
Damonamajor Oct 24, 2024
35bfcf9
Add year to join for minor_collector
Damonamajor Oct 24, 2024
ce87790
Test with only valid daily traffic for local
Damonamajor Oct 25, 2024
40ddfd7
Test macro upload
Damonamajor Oct 25, 2024
a3131c6
Test macro
Damonamajor Oct 25, 2024
364fafb
Revert to traffic not null
Damonamajor Oct 28, 2024
28cbeb2
Test nearest_local new format
Damonamajor Oct 28, 2024
387731c
Revert
Damonamajor Oct 28, 2024
e70ea40
Revert
Damonamajor Oct 28, 2024
43937d3
test
Damonamajor Oct 28, 2024
35dafbf
Remove semicolon
Damonamajor Oct 28, 2024
fe29661
Try null and all for local
Damonamajor Oct 29, 2024
1d118f4
typo
Damonamajor Oct 29, 2024
3f5f221
add road to query
Damonamajor Oct 29, 2024
f6bdb9a
Revert to working query
Damonamajor Oct 29, 2024
7dd95cd
Test pin to local
Damonamajor Nov 8, 2024
97d41bb
Test Local v2
Damonamajor Nov 8, 2024
4e22dcc
Test v3
Damonamajor Nov 8, 2024
4a1b303
Revert to working
Damonamajor Nov 13, 2024
e25112e
Merge branch 'master' into Create-IDOT-features
Damonamajor Nov 13, 2024
a4277e8
Add _road to macro
Damonamajor Nov 13, 2024
56307bb
Merge branch 'Create-IDOT-features' of github.com:ccao-data/data-arch…
Damonamajor Nov 13, 2024
cdf419c
Add _road
Damonamajor Nov 13, 2024
1a3f32e
Remove test macros
Damonamajor Nov 13, 2024
3647eed
Update dbt/models/proximity/proximity.dist_pin_to_traffic_master.sql
Damonamajor Nov 14, 2024
9cd97ae
nearest highway
Damonamajor Nov 14, 2024
ff423a9
Merge branch 'Create-IDOT-features' of github.com:ccao-data/data-arch…
Damonamajor Nov 14, 2024
e415509
Add highways/collector
Damonamajor Nov 14, 2024
d903517
Add arterial, delete master
Damonamajor Nov 14, 2024
f177074
Add spatial.parcel
Damonamajor Nov 14, 2024
c03c320
Add year to join
Damonamajor Nov 14, 2024
48603c0
Fix query
Damonamajor Nov 14, 2024
8544f80
Add to proximity
Damonamajor Nov 14, 2024
f93fefd
Add to fill
Damonamajor Nov 14, 2024
53ef3be
Improve docs
Damonamajor Nov 14, 2024
bddc0c8
Reorder columns
Damonamajor Nov 15, 2024
ed414a9
Reorder columns
Damonamajor Nov 15, 2024
623a801
Fix schema
Damonamajor Nov 15, 2024
cc24ca4
Fix columns
Damonamajor Nov 15, 2024
ca3ab54
Reorder columns
Damonamajor Nov 15, 2024
159df1c
Rename columns
Damonamajor Nov 15, 2024
5d8ee59
Add shared input
Damonamajor Nov 15, 2024
ecb5fe0
Add crosswalk year fill
Damonamajor Nov 15, 2024
15e6c3b
add nearest_
Damonamajor Nov 15, 2024
88d62ea
Add nearest
Damonamajor Nov 15, 2024
b3d3e5f
add to cyf
Damonamajor Nov 15, 2024
fae0635
Add nearest
Damonamajor Nov 15, 2024
dadc100
Add to schema
Damonamajor Nov 15, 2024
72a244a
Add to models schema
Damonamajor Nov 15, 2024
b896304
Rename to roads
Damonamajor Nov 17, 2024
546cfef
Revert to traffic
Damonamajor Nov 17, 2024
7d98745
Rename
Damonamajor Nov 17, 2024
f333fa1
Delete file
Damonamajor Nov 17, 2024
8daaffd
Add to schema
Damonamajor Nov 17, 2024
2dd70d7
Fix schema
Damonamajor Nov 17, 2024
64bbf96
Add to inputs
Damonamajor Nov 17, 2024
40585a7
Rename to connector
Damonamajor Nov 17, 2024
92a459c
Rename to collector
Damonamajor Nov 17, 2024
50c8740
Rename to collector
Damonamajor Nov 17, 2024
42f6815
Add other features
Damonamajor Nov 17, 2024
d24d72a
rename schema
Damonamajor Nov 17, 2024
530e843
Add road
Damonamajor Nov 17, 2024
d8afea6
Add other category
Damonamajor Nov 18, 2024
cb4a74b
Remove Major Arterial
Damonamajor Nov 19, 2024
bb84a02
Rename to roads
Damonamajor Nov 19, 2024
0390507
Rename to roads
Damonamajor Nov 20, 2024
ce7e290
Rename columns
Damonamajor Nov 20, 2024
2b2b6c4
Fix crosswalk
Damonamajor Nov 20, 2024
1e92767
More renames
Damonamajor Nov 20, 2024
bbef26f
More renames
Damonamajor Nov 20, 2024
1875dea
Make singular
Damonamajor Nov 21, 2024
a54494e
Make singular
Damonamajor Nov 21, 2024
a8375d3
Merge branch 'Create-IDOT-features' into 585-add-a-loaded_at-column-t…
wrridgeway Nov 21, 2024
f492114
Add loaded_at to utils
wrridgeway Nov 21, 2024
5b0d5d7
Simplify code
Nov 25, 2024
3e32945
Minor code cleaning
wrridgeway Nov 25, 2024
d761d88
Add commenting
wrridgeway Nov 26, 2024
2b1e903
Commenting
wrridgeway Dec 5, 2024
08b765f
Correct file path
wrridgeway Dec 5, 2024
4c3c256
Merge branch 'master' into 585-add-a-loaded_at-column-to-all-sources
wrridgeway Dec 5, 2024
0739010
Add loaded_at columns
wrridgeway Dec 5, 2024
d4bddfb
Add 2024 fema url
wrridgeway Dec 5, 2024
15f480e
Simplify code
wrridgeway Dec 5, 2024
5922fd1
Maybe
wrridgeway Dec 5, 2024
1cb245b
Add new cps boundaries
wrridgeway Dec 6, 2024
233d3fb
Temp adjustment to lintr
wrridgeway Dec 10, 2024
cf0d082
Merge branch 'master' into 585-add-a-loaded_at-column-to-all-sources
wrridgeway Dec 11, 2024
3bb9e54
Remove accidental change to .lintr
wrridgeway Dec 11, 2024
26a236b
Linting
wrridgeway Dec 11, 2024
8c21816
Linting
wrridgeway Dec 11, 2024
4eeb02d
Merge branch 'master' into 585-add-a-loaded_at-column-to-all-sources
wrridgeway Dec 12, 2024
38f68fe
Test lintr
wrridgeway Dec 12, 2024
5e84a15
Final code updates
wrridgeway Dec 12, 2024
2762560
Add geoparquet_to_s3
wrridgeway Dec 12, 2024
d095bfb
Switch to utils function
wrridgeway Dec 12, 2024
11b33fe
Switch to util function
wrridgeway Dec 12, 2024
253a71d
Swap out all instances
wrridgeway Dec 12, 2024
b54f6f5
Clean ccao script
wrridgeway Dec 12, 2024
0659dcd
Address non-spatial parquets
wrridgeway Dec 16, 2024
ea3b35c
More replacing
wrridgeway Dec 16, 2024
c76ca75
Undo raw changes
wrridgeway Dec 16, 2024
52f5877
Test renv fix
wrridgeway Dec 17, 2024
3528bdf
Linting
wrridgeway Dec 17, 2024
459fc8d
Linting
wrridgeway Dec 17, 2024
166701f
Linting
wrridgeway Dec 17, 2024
67771aa
More loaded_at
wrridgeway Dec 17, 2024
c5e7818
More loaded at
wrridgeway Dec 17, 2024
4e6f2e6
Merge branch '585-add-a-loaded_at-column-to-all-sources' of https://g…
wrridgeway Dec 17, 2024
7784c29
Merge branch 'master' into 585-add-a-loaded_at-column-to-all-sources
wrridgeway Dec 17, 2024
952b4b8
Housing
wrridgeway Dec 17, 2024
ab52d12
More loaded_at
wrridgeway Dec 18, 2024
585a0d5
More loaded_at
wrridgeway Dec 18, 2024
0760d2b
Code cleanup
wrridgeway Dec 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/lint.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,10 @@ jobs:
# list of changed files within `super-linter`
fetch-depth: 0

- name: Disable renv
shell: bash
run: rm etl/.Rprofile
Comment on lines +26 to +28
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allows CI to pass.


- name: Lint
uses: github/super-linter@v6
env:
Expand Down
15 changes: 14 additions & 1 deletion etl/renv.lock
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"R": {
"Version": "4.4.2",
"Version": "4.4.1",
"Repositories": [
{
"Name": "CRAN",
Expand Down Expand Up @@ -2808,6 +2808,19 @@
],
"Hash": "ad57b543f7c3fca05213ba78ff63df9b"
},
"sfarrow": {
"Package": "sfarrow",
"Version": "0.4.1",
"Source": "Repository",
"Repository": "CRAN",
"Requirements": [
"arrow",
"dplyr",
"jsonlite",
"sf"
],
"Hash": "b320f164b1d7bb7e4582b841e22d15a0"
},
"sfheaders": {
"Package": "sfheaders",
"Version": "0.4.4",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,13 @@ remote_file <- file.path(output_bucket, paste0("ihs_price_index_data.parquet"))

# Grab the data, clean it just a bit, and write if it doesn't already exist
data.frame(t(
openxlsx::read.xlsx(most_recent_ihs_data_url, sheet = 2) %>%
dplyr::select(-c("X2", "X3", "X4"))
read.xlsx(most_recent_ihs_data_url, sheet = 2) %>%
select(-c("X2", "X3", "X4"))
)) %>%
# Names and columns are kind of a mess after the transpose,
# shift up first row, shift over column names
janitor::row_to_names(1) %>%
dplyr::mutate(puma = rownames(.)) %>%
dplyr::relocate(puma, .before = "YEARQ") %>%
dplyr::rename(name = "YEARQ") %>%
arrow::write_parquet(remote_file)
row_to_names(1) %>%
mutate(puma = rownames(.)) %>%
relocate(puma, .before = "YEARQ") %>%
rename(name = "YEARQ") %>%
write_parquet(remote_file)
7 changes: 2 additions & 5 deletions etl/scripts-ccao-data-raw-us-east-1/sale/sale-foreclosure.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ output_bucket <- file.path(AWS_S3_RAW_BUCKET, "sale", "foreclosure")
files <- list.files("O:/CCAODATA/data/foreclosures", recursive = TRUE)

# Function to retrieve data and write to S3
read_write <- function(x) {
walk(files, \(x) {
output_dest <- file.path(output_bucket, glue(parse_number(x), ".parquet"))

if (!object_exists(output_dest)) {
Expand All @@ -28,7 +28,4 @@ read_write <- function(x) {
) %>%
write_parquet(output_dest)
}
}

# Apply function to foreclosure data
walk(files, read_write)
})
8 changes: 2 additions & 6 deletions etl/scripts-ccao-data-raw-us-east-1/sale/sale-mydec.R
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ files <- xml2::read_html(
str_subset("ptax203")

# Function to scrape IDOR data and write to S3
down_up <- function(x) {
walk(files, \(x) {
year <- str_extract(x, pattern = "[0-9]{4}")

if (
Expand All @@ -44,8 +44,4 @@ down_up <- function(x) {
readr::read_delim(list.files(tmp2, full.names = TRUE), delim = "\t") %>%
write_parquet(file.path(output_bucket, glue("{year}.parquet")))
}
}


# Apply function to foreclosure data
walk(files, down_up)
})
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ output_bucket <- file.path(AWS_S3_RAW_BUCKET, "spatial", "access")
# List APIs from city site
sources_list <- bind_rows(list(
# INDUSTRIAL CORRIDORS
# See https://data.cityofchicago.org/Community-Economic-Development/IndustrialCorridor_Jan2013/3tu3-iesz/about_data # nolint
# for more information
"ind_2013" = c(
"source" = "https://data.cityofchicago.org/api/geospatial/",
"api_url" = "e6xh-nr8w?method=export&format=GeoJSON",
Expand Down
21 changes: 10 additions & 11 deletions etl/scripts-ccao-data-raw-us-east-1/spatial/spatial-ccao.R
Original file line number Diff line number Diff line change
Expand Up @@ -11,19 +11,17 @@ output_bucket <- file.path(AWS_S3_RAW_BUCKET, "spatial", "ccao")

# Read privileges for the this drive location are limited.
# Contact Cook County GIS if permissions need to be changed.
file_path <- "//gisemcv1.ccounty.com/ArchiveServices/"
file_path <- "//gisemcv1.ccounty.com/ArchiveServices/" # nolint

sources_list <- bind_rows(list(
sources_list <- data.frame(
# NEIGHBORHOOD
"neighborhood" = c(
"url" = paste0(
"https://gitlab.com/ccao-data-science---modeling/packages/ccao",
"/-/raw/master/data-raw/nbhd_shp.geojson"
),
"boundary" = "neighborhood",
"year" = "2021"
)
))
"url" = paste0(
"https://gitlab.com/ccao-data-science---modeling/packages/ccao",
"/-/raw/master/data-raw/nbhd_shp.geojson"
),
"boundary" = "neighborhood",
"year" = "2021"
)

# Function to call referenced API, pull requested data, and write it to S3
pwalk(sources_list, function(...) {
Expand All @@ -45,6 +43,7 @@ gdb_files <- data.frame("path" = list.files(file_path, full.names = TRUE)) %>%
filter(
str_detect(path, "Current", negate = TRUE) &
str_detect(path, "20") &
# We detect parcel GDBs, but will extract the township layer
str_detect(path, "Parcels")
)

Expand Down
39 changes: 19 additions & 20 deletions etl/scripts-ccao-data-raw-us-east-1/spatial/spatial-political.R
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ output_bucket <- file.path(AWS_S3_RAW_BUCKET, "spatial", "political")

# Read privileges for the this drive location are limited.
# Contact Cook County GIS if permissions need to be changed.
file_path <- "//gisemcv1.ccounty.com/ArchiveServices/"
file_path <- "//gisemcv1.ccounty.com/ArchiveServices/" # nolint

sources_list <- bind_rows(list(
# BOARD OF REVIEW
Expand All @@ -23,7 +23,7 @@ sources_list <- bind_rows(list(
"year" = "2012"
),
"bor_2023" = c(
"source" = "https://gis.cookcountyil.gov/traditional/rest/services/politicalBoundary/MapServer/",
"source" = "https://gis.cookcountyil.gov/traditional/rest/services/politicalBoundary/MapServer/", # nolint
"api_url" = "10/query?outFields=*&where=1%3D1&f=geojson",
"boundary" = "board_of_review_district",
"year" = "2023"
Expand All @@ -37,7 +37,7 @@ sources_list <- bind_rows(list(
"year" = "2012"
),
"cmd_2023" = c(
"source" = "https://gis.cookcountyil.gov/traditional/rest/services/politicalBoundary/MapServer/",
"source" = "https://gis.cookcountyil.gov/traditional/rest/services/politicalBoundary/MapServer/", # nolint
"api_url" = "9/query?outFields=*&where=1%3D1&f=geojson",
"boundary" = "commissioner_district",
"year" = "2023"
Expand All @@ -51,7 +51,7 @@ sources_list <- bind_rows(list(
"year" = "2010"
),
"cnd_2023" = c(
"source" = "https://gis.cookcountyil.gov/traditional/rest/services/politicalBoundary/MapServer/",
"source" = "https://gis.cookcountyil.gov/traditional/rest/services/politicalBoundary/MapServer/", # nolint
"api_url" = "13/query?outFields=*&where=1%3D1&f=geojson",
"boundary" = "congressional_district",
"year" = "2023"
Expand All @@ -65,7 +65,7 @@ sources_list <- bind_rows(list(
"year" = "2012"
),
"jsd_2022" = c(
"source" = "https://gis.cookcountyil.gov/traditional/rest/services/politicalBoundary/MapServer/",
"source" = "https://gis.cookcountyil.gov/traditional/rest/services/politicalBoundary/MapServer/", # nolint
"api_url" = "5/query?outFields=*&where=1%3D1&f=geojson",
"boundary" = "judicial_district",
"year" = "2022"
Expand All @@ -79,7 +79,7 @@ sources_list <- bind_rows(list(
"year" = "2010"
),
"str_2023" = c(
"source" = "https://gis.cookcountyil.gov/traditional/rest/services/politicalBoundary/MapServer/",
"source" = "https://gis.cookcountyil.gov/traditional/rest/services/politicalBoundary/MapServer/", # nolint
"api_url" = "11/query?outFields=*&where=1%3D1&f=geojson",
"boundary" = "state_representative_district",
"year" = "2023"
Expand All @@ -93,7 +93,7 @@ sources_list <- bind_rows(list(
"year" = "2010"
),
"sts_2023" = c(
"source" = "https://gis.cookcountyil.gov/traditional/rest/services/politicalBoundary/MapServer/",
"source" = "https://gis.cookcountyil.gov/traditional/rest/services/politicalBoundary/MapServer/", # nolint
"api_url" = "12/query?outFields=*&where=1%3D1&f=geojson",
"boundary" = "state_senate_district",
"year" = "2023"
Expand Down Expand Up @@ -150,20 +150,19 @@ pwalk(sources_list, function(...) {
# MUNICIPALITY

# Paths for all relevant geodatabases
gdb_files <- data.frame("path" = list.files(file_path, full.names = TRUE)) %>%
data.frame("path" = list.files(file_path, full.names = TRUE)) %>%
filter(
str_detect(path, "Current", negate = TRUE) &
str_detect(path, "20") &
str_detect(path, "Admin")
)

# Function to call referenced API, pull requested data, and write it to S3
pwalk(gdb_files, function(...) {
df <- tibble::tibble(...)
county_gdb_to_s3(
s3_bucket_uri = output_bucket,
dir_name = "municipality",
file_path = df$path,
layer = "MuniTaxDist"
)
})
) %>%
# Function to call referenced API, pull requested data, and write it to S3
pwalk(function(...) {
df <- tibble::tibble(...)
county_gdb_to_s3(
s3_bucket_uri = output_bucket,
dir_name = "municipality",
file_path = df$path,
layer = "MuniTaxDist"
)
})
12 changes: 12 additions & 0 deletions etl/scripts-ccao-data-raw-us-east-1/spatial/spatial-school.R
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,12 @@ sources_list <- bind_rows(list(
"boundary" = "cps_attendance_elementary",
"year" = "2023-2024"
),
"attendance_ele_2025" = c(
"source" = "https://data.cityofchicago.org/api/geospatial/",
"api_url" = "5ihw-cbdn?method=export&format=GeoJSON",
"boundary" = "cps_attendance_elementary",
"year" = "2024-2025"
),

# CPS ATTENDANCE - SECONDARY
"attendance_sec_0607" = c(
Expand Down Expand Up @@ -231,6 +237,12 @@ sources_list <- bind_rows(list(
"boundary" = "cps_attendance_secondary",
"year" = "2023-2024"
),
"attendance_sec_2025" = c(
"source" = "https://data.cityofchicago.org/api/geospatial/",
"api_url" = "4kfz-zr3a?method=export&format=GeoJSON",
"boundary" = "cps_attendance_secondary",
"year" = "2024-2025"
),

# LOCATION
"locations_all_21" = c(
Expand Down
41 changes: 32 additions & 9 deletions etl/scripts-ccao-data-raw-us-east-1/spatial/spatial-transit.R
Original file line number Diff line number Diff line change
Expand Up @@ -20,15 +20,25 @@ options(timeout = max(300, getOption("timeout")))
cta_feed_dates_list <- c(
"2015-10-29", "2016-09-30", "2017-10-22", "2018-10-06",
"2019-10-04", "2020-10-10", "2021-10-09", "2022-10-20",
"2023-10-04"
"2023-10-04", "2024-10-17"
)

# If missing feed on S3, download and remove .htm file (causes errors)
# then rezip and upload
get_cta_feed <- function(feed_date) {
feed_url <- paste0(
"https://transitfeeds.com/p/chicago-transit-authority/165/",
str_remove_all(feed_date, "-"), "/download"
feed_url <- ifelse(
substr(feed_date, 1, 4) <= "2023",
paste0(
"https://transitfeeds.com/p/chicago-transit-authority/165/",
str_remove_all(feed_date, "-"), "/download"
),
paste0(
"https://files.mobilitydatabase.org/mdb-389/mdb-389-",
str_remove_all(feed_date, "-"),
"0023/mdb-389-",
str_remove_all(feed_date, "-"),
"0023.zip"
)
)
s3_uri <- file.path(output_path, "cta", paste0(feed_date, "-gtfs.zip"))

Expand All @@ -55,14 +65,26 @@ walk(cta_feed_dates_list, get_cta_feed)
metra_feed_dates_list <- c(
"2015-10-30", "2016-09-30", "2017-10-21", "2018-10-05",
"2019-10-04", "2020-10-10", "2021-10-08", "2022-10-21",
"2023-10-14"
"2023-10-14", "2024-04-22"
)

get_metra_feed <- function(feed_date) {
feed_url <- paste0(
"https://transitfeeds.com/p/metra/169/",
str_remove_all(feed_date, "-"), "/download"

feed_url <- ifelse(
substr(feed_date, 1, 4) <= "2023",
paste0(
"https://transitfeeds.com/p/metra/169/",
str_remove_all(feed_date, "-"), "/download"
),
paste0(
"https://files.mobilitydatabase.org/mdb-1187/mdb-1187-",
str_remove_all(feed_date, "-"),
"0016/mdb-1187-",
str_remove_all(feed_date, "-"),
"0016.zip"
)
)

s3_uri <- file.path(output_path, "metra", paste0(feed_date, "-gtfs.zip"))

if (!aws.s3::object_exists(s3_uri)) {
Expand All @@ -80,7 +102,8 @@ walk(metra_feed_dates_list, get_metra_feed)
##### Pace #####
pace_feed_dates_list <- c(
"2015-10-16", "2016-10-15", "2017-10-16", "2018-10-17",
"2019-10-22", "2020-09-23", "2021-03-15", "2023-09-24"
"2019-10-22", "2020-09-23", "2021-03-15", "2023-09-24",
"2024-02-07"
)

get_pace_feed <- function(feed_date) {
Expand Down
Loading
Loading