-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify major roads feature #230
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wagnerlmichael This looks good, but let's wait for https://github.com/ccao-data/data-architecture/actions/runs/6867260306 to finish before merging.
Also, can you quickly check the difference between the roads in each year? I see there's a large difference in the raw Parquet file size between 2014 and 2023. I recommend making a small multiples map showing the roads and posting it in the PR comments.
aws-s3/scripts-ccao-data-raw-us-east-1/spatial-environment-major_road.R
Outdated
Show resolved
Hide resolved
aws-s3/scripts-ccao-data-raw-us-east-1/spatial-environment-major_road.R
Outdated
Show resolved
Hide resolved
@wagnerlmichael Ugh, this is what I was worried about. Clearly someone retagged a bunch of roads at some point, so the road definitions aren't going to be consistent historically. Let's do this:
Forward filling here will be handled by the CTAS, and backward filling should be handled by the model views, where appropriate (so you don't have to do anything to handle filling). |
We ended up taking the 2023 data from the January 1st historical data. This is consistent with our major roads ( This is the most recent build time. It is significantly faster than the prior run time without updated simplified linestrings. Is there anything I need to do once this PR is merged to update the CTE for prod? I'm assuming the |
Due to the major roads changing from year to year, we have decided to go with an additive approach. Starting from 2014, each year's data builds upon the previous year, with new major road additions being added to the existing dataset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good @wagnerlmichael. Thanks for the iterations on this.
The main task of this PR was originally to use
st_simplify
to reduce the complexity of the linestrings representing major roads.It has since expanded in scope to accomplish the following:
Closes #219