Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HQ: Use reprojected access vector in rasterization task #1637

Merged

Conversation

emilyanndavis
Copy link
Member

@emilyanndavis emilyanndavis commented Oct 4, 2024

Description

Fixes #1615 by passing the reprojected access vector to the rasterize_access task

Checklist

  • Updated HISTORY.rst and link to any relevant issue (if these changes are user-facing)
  • [ ] Updated the user's guide (if needed)
  • Tested the Workbench UI (if relevant)

@emilyanndavis
Copy link
Member Author

@dcdenu4 I think this is the only change needed, but I noticed that create_access_raster_task and rasterize_access_task are being added to access_task_list, which allows them to later be passed to total_degradation_task as dependencies.

Do I need to add reproject_access_task to access_task_list?

Or is it sufficient that reproject_access_task is already a dependency of rasterize_access_task (and therefore implicitly understood as a dependency of total_degradation_task)?

Also wondering if you have any suggestions as to how to go about testing this, or if there might be a scenario in another test suite I can use as an example?

@dcdenu4
Copy link
Member

dcdenu4 commented Oct 7, 2024

Hey @emilyanndavis

Or is it sufficient that reproject_access_task is already a dependency of rasterize_access_task (and therefore implicitly understood as a dependency of total_degradation_task)?

Yes, that's right. Since rasterize_access_task can't execute until reproject_access_task is completed, we can just add rasterize_access_task to access_task_list and rely on that dependency chain. Since taskgraph is just queuing up work (tasks) to be run in any order (if n_workers>1) we want to make sure total_degradation_task doesn't start until all the possible inputs needed have been calculated first.

any suggestions as to how to go about testing this

Great question, this is an interesting one for testing. In the Model_Spec, access_vector_path is included in the args_with_spatial_overlap. This means that our validation step is going to check that the bounding boxes of all the included spatial inputs overlap. If they don't, an error will be returned. So, we have some upfront check that makes sure the access vector doesn't have a coordinate system that is somewhere else in the world compared to the other inputs. If the access_vector did have a different projection from the land use land cover input, there's a good chance the rasterize operation would work just fine, though it could be slightly different than what we'd expect. This would be the minute detail we'd want to test, but one that might be a pain to create and one that might already be tested (more on that in a sec). In tests/test_habitat_quality.py we have a def make_access_shp function, which uses a default projected coordinate system. We might expect that if we used a different coordinate system that still represented the same area and vector features, the pixels that get set during rasterization might be different. So we could test that the rasterized layer is what we expect and that we can confirm a difference using another projection... But, if we did have a different output for that rasterization step, I'd ultimately expect the final outputs of the model to be different too. So in that way we already might be testing against this with our regression tests. All to say, though I think it would be interesting to see if an access vector projected in a different space would lead to a different rasterized output, I don't think we need to worry about explicitly testing this.

Happy to talk about this more as I found it hard to clearly communicate here!

@emilyanndavis
Copy link
Member Author

Makes sense, @dcdenu4; thank you for the detailed explanation!

@emilyanndavis emilyanndavis marked this pull request as ready for review October 7, 2024 21:29
Copy link
Member

@dcdenu4 dcdenu4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @emilyanndavis ! Looks good to me. Ignoring the RTD failure as unrelated.

@dcdenu4 dcdenu4 merged commit 16737e9 into natcap:main Oct 8, 2024
28 of 29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HQ doesn't use reprojected access vector
2 participants