-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add check for existence of requested data to Query init #346
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A minor thing on the error message to fix.
I think the approach is good here. I wouldn't go down the route of hostname checks, a generic check like this is a much better general approach. And yes, can only check so much before a query, at some point the error just has to be triggered at the point of actually using the file.
A future idea could be a function somewhere in VAST Tools that is a list_available_data()
so users could run that to see what is detected on the system they are working on.
vasttools/query.py
Outdated
"Not all requested data is available!" | ||
"Please address and try again." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These strings are added so there won't be a space in there at the moment.
When the space is added the test checks will also need to be updated.
Fix #339.
This code will check that the requested data exists at the initialisation of a
Query
object. It checks that the relevant directories exist and assumes that if those directories exist then the catalogues/fits files it should contain also exist.The goal is a quick first-pass check to see if the data is available (since, e.g., nimbus doesn't have any Stokes Q/U/V data) with a nice exit if it doesn't. Adding a check that all files exist is quite cumbersome and probably unnecessary, as the query will still fail somewhat nicely (i.e. it includes the filepath in the traceback) if they don't.
This solution doesn't directly fix #339, mostly because the jupyterhub instances don't have unique hostnames and therefore a simple hostname check is not feasible. Restricting access to Stokes Q/U/V data to ada is not ideal as some users may have downloaded full stokes data to their own machines. However, this solution is more general and produces a similar outcome while also handling any other instances of missing data (e.g. there is currently no COMBINED data for some epochs on nimbus, there are a number of missing epochs on ada)