-
-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Esa ehst download improvements #2797
Esa ehst download improvements #2797
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. The new approach seems simpler. It might be helpful to note in the documentation that the expected file extension has changed.
I'm not going to mark this as "Approve", though, as I think someone with more direct experience should review if possible.
Many thanks @keflavich . That is what we wanted, more simple and intuitive ways of accessing the data. Also focused on how the users will act to retrieve the data: getting the desired metadata, getting the list of files from that data and then download it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few minor comments, otherwise this is almost ready to go. Thanks!
(ignore the failing devtest job, we're dealing with it separately and it won't hold up this PR)
astroquery/esa/hubble/core.py
Outdated
@@ -107,6 +108,15 @@ def download_product(self, observation_id, *, calibration_level=None, | |||
filename = self._get_product_filename(product_type, filename) | |||
self._tap.load_data(params_dict=params, output_file=filename, verbose=verbose) | |||
|
|||
return self.check_rename_to_gz(filename=filename) | |||
|
|||
def check_rename_to_gz(self, filename): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't have to be a class method, right? Maybe dump it as a "private" function outside of the class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
CHANGES.rst
Outdated
- New methods to download single files and download FITS associated to an observation. [#2797] | ||
- New function to retrieve all the files associated to an observation. [#2797] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it useful to explicitly mention the method/function names. If you do, put them in double backticks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion, included.
astroquery/esa/hubble/core.py
Outdated
|
||
def download_file(self, file, *, filename=None, verbose=False): | ||
""" | ||
Download a file from EHST based on its filename. Similar to get_a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ending of this sentence is missing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed and updated, the sentence is complete now.
CHANGES.rst
Outdated
@@ -64,6 +66,7 @@ esa.hubble | |||
a lot faster. [#2524] | |||
- Method query_hst_tap has been deprecated and is replaced with query_tap, with the same arguments. [#2597] | |||
- Product types in download_product method have been modified to: PRODUCT, SCIENCE_PRODUCT or POSTCARD. [#2597] | |||
- query_criteria method now allows user to filter by Proposal ID. [#2797] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentioning the actual keyword is useful, as well as several other methods got this added.
- query_criteria method now allows user to filter by Proposal ID. [#2797] | |
- Added `proposal` keyword argument to several methods now allows to filter by Proposal ID. [#2797] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I included your suggestion, thanks!
@@ -59,18 +59,24 @@ def test_query_tap_async(self): | |||
def test_download_product(self): | |||
result = esa_hubble.query_tap(query=self.hst_query) | |||
observation_id = np.random.choice((result['observation_id'])) | |||
temp_file = self.temp_folder.name + "/" + observation_id + ".tar" | |||
temp_file = self.temp_folder.name + "/" + observation_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid this won't work with windows (we still have a bit of a cleanup backlog for that, including some in the esa modules #2773). So while we're at it, could you swap all these string manipulations with an os.path.join
instead, here and below, too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified in different places so it works on every environment.
docs/esa/hubble/hubble.rst
Outdated
In eHST is it possible to download products based on their observation ID (mandatory) and | ||
a required calibration_level (RAW, CALIBRATED, PRODUCT or AUXILIARY) and/or product type (SCIENCE, PREVIEW, THUMBNAIL or AUXILIARY). | ||
|
||
Deprecation Warning: product types PRODUCT, SCIENCE_PRODUCT or POSTCARD are no longer supported. Please modify your scripts accordingly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a note or some other directive around this to highlight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notes and warnings included.
docs/esa/hubble/hubble.rst
Outdated
>>> from astroquery.esa.hubble import ESAHubble | ||
>>> esahubble = ESAHubble() | ||
>>> esahubble.download_product(observation_id="j6fl25s4q", calibration_level="RAW", | ||
... filename="raw_data_for_j6fl25s4q") # doctest: +IGNORE_OUTPUT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although we're ignoring the outputs (but only do that if frequent changes are expected, e.g. more observations are coming in, the order of the result is not deterministic, etc), please include a few line long of the expected outcome. E.g. in this case, as I see it will be only the string raw_data_for_j6fl25s4q.zip
OTOH, all these download examples are actually downloading the files, so I wonder whether cutting back on the actual code examples would make sense (after all your narrative text does the explanation quite well)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the expected output. Thanks!
Thanks for your feedback @bsipocz. I just updated the branch with your comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor namespace fix that I commit directly, otherwise it's good to go. Thanks @jespinosaar.
However, with the new added methods I did a quick check on the current API, and saw that they may benefit from naming them slightly differently to add some consistency. I would suggest to consider these comments, and address it in a follow-up?
params = {"RETRIEVAL_TYPE": "PRODUCT", "ARTIFACTID": artifact_id, "TAPCLIENT": "ASTROQUERY"} | ||
return self.download_file(file=artifact_id, filename=filename, verbose=verbose) | ||
|
||
def get_associated_files(self, observation_id, *, verbose=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had another go through this looking into some consistency with the currently present API.
Practically this has the purpose as get_obs_products
for jwst, right? https://github.com/astropy/astroquery/blob/main/astroquery/esa/jwst/core.py#L947
While the return differs a bit, how do you feel about calling them the same, and in a follow-up PR making them do the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the same, in this case we are only listing the file names, in case users want to filter them before download anything. get_obs_products downloads them.
f"art.observation_id = '{observation_id}'") | ||
return self.query_tap(query=query) | ||
|
||
def download_fits_files(self, observation_id, *, verbose=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same API considerations, maybe this could be called download_files
, and having a kwarg for extensions defaulting to 'fits'
? This is really only a discussion item as download_files
is not really wide-spread, and the currently existing ones in alma and casda are having different signature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We wanted a specific method naming FITS, to make it more visible to users.
Codecov Report
@@ Coverage Diff @@
## main #2797 +/- ##
==========================================
+ Coverage 66.12% 66.17% +0.04%
==========================================
Files 235 235
Lines 18080 18109 +29
==========================================
+ Hits 11955 11983 +28
- Misses 6125 6126 +1
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Thanks @jespinosaar! About the method names I opened a reminder issue as it doesn't need to hold up the merge of this one. |
Many thanks @bsipocz and @keflavich! Yes, we can fix the method names in other PR. |
Dear Astroquery Team,
This PR aims to offer a more intuitive way for users to download products from eHST Science Archive. I have left get_artifact method to ensure backwards compatibility, but I have moved its functionality to download_file method. In addition to this, we have added a mechanism to retrieve the files associated to an observation and download only FITS files. Finally, the criteria methods have been updated so now users can search by proposal.
Let's iterate in this new pull request. Please let me know your thoughts and comments. Thanks in advance for your support!
cc @esdc-esac-esa-int