Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Esa ehst download improvements #2797

Merged

Conversation

jespinosaar
Copy link
Contributor

Dear Astroquery Team,

This PR aims to offer a more intuitive way for users to download products from eHST Science Archive. I have left get_artifact method to ensure backwards compatibility, but I have moved its functionality to download_file method. In addition to this, we have added a mechanism to retrieve the files associated to an observation and download only FITS files. Finally, the criteria methods have been updated so now users can search by proposal.

Let's iterate in this new pull request. Please let me know your thoughts and comments. Thanks in advance for your support!

cc @esdc-esac-esa-int

Copy link
Contributor

@keflavich keflavich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. The new approach seems simpler. It might be helpful to note in the documentation that the expected file extension has changed.

I'm not going to mark this as "Approve", though, as I think someone with more direct experience should review if possible.

@jespinosaar
Copy link
Contributor Author

Many thanks @keflavich . That is what we wanted, more simple and intuitive ways of accessing the data. Also focused on how the users will act to retrieve the data: getting the desired metadata, getting the list of files from that data and then download it.

@bsipocz bsipocz added this to the v0.4.7 milestone Aug 4, 2023
Copy link
Member

@bsipocz bsipocz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few minor comments, otherwise this is almost ready to go. Thanks!

(ignore the failing devtest job, we're dealing with it separately and it won't hold up this PR)

astroquery/esa/hubble/core.py Show resolved Hide resolved
@@ -107,6 +108,15 @@ def download_product(self, observation_id, *, calibration_level=None,
filename = self._get_product_filename(product_type, filename)
self._tap.load_data(params_dict=params, output_file=filename, verbose=verbose)

return self.check_rename_to_gz(filename=filename)

def check_rename_to_gz(self, filename):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't have to be a class method, right? Maybe dump it as a "private" function outside of the class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

CHANGES.rst Outdated
Comment on lines 34 to 35
- New methods to download single files and download FITS associated to an observation. [#2797]
- New function to retrieve all the files associated to an observation. [#2797]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it useful to explicitly mention the method/function names. If you do, put them in double backticks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion, included.


def download_file(self, file, *, filename=None, verbose=False):
"""
Download a file from EHST based on its filename. Similar to get_a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ending of this sentence is missing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed and updated, the sentence is complete now.

CHANGES.rst Outdated
@@ -64,6 +66,7 @@ esa.hubble
a lot faster. [#2524]
- Method query_hst_tap has been deprecated and is replaced with query_tap, with the same arguments. [#2597]
- Product types in download_product method have been modified to: PRODUCT, SCIENCE_PRODUCT or POSTCARD. [#2597]
- query_criteria method now allows user to filter by Proposal ID. [#2797]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioning the actual keyword is useful, as well as several other methods got this added.

Suggested change
- query_criteria method now allows user to filter by Proposal ID. [#2797]
- Added `proposal` keyword argument to several methods now allows to filter by Proposal ID. [#2797]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I included your suggestion, thanks!

@@ -59,18 +59,24 @@ def test_query_tap_async(self):
def test_download_product(self):
result = esa_hubble.query_tap(query=self.hst_query)
observation_id = np.random.choice((result['observation_id']))
temp_file = self.temp_folder.name + "/" + observation_id + ".tar"
temp_file = self.temp_folder.name + "/" + observation_id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid this won't work with windows (we still have a bit of a cleanup backlog for that, including some in the esa modules #2773). So while we're at it, could you swap all these string manipulations with an os.path.join instead, here and below, too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified in different places so it works on every environment.

In eHST is it possible to download products based on their observation ID (mandatory) and
a required calibration_level (RAW, CALIBRATED, PRODUCT or AUXILIARY) and/or product type (SCIENCE, PREVIEW, THUMBNAIL or AUXILIARY).

Deprecation Warning: product types PRODUCT, SCIENCE_PRODUCT or POSTCARD are no longer supported. Please modify your scripts accordingly.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a note or some other directive around this to highlight.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notes and warnings included.

>>> from astroquery.esa.hubble import ESAHubble
>>> esahubble = ESAHubble()
>>> esahubble.download_product(observation_id="j6fl25s4q", calibration_level="RAW",
... filename="raw_data_for_j6fl25s4q") # doctest: +IGNORE_OUTPUT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although we're ignoring the outputs (but only do that if frequent changes are expected, e.g. more observations are coming in, the order of the result is not deterministic, etc), please include a few line long of the expected outcome. E.g. in this case, as I see it will be only the string raw_data_for_j6fl25s4q.zip

OTOH, all these download examples are actually downloading the files, so I wonder whether cutting back on the actual code examples would make sense (after all your narrative text does the explanation quite well)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the expected output. Thanks!

@jespinosaar
Copy link
Contributor Author

Thanks for your feedback @bsipocz. I just updated the branch with your comments.

Copy link
Member

@bsipocz bsipocz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor namespace fix that I commit directly, otherwise it's good to go. Thanks @jespinosaar.

However, with the new added methods I did a quick check on the current API, and saw that they may benefit from naming them slightly differently to add some consistency. I would suggest to consider these comments, and address it in a follow-up?

astroquery/esa/hubble/core.py Outdated Show resolved Hide resolved
params = {"RETRIEVAL_TYPE": "PRODUCT", "ARTIFACTID": artifact_id, "TAPCLIENT": "ASTROQUERY"}
return self.download_file(file=artifact_id, filename=filename, verbose=verbose)

def get_associated_files(self, observation_id, *, verbose=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had another go through this looking into some consistency with the currently present API.

Practically this has the purpose as get_obs_products for jwst, right? https://github.com/astropy/astroquery/blob/main/astroquery/esa/jwst/core.py#L947
While the return differs a bit, how do you feel about calling them the same, and in a follow-up PR making them do the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the same, in this case we are only listing the file names, in case users want to filter them before download anything. get_obs_products downloads them.

f"art.observation_id = '{observation_id}'")
return self.query_tap(query=query)

def download_fits_files(self, observation_id, *, verbose=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same API considerations, maybe this could be called download_files, and having a kwarg for extensions defaulting to 'fits'? This is really only a discussion item as download_files is not really wide-spread, and the currently existing ones in alma and casda are having different signature

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We wanted a specific method naming FITS, to make it more visible to users.

@codecov
Copy link

codecov bot commented Aug 8, 2023

Codecov Report

Merging #2797 (f1a833d) into main (ba39fb2) will increase coverage by 0.04%.
Report is 17 commits behind head on main.
The diff coverage is 93.87%.

@@            Coverage Diff             @@
##             main    #2797      +/-   ##
==========================================
+ Coverage   66.12%   66.17%   +0.04%     
==========================================
  Files         235      235              
  Lines       18080    18109      +29     
==========================================
+ Hits        11955    11983      +28     
- Misses       6125     6126       +1     
Files Changed Coverage Δ
astroquery/jplhorizons/__init__.py 100.00% <ø> (ø)
astroquery/jplhorizons/core.py 65.49% <ø> (ø)
astroquery/esa/hubble/core.py 84.56% <92.30%> (+0.86%) ⬆️
astroquery/cds/__init__.py 50.00% <100.00%> (+7.14%) ⬆️
astroquery/exoplanet_orbit_database/__init__.py 100.00% <100.00%> (ø)
astroquery/ibe/__init__.py 100.00% <100.00%> (ø)
astroquery/irsa/__init__.py 100.00% <100.00%> (ø)
astroquery/irsa_dust/__init__.py 100.00% <100.00%> (ø)
astroquery/nasa_exoplanet_archive/__init__.py 100.00% <100.00%> (ø)
astroquery/ned/__init__.py 100.00% <100.00%> (ø)
... and 1 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@bsipocz bsipocz merged commit 3f16e62 into astropy:main Aug 8, 2023
@bsipocz
Copy link
Member

bsipocz commented Aug 8, 2023

Thanks @jespinosaar! About the method names I opened a reminder issue as it doesn't need to hold up the merge of this one.

@jespinosaar
Copy link
Contributor Author

Many thanks @bsipocz and @keflavich! Yes, we can fix the method names in other PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants