Esa ehst download improvements #2797

jespinosaar · 2023-08-04T15:52:16Z

Dear Astroquery Team,

This PR aims to offer a more intuitive way for users to download products from eHST Science Archive. I have left get_artifact method to ensure backwards compatibility, but I have moved its functionality to download_file method. In addition to this, we have added a mechanism to retrieve the files associated to an observation and download only FITS files. Finally, the criteria methods have been updated so now users can search by proposal.

Let's iterate in this new pull request. Please let me know your thoughts and comments. Thanks in advance for your support!

cc @esdc-esac-esa-int

keflavich

Looks good to me. The new approach seems simpler. It might be helpful to note in the documentation that the expected file extension has changed.

I'm not going to mark this as "Approve", though, as I think someone with more direct experience should review if possible.

jespinosaar · 2023-08-04T16:10:39Z

Many thanks @keflavich . That is what we wanted, more simple and intuitive ways of accessing the data. Also focused on how the users will act to retrieve the data: getting the desired metadata, getting the list of files from that data and then download it.

bsipocz

I have a few minor comments, otherwise this is almost ready to go. Thanks!

(ignore the failing devtest job, we're dealing with it separately and it won't hold up this PR)

astroquery/esa/hubble/core.py

bsipocz · 2023-08-04T18:52:24Z

astroquery/esa/hubble/core.py

@@ -107,6 +108,15 @@ def download_product(self, observation_id, *, calibration_level=None,
        filename = self._get_product_filename(product_type, filename)
        self._tap.load_data(params_dict=params, output_file=filename, verbose=verbose)

+        return self.check_rename_to_gz(filename=filename)
+
+    def check_rename_to_gz(self, filename):


this doesn't have to be a class method, right? Maybe dump it as a "private" function outside of the class.

bsipocz · 2023-08-04T18:53:16Z

CHANGES.rst

+- New methods to download single files and download FITS associated to an observation. [#2797]
+- New function to retrieve all the files associated to an observation. [#2797]


I find it useful to explicitly mention the method/function names. If you do, put them in double backticks.

Thanks for the suggestion, included.

bsipocz · 2023-08-07T20:54:13Z

astroquery/esa/hubble/core.py

+
+    def download_file(self, file, *, filename=None, verbose=False):
+        """
+        Download a file from EHST based on its filename. Similar to get_a


The ending of this sentence is missing

Reviewed and updated, the sentence is complete now.

bsipocz · 2023-08-07T20:57:38Z

CHANGES.rst

@@ -64,6 +66,7 @@ esa.hubble
  a lot faster. [#2524]
 - Method query_hst_tap has been deprecated and is replaced with query_tap, with the same arguments. [#2597]
 - Product types in download_product method have been modified to: PRODUCT, SCIENCE_PRODUCT or POSTCARD. [#2597]
+- query_criteria method now allows user to filter by Proposal ID. [#2797]


Mentioning the actual keyword is useful, as well as several other methods got this added.

Suggested change

- query_criteria method now allows user to filter by Proposal ID. [#2797]

- Added `proposal` keyword argument to several methods now allows to filter by Proposal ID. [#2797]

I included your suggestion, thanks!

bsipocz · 2023-08-07T21:13:41Z

astroquery/esa/hubble/tests/test_esa_hubble_remote.py

@@ -59,18 +59,24 @@ def test_query_tap_async(self):
    def test_download_product(self):
        result = esa_hubble.query_tap(query=self.hst_query)
        observation_id = np.random.choice((result['observation_id']))
-        temp_file = self.temp_folder.name + "/" + observation_id + ".tar"
+        temp_file = self.temp_folder.name + "/" + observation_id


I'm afraid this won't work with windows (we still have a bit of a cleanup backlog for that, including some in the esa modules #2773). So while we're at it, could you swap all these string manipulations with an os.path.join instead, here and below, too?

Modified in different places so it works on every environment.

bsipocz · 2023-08-07T21:18:16Z

docs/esa/hubble/hubble.rst

+In eHST is it possible to download products based on their observation ID (mandatory) and
+a required calibration_level (RAW, CALIBRATED, PRODUCT or AUXILIARY) and/or product type (SCIENCE, PREVIEW, THUMBNAIL or AUXILIARY).
+
+Deprecation Warning: product types PRODUCT, SCIENCE_PRODUCT or POSTCARD are no longer supported. Please modify your scripts accordingly.


Add a note or some other directive around this to highlight.

Notes and warnings included.

bsipocz · 2023-08-07T21:20:19Z

docs/esa/hubble/hubble.rst

+  >>> from astroquery.esa.hubble import ESAHubble
+  >>> esahubble = ESAHubble()
+  >>> esahubble.download_product(observation_id="j6fl25s4q", calibration_level="RAW",
+  ...                            filename="raw_data_for_j6fl25s4q")  # doctest: +IGNORE_OUTPUT


Although we're ignoring the outputs (but only do that if frequent changes are expected, e.g. more observations are coming in, the order of the result is not deterministic, etc), please include a few line long of the expected outcome. E.g. in this case, as I see it will be only the string raw_data_for_j6fl25s4q.zip

OTOH, all these download examples are actually downloading the files, so I wonder whether cutting back on the actual code examples would make sense (after all your narrative text does the explanation quite well)

I added the expected output. Thanks!

jespinosaar · 2023-08-08T13:43:06Z

Thanks for your feedback @bsipocz. I just updated the branch with your comments.

bsipocz

One minor namespace fix that I commit directly, otherwise it's good to go. Thanks @jespinosaar.

However, with the new added methods I did a quick check on the current API, and saw that they may benefit from naming them slightly differently to add some consistency. I would suggest to consider these comments, and address it in a follow-up?

astroquery/esa/hubble/core.py

bsipocz · 2023-08-08T17:57:41Z

astroquery/esa/hubble/core.py

-        params = {"RETRIEVAL_TYPE": "PRODUCT", "ARTIFACTID": artifact_id, "TAPCLIENT": "ASTROQUERY"}
+        return self.download_file(file=artifact_id, filename=filename, verbose=verbose)
+
+    def get_associated_files(self, observation_id, *, verbose=False):


I had another go through this looking into some consistency with the currently present API.

Practically this has the purpose as get_obs_products for jwst, right? https://github.com/astropy/astroquery/blob/main/astroquery/esa/jwst/core.py#L947
While the return differs a bit, how do you feel about calling them the same, and in a follow-up PR making them do the same?

Not the same, in this case we are only listing the file names, in case users want to filter them before download anything. get_obs_products downloads them.

bsipocz · 2023-08-08T18:01:50Z

astroquery/esa/hubble/core.py

+                 f"art.observation_id = '{observation_id}'")
+        return self.query_tap(query=query)
+
+    def download_fits_files(self, observation_id, *, verbose=False):


Same API considerations, maybe this could be called download_files, and having a kwarg for extensions defaulting to 'fits'? This is really only a discussion item as download_files is not really wide-spread, and the currently existing ones in alma and casda are having different signature

We wanted a specific method naming FITS, to make it more visible to users.

codecov · 2023-08-08T18:15:12Z

Codecov Report

Merging #2797 (f1a833d) into main (ba39fb2) will increase coverage by 0.04%.
Report is 17 commits behind head on main.
The diff coverage is 93.87%.

@@            Coverage Diff             @@
##             main    #2797      +/-   ##
==========================================
+ Coverage   66.12%   66.17%   +0.04%     
==========================================
  Files         235      235              
  Lines       18080    18109      +29     
==========================================
+ Hits        11955    11983      +28     
- Misses       6125     6126       +1

Files Changed	Coverage Δ
astroquery/jplhorizons/__init__.py	`100.00% <ø> (ø)`
astroquery/jplhorizons/core.py	`65.49% <ø> (ø)`
astroquery/esa/hubble/core.py	`84.56% <92.30%> (+0.86%)`	⬆️
astroquery/cds/__init__.py	`50.00% <100.00%> (+7.14%)`	⬆️
astroquery/exoplanet_orbit_database/__init__.py	`100.00% <100.00%> (ø)`
astroquery/ibe/__init__.py	`100.00% <100.00%> (ø)`
astroquery/irsa/__init__.py	`100.00% <100.00%> (ø)`
astroquery/irsa_dust/__init__.py	`100.00% <100.00%> (ø)`
astroquery/nasa_exoplanet_archive/__init__.py	`100.00% <100.00%> (ø)`
astroquery/ned/__init__.py	`100.00% <100.00%> (ø)`
... and 1 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

bsipocz · 2023-08-08T18:28:44Z

Thanks @jespinosaar! About the method names I opened a reminder issue as it doesn't need to hold up the merge of this one.

jespinosaar · 2023-08-10T11:12:42Z

Many thanks @bsipocz and @keflavich! Yes, we can fix the method names in other PR.

jespinosaar added 3 commits August 4, 2023 16:09

EHSTPCR-1113: new methods to download files

eb54a0d

EHSTPCR-1113: search by proposal added, docs updated

4f28542

EHSTPCR-1113: code style fixed

9e482d5

keflavich reviewed Aug 4, 2023

View reviewed changes

jespinosaar added 3 commits August 4, 2023 17:59

EHSTPCR-1113: changes.rst file modified

564e4fd

EHSTPCR-1113: style fixes in test_esa_hubble_remote.py

5a3ac64

EHSTPCR-1113: warning for change of format

bd45846

bsipocz added the esa.esa_hubble label Aug 4, 2023

bsipocz added this to the v0.4.7 milestone Aug 4, 2023

EHSTPCR-1113: gzip test fixed

1bc13cd

bsipocz reviewed Aug 7, 2023

View reviewed changes

jespinosaar added 4 commits August 8, 2023 10:54

EHSTPCR-1113: review checks included

68407bd

EHSTPCR-1113: code style fixes

fb28c32

EHSTPCR-1113: changes.rst issue fixed

888d5ca

EHSTPCR-1113: check_gz private

6387e3e

bsipocz approved these changes Aug 8, 2023

View reviewed changes

MAINT: don't expose private functions in namespace

f1a833d

bsipocz mentioned this pull request Aug 8, 2023

MAINT: consolidate API #2803

Open

bsipocz merged commit 3f16e62 into astropy:main Aug 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Esa ehst download improvements #2797

Esa ehst download improvements #2797

jespinosaar commented Aug 4, 2023

keflavich left a comment

jespinosaar commented Aug 4, 2023

bsipocz left a comment

bsipocz Aug 4, 2023

jespinosaar Aug 8, 2023

bsipocz Aug 4, 2023

jespinosaar Aug 8, 2023

bsipocz Aug 7, 2023

jespinosaar Aug 8, 2023

bsipocz Aug 7, 2023

jespinosaar Aug 8, 2023

bsipocz Aug 7, 2023

jespinosaar Aug 8, 2023

bsipocz Aug 7, 2023

jespinosaar Aug 8, 2023

bsipocz Aug 7, 2023

jespinosaar Aug 8, 2023

jespinosaar commented Aug 8, 2023

bsipocz left a comment

bsipocz Aug 8, 2023

jespinosaar Aug 10, 2023

bsipocz Aug 8, 2023

jespinosaar Aug 10, 2023

codecov bot commented Aug 8, 2023

bsipocz commented Aug 8, 2023

jespinosaar commented Aug 10, 2023

		- New methods to download single files and download FITS associated to an observation. [#2797]
		- New function to retrieve all the files associated to an observation. [#2797]

	- query_criteria method now allows user to filter by Proposal ID. [#2797]
	- Added `proposal` keyword argument to several methods now allows to filter by Proposal ID. [#2797]

Esa ehst download improvements #2797

Esa ehst download improvements #2797

Conversation

jespinosaar commented Aug 4, 2023

keflavich left a comment

Choose a reason for hiding this comment

jespinosaar commented Aug 4, 2023

bsipocz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jespinosaar commented Aug 8, 2023

bsipocz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Aug 8, 2023

Codecov Report

bsipocz commented Aug 8, 2023

jespinosaar commented Aug 10, 2023