-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature/discover-interface #61
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a big PR so this is just a first pass review, once we get through some of the code parts I'll make sure to run everything as part of my final review. For now, there's a few areas that I've commented on that need to be changed to be more robust / follow existing patterns / software best practices.
@@ -23,6 +23,8 @@ | |||
from obs_inv_utils.subprocess_cmd_handler import SubprocessCmd | |||
from obs_inv_utils import nceplibs_cmds as nc_cmds | |||
|
|||
import yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this import should be grouped above with the others, under uuid
|
||
work_dir = os.path.join(self.meta_config.work_dir, temp_uuid) | ||
# Because s3 bucket bufr files require download and new work_dir and discover files do not they need to be handled differently. | ||
if bucket == 'noaa-reanalyses-pds': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not the appropriate way to do this. We should not be hard coding in the bucket, especially knowing we need to be able to support the private s3 buckets as well in the future.
If we have to hard code something, it should be the check for discover as that is a much more specific use case.
work_dir = os.path.join(self.meta_config.work_dir, temp_uuid) | ||
# Because s3 bucket bufr files require download and new work_dir and discover files do not they need to be handled differently. | ||
if bucket == 'noaa-reanalyses-pds': | ||
print(f'Running get_bufr_file_meta for NOAA S3 Clean Bucket (aws_s3) {self.s3_bucket}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not be referring to it as Clean Bucket, but NNJA. However, this should be much more generalized for any possible s3 bucket
os.remove( saved_filename ) | ||
shutil.rmtree( work_dir ) | ||
|
||
elif self.s3_bucket == None or self.s3_bucket == '': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be using a different way to check directly for Discover such as platform
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a note for myself to look into: how is HPSS called and would that not be prevented with this elif statement?
@@ -0,0 +1,11 @@ | |||
s3_bucket: noaa-reanalyses-pds | |||
s3_prefix: observations/reanalysis/gps/gpsro/%Y/%m/bufr/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the file name says amsua but it's pointing to gps data? need to make sure the file name of the file and the bufr files is consistent with the s3_prefix
python src/obs_inv_utils/obs_inv_cli.py get-obs-inventory -c src/tests/yaml_configs/test_aws_amsua_1.yaml ; | ||
python src/obs_inv_utils/obs_inv_cli.py get-obs-count-meta-sinv -c src/tests/yaml_configs/test_discover_amsua_2.yaml ; | ||
|
||
echo "test complete." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not actually a test but a chance to run some files, it won't actually check and confirm things are working. Since we are already using pytest for other testing, it would be appropriate to write this into a python file using pytest and assert statements for confirmation that the code is working appropriately
], | ||
) | ||
|
||
HpssTarballContents = namedtuple( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this actually getting used in this file or was it just there for reference making the discover ones? if just reference they can be removed now
This PR adds the changes to multiple scripts which allow the observation-inventory-utils to work on Discover (NASA). AWS and HPSS functionality are preserved despite these edits:
(1) Two new files have been added: obs_inv_utils_discover.csh, discover_interface.py.
(2) Changes were made to 5 scripts: nceplibs_bufr_cmd_handler.py, obs_inv_queries.py, obs_storage_platforms.py, search_engine.py
Files for testing the changes above have also been added:
(3) Four test yaml files have also been added: test_discover.sh, test_aws_amsua_1.yaml, test_aws_amsua_2.yaml, test_discover_amsua_1.yaml, test_discover_amsua_2.yaml.
(4) A .sh file, test_discover.sh, that runs the appropriate obs_inv_cli commands for all the new yamls for faster testing on discover.