Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

switch jenkins runs to use roman-serverless for crds #946

Merged
merged 1 commit into from
Jan 30, 2024

Conversation

braingram
Copy link
Collaborator

@braingram braingram commented Oct 19, 2023

Jenkins runs show errors in the logs for the crds list step:
from: https://plwishmaster.stsci.edu:8081/blue/organizations/jenkins/RT%2Fromancal/detail/romancal/1092/pipeline

++ crds list --contexts roman_0051.pmap --mappings

++ grep pmap

CRDS - ERROR -  (FATAL) CRDS server connection and cache load FAILED.  Cannot continue.

 See https://hst-crds.stsci.edu/docs/cmdline_bestrefs/ or https://jwst-crds.stsci.edu/docs/cmdline_bestrefs/

 for more information on configuring CRDS,  particularly CRDS_PATH and CRDS_SERVER_URL. : [Errno 2] No such file or directory: '/grp/crds/roman/test/config/jwst/server_config'

+ echo 'CRDS_CONTEXT = '

CRDS_CONTEXT =

This appears to be due to crds defaulting to jwst when an observatory is not defined.

This PR updates the CRDS_SERVER environment variable to allow crds to know that this is roman.

Regression tests running at:
https://plwishmaster.stsci.edu:8081/job/RT/job/Roman-Developers-Pull-Requests/432/

Checklist

  • added entry in CHANGES.rst under the corresponding subsection
  • updated relevant tests
  • updated relevant documentation
  • updated relevant milestone(s)
  • added relevant label(s)
  • ran regression tests, post a link to the Jenkins job below. How to run regression tests on a PR

@braingram braingram force-pushed the fix_crds_serverless_config branch from 00d0b9e to 85b4f52 Compare October 19, 2023 20:34
@braingram braingram marked this pull request as ready for review October 19, 2023 20:43
@braingram braingram requested a review from a team as a code owner October 19, 2023 20:43
@codecov
Copy link

codecov bot commented Oct 19, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (c3a7025) 76.74% compared to head (a9949c2) 76.74%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #946   +/-   ##
=======================================
  Coverage   76.74%   76.74%           
=======================================
  Files         105      105           
  Lines        7013     7013           
=======================================
  Hits         5382     5382           
  Misses       1631     1631           
Flag Coverage Δ *Carryforward flag
nightly 63.01% <ø> (ø) Carriedforward from c3a7025

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@braingram
Copy link
Collaborator Author

All regression tests passed. The output of the crds list step is as follows:

++ crds list --contexts roman_0051.pmap --mappings
++ grep pmap
+ echo 'CRDS_CONTEXT = roman_0051.pmap'
CRDS_CONTEXT = roman_0051.pmap

showing no error and that the CRDS_CONTEXT remained set.
The run did end up failing (just like all the other recent runs) on Upload Artifacts with:
java.lang.RuntimeException: Failed uploading artifacts by spec

@braingram braingram requested a review from nden October 19, 2023 20:56
@ddavis-stsci
Copy link
Collaborator

Where is https://roman-serverless defined?

@braingram
Copy link
Collaborator Author

I'm not sure if the crds docs describe this (beyond what's below). Internally, crds calls get_default_observatory which uses 1 of several sources:
https://github.com/spacetelescope/crds/blob/master/crds/client/api.py#L520

    """Based on the environment, cache, and server,  determine the default observatory.

    1. CRDS_OBSERVATORY env var
    2. CRDS_SERVER_URL env var
    3. Observatory(Server default context)
    4. jwst
    """

As CRDS_OBSERVATORY isn't set, crds attempts to parse the observatory from the CRDS_SERVER_URL by checking for all known observatories in the server url. Since roman is listed as one of the known observatories then crds identifies the observatory as roman.

This is similar to what is done with jwst when CRDS_SERVER_URL=https://jwst-crds.stsci.edu. The same code used for this PR inspects the url, finds jwst and sets the observatory to jwst.

@ddavis-stsci
Copy link
Collaborator

I'm seeing

export CRDS_SERVER_URL=https://roman-serverless
crds list --contexts $CRDS_CONTEXT --mappings --verbose | /usr/bin/grep pmap
CRDS - DEBUG -  Command: ['list.py', '--contexts', '--mappings', '--verbose']
CRDS - DEBUG -  Using CACHED CRDS reference assignment rules last updated on '2023-10-23 15:48:51.031570'
CRDS - DEBUG -  Using reference file selection rules 'roman-operational' defined by caller.

So I'm not sure where this is being cached and when/who will update these.

I'm beginning to think we should revert this to the earlier version without the "crds list" command for now. I'm not sure that setting the variable above and resetting it with "crds list" really makes sense.

Then we can fix this in GH actions?

@nden nden requested a review from stscieisenhamer October 30, 2023 13:42
Copy link
Collaborator

@stscieisenhamer stscieisenhamer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As described, the code change is meaningless.

Reading through the comments and noting that the CI is passing, is there something still up? If so, it is not clear in the commentary what the issues still are.

@@ -53,7 +53,7 @@ def pip_install_args = "--index-url ${pip_index} --progress-bar=off"
env_vars = [
"TEST_BIGDATA=https://bytesalad.stsci.edu/artifactory",
"CRDS_CONTEXT=roman_0051.pmap",
"CRDS_SERVER_URL=https://serverless",
"CRDS_SERVER_URL=https://roman-serverless",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change, from the crds perspective, is a NOOP. The only condition to forcing serverless mode is having "serverless" in the url, as it already was pre-change.

Copy link
Collaborator Author

@braingram braingram Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look. Yes this is still an issue. Every run without this change is silently failing the crds context lookup step with the following error (which @nden pointed out):

++ crds list --contexts roman_0051.pmap --mappings
++ grep pmap
CRDS - ERROR -  (FATAL) CRDS server connection and cache load FAILED.  Cannot continue.
 See https://hst-crds.stsci.edu/docs/cmdline_bestrefs/ or https://jwst-crds.stsci.edu/docs/cmdline_bestrefs/
 for more information on configuring CRDS,  particularly CRDS_PATH and CRDS_SERVER_URL. : [Errno 2] No such file or directory: '/grp/crds/roman/test/config/jwst/server_config'
+ echo 'CRDS_CONTEXT = '
CRDS_CONTEXT =

As @ddavis-stsci commented it might be cleaner to just remove crds list entirely (I don't know why this was added in the first place).

I understand that both serverless and roman-serverless are equivalent for forcing serverless mode. However this change does effect what observatory is inferred from the CRDS_SERVER_URL as mentioned in the comment above and as seen in the jenkins runs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, my mistake. I had presumed that the "serverless" indicator precluded any other server-based checking, but it does not. If the observatory is unknown, crds still attempts to pull the name from the CRDS_SERVER_URL. So, yes, this change will enforce the use of the roman side of crds.

So, this PR is approved.

However, curiosity question: At this point in the CI, how has crds been installed? Directly or presumed installed with romancal?

For jwst, I believe the crds list command was there just to log what the state of the crds system is during CI. Leaving it in or out is up to the user.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crds is installed just prior to the call to crds list. It's installed as a dependency of romancal:

"pip install -e .[test]",

@stscieisenhamer stscieisenhamer self-requested a review October 30, 2023 15:12
@braingram braingram force-pushed the fix_crds_serverless_config branch from 85b4f52 to 773c310 Compare October 30, 2023 15:28
@braingram braingram force-pushed the fix_crds_serverless_config branch from 773c310 to a9949c2 Compare January 30, 2024 19:03
@braingram
Copy link
Collaborator Author

ddtrace ci failures are unrelated and caused by incompatibility between pytest 8 and ddtrace:
DataDog/dd-trace-py#8220

@nden
Copy link
Collaborator

nden commented Jan 30, 2024

Jonathan's approval is good enough for me. I think we should merge.

@nden nden merged commit d81a992 into spacetelescope:main Jan 30, 2024
28 of 30 checks passed
@braingram braingram deleted the fix_crds_serverless_config branch January 30, 2024 20:25
ddavis-stsci pushed a commit to ddavis-stsci/romancal that referenced this pull request Jan 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants