Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update http (not s) refs to IETF URI to use https #739

Merged
merged 6 commits into from
Aug 23, 2021

Conversation

machawk1
Copy link
Member

There are a few references throughout the codebase/tests to http://tools.ietf.org/html/rfc7089, which results in a 404 when dereferenced. This PR uses the same URI but with https as the scheme, which resolves.

Perhaps trivial but also critical to provide the context where it is used.

@machawk1
Copy link
Member Author

I noticed a strange issue here that is causing the tests to fail. In the GitHub Action, the sample CDXJ file are pulled from the master branch. For these tests to be representative, especially if modified or new ones added, they ought to use the samples in the branch from which the PR is contributing.

This PR (#739) is a good opportunity to fix this issue in testing.

@machawk1
Copy link
Member Author

Based on https://raw.githubusercontent.com/oduwsdl/ipwb/master/.github/workflows/test.yml , can you suggest how we go about using the branch's data as the basis for the tests? Is it even a good idea to do this, @ibnesayeed?

@machawk1
Copy link
Member Author

Something like:

- uses: actions/checkout@v2
  with:
    ref: ${{ github.event.pull_request.head.sha }}

...might work but would only be applicable to PRs. Ideally, changes should be made to master directly anyway, so perhaps this will help to enforce that.

@machawk1
Copy link
Member Author

machawk1 commented Aug 21, 2021

Taking a step back, this is not an issue with the GitHub Action but rather, the absolute URI is embedded in the test code, e.g.,

def test_https():
assert get_web_archive_index(
'https://raw.githubusercontent.com/oduwsdl/ipwb/master/samples/' +
'indexes/salam-home.cdxj'
).startswith('!context ["http://tools.ietf.org/html/rfc7089"]')

...and this is probably fine here, as it is testing the ability to pull in a remote index and its validity in the first line.

machawk1 added a commit that referenced this pull request Aug 21, 2021
in remotely in #739.

Because the test expects the value to already be present in master,
it is invalid. This should help #739 pass.
@ibnesayeed
Copy link
Member

There are a few references throughout the codebase/tests to http://tools.ietf.org/html/rfc7089, which results in a 404 when dereferenced. This PR uses the same URI but with https as the scheme, which resolves.

This was likely an transient issue. I did check earlier that curling HTTP URI returned 404 response while accessing the same from the browser performed redirects. Now, the redirection is working from curl as well.

@machawk1
Copy link
Member Author

@ibnesayeed I encountered it intermittently as well but during this time, the HTTPS URI consistently worked. Any issue w/ merging this?

Comment on lines 29 to 30
with:
ref: ${{ github.event.pull_request.head.sha }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this should be here because this workflow is executed on push as well, for which I do not know what would be the PR head.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! This was added during the discovery phase before I realized that the absolute reference to the URI using the master branch was embedded in the test case. I will remove this change to test.yml.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the change to test.yml in f50d06f (below)

@ibnesayeed
Copy link
Member

ibnesayeed commented Aug 23, 2021

@ibnesayeed I encountered it intermittently as well but during this time, the HTTPS URI consistently worked. Any issue w/ merging this?

I am okay with URIs changed to HTTPS, but not sure about other changes.

@ibnesayeed
Copy link
Member

Based on https://raw.githubusercontent.com/oduwsdl/ipwb/master/.github/workflows/test.yml , can you suggest how we go about using the branch's data as the basis for the tests? Is it even a good idea to do this, @ibnesayeed?

The documentation suggests that it should be the default behavior:

    # The branch, tag or SHA to checkout. When checking out the repository that
    # triggered a workflow, this defaults to the reference or SHA for that event.
    # Otherwise, uses the default branch.
    ref: ''

@machawk1
Copy link
Member Author

The addition sanity check of echo'ing the branch being used with the removal of explicitly changing branches in f50d06f will validate the default behavior.

Comment on lines 29 to 30
- name: Echo branch being used
run: git branch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this step?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the aforementioned sanity check. Not necessary but useful in the currently running GitHub action to validate the "default behavior". If the default behavior is, in fact, to use the PR commit instead of the main branch, I will remove it.

@ibnesayeed
Copy link
Member

The addition sanity check of echo'ing the branch being used with the removal of explicitly changing branches in f50d06f will validate the default behavior.

I am not sure if we need to verify a well-tested and well-used aspect of external code.

@machawk1
Copy link
Member Author

I have removed the sanity check in afa7405, @ibnesayeed. Any other suggestions before we merge this?

Copy link
Member

@ibnesayeed ibnesayeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@machawk1 machawk1 merged commit fd45f93 into master Aug 23, 2021
@machawk1 machawk1 deleted the use-secure-ietf-url branch August 23, 2021 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants