Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add WebHDFS support for HTTP artifacts. Fixes #7540 #8468

Merged

Conversation

alexdittmann
Copy link
Contributor

@alexdittmann alexdittmann commented Apr 25, 2022

Signed-off-by: Alexander Dittmann alexander.dittmann@sap.com

Fixes #7540

This PR adds support for webHDFS for HTTP input/output artifacts. This is accomplished by:

  • explicitly following temporary redirects (status code 307, see here)
  • adding supporting for two ways of authentication:
    • via certificates: the client authenticates via a tls certificate pair
    • via OAuth2: the client authenticates via an OAuth2 token

An example yaml can be found here: https://github.com/argoproj/argo-workflows/compare/master...alexdittmann:feat-webhdfs-support-for-http-art?expand=1#diff-731816132908abcbbf4286af6ac8c6d9241ef1d7de7f6f00d92a6416e8106e8f

@alexdittmann alexdittmann force-pushed the feat-webhdfs-support-for-http-art branch from e9b3e7d to fcd460d Compare April 25, 2022 17:18
Signed-off-by: Alexander Dittmann <alexander.dittmann@sap.com>
@alexdittmann alexdittmann force-pushed the feat-webhdfs-support-for-http-art branch from fcd460d to 1a7d50f Compare April 26, 2022 11:55
@alexdittmann
Copy link
Contributor Author

What is the procedure with the required checks before making the PR ready for review, is there some action required? Three of them passed, but the others are in the "Expected — Waiting for status to be reported".

And another thing regarding updates of the PR: Do I always have to make sure that there is only one single commit, or is this only required for the PR creation?
Thanks so far!

Copy link
Contributor

@alexec alexec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some minor change, I think we probably want to have an E2E test for HTTP, can you check? if not, can you add a http_test.go, you might be able use the minio to test against.

pkg/apis/workflow/v1alpha1/workflow_types.go Outdated Show resolved Hide resolved
pkg/apis/workflow/v1alpha1/workflow_types.go Show resolved Hide resolved
pkg/apis/workflow/v1alpha1/workflow_types.go Outdated Show resolved Hide resolved
pkg/apis/workflow/v1alpha1/workflow_types.go Outdated Show resolved Hide resolved
workflow/artifacts/artifacts.go Outdated Show resolved Hide resolved
workflow/artifacts/artifacts.go Show resolved Hide resolved
workflow/artifacts/http/util.go Outdated Show resolved Hide resolved
workflow/artifacts/http/util_test.go Outdated Show resolved Hide resolved
Signed-off-by: Alexander Dittmann <alexander.dittmann@sap.com>
@alexdittmann
Copy link
Contributor Author

alexdittmann commented Apr 27, 2022

some minor change, I think we probably want to have an E2E test for HTTP, can you check? if not, can you add a http_test.go, you might be able use the minio to test against.

First of all, thanks for your feedback!
I agree, an e2e test would make sense, but I am not sure how to start. I was looking for an example and i only found something for s3 in the e2e/functional_test.go (e.g. here). Should I create something similar for http in a new http_test.go file or am i on the wrong track?

@alexec
Copy link
Contributor

alexec commented Apr 27, 2022

Yes. I would base it on ArtifactsSuite.

@alexec
Copy link
Contributor

alexec commented Apr 27, 2022

To write through tests, you'd need to start (I'm guessing) Hadoop. We do something similar for S3, by starting a single MinIO pod in the test system. MinIO is lightweight, so is Hadoop lightweight?

Signed-off-by: Alexander Dittmann <alexander.dittmann@sap.com>
@alexdittmann
Copy link
Contributor Author

alexdittmann commented Apr 28, 2022

Hi, thanks for the input. I would suggest using httpbin.org, this allows for testing the http artifacts also with the authentication types (only caveat is that this only works for the input artifacts, because there are only auth endpoints for GET).

Signed-off-by: Alexander Dittmann <alexander.dittmann@sap.com>
Signed-off-by: Alexander Dittmann <alexander.dittmann@sap.com>
@alexdittmann
Copy link
Contributor Author

Can we make this ready for review and enable the tests? Or is there still something missing?

Copy link
Contributor

@alexec alexec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just more minor comments. Because the PR is draft, I've only provided comments so far. One you mark it ready for review I'll check it out and run it myself and give a thorough review.

test/e2e/functional_test.go Outdated Show resolved Hide resolved
test/e2e/functional_test.go Outdated Show resolved Hide resolved
test/e2e/functional_test.go Outdated Show resolved Hide resolved
Signed-off-by: Alexander Dittmann <alexander.dittmann@sap.com>
@alexdittmann alexdittmann marked this pull request as ready for review May 4, 2022 07:49
@alexec
Copy link
Contributor

alexec commented May 4, 2022

CI builds did not run. You also need to sync with master. Syncing should cause the builds to run.

Signed-off-by: Alexander Dittmann <alexander.dittmann@sap.com>
@alexec alexec changed the title feat: Add webHDFS support for HTTP artifacts. Fixes #7540 feat: Add WebHDFS support for HTTP artifacts. Fixes #7540 May 5, 2022
@alexec
Copy link
Contributor

alexec commented May 5, 2022

We should have a doc (in ./docs) to show how to use this. E..g with Hadoop. This could be in this PR, or in another PR/issue.

#8635

@alexec alexec self-assigned this May 13, 2022
@alexec alexec enabled auto-merge (squash) May 13, 2022 17:46
@alexec alexec merged commit 6b9dc26 into argoproj:master May 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support webhdfs artifacts
2 participants