Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add webHDFS support. Fixes #7540 #8443

Closed

Conversation

alexdittmann
Copy link
Contributor

@alexdittmann alexdittmann commented Apr 21, 2022

Signed-off-by: Alexander Dittmann alexander.dittmann@sap.com

Fixes #7540

This PR adds support for webHDFS input/output artifacts, supporting two ways for authentication:

Note:
I initially created a (private) fork of this repo, where I successfully tested these changes against, both with an Azure and SAP Hana Datalake store. When creating this official fork from the latest master, however, submitting new workflows always resulted in StartError of the workflow pod. Here is the events log:

Events:
  Type     Reason     Age   From               Message
  ----     ------     ----  ----               -------
  Normal   Scheduled  21s   default-scheduler  Successfully assigned argo/awesome-poochenheimer to k3d-k3s-default-server-0
  Normal   Pulled     22s   kubelet            Container image "quay.io/argoproj/argoexec:latest" already present on machine
  Normal   Created    22s   kubelet            Created container init
  Normal   Started    22s   kubelet            Started container init
  Normal   Pulled     22s   kubelet            Container image "quay.io/argoproj/argoexec:latest" already present on machine
  Normal   Created    21s   kubelet            Created container wait
  Normal   Started    21s   kubelet            Started container wait
  Normal   Pulled     21s   kubelet            Container image "argoproj/argosay:v2" already present on machine
  Normal   Created    21s   kubelet            Created container main
  Warning  Failed     21s   kubelet            Error: failed to create containerd task: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "/var/run/argo/argoexec": stat /var/run/argo/argoexec: no such file or directory: unknown

Im using k3d btw. Can this error somehow be mitigated? (I have run make argo-exec already)

Also appreciate any feedback!

Thanks
Alex

Signed-off-by: Alexander Dittmann <alexander.dittmann@sap.com>
@alexec
Copy link
Contributor

alexec commented Apr 21, 2022

I have a question. How is this different to using a HTTP artifact? Could we extend HTTP artifact somehow to support TLS client cert + oauth2? That would give HTTP users lots more features, but also work for WebHDFS.

@alexdittmann
Copy link
Contributor Author

I have a question. How is this different to using a HTTP artifact? Could we extend HTTP artifact somehow to support TLS client cert + oauth2? That would give HTTP users lots more features, but also work for WebHDFS.

When I initially started working, the HTTP artifact only had the basic input artifact. Since webhdfs is a strict protocol (e.g. clearly defined how data is pushed), i thought it would be easier to handle it separately. However, taking a look at the recent changes to the http artifact, i see that that it would fit quite well into there now. The only thing that would need to be taken care of is the redirect logic.

@alexec
Copy link
Contributor

alexec commented Apr 22, 2022

I thought that might be the case. HTTP only change this week to be amenable to this, so your timing is perfect.

We should probably add an example of WebHDFS to the docs.

@alexdittmann
Copy link
Contributor Author

closing this PR in favour for a new one: #8468

@alexec alexec mentioned this pull request May 5, 2022
13 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support webhdfs artifacts
2 participants