-
Notifications
You must be signed in to change notification settings - Fork 598
Launch PEX on Kubernetes fails #2919
Comments
That log file is a couple of hours off the event you reported, and there are a few things that look suspicious to me (inc. |
The lapse in the timestamps of the logs is because I was using computers in different time zones (deploying from my local computer into the cluster). I was watching the logs in real-time and I can say those logs are related to the issue. Anyway, it's possible that I was missing another important logs of other system that is needed to debug, if so I could replicate the problem. In the other hand, I'm looking for an alternative way of deploying the PEX. I was trying to use the S3 uploader, but I can not understand how it works. I could open another issue for this, if you wish, but anyway... Firstly, I have no idea how to configure the CLI. I have tried multiple combinations of parameters and configuration files, but the only thing I have get to work is modifying the Sorry if this is obvious, but I have seen no example anywhere about using the S3 uploader. And again, I don't understand how to use the Heron CLI with the What I usually do with Kubernetes is: heron config heron set service_url http://…/heron-apiserver:9000
heron submit heron "dist/xxxxxxxxxxx.pex" --verbose - […] And this worked like a charm. But I have no idea which Thank you very much! |
Ah, I was thrown off by the +0000 which made it look like it was all in UTC†, so thought you got logs for another instance of the event. I think this is the relevent section of the log which is littered with
I can't say I happen to know how any of this works at the moment, I am just interested in the issue as I will soon be deploying PEXs to Kubernetes. Does this work with Java toplogies? I think it would be a config issue too, but haven't got to that point myself to know better of the process or documentation. Hopefully someone else will know better and be able to give you more help, or I might encounter the same things and work it out but that would probably be a week or two away. If you do work it out I'd be interested in hearing what the issue is. † out of curiosity, which time zones were the different applications running in, the +XXXX should be meaningful so I think one or both aren't behaving as they should |
While making and starting to test a heronpy PEX, I found that the documentation and example repos don't have valid imports. Were you able to execute your PEXs locally without getting import errors? I don't know if that could be a contributing factor. There is PR #2928 to get the documentation a bit more up to date. |
Yeah, it seems that the documentation is pretty outdated. I've just checked your PR and effectively that fixed the problems with the documented imports that we faced at the beginning. My fear is that the Helm chart I'm using to install Heron in Kubernetes is also outdated. Feel free to ask for anything you think I could help! |
Thanks! I'll probably have questions at some point. Kubernetes moves pretty fast so I wouldn't be surprised if it is the case. That said, it looks like the chart was made with Helm 2.7.2 (vs. latest which is 2.9.1). I don't have practical experience with it yet but imagine I'll be there in 2-3 weeks so will have a look over it. It looks like the the files are here and the chart is generated here. |
I have edited the issue because I tested again with a small PEX and it works (though I have a new problem after that). With big PEX still does not work. Also I have reproduce the issue locally using Minikube. The problem that appeared when the small PEX was successfully submitted was that the topology does not launch because of an insufficient resources error in the launched pod: |
I made this to reproduce the issue: https://github.com/cristobalcl/heron-issue-2919 Also, I tested more times and I can reproduce the issue with small PEX... :( |
I found I had to up the I also got the two issues:
It feels like the moral of the story is "have a beefy AF setup so you are less likely to encounter issues", time for a Dell Precision? |
I am using the same exact issue when using minikube. it failed to upload. in the bookie server log, there ar e a lot of errors. It has error in the worker container as well as the following error I don't use helm. I follow this steps https://apache.github.io/incubator-heron/docs/operators/deployment/schedulers/kubernetes/ |
@Code0x58 , @cristobalcl , any progress on this? it seems nobody get this fixed. |
I think it was something to do with the resources requested by heron being something quite large (1CPU?) but there was way to get the client to request lower resources, there was some talk around the time of this in the general Heron slack channel which may be helpful. I've since fallen off the Heron trail |
After I saw timeout warning, I migrate to a powerful box, the issue is gone. |
I recently get this error each time I try to launch a topology to Kubernetes:
This is running locally through a kubectl proxy, but I get the same error when I try from another POD in the same Kubernetes cluster pointing directly to the Heron API.
This is the log from the heron-apiserver instance in the Heron POD:
heron-apiserver.log
I deploy Heron in Kubernetes (AWS) using Helm:
Of course, the topologies work OK in my local Heron.
It fails both with big (50 mb) and small (1.5 mb) PEX[Edited: with small PEXs the submit works but then appears another error]. In the past, I have got it worked trying repeatedly 10-20 times (in a Jenkins pipeline in the same cluster), but now it does not work anymore.Edited: I can reproduce the error locally using Minikube.
Edited: scripts and code to reproduce the bug: https://github.com/cristobalcl/heron-issue-2919
The text was updated successfully, but these errors were encountered: