-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
0.8 upgrade to 1.0 failed - no node process (1 out of 20 nodes) #2688
Comments
This mkdir seems to be the last failure: Perhaps mkdir -p |
no problem with adding -p |
According to the log, it happened in the previous upgrade as well. Why didn't we have this problem on all other nodes? I think it requires a bit deeper digging. |
I can take a second look at the logs |
please do so. thanks |
@tamireran
and then
Was the agent down for these periods of time ? Also we see for some reason that the forever service still prints during the installation... continuing to check |
It looks like forever was still running the noobaa local service. |
Probably, as we had a problem in the previous version as well. A restart
solve the problem for the previous version. The problem was that the agent
didn't try to reconnect after upgrade.
…On Feb 15, 2017 3:23 AM, "Nimrod Becker" ***@***.***> wrote:
@tamireran <https://github.com/tamireran>
I see a gap in the agent logs (usr/local/noobaa/log) between:
Nov-20 19:54:12.382 [Agent/2019] [L0] core.agent.agent_cli:: memory usage ...
Feb-7 17:19:18.622 [Agent/1323] [L0] core.rpc.rpc_n2n_agent:: N2N AGENT accept_signal ...
and then
Feb-8 18:10:12.268 [Agent/1323] [L0] core.agent.agent:: agent-147824126724-for- ...
Feb-9 19:00:18.456 [Agent/1323] [L0] core.rpc.ice:: ICE ACCEPT REMOTE INFO ...
Was the agent down for these periods of time ?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2688 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADt7ovtt2eP6bqGqlfPWtYAd3gUsDybcks5rcuA0gaJpZM4L80EP>
.
|
@NimrodGeva we need to make sure that in the upgrade we clean this issue if it occurred during previous upgrades. |
We tried upgrading from 0.5.3, 0.5.0 and 0.8, all with agents, straight to 1.1, and the problem did not reproduce. Situations where an agent is unresponsive to upgrades should not happen in the field. Those are always possible and unfortunately sometimes will only be solved using some hotfix/workaround. |
Environment info
Actual behavior
No node process.
Expected behavior
Steps to reproduce
ssh notadmin@13.92.235.117
You know the password
http://noobaaonline.eastus.cloudapp.azure.com:8080/
Username: cleverball@noobaa.com
Password: XlkJvFXz
Screenshots or Logs or other output that would be helpful
(If large, please upload as attachment)
/var/log/setup.out
The text was updated successfully, but these errors were encountered: