Skip to content
This repository has been archived by the owner on Aug 1, 2024. It is now read-only.

Random wakeups happen more than 10 times per month #67

Closed
scottyeager opened this issue Oct 17, 2023 · 5 comments
Closed

Random wakeups happen more than 10 times per month #67

scottyeager opened this issue Oct 17, 2023 · 5 comments
Assignees

Comments

@scottyeager
Copy link
Contributor

scottyeager commented Oct 17, 2023

Farmers are reporting that the farmerbot initiates random wakeups for their nodes more than 10 times per month. Examples below.

Node 4494

Colossus01.log.gz

Relevant lines, showing 11 wakeups during September (log file ends at 9-23):

2023-09-04 05:51:30 [INFO ] [POWERMANAGER] Random wakeup for node 4494
2023-09-04 06:11:30 [INFO ] [DATAMANAGER] Node 4494 is ON.

2023-09-06 09:41:33 [INFO ] [POWERMANAGER] Random wakeup for node 4494
2023-09-06 10:01:33 [INFO ] [DATAMANAGER] Node 4494 is ON.

2023-09-09 23:17:55 [INFO ] [POWERMANAGER] Random wakeup for node 4494
2023-09-09 23:37:55 [INFO ] [DATAMANAGER] Node 4494 is ON.

2023-09-14 18:06:48 [INFO ] [POWERMANAGER] Random wakeup for node 4494
2023-09-14 18:26:49 [INFO ] [DATAMANAGER] Node 4494 is ON.

2023-09-15 08:06:50 [INFO ] [POWERMANAGER] Random wakeup for node 4494
2023-09-15 08:26:50 [INFO ] [DATAMANAGER] Node 4494 is ON.

2023-09-16 08:26:52 [INFO ] [POWERMANAGER] Random wakeup for node 4494
2023-09-16 08:46:51 [INFO ] [DATAMANAGER] Node 4494 is ON.

2023-09-16 13:01:52 [INFO ] [POWERMANAGER] Random wakeup for node 4494
2023-09-16 13:21:52 [INFO ] [DATAMANAGER] Node 4494 is ON.

2023-09-19 21:36:57 [INFO ] [POWERMANAGER] Random wakeup for node 4494
2023-09-19 21:56:57 [INFO ] [DATAMANAGER] Node 4494 is ON.

2023-09-21 15:36:59 [INFO ] [POWERMANAGER] Random wakeup for node 4494
2023-09-21 15:56:59 [INFO ] [DATAMANAGER] Node 4494 is ON.

2023-09-21 20:01:59 [INFO ] [POWERMANAGER] Random wakeup for node 4494
2023-09-21 20:22:02 [INFO ] [DATAMANAGER] Node 4494 is ON.

2023-09-23 15:02:41 [INFO ] [POWERMANAGER] Random wakeup for node 4494
2023-09-23 15:22:41 [INFO ] [DATAMANAGER] Node 4494 is ON.

I have checked that each of these wakeups was successful.

Nodes 4486 & 4488

Colossus02.log.gz

Each node had twelve random wakeups during the days of September included in the log file.

@MarioBassem
Copy link
Contributor

tried to run farmerbot, but it had dependency issues since crystallib is now heavily refactored and baobab is now merged with crystallib, and after fixing some imports, i found that the twinclient module is now archived, iam guessing that it is now deprecated in favor of the griddriver, which does not have all the functionality of the twinclient yet.
i then figured it would be simpler to use a version of crystallib that is not refactored, but there is no mention of which version is used in farmerbot, it uses latest version, so i picked a commit that does not have the refactoring work, and will let the farmer bot use this specific version.

@MarioBassem
Copy link
Contributor

after some investigations with @ashraffouda, we found out that this message is logged whenever there is an attempt to do a random wakeup for a node, but the wakeup may actually fail, and if a failure occurs, the farmerbot does not count this attempt, hence there may be more than 10 log messages per node each month.

@scottyeager
Copy link
Contributor Author

Hi @MarioBassem,

Thanks for having a look. Please check the first example again, node 4494. I have added the log lines for the first subsequent report that the node is up after each random wakeup is initiated. They happen very regularly, about 20 minutes after wakeup is initiated, and I don't see any evidence that failed wakeups are the root cause in this case.

@ashraffouda
Copy link

tried to run farmerbot, but it had dependency issues since crystallib is now heavily refactored and baobab is now merged with crystallib, and after fixing some imports, i found that the twinclient module is now archived, iam guessing that it is now deprecated in favor of the griddriver, which does not have all the functionality of the twinclient yet. i then figured it would be simpler to use a version of crystallib that is not refactored, but there is no mention of which version is used in farmerbot, it uses latest version, so i picked a commit that does not have the refactoring work, and will let the farmer bot use this specific version.

I was checking why farmerbot now is broken, now there are a lot of dependencies gone and can not figure out why and where is it now?
things like baobab.actionrunner ,boabab.client , twinclient I can not go forward with that if things got moved/refactored all related dependencies should related as well
@despiegk and @timurgordon can u help with that if u know where things gone?

@scottyeager
Copy link
Contributor Author

Since crystallib has no recent version tags, we can only infer which version the latest builds of farmerbot were built with by checking the dates:

image

image

image

So to continue development immediately, it looks like we can work against commit 05a5794 of crystallib and the latest commit from development branch of baobab from its own repo.

I'm able to build farmerbot locally with this approach. Going forward we need tags to reference on crystallib—I'll request this.

@xmonader xmonader removed this from 3.12.x Nov 28, 2023
@xmonader xmonader closed this as completed Aug 1, 2024
@xmonader xmonader added this to 3.15.x and 3.14.x Aug 1, 2024
@xmonader xmonader moved this to Done in 3.14.x Aug 1, 2024
@xmonader xmonader removed this from 3.15.x Oct 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants