-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: use 0.0.58 freeze admin account for freeze transaction #365
Conversation
…init-account Signed-off-by: Jeffrey Tang <jeffrey@swirldslabs.com>
Signed-off-by: Jeffrey Tang <jeffrey@swirldslabs.com>
Signed-off-by: Jeffrey Tang <jeffrey@swirldslabs.com>
Unit Test Results - Linux 1 files 19 suites 1m 27s ⏱️ Results for commit a5ce03c. ♻️ This comment has been updated with latest results. |
Unit Test Results - Windows 1 files 19 suites 1m 32s ⏱️ Results for commit a5ce03c. ♻️ This comment has been updated with latest results. |
E2E Relay Tests Coverage Report1 files 1 suites 3m 1s ⏱️ Results for commit a5ce03c. ♻️ This comment has been updated with latest results. |
E2E Tests Coverage Report56 tests 55 ✅ 5m 53s ⏱️ For more details on these failures, see this check. Results for commit a5ce03c. ♻️ This comment has been updated with latest results. |
E2E Node PEM Stop Add Tests Coverage Report 1 files 1 suites 9m 41s ⏱️ For more details on these failures, see this check. Results for commit a5ce03c. ♻️ This comment has been updated with latest results. |
E2E Mirror Node Tests Coverage Report11 tests 11 ✅ 3m 42s ⏱️ Results for commit a5ce03c. ♻️ This comment has been updated with latest results. |
E2E Node PFX Kill Add Tests Coverage Report 1 files 1 suites 10m 34s ⏱️ For more details on these failures, see this check. Results for commit a5ce03c. ♻️ This comment has been updated with latest results. |
E2E Node Local Build Tests Coverage Report8 tests 8 ✅ 5m 29s ⏱️ Results for commit ebfa104. ♻️ This comment has been updated with latest results. |
Co-authored-by: Jeromy Cannon <jeromy@swirldslabs.com> Signed-off-by: JeffreyDallas <39912573+JeffreyDallas@users.noreply.github.com>
…init-account Signed-off-by: Jeffrey Tang <jeffrey@swirldslabs.com>
|
||
// set operator of freeze transaction as freeze admin account | ||
const accountKeys = await this.accountManager.getAccountKeysFromSecret(FREEZE_ADMIN_ACCOUNT, config.namespace) | ||
const freezeAdminPrivateKey = accountKeys.privateKey |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E2E test of adding node failed here due to accoutnKeys is "undefined",
seems 0.0.58 accout does not preexist yet.
If I added node init() first, then there are error during update account
failed to update account keys for accountId 0.0.300
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in your logs it looks like solo account init
is failing. Does it ever pass? Originally, I had it running the full set of accounts in an E2E test, but Lenin reduced it to a subset to make it run faster. So, we haven't been running the full set on a regular basis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If running the following commands from terminal, everything is working,
kind delete cluster -n "${SOLO_CLUSTER_NAME}" || true
kind create cluster -n "${SOLO_CLUSTER_NAME}" || return
solo init -d ${FST_CHARTS_DIR} --namespace "${SOLO_NAMESPACE}" -i node0,node1,node2 -t v0.49.0-alpha.2 -s "${SOLO_CLUSTER_SETUP_NAMESPACE}" || return
solo node keys --gossip-keys --tls-keys --key-format pem || return
solo cluster setup || return
solo network deploy || return
solo node setup || return
solo node start || return
solo account init || return
solo node add -i node3 || return
but E2E add not test did not have solo account init
part, so i added:
it('should succeed with init command', async () => {
const status = await accountCmd.init(argv)
expect(status).toBeTruthy()
}, 180000)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
originally when I wrote this code, I found that the first time the nodes were doing a write transaction they would create the system accounts and pause for a few hundred milliseconds. This caused an issue where the grpc calls all being released at once would fail.
Later, when Lenin was trying to fix the logic, he removed all of that logic and instead of running them in parallel, just made them run sequentially. Later, he found the bug was actually in @hashgraph/sdk, and put in a fix to correct the loggers they were creating and not closing. I asked him to re-enable the parallel, which he did. But, I don't think he updated the logic to do the initial single account update call before releasing the rest. He also removed the throttling I had. The throttling was less than ideal, however, it helped me get past the issue created by the logger leak (which I did not know existed at that time), and prevented me from overwhelming the consensus nodes.
I'm guessing that this might be related to either:
- the consensus nodes pause to do genesis system account creation logic and rejecting the flood of grpc calls
- the haproxy and/or consensus node is being overwhelmed with connections and not keeping up correctly
In the case that it is #1, you can check the timestamps. I believe in the hgcaa.log, you can see when it is creating the system accounts and line that up with your grpc call timestamps.
If it is #2, you could try figuring out how to fix the consensus node or haproxy to handle the connections without error; or, you could figure out how to throttle the transactions to prevent them from releasing too many at once.
It has been awhile since I looked at this code, I think Lenin in one version of the code released a certain number of calls before waiting and releasing some more. If that code is still there, you could reduce that number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification.
Just wonder why the same piece of code was called
when solo account init
is used in command line, working fine
But when called using accountCmd.init(argv), it failed.
The throttling, or parallel/sequential mechanism should be the same when used in the above two
different scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when you run from the command line, it will dump the configs into the logs file. I think it also dumps them when you run accountCmd.init(argv). You could take the rows and compare them to see what parameters might be different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not seeing any difference in argv dumped values.
Signed-off-by: Jeffrey Tang <jeffrey@swirldslabs.com>
To make sure we compare apple to apple I created a script
also lowered total number of system account so the test can finish quicker if following the setup script by each k8.createSecret() would take 200-300ms, step to reproduce hgcaa log:
if following the setup script by a modified e2e test each k8.createSecret() would take 7000-8000 ms, step to reproduce hgcaa log:
Now need to figure out why in scenario of running e2e test, even with the same pre-existing cluster/network |
Signed-off-by: Jeffrey Tang <jeffrey@swirldslabs.com>
no longer needed |
Description
This pull request changes the following:
Related Issues