Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.0.3 -> main] Test failure: p2p_sync_throttle_test #898

Merged
merged 3 commits into from
Oct 7, 2024
Merged

Conversation

greg7mdp
Copy link
Contributor

@greg7mdp greg7mdp commented Oct 7, 2024

Resolves #882.

While moving some code around (see here), a call to:

void connection::close(bool reconnect, bool shutdown)

became a call to:

int close(int __fd)

This restores the intended behavior (close the actual connection, instead of closing fd number 1)

Copy link
Member

@spoonincode spoonincode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

obviously we need to merge this in now, but seems the tests are all failing

@heifner
Copy link
Member

heifner commented Oct 7, 2024

obviously we need to merge this in now, but seems the tests are all failing

Seems it has exposed a bug in sync wait timeout causing syncing to not continue correctly. I'm looking into it.

@greg7mdp
Copy link
Contributor Author

greg7mdp commented Oct 7, 2024

Seems it has exposed a bug in sync wait timeout causing syncing to not continue correctly. I'm looking into it.

I wonder if sync_reassign_fetch:

sync_reassign_fetch( c );
c->close(true);
)

calls request_next_chunk:

void sync_manager::sync_reassign_fetch(const connection_ptr& c) {
fc::unique_lock g( sync_mtx );
if( c == sync_source ) {
peer_ilog(c, "reassign_fetch, our last req is ${cc}, next expected is ${ne}",
("cc", sync_last_requested_num)("ne", sync_next_expected_num));
c->cancel_sync();
auto lib = my_impl->get_chain_lib_num();
sync_last_requested_num = 0;
sync_next_expected_num = std::max(sync_next_expected_num, lib + 1);
sync_source.reset();
request_next_chunk();
}
}

which reassigns the same connection to sync_source, which we close immediately afterwards?

@heifner
Copy link
Member

heifner commented Oct 7, 2024

which reassigns the same connection to sync_source, which we close immediately afterwards?

Yes, that is possible in general and since only one connection for this test then it would be the same one. However, I think the issue here is that the throttling node is not sending any blocks after it hits the throttle state.

@greg7mdp
Copy link
Contributor Author

greg7mdp commented Oct 7, 2024

since only one connection for this test then it would be the same one

How can it not be a problem if we post a request and then immediately after close the connection?

@heifner
Copy link
Member

heifner commented Oct 7, 2024

How can it not be a problem if we post a request and then immediately after close the connection?

The connection has not sent a block in an expected timeframe. The response to that is to disconnect. That is our only connection for this test, but regardless, it is the right thing to do. The real bug here is that the node didn't send a block in the expected time frame, see #899.

If we close the connection then any pending requests are not sent.

@greg7mdp greg7mdp changed the title [1.0 -> main] Test failure: p2p_sync_throttle_test [1.0.3 -> main] Test failure: p2p_sync_throttle_test Oct 7, 2024
@greg7mdp greg7mdp merged commit c1e6754 into main Oct 7, 2024
30 of 36 checks passed
@greg7mdp greg7mdp deleted the gh_882_main branch October 7, 2024 20:16
@ericpassmore ericpassmore added bug The product is not working as was intended. test-instability tag issues for flaky tests, high priority to address labels Oct 8, 2024
@ericpassmore
Copy link
Contributor

Note:start
category: System Stability
component: P2P
summary: Fix p2p sync to close connection.
Note:end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The product is not working as was intended. test-instability tag issues for flaky tests, high priority to address
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Test failure: p2p_sync_throttle_test
4 participants