Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guarding and encapsulating properly CStakerStatus inside the wallet. #2131

Closed

Conversation

furszy
Copy link

@furszy furszy commented Jan 11, 2021

Solving an edge case segfault at shutdown caused by the lack of thread access synchronization for the pStakerStatus member inside the wallet. The member is accessed by at least three different threads directly. One from the miner, the wallet shutdown and the GUI status update.

@furszy furszy self-assigned this Jan 11, 2021
Copy link

@random-zebra random-zebra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. But maybe it would be cleaner to have the mutex defined inside the CStakerStatus class (rather than the wallet), and used directly in its setters/getters (rather than on the wallet wrappers).

Comment on lines +552 to +553
Optional<uint256> opLastHash = pwallet->GetStakingStatusLastHash();
if (!opLastHash || g_best_block == *opLastHash)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:
we have already checked that pwallet->HasStakingStatus(), so we know that GetStakingStatusLastHash() returns a non-nullopt. So we can just do

Suggested change
Optional<uint256> opLastHash = pwallet->GetStakingStatusLastHash();
if (!opLastHash || g_best_block == *opLastHash)
if (g_best_block == *pwallet->GetStakingStatusLastHash())

or, alternatively, we can keep this as is, and remove the previous call to pwallet->HasStakingStatus() at line 546.

@furszy
Copy link
Author

furszy commented Jan 12, 2021

The goal behind the mutex in the wallet is to guard the pointer as well, not only the internal object state. The segfault that had was due a bad pointer access. Thus why synchronized the access to it here.
Will do another change to protect the access and be able to encapsulate the mutex inside the staking status.

@random-zebra
Copy link

The goal behind the mutex in the wallet is to guard the pointer as well

Could use cs_wallet to guard the pointer itself (though it shouldn't be needed, as that's only set in CWallet::SetNull).

@furszy
Copy link
Author

furszy commented Jan 12, 2021

we have to go into the opposite direction, prevent using the wallet mutex as much as possible. The wallet mutex is already being obtained in the message processing thread (and many other threads as well), every unnecessary lock that we do there, it will slow down the sync time and the GUI responsiveness.

Plus, this kind of data has nothing to do with the wallet and could actually be decoupled to another file. But well, for that we should encapsulate the miner properly into an object first etc etc.


though it shouldn't be needed, as that's only set in CWallet::SetNull

Before the second commit, we were deleting the pointer manually. Which in most of the scenarios should be fine but.. for some reason the issue popped up here twice and the debugger caught it. It could be a bad wallet pointer at shutdown that tries to access the member as well.
I'm doing some changes now to directly prevent the access at shutdown.

@furszy furszy force-pushed the 2020_wallet_guard_staking_status branch from 426ad40 to a9a4960 Compare January 12, 2021 18:11
@furszy
Copy link
Author

furszy commented Jan 12, 2021

Updated.

@random-zebra random-zebra added this to the 5.1.0 milestone Jan 17, 2021
Copy link

@random-zebra random-zebra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't introduce custom copy-constructors, unless absolutely needed (and in that case we should provide also copy operator and destructor).

Here we are defining one, because of the mutex.
But simply ignoring the mutex in the constructor is unsafe, and, in general, defining a thread-safe copy is tricky, degrades performance (we have to first construct, then copy-assign... can't use the initializer list), and is just ugly (e.g. what if our class has const members).

We have the same issue with some tiertwo classes, which will be fixed in the future refactoring.
In most cases, these issues can be avoided with a better class architecture (thread design should occur mostly at the application level, not on a per-class basis).

In this specific case, CStakerStatus is class with a single object, which is shared (via a unique_ptr) across all threads. We don't need to copy it. Ever.
The only reason why we are adding these constructors, is because of Optional<>, which is only used as return value of GetStakerStatus (which is only used by the getstakingstatus rpc).
We can simply return the raw pointer there, thus avoiding the need for a custom copy-ctor directly.

Comment on lines 4782 to 4790
void CStakerStatus::SetNull()
{
LOCK(cs_stakerStatus);
SetLastCoins(0);
SetLastTries(0);
SetLastTip(nullptr);
SetLastTime(0);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each one of the Set* functions here will lock cs_stakerStatus independently, so we either remove the initial LOCK, or we keep it, but then operate directly on the private memberts (nCoins, nTries, tipBlock and nTime), instead of calling the setters.

I prefer the second one (as it keeps the whole SetNull atomic).

Comment on lines 191 to 192
explicit CStakerStatus() {}
CStakerStatus(const CStakerStatus &status);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove these constructors. They are only needed if we use Optional<CStakerStatus>.
And, in general, we should follow the rule of three (ref: #2089): i.e. if we really need a custom copy ctor, then we also need also a custom copy assignment operator, and a custom destructor.

Comment on lines 4793 to 4799
CStakerStatus::CStakerStatus(const CStakerStatus &status)
{
nCoins = status.nCoins;
nTime = status.nTime;
nTries = status.nTries;
tipBlock = status.tipBlock;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not thread-safe.
Both this and status could be accessed by other threads during the initialization (neither this.cs_stakerStatus nor status.cs_stakerStatus is being held).

As a rule of thumb, we should avoid copying objects of thread-safe classes.

@furszy
Copy link
Author

furszy commented Jan 23, 2021

ok yes, agree on every aspect. Doing the changes.

@furszy furszy force-pushed the 2020_wallet_guard_staking_status branch from a9a4960 to f17d80d Compare January 23, 2021 17:07
@random-zebra random-zebra modified the milestones: 5.1.0, Future Mar 1, 2021
@furszy furszy marked this pull request as draft March 4, 2021 23:39
@furszy
Copy link
Author

furszy commented Mar 4, 2021

Moved to draft. Will back to this one after v5.1.0.

random-zebra added a commit that referenced this pull request Mar 6, 2021
c853471 init: shutdown and return early if ActivateBestChain() fails. (furszy)
1961163 qt: Poll ShutdownTimer after init is done (Marco)
21d56e7 [GUI] Topbar: don't try to poll for data if shutdown was requested. (furszy)

Pull request description:

  First commit was decoupled from #2131, the topbar staking status polling should not try to refresh data if shutdown was requested.
  Second commit is coming from bitcoin#12377.
  Third commit is starting the shutdown process and returning early (without starting the import thread) if the best chain cannot be activated during the node's initialization.

ACKs for top commit:
  random-zebra:
    utACK c853471
  Fuzzbawls:
    utACK c853471

Tree-SHA512: 8d5b289df0ba30d5f654b702e49b01127aa2470550d1313ed63022b0e6c737b3d092a50bd9caf759ea1308f20b6bab65a2a01ff89e7758e50826d389b257e16b
@furszy furszy closed this Nov 29, 2022
@furszy furszy deleted the 2020_wallet_guard_staking_status branch November 29, 2022 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants