You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After stopping inputting data to the storage, when using get_unordered_scanners to scan the database data, if a certain replicaserver is restarted, there will be a phenomenon that the data of some shards is not scanned completely and correspond scanner exits early;
The reasons for the problem are analyzed by viewing the code as follows:
In the scanning process, the client requests the server to be divided into two parts, on_get_scanner and on_scan. on_get_scanner: mainly establishes a context with the server; while on_scan: the client continuously scans data from the server
When restarting a replicaserver, some primary shards will be transferred to the new replicaserver, so the context determined by the client and server through on_get_scanner is lost; at this time, the client will have its own retry mechanism in on_get_scanner and the new replicaserver to determine the new Context, but at this time, the client will pass the last scanned key as start_key to on_get_scanner; the on_get_scanner function will mistakenly think that the intention of this scan is a fixed hashkey scan instead of a full scan, because start_key is a non-empty string
The text was updated successfully, but these errors were encountered:
@Smityz@shuo-jia it is not the problem fixed in XiaoMi/pegasus-java-client#156 and XiaoMi/pegasus-go-client#86 ;
it's a new problem;
int this problem
in this problem, server side lost its context with the client and java client's logic will meet the block showing in the above picture;
int this case, java client will call 'on_get_scanners' again and restart the scan process, but at this time, the "start_key" field int the request struct will be filled with hashkey then cause the bug i have said above
After stopping inputting data to the storage, when using get_unordered_scanners to scan the database data, if a certain replicaserver is restarted, there will be a phenomenon that the data of some shards is not scanned completely and correspond scanner exits early;
The reasons for the problem are analyzed by viewing the code as follows:
In the scanning process, the client requests the server to be divided into two parts, on_get_scanner and on_scan. on_get_scanner: mainly establishes a context with the server; while on_scan: the client continuously scans data from the server
When restarting a replicaserver, some primary shards will be transferred to the new replicaserver, so the context determined by the client and server through on_get_scanner is lost; at this time, the client will have its own retry mechanism in on_get_scanner and the new replicaserver to determine the new Context, but at this time, the client will pass the last scanned key as start_key to on_get_scanner; the on_get_scanner function will mistakenly think that the intention of this scan is a fixed hashkey scan instead of a full scan, because start_key is a non-empty string
The text was updated successfully, but these errors were encountered: