-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
seqid mismatch during CLOSE #96
Comments
Thanks for the info. I will try to reproduce it. |
I'm not able to reproduce this. To you run any special test, or do yo see errors on the application side? |
As we have experienced a lot of issues, I was able to write a small piece of code in PHP to reproduce it. It's quite straightforward: just do some reads or write from different process on the same file at the same time and you can reproduce the issue. The code is forking 4 process, and each process will try to write 4 times a line in the same log file. Here is the test code (can do a write or a read test):
And the results:
Of course, the total time of 16 seconds is showing a blocking issue here! And the content of the log file written:
You can see the blocking time of around 5 seconds on some operations. |
Is this possibly related to #120? |
@dkocher Sounds like it. The reproducer on my side was
|
Motivation: As parallel threads modifies open-stateid sequence number the FileTracker should always return a copy to avoid that sequence is modified before reply is sent. Modification: Update FileTracker#addOpen and FileTracker#downgradeOpen to return a copy of stateid. Result: Spec compliant behaviour in concurrent environment. Fixes: dCache#96 Acked-by: Albert Rossi Target: master, 0.24, 0.23 (cherry picked from commit 73d73af) Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
Motivation: As parallel threads modifies open-stateid sequence number the FileTracker should always return a copy to avoid that sequence is modified before reply is sent. Modification: Update FileTracker#addOpen and FileTracker#downgradeOpen to return a copy of stateid. Result: Spec compliant behaviour in concurrent environment. Fixes: dCache#96 Acked-by: Albert Rossi Target: master, 0.24, 0.23 (cherry picked from commit 73d73af) Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
When multiple OPEN and CLOSE operations are interleaved, it's looking like the wrong seqid is used for the test:
nfs4j/core/src/main/java/org/dcache/nfs/v4/OperationCLOSE.java
Line 58 in b4af4c3
Before this test, we added a log to check the different values:
_log.info("[contextCurrentStateId = " + context.currentStateid() + " , _args.opclose.open_stateid = " + _args.opclose.open_stateid + " , nfsState.stateid() = " + nfsState.stateid() + " , stateid = " + stateid + "]");
The seqid returned by
nfsState.stateid()
seems to be the one from the last OPEN operation, so the test is failing.Here is the capture and the log :
bug-nfs-multiple-open-close.zip
In the log, you'll notice three times where the seqid isn't the one expected. And in the capture, we can see that it's the seqid of the previous OPEN. The NFS client is a Linux VM (minikube).
The text was updated successfully, but these errors were encountered: