Fix exception when saving and autosave trigger at the same time #6694

Siedlerchr · 2020-07-17T16:49:23Z

Fixes #6684
Fixes #6644
Fixes #6102
Fixes #6000

Summary:
Manual save + autosave produce an exception/race condition on saving, because both want to access the same file for moving the temp file to the target file.

Solution: Throttle multiple saving operations in a kind of queue

Change in CHANGELOG.md described (if applicable)
Tests created for changes (if applicable)
Manually tested changed features in running JabRef (always required)
Screenshots added in PR description (for UI changes)
Checked documentation: Is the information available and up to date? If not created an issue at https://github.com/JabRef/user-documentation/issues or, even better, submitted a pull request to the documentation repository.

tobiasdiez · 2020-07-18T10:50:41Z

I wouldn't remove auto-saving. Actually, I feel this should be the standard behavior: why do I need to remember as a user to save the file?

Siedlerchr · 2020-07-18T11:05:35Z

I will try if I can find a way to lock the saving to one thread

tobiasdiez · 2020-07-18T11:09:42Z

That might lead to performance problems: saving might take a few seconds so in the worst case the save requests add up. The same with "Save all" etc.

I think the underlying problem is really that the tmp file is reused between save operations. This is really not a good idea (also for data integrity).

Siedlerchr · 2020-07-18T11:13:31Z

I'm open for ideas: If you have autosave running and manually hit saving you run into problems. Even if you would write to another temp file first that doesn't help. You will get an exception nonetheless as the file is locked by another thread when it is moved.
The idea would be cancel any further attempts of saving of the same database when one is already running.

tobiasdiez · 2020-07-18T12:16:42Z

You will get an exception nonetheless as the file is locked by another thread when it is moved.

Which file is locked?

My idea was:

Write to unique temp file
Copy temp file over target file

The first step should only lock the temp file, and write to it. There is no problem there since every write process has its own temp file, and there is no interaction. The second step is atomic, so there is no locking or inter-thread problem.

In addition to these technicalities, I think it be worthwhile to improve the user experience a bit. I guess it would be good to hard-cancel every running auto-save and normal save (and delete the temporary file), as soon as the user runs a new save for the same file. This reduces overhead.

Siedlerchr · 2020-07-18T15:00:57Z

@tobiasdiez Even though you have atomic moving, the other thread will throw an IllegalAccessException.

Following situation:

Thread Autosave A
Thread Manual Save B

A writes to temp files and now calls the Atomic File Move => Performs Atomic Moves temp file A to target file
B writes to temp file and also wants to call the Atomic File Move (while the other thread is still moving) => Boom! You will get an IllegalAcessException for the target file because the other thread is still in the process of moving and probably has a lock on that file.

To reproduce:
Set a breakpoint in that close method of AtomicFileOutputstream
Type something in a field, wait for autosave to trigger
Manually hit save.

the only way one could solve this would be to cancel the running? save operation

tobiasdiez · 2020-07-18T17:17:10Z

But this exception is coming from this lock, isn't it? (Which comes from the fact that we are writing to the same tmp file)

jabref/src/main/java/org/jabref/logic/exporter/AtomicFileOutputStream.java

Line 85 in ee2c985

temporaryFileLock = ((FileOutputStream) out).getChannel().lock();

Since Files.move(...) is atomic, I cannot set a breakpoint in it - so it's hard to test if an exception is thrown if two request for moving to the same target are received. I would expect that if thread A calls Files.move(...), moving started, thread B calls Files.move(...) while move is in progress, then B simply waits until A is finished (since the move is atomic, the file system shouldn't be able to do anything in between and just queue further moves).

Siedlerchr · 2020-07-18T17:34:05Z

The temporary file lock is never set, its always null because the Files... Returns directly a File Channel. (see the comment in the code in my changes).
It's sufficient to set a breakpoint in the close method, wait for both threads to arrive and then continue running.
Atomic move will only work atomic if both files are on the same file disk (at least for unix)

I will update the code to show the problem

tobiasdiez · 2020-07-18T19:01:46Z

Good! Can you also please add a comment at the place where the exception is thrown (I don't really have time right now to play with this in detail).

Siedlerchr · 2020-07-18T19:43:57Z

I tested the idea to create the file in the System's TEMP Folder, but as I expected an atomic move is not possible if the TEMP folder is on a different disk than the library file. (My Temp folder is on a SSD, while my jabref bib is on a different HDD, "Atomic move between providers is not supported)

I tested you idea with a unique file name in the same folder which results in an exception when multipe threads try to acess the file. I commented in the code where to put the breakpoint.

Locking the section could solve this, but the problem is that when the saving is done manually, it's the FX thread which executes the save operation. So that should also be a background task

1:25:40.172 [pool-6-thread-1] ERROR org.jabref.logic.autosaveandbackup.BackupManager - Error while saving to fileX:\Users\CS\Documents\_JABREFTEMP\newBiblatexTEst.bib.sav
java.nio.file.AccessDeniedException: X:\Users\CS\Documents\_JABREFTEMP\newBiblatexTEst.bib.sav112661705549979460.tmp -> X:\Users\CS\Documents\_JABREFTEMP\newBiblatexTEst.bib.sav
	at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:89) ~[?:?]
	at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) ~[?:?]
	at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:309) ~[?:?]
	at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:292) ~[?:?]
	at java.nio.file.Files.move(Files.java:1426) ~[?:?]
	at org.jabref.logic.exporter.AtomicFileOutputStream.close(AtomicFileOutputStream.java:203) ~[main/:?]
	at sun.nio.cs.StreamEncoder.implClose(StreamEncoder.java:353) ~[?:?]
	at sun.nio.cs.StreamEncoder.close(StreamEncoder.java:168) ~[?:?]
	at java.io.OutputStreamWriter.close(OutputStreamWriter.java:255) ~[?:?]
	at org.jabref.logic.exporter.BibDatabaseWriter.savePartOfDatabase(BibDatabaseWriter.java:221) ~[main/:?]
	at org.jabref.logic.exporter.BibDatabaseWriter.saveDatabase(BibDatabaseWriter.java:161) ~[main/:?]
	at org.jabref.logic.autosaveandbackup.BackupManager.performBackup(BackupManager.java:139) ~[main/:?]
	at java.util.Optional.ifPresent(Optional.java:176) ~[?:?]
	at org.jabref.logic.autosaveandbackup.BackupManager.lambda$4(BackupManager.java:164) ~[main/:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:832) [?:?]
21:25:40.184 [JavaFX Application Thread] INFO  org.jabref.gui.JabRefDialogService - Saving library...

tobiasdiez · 2020-07-18T21:17:55Z

Yeah, seems like we need a wrapper that controls which save operations are in progress, and if there is already one for the target file, then stops the old one and only then starts the new one... damn.

Thanks for your time investigating this!

koppor · 2020-07-21T06:46:36Z

@tobiasdiez's one works, because we use temp files.

Worker-Queue of Guava?

An ADR is required ^^.

* upstream/master: Simplify deployment.yml (#6703) Update snap.yml (#6702) Update snap.yml Fix snap (#6701) change snap action from obsoloete to newer (#6700)

try to combine autosave logic and normal save actions TODO: how to handle return values of saves?

Siedlerchr · 2020-07-26T20:29:50Z

Idea: put the logic into save database action

koppor · 2020-07-27T22:03:46Z

In #5669 (comment) - I put synchronized here and there (with thinking where exactly it shoulb be). Maybe, this could help here, too?

koppor · 2020-07-27T22:18:16Z

This somehow relates to #5967, but I don't know, how.

Siedlerchr · 2020-07-28T05:30:56Z

Synchronizing won't help here (always a new instance created) and I already have an idea how to fix this

calixtus · 2020-08-20T20:28:14Z

What's the status here?

Siedlerchr · 2020-08-20T20:53:25Z

Work in progress. Have an idea to cancel the running or queue the new save actions

* upstream/master: (120 commits) Follow up fix for copy paste (#6820) Add CSS Customisation (#6725) Separate signing and notarizing (#6822) Remove checkstyle hack. 8.36 got released (#6816) Feature/enable lucene query parsing (#6799) New release cycle Bump WyriHaximus/github-action-wait-for-status from v1.1.2 to v1.2 (#6814) Bump mockito-core from 3.5.5 to 3.5.7 (#6813) Bump classgraph from 4.8.87 to 4.8.89 (#6812) Bump me.champeau.gradle.jmh from 0.5.0 to 0.5.1 (#6811) Bump checkstyle from 8.35 to 8.36 (#6810) Improve Changelog Refactor edit action (#6808) Fixed typo in BuildInfo (#6807) disable checkstyle for generated fix checkstyle Simplify check-links.yaml (markdown-link-check) (#6720) Rename /gen to /generate (#6800) Disable CSL refresh on push (#6803) New Crowdin updates (#6804) ... # Conflicts: # src/main/java/org/jabref/gui/JabRefFrame.java

github-actions · 2020-08-31T15:22:24Z

The JabRef maintainers will add the following name to the AUTHORS file. In case you want to use a different one, please comment here and adjust your name in your git configuration for future commits.

Nils Streijffert

prepare save action executor TODO: How to handle return value Wat for future?

Siedlerchr · 2021-04-09T19:09:45Z

Hmmm. Then I think my queue is already sufficient. The key problem is however that we can't create a lock on a file and that there is no atomic move on all systems. So this will create problems with Dropbox or whatever?

tobiasdiez · 2021-04-09T21:13:39Z

Yes, I think the queue would already be a huge improvement.

The only problem I can remember with our "atomic write" has been the changed file security meta data. But that was fixed as well. Do we really have reports for issues with Dropbox?

koppor · 2021-04-10T21:07:09Z

The only problem I can remember with our "atomic write" has been the changed file security meta data. But that was fixed as well. Do we really have reports for issues with Dropbox?

I think, AtomicWrite was introduced to catch for errors during file writing - and have the previous version available. Since there are no reports of corrupted data bases (at least I am not aware of), I would use the "usual" file update behavior. Reason: Sharepoint, Google Drive, Dropbox and other file sharing services IMHO rely on that (especially when it comes to proper file verisoning)

calixtus · 2021-04-11T04:31:53Z

Maybe we can turn the action a bit inside out: before updating create a backup copy of the file and then update the current file?
But I did not work yet much on the file writer, so just a stupid idea of mine. 😅

tobiasdiez · 2021-04-11T09:07:41Z

Since there are no reports of corrupted data bases

Preventing corrupted data is exactly the point of the atomic writer, so having no such reports is proofing that everything is working fine.

file sharing services IMHO rely on that (especially when it comes to proper file verisoning)

Can you elaborate how the atomic writer leads to problems? Since we essentially overwrite the content very quickly (delete & copy), I don't think file sharing services confuse these changes for two files - but I might be wrong.

before updating create a backup copy of the file and then update the current file

That may still leave the main library file in a corrupted state. In the best case, this means just more work for the user (copy backup over current file) and in the worse case means the user looses part of her db (when the corruption is noticed to late). But the atomic writer works in a very similar way, just interchanging the roles of "backup" and "current file": write contents to a new file, and when this succeeds replace the "current file" with this new file (which on some os is even an atomic operation).

* upstream/main: Main instead of master Custom DOI base address fix (#7569) Change export to save (#7518) Bump unoloader from 7.1.1 to 7.1.2 (#7609) Bump org.beryx.jlink from 2.23.5 to 2.23.6 (#7610) Bump com.adarshr.test-logger from 2.1.1 to 3.0.0 (#7611) Bump libreoffice from 7.1.1 to 7.1.2 (#7612) Squashed 'buildres/csl/csl-styles/' changes from e1acabe..bfa3b6d (#7603) Rename master to main

Siedlerchr · 2021-04-23T20:35:12Z

Fuck this shit. Now I'm creating a block when I wait for the save to complete. Argh.
Ideally, the save would happen in the background and the user would be informed if it was a sucess or not.
However, currently we return a boolean if it was sucess or not. This is bad.
And the cancellation also does not really work :( fuck this shit!

* upstream/main: (71 commits) [Bot] Update CSL styles (#7735) Fix for issue 6966: open all files of multiple entries (#7709) Add simple unit tests (#7696) Add simple unit tests (#7543) Update check-outdated-dependencies.yml Added preset for new entry keybindings and reintroduced defaults (#7705) Select the entry which has smaller dictonary order when merge (#7708) Update CHANGELOG.md fix: make more fields, fomatters, ids and languages sorted by alphabetical order (#7717) Bump libreoffice from 7.1.2 to 7.1.3 (#7721) Bump unoloader from 7.1.2 to 7.1.3 (#7724) Bump org.beryx.jlink from 2.23.7 to 2.23.8 (#7723) Bump org.openjfx.javafxplugin from 0.0.9 to 0.0.10 (#7725) fix: make fields sorted by lexicographical order (#7711) Fix tests Refactoring existing unit tests (#7687) Refactoring and addition of unit tests (#7581) Refactor simple Unit Tests (#7571) Add simple unit tests (#7544) add and extend unit tests (#7685) ...

claell · 2022-05-12T18:15:44Z

Possibly, this has already been discussed. But git seems to solve parallel write operations to a repository from multiple git processes with a lock file. Before the process starts, it checks, whether the lock file is there. If not, it creates it, performs the actions and removes the lock file. If the lock is there, it shows a warning.

For JabRef, one might not even need such a lock file, but just a variable that gets set when operations are performed (and finished). Do I miss something important?

koppor · 2023-01-06T22:13:45Z

Now, JabRef has issues on the edge case WSL shares: #9547

Siedlerchr marked this pull request as draft July 17, 2020 16:49

Siedlerchr added the status: devcall label Jul 18, 2020

Use custom unique names for temp fiels

5a05db0

Siedlerchr force-pushed the writebackupToTemp branch from 08a7b16 to 5a05db0 Compare July 18, 2020 19:33

koppor removed the status: devcall label Jul 21, 2020

Siedlerchr added 2 commits July 24, 2020 19:57

Merge remote-tracking branch 'upstream/master' into writebackupToTemp

5bcbb31

* upstream/master: Simplify deployment.yml (#6703) Update snap.yml (#6702) Update snap.yml Fix snap (#6701) change snap action from obsoloete to newer (#6700)

very early draft idea

021012f

try to combine autosave logic and normal save actions TODO: how to handle return values of saves?

koppor mentioned this pull request Jul 27, 2020

Fix change on disk detection JabRef/jabref-koppor#442

Draft

Siedlerchr added 2 commits August 31, 2020 18:06

add start and shutdown methdod

aaa62da

restore old saving

c6651e4

prepare save action executor TODO: How to handle return value Wat for future?

Siedlerchr added 3 commits April 11, 2021 20:17

checkstyle

6b17b69

pass GlobalSaveManager as parameter

6611a63

Siedlerchr mentioned this pull request May 8, 2021

JabRef breaks hard links to .bib files #7718

Open

koppor mentioned this pull request Jun 6, 2021

Fix Google Scholar fetcher for downloading a single entry #7075

Closed

5 tasks

Siedlerchr mentioned this pull request Oct 9, 2021

Library should not be flagged as externally modified when changing library properties #8132

Closed

2 tasks

koppor mentioned this pull request Feb 18, 2022

JabRef writes corrupted library files that can no longer be read, leading to dataloss #8484

Closed

2 tasks

koppor added this to the v5.6 milestone Feb 18, 2022

koppor self-assigned this Feb 18, 2022

koppor modified the milestones: v5.6, v5.7 Mar 28, 2022

ThiloteE added bug Confirmed bugs or reports that are very likely to be bugs export / save labels Apr 20, 2022

koppor mentioned this pull request May 2, 2022

More conservative write #8750

Closed

14 tasks

ThiloteE modified the milestones: v5.7, v5.8 Jul 14, 2022

koppor mentioned this pull request Aug 16, 2022

AtomicFileOutputStream does not overwrite file if exception occurred during write #9067

Merged

6 tasks

koppor closed this Aug 16, 2022

Siedlerchr deleted the writebackupToTemp branch December 18, 2022 20:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix exception when saving and autosave trigger at the same time #6694

Fix exception when saving and autosave trigger at the same time #6694

Siedlerchr commented Jul 17, 2020 •

edited

Loading

tobiasdiez commented Jul 18, 2020

Siedlerchr commented Jul 18, 2020

tobiasdiez commented Jul 18, 2020 •

edited

Loading

Siedlerchr commented Jul 18, 2020

tobiasdiez commented Jul 18, 2020

Siedlerchr commented Jul 18, 2020

tobiasdiez commented Jul 18, 2020 •

edited

Loading

Siedlerchr commented Jul 18, 2020 •

edited

Loading

tobiasdiez commented Jul 18, 2020

Siedlerchr commented Jul 18, 2020

tobiasdiez commented Jul 18, 2020

koppor commented Jul 21, 2020

Siedlerchr commented Jul 26, 2020

koppor commented Jul 27, 2020

koppor commented Jul 27, 2020

Siedlerchr commented Jul 28, 2020

calixtus commented Aug 20, 2020

Siedlerchr commented Aug 20, 2020

github-actions bot commented Aug 31, 2020

Siedlerchr commented Apr 9, 2021

tobiasdiez commented Apr 9, 2021

koppor commented Apr 10, 2021

calixtus commented Apr 11, 2021

tobiasdiez commented Apr 11, 2021

Siedlerchr commented Apr 23, 2021 •

edited

Loading

claell commented May 12, 2022

koppor commented Jan 6, 2023

Fix exception when saving and autosave trigger at the same time #6694

Fix exception when saving and autosave trigger at the same time #6694

Conversation

Siedlerchr commented Jul 17, 2020 • edited Loading

tobiasdiez commented Jul 18, 2020

Siedlerchr commented Jul 18, 2020

tobiasdiez commented Jul 18, 2020 • edited Loading

Siedlerchr commented Jul 18, 2020

tobiasdiez commented Jul 18, 2020

Siedlerchr commented Jul 18, 2020

tobiasdiez commented Jul 18, 2020 • edited Loading

Siedlerchr commented Jul 18, 2020 • edited Loading

tobiasdiez commented Jul 18, 2020

Siedlerchr commented Jul 18, 2020

tobiasdiez commented Jul 18, 2020

koppor commented Jul 21, 2020

Siedlerchr commented Jul 26, 2020

koppor commented Jul 27, 2020

koppor commented Jul 27, 2020

Siedlerchr commented Jul 28, 2020

calixtus commented Aug 20, 2020

Siedlerchr commented Aug 20, 2020

github-actions bot commented Aug 31, 2020

Siedlerchr commented Apr 9, 2021

tobiasdiez commented Apr 9, 2021

koppor commented Apr 10, 2021

calixtus commented Apr 11, 2021

tobiasdiez commented Apr 11, 2021

Siedlerchr commented Apr 23, 2021 • edited Loading

claell commented May 12, 2022

koppor commented Jan 6, 2023

Siedlerchr commented Jul 17, 2020 •

edited

Loading

tobiasdiez commented Jul 18, 2020 •

edited

Loading

tobiasdiez commented Jul 18, 2020 •

edited

Loading

Siedlerchr commented Jul 18, 2020 •

edited

Loading

Siedlerchr commented Apr 23, 2021 •

edited

Loading