Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent setItem and getItem can lead into an unahdled does not look like a valid storage file #108

Closed
tndev opened this issue Oct 1, 2018 · 23 comments

Comments

@tndev
Copy link

tndev commented Oct 1, 2018

Concureent setItem and getItem can lead into an unahdled does not look like a valid storage file:

parse error:  {} for:
     Error: [node-persist][readFile] .node-persist/storage/37b51d194a7513e45b56f6524f2d51f2 does not look like a validstorage file!
      at fs.readFile (node_modules/node-persist/src/local-storage.js:278:66)
      at FSReqWrap.readFileAfterClose [as oncomplete] (fs.js:525:3)

The code used to reproduce create that error was:

const localdb = require('node-persist')
const storage = localdb.create({ ttl: true, logging: true })

async function test2 () {
  await storage.setItem('bar', { 'foo': 'Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.' })
  await storage.getItem('bar')
  await storage.setItem('bar', { 'foo': 'Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.' })
  await storage.getItem('bar')
  await storage.setItem('bar', { 'foo': 'Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.' })
  await storage.getItem('bar')
}

async function test3 () {
  await storage.getItem('bar')
  await storage.getItem('bar')
  await storage.getItem('bar')
  await storage.getItem('bar')
  await storage.getItem('bar')
}

async function main () {
  await storage.init()

  try {
    await Promise.all([
      test2(),
      test3(),
      test2(),
      test3(),
      test2()
    ])
  } catch (err) {
    console.error(err)
  }
}

main().catch(err => {
  console.error(err)
})

The problem cannot be realiable be reproduces.

@tndev tndev changed the title Concureent setItem and getItem can lead into an unahdled does not look like a valid storage file Concurrent setItem and getItem can lead into an unahdled does not look like a valid storage file Oct 1, 2018
@akhoury
Copy link
Collaborator

akhoury commented Oct 2, 2018

we could use file locking i guess, maybe try https://github.com/moxystudio/node-proper-lockfile

@tndev
Copy link
Author

tndev commented Oct 2, 2018

Yes that might be a solution. I honestly don't use your library so I don't have much knowledge about how it works, and if and what kind of bottelneck that would introduce.

I just stumbled across a question on stackoverflow regarding your library, and figured out that problem while I tried to give an solution to the question.

@akhoury
Copy link
Collaborator

akhoury commented Oct 2, 2018

can you link the question please?

@biswanaths
Copy link

biswanaths commented Oct 4, 2018

I think the better approach would be to serialise read and writes based on the queue. One approach as mentioned in my SO answer(see the gist) above is to chain promises. I think we can enhance it to simply chain promises per key rather than whole universe of read and write through the library.

Please see the basic idea here : https://gist.github.com/biswanaths/b8999c9e4e5a9e979dcde061074528f6

I believe whatever solution to this problem should live in library. The end user does not need to know and work around whether the library is storing kv per file or using a single file.

@akhoury
Copy link
Collaborator

akhoury commented Oct 6, 2018

well, what if it's a different process writing to the same file?

@biswanaths
Copy link

@akhoury , Oh, yes. This obviously does will not work of multiprocess I don't know where node-persist, use case falls, like single-process, multi-process or may be getting used as part of a distributed application.

Also going through the old case, I think Issue 31 more or less deals with the same issue.

@DragonEmbers
Copy link

I've also hit this problem, but in my case, I'm not performing asynchronous get/sets but a series of ~10 sets asynchronously using Promise.all(array.map([function that creates promises])). The library hits on the same file name a number of times and ends up writing n values to the same file, thus ending up with a corrupt file.

Given that asynchronicity is commonplace in nodejs, it would be a lot better if this was handled by the library using file locks.

@Weedoze
Copy link

Weedoze commented Jun 24, 2020

This problem is still happening. What is the status of the development about this topic ?

@robbie-cahill
Copy link

robbie-cahill commented Jul 16, 2020

When storage.init (storage being node-persist) encounters a corrupt file it throws this error.

I tried using try/catch and .catch() callbacks to handle this error gracefully, but neither work because the error is not bubbling up to where I can handle it.

For those looking for a drop-in replacement for this library, I've switched to https://www.npmjs.com/package/node-localstorage and it seems to be working fine. The api is basically similar with getItem() and setItem()

@ryanleesmith
Copy link

ryanleesmith commented Oct 11, 2020

node-localstorage doesn't appear to be async, or offer an async api.

Regarding this plugin's error - I have a .catch around every line of code related to it (init, getItem, setItem, removeItem) and it's still throwing the uncaught exception. Even more confusing, it happens at a periodic interval - which would be explained by the fact that my code periodically polls and uses the plugin for reading/writing, but I actually removed all references in the code to the corrupted file in question. It's unfortunate as this is otherwise a fine plugin.

@ryanleesmith
Copy link

Ok, figured it out - the interval for expiring keys does a scan, which is where the uncaught exception is originating from, and why you can't catch it. Passing in expiredInterval: 0 as an option in the init resolves the issue for me. If I understand correctly, this means keys will not be auto-expired, but getItem will still honor the TTL. Depending on your usage, this may or may not be an issue.

@SamNChiet
Copy link

Wait, wouldn't this kind of invalidate the library for use as a server-side application that's handling requests?
Would it work to wrap getItem and setItem with something that tracks files being modified, or should I just jump to something else?

@SamNChiet
Copy link

SamNChiet commented Mar 27, 2021

I think the better approach would be to serialise read and writes based on the queue. One approach as mentioned in my SO answer(see the gist) above is to chain promises. I think we can enhance it to simply chain promises per key rather than whole universe of read and write through the library.

Please see the basic idea here : https://gist.github.com/biswanaths/b8999c9e4e5a9e979dcde061074528f6

I believe whatever solution to this problem should live in library. The end user does not need to know and work around whether the library is storing kv per file or using a single file.

this worked a treat - it seems like the library author has robustness concerns, but honestly? I'm not using this because it's the most robust database solution out there, accounting for external multiprocess writes and all that - I'm using it because it's a dang nice interface for serializing bits of data for fun projects.

I still think locking files on the OS level would be the best solution, but even this simple queue basically solves my problem. Thanks!

@Andersama
Copy link

Andersama commented Aug 25, 2021

Not that I'm contributing much, but I've also run into this. It's an extremely bizarre error because this error can happen whether or not you're already viewing the file. EG: if you pick up the habit of looking at the files as they're being modified all goes well until you write code that apparently creates this issue, then suddenly the file is no longer being written to, and you're left with the strange error which seems to imply JSON.parse() or something like it failed. My accidental solution was to debounce requests and make sure they execute on one second boundaries. It's effectively an extremely dumb queue, because it can drop data (but in my case the data's not critical anyway).

@wintercounter
Copy link

I've made everything work, tests were running fine, time to prod: same errors. Pretty unexpected that it won't be able to handle multiple requests. I quickly re-implemented the whole feature in Redis, but it caused a 4hr downtime to our service.

@ayy1337
Copy link

ayy1337 commented Jan 23, 2023

Bro how is this still a bug

@akhoury
Copy link
Collaborator

akhoury commented Feb 23, 2023

hey, sorry for the serious delay on this, life happened.

node-persist@3.1.1 should fix this problem within the same process, and I think that's only fair, but handling a cluster or multi "machines" writing on the same disk is not really this library's job. If you find yourself using a multi machines services then this is not the library for you. In order for me to solve this across a multi process/machines environment, I would need some external running background or network service, i.e. redis or the likes of it.

Copying OP's test case, note the writeQueue option comment.

const localdb = require('node-persist');
const storage = localdb.create({ 
    ttl: true, 
    writeQueue: true, // turn this to false to disable write queue and you will see the initial error "does not look like a valid storage file"
    logging: false 
})

async function test2 () {
  await storage.setItem('bar', { 'foo': 'Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.' })
  await storage.getItem('bar')
  await storage.setItem('bar', { 'foo': 'Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.' })
  await storage.getItem('bar')
  await storage.setItem('bar', { 'foo': 'Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.' })
  await storage.getItem('bar')
}

async function test3 () {
  await storage.getItem('bar')
  await storage.getItem('bar')
  await storage.getItem('bar')
  await storage.getItem('bar')
  await storage.getItem('bar')
}

async function main () {
  await storage.init()

  try {
    await Promise.all([
      test2(),
      test3(),
      test2(),
      test3(),
      test2()
    ])
  } catch (err) {
    console.error(err)
  }
}

main().catch(err => {
  console.error(err)
})

@mdodge-ecgrow
Copy link

Thanks for your work Aziz! I was wondering if this latest update would fix this issue that we have been getting intermittently: Error: EMFILE: too many open files. Or should I open a seperate issue?
Thanks!

@audiokan
Copy link

@akhoury Hello Aziz! I just wanted to let you know that I found this project and issue, as your last commit fd0d7cf for version 3.1.1 yesterday seems to have a syntax error, so this broke our project when I ran it today:

/var/lib/jenkins/workspace/sonatype-oss-index/node_modules/node-persist/src/local-storage.js:367
			if (!this.q[file]?.length) {
			                  ^

SyntaxError: Unexpected token '.'

@wintercounter
Copy link

That's not a syntax error, it's called optional chaining and you need at least Node 14 to make it work.

@akhoury
Copy link
Collaborator

akhoury commented Feb 25, 2023

@wintercounter is correct, but to be fair package.json node engine says >=10.12.0 so I just removed that optional chaining in 3.1.3

@audiokan
Copy link

Thanks, this indeed fixed it! As I am also not too familiar with node and JavaScript itself (just maintaining the Jenkins server) I did not know this, thanks for the info! And yes, as I did not receive any version dependency conflicts it did not occur that this might be the problem instead of it being a syntax error.

So thank you all for your work, I appreciate it, and have a nice weekend!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests