Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory leak #44

Closed
kylegetson opened this issue May 17, 2013 · 8 comments
Closed

memory leak #44

kylegetson opened this issue May 17, 2013 · 8 comments

Comments

@kylegetson
Copy link

This may be related to the previous issue for levelup (Level/levelup#140), but I get high memory usage when calling db.get() lots of times when the key is not the same each time. In this example, I use a unique key, each exists in the db, when called 5000000 times memory usage goes above 500M within seconds.

https://gist.github.com/kylegetson/5599592

I'm working on a web service that will wrap levelup, and I expect many millions of gets/puts.

It seems like in some cases the memory is cleaned up, might be related to calling put.
https://gist.github.com/kylegetson/5599722

@rvagg
Copy link
Member

rvagg commented May 18, 2013

will be on to this asap, I take memory leaks very personally fwiw!

@rvagg
Copy link
Member

rvagg commented May 18, 2013

@kylegetson can you test leveldown@0.4.3 for me please? it has some minor changes to the previous memory-leak fix--some poor C++ in the previous fix probably accounts for the continued existence of the problem. Let me know how it goes and I'll backport to 0.2.x.

@kylegetson
Copy link
Author

I still seem to have the same issue, even with the new changes.

Looking at it further, memory usage seems to depend on the size of the leveldb database. When I have a small database of around 500k I get memory usage peaking at about 60M. As the database grows, so does the memory usage. I went up to about a 500M database, and see memory usage go up to 510M in memory. I'm still using the same tests from my gists, and testing with a variety of databases, md5 keys, short string keys, numeric keys, all the values are short strings and numbers, no json.

It also doesn't seem to matter if db.get() finds an item in the database. Repeatedly calling it on keys that don't exist produces the same problem.

It does seem much better than before your latest 2 patches, because before, there didn't seem to be a cap to memory usage, I could easily use 1-2G in memory with these same tests.

I really appreciate you looking at this, I wish I could help, but my C++ skills are nearly non existant.

@rvagg
Copy link
Member

rvagg commented May 18, 2013

I've done a bunch more work, nothing that should be significant enough to change memory use behaviour though. However, I now have a leak tester app that's similar to yours and it's let me find out a few interesting things about what LevelDB's doing.

If you check out the leveldown repo, build it and run node test/leak-tester.js you can see the memory usage change over time along with some other interesting stuff.

What I've noticed is that a lot of the memory usage behaviour of LevelDB is in direct proportion to the 'maxOpenFiles' property. By default this is 1000 but you'll see in my leak-tester.js that I have it set to 10. At that number, the memory usage flattens out for me at around 50/60MB but if you increase it then the memory usage goes up significantly. For example, if you set it to 250 then it'll go up beyond 200MB but then when it starts using level3 files (somewhere over 500MB on disk) it can flatten back down to nearly 80MB, which is kind of odd. The default cache size is 8MB so just leave that alone for this exercise, it shouldn't matter too much. Have a play and look at the different memory usage profiles over time.

Current LevelDOWN version is 0.4.4 and it's in npm. But the current LevelUP will still be using the 0.2.x branch, so don't bother testing there, if you want to use leveldown@0.4.4 when check out the 0.9-wip branch of LevelUP and use it there.

Let me know if you're still seeing bad behaviour and see if you can modify leak-tester.js to demonstrate it.

@rvagg
Copy link
Member

rvagg commented May 18, 2013

so I'm running it now with the default 'maxOpenFiles' (1000) and the memory behaviour is strange but stabilises. It goes up to nearly 1G and then slowly drops down and has stabilised for me between 200MB and 300MB, even dipping below 200MB occasionally. Numbers like this for a ~3GB store:

getCount = 1489000 , putCount =  1382780 , rss = 2728% 249M ["9","26","222","1200","0","0","0"]

(i.e. 1,382,780 records, 249MB RAM used and 1200 level3 files (222 level2 files, etc.). The memory usage pattern hasn't changed much for me since the store got to about 1.5GB, but I'll keep on running this and see how it goes.

@rvagg
Copy link
Member

rvagg commented May 18, 2013

getCount = 13971000 , putCount =  7526165 , rss = 1840% 168M ["0","6","69","753","5087","2073","0"]

That's at 15G ~8000 files, the memory footprint keeps on getting more compact as the size of the store gets bigger.

@kylegetson
Copy link
Author

Nice! I really like the leak tester! I get similar results, seems to stable out at around 220MB for me, with your setting of 10 open files, and that increases and I raise that. I'll run some of my real data through tomorrow and see how it goes. thanks!

@rvagg
Copy link
Member

rvagg commented May 20, 2013

Make sure you run it for a long time, the number will decrease and as you have more data 10 open files is very suboptimal. A 20GB store is stable at 130MB RAM with the default 1000 open file for me, so don't sweat the short-term numbers if your machine can handle them.
Also, in production, don't be afraid to raise the cache size by a large amount, it'll speed up reads a lot.

@rvagg rvagg closed this as completed May 20, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants