Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

size_limit for cache does not take into account the size of directories #140

Closed
crash-g opened this issue Mar 25, 2020 · 2 comments
Closed

Comments

@crash-g
Copy link

crash-g commented Mar 25, 2020

I have noticed that the volume method does not return the exact size of the cache, but in some scenarios there could be a discrepancy of more than 100 MB.

From the code, I gather that volume = page_size * page_count + total_file_size, where page_size * page_count is an estimate of the database size and total_file_size is the size of all file-backed entries.

The problem is that this does not take into account the space occupied by directories. There are two levels of directories, whose names correspond to one byte in the hash, so there can be at most 65536 directories. Each directory being 4KB in my case, this gives at most 256MB of space which is unaccounted for.

This is not a problem in my scenario but the behavior was confusing, so I was wondering if I missed the explanation somewhere or if it is implied by the current documentation of volume.

Version: 3.1.1
OS: Ubuntu

@grantjenks
Copy link
Owner

You didn’t miss anything in the docs. I simply haven’t accounted the directory sizes because I don’t know good and easy way to do so.

The directory structure could change with a different Disk so maybe it’s best to exclude them. I’d lean toward a simple docs update saying it excludes directory sizes.

I do think it’s worth calling out. Others have been concerned by the large number of directories. When I created diskcache, I had an abundance of disk space so I don’t worry about it much.

@grantjenks
Copy link
Owner

Blurb added to Caveats in a8a014c to be released in v5

grantjenks added a commit that referenced this issue Feb 27, 2024
* Remove unused "env" setting from pytest section

* Remove nose references

* Fixes for pylint in Python 3.8

* Update 2019 date references to 2020

* Update references to Python 2 support

* Remove Python 2 shims

* Add flake8 to linters and fix issues

* Add locked method to Lock

* Add paragraph to caveats about cache volume for Issue #140

* Bump version to 5.0.0

* Remove Python 2 mapping methods keys/values/items/iter*/view*

* Update tests for Python 3

* Bump version to 5.0.1

* Bump version to 5.0.2

* Add python_requires kwarg to setup

* Bump version to 5.0.3

* Prevent cache shard attribute access when unsafe

* Support transactions in FanoutCache (probably a bad idea)

* Bump version to 5.1.0

* Use no hardcoded /tmp/diskcache/... paths in tests

* replace open mode 'w' to 'x'

* Use disk provided by the user whenever possible

* Remove transaction from Deque.__init__

When initializing a Deque, a transaction was used to extend elements from the
given iterable. The transaction is not used in Index.__init__ or in the
FanoutCache.fromcache API.

Users that want Deque.__init__ to use a transaction as before should use:

    d = Deque()
    with d.transact():
        d.extend(iterable)

The transaction is therefore explicit and consistent with other APIs.

* Use the same Disk in FanoutCache as in Index and Deque subdirs

* Remove travis and appveyor in favor of GitHub Actions

* Rewrite k/v store benchmarking to avoid IPython "magic" syntax

* Update development requirements for editable install

* I blue it

* Make imports consistent with isort

* Increase max attributes to 8

* Tell mypy to ignore django

* Remove useless `dataset` target

* Ignore type errors when setting class attributes

* Add django to deps for sphinx and pylint

* Tell doc8 to ignore docs/_build dir

* Update flake8 configs

* Flake8 fixes (mostly removing useless module imports)

* Update pylint and fix code

* Update Sphinx and re-gen conf.py

* Update copyright year

* Update readme badges and CI notes

* Pin jedi for ipython

* Skip help() examples when running doctest

* Fix configs for pytest

* Add branch coverage and decrease coverage minimum to 96

* Ignore more .coverage files

* Bump version to 5.2.0

* Install libmemcached-dev for release action

* Bump version to 5.2.1

* Add Python 3.9 support trove classifier.

* Run integration on pull requests

* Fix typo

* Fix the URL to Django documentation for cache.

* remove leftovers from Travis and AppVeyor

Both were removed in favor of GitHub actions.

* remove unused imports

* Ignore pylint's consider-using-with in Disk.fetch

* Simplify ENOENT handling around fetch() and remove()

* Add doc about IOError

* Add notes about changes to store() and remove()

* Update remove to cleanup parent dirs

* Remove logic from filename() for creating directories

* Modify store() to create the subdirs when writing the file (1 of 4)

* Refactor file writing logic to retry makedirs

* Add test for Lock.locked()

* Test re-entrancy of "rlock"

* Delete EACCESS error tests

* Test Cache.memoize() with typed kwargs

* Test JSONDisk.get by iterating cache

* Increase coverage to 97%

* Add test for cleaning up dirs

* Add TODO for testing Disk._write

* Add tests for Disk._write

* Add a pragma "no cover" statements and increase threshold to 98

* Blue fixes (mostly docstring triple quotes)

* Pylint fixes

* Disable no-self-use in Disk._write

* Add `ignore` to memoize()

* Fixes for blue

* Fixes #201 added github repo to project_urls

* Fixup formatting for project urls

* Stop using ENOVAL in args_to_key()

* Add caveat about inconsistent pickles

* Bug Fix: Use "ignore" keyword argument with Index.memoize()

* Drop old Ubuntu from integration testing

* docs: fix typo

* Disable consider-using-f-string

* Support for Python 3.10 in testing (#238)

* Add support for Python 3.10
* Update copyright to 2022
* Bump version to 5.3.0
* Add Python 3.10 to the README

* Update tests for Django 3.2

* Fix DjangoCache.delete to return True/False

* Bump Django testing to 3.2

* Remove unused imports

* Run isort

* Bump version to 5.4.0

* Put commands above deps for doc8 testenv

* Update rsync command for uploading docs

* Remove unused import

* Update Cache(...) params when allocating

* Add docs about the eviction policy to recipes

* Test on Django 4.2 LTS

* Update year to 2023

* Bump python testing to 3.11

* i blue it

* Update requirements

* Update pylint

* Drop Python 3.7 from testing

* Update tests for Django 4.2

* Bump version to v5.5.0

* Drop 3.7 from CI

* Install dev requirements for wheel package

* Bump version to 5.5.1

* Close the cache explicitly before deleting the reference

* Oops, close the cache, not the deque

* Shutup pylint

* Bump version to 5.5.2

* Bump versions of checkout and setup-python

* Add maxlen parameter to diskcache.Deque (#191)

* Add maxlen parameter to diskcache.Deque

* Bump version to 5.6.0

* Fix docs re: JSONDisk

* Support pathlib.Path as directory argument

* Bump version to 5.6.1

* Bug fix: Fix peek when value is so large that a file is used (#288)

Error caused by copy/paste from pull().

* Bump version to 5.6.2

* Update release.yml to use pypa/gh-action-pypi-publish

* Bump version to 5.6.3

* Fix a few things after merging.

Signed-off-by: Andrea Odetti <mariofutire@gmail.com>

* Fix blue check.

Signed-off-by: Andrea Odetti <mariofutire@gmail.com>

* Fix pylint errors R0904 and R0915.

Signed-off-by: Andrea Odetti <mariofutire@gmail.com>

* Add read-only tests.

Signed-off-by: Andrea Odetti <mariofutire@gmail.com>

---------

Signed-off-by: Andrea Odetti <mariofutire@gmail.com>
Co-authored-by: Grant Jenks <grant.jenks@gmail.com>
Co-authored-by: ume <bungoume@gmail.com>
Co-authored-by: Cologler <skyoflw@gmail.com>
Co-authored-by: C2D <50617709+i404788@users.noreply.github.com>
Co-authored-by: Omer Katz <omer.drow@gmail.com>
Co-authored-by: Joakim Nordling <joakim.nordling@gmail.com>
Co-authored-by: Abhilash Raj <maxking@users.noreply.github.com>
Co-authored-by: Jürgen Gmach <juergen.gmach@googlemail.com>
Co-authored-by: Abhinav Omprakash <55880260+AbhinavOmprakash@users.noreply.github.com>
Co-authored-by: artiom <artiom.lunev@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants