Skip to content

Commit

Permalink
general: add benchmarks for sqlite/file backends; update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
karlicoss committed Sep 17, 2023
1 parent 339c19f commit 9e89fd1
Show file tree
Hide file tree
Showing 2 changed files with 32 additions and 13 deletions.
27 changes: 14 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ Cachew gives the best of two worlds and makes it both **easy and efficient**. Th
- first your objects get [converted](src/cachew/marshall/cachew.py#L34) into a simpler JSON-like representation
- after that, they are mapped into byte blobs via [`orjson`](https://github.com/ijl/orjson).

When the function is called, cachew [computes the hash of your function's arguments ](src/cachew/__init__.py:#L504)
When the function is called, cachew [computes the hash of your function's arguments ](src/cachew/__init__.py:#L466)
and compares it against the previously stored hash value.

- If they match, it would deserialize and yield whatever is stored in the cache database
Expand All @@ -140,18 +140,18 @@ and compares it against the previously stored hash value.



* automatic schema inference: [1](src/cachew/tests/test_cachew.py#L350), [2](src/cachew/tests/test_cachew.py#L364)
* automatic schema inference: [1](src/cachew/tests/test_cachew.py#L371), [2](src/cachew/tests/test_cachew.py#L385)
* supported types:

* primitive: `str`, `int`, `float`, `bool`, `datetime`, `date`, `Exception`

See [tests.test_types](src/cachew/tests/test_cachew.py#L676), [tests.test_primitive](src/cachew/tests/test_cachew.py#L710), [tests.test_dates](src/cachew/tests/test_cachew.py#L630), [tests.test_exceptions](src/cachew/tests/test_cachew.py#L1037)
* [@dataclass and NamedTuple](src/cachew/tests/test_cachew.py#L592)
* [Optional](src/cachew/tests/test_cachew.py#L494) types
* [Union](src/cachew/tests/test_cachew.py#L788) types
* [nested datatypes](src/cachew/tests/test_cachew.py#L410)
See [tests.test_types](src/cachew/tests/test_cachew.py#L697), [tests.test_primitive](src/cachew/tests/test_cachew.py#L731), [tests.test_dates](src/cachew/tests/test_cachew.py#L651), [tests.test_exceptions](src/cachew/tests/test_cachew.py#L1073)
* [@dataclass and NamedTuple](src/cachew/tests/test_cachew.py#L613)
* [Optional](src/cachew/tests/test_cachew.py#L515) types
* [Union](src/cachew/tests/test_cachew.py#L809) types
* [nested datatypes](src/cachew/tests/test_cachew.py#L431)

* detects [datatype schema changes](src/cachew/tests/test_cachew.py#L440) and discards old data automatically
* detects [datatype schema changes](src/cachew/tests/test_cachew.py#L461) and discards old data automatically


# Performance
Expand All @@ -165,20 +165,20 @@ You can find some of my performance tests in [benchmarks/](benchmarks) dir, and


# Using
See [docstring](src/cachew/__init__.py#L329) for up-to-date documentation on parameters and return types.
See [docstring](src/cachew/__init__.py#L281) for up-to-date documentation on parameters and return types.
You can also use [extensive unit tests](src/cachew/tests/test_cachew.py) as a reference.

Some useful (but optional) arguments of `@cachew` decorator:

* `cache_path` can be a directory, or a callable that [returns a path](src/cachew/tests/test_cachew.py#L387) and depends on function's arguments.
* `cache_path` can be a directory, or a callable that [returns a path](src/cachew/tests/test_cachew.py#L408) and depends on function's arguments.

By default, `settings.DEFAULT_CACHEW_DIR` is used.

* `depends_on` is a function which determines whether your inputs have changed, and the cache needs to be invalidated.

By default it just uses string representation of the arguments, you can also specify a custom callable.

For instance, it can be used to [discard cache](src/cachew/tests/test_cachew.py#L89) if the input file was modified.
For instance, it can be used to [discard cache](src/cachew/tests/test_cachew.py#L103) if the input file was modified.

* `cls` is the type that would be serialized.

Expand Down Expand Up @@ -251,6 +251,7 @@ def mcachew(*args, **kwargs):
import cachew
except ModuleNotFoundError:
import warnings

warnings.warn('cachew library not found. You might want to install it to speed things up. See https://github.com/karlicoss/cachew')
return lambda orig_func: orig_func
else:
Expand All @@ -264,9 +265,9 @@ Now you can use `@mcachew` in place of `@cachew`, and be certain things don't br
## Settings


[cachew.settings](src/cachew/__init__.py#L68) exposes some parameters that allow you to control `cachew` behaviour:
[cachew.settings](src/cachew/__init__.py#L66) exposes some parameters that allow you to control `cachew` behaviour:
- `ENABLE`: set to `False` if you want to disable caching for without removing the decorators (useful for testing and debugging).
You can also use [cachew.extra.disabled_cachew](src/cachew/extra.py#L18) context manager to do it temporarily.
You can also use [cachew.extra.disabled_cachew](src/cachew/extra.py#L21) context manager to do it temporarily.
- `DEFAULT_CACHEW_DIR`: override to set a different base directory. The default is the "user cache directory" (see [appdirs docs](https://github.com/ActiveState/appdirs#some-example-output)).
- `THROW_ON_ERROR`: by default, cachew is defensive and simply attemps to cause the original function on caching issues.
Set to `True` to catch errors earlier.
Expand Down
18 changes: 18 additions & 0 deletions benchmarks/20230917.org
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
Running on @karlicoss desktop PC, =python3.10=

Just a comparison of =sqlite= and =file= backends.

#+begin_example
$ pytest --pyargs -k 'test_many and gc_off and 3000000' -s
src/cachew/tests/test_cachew.py::test_many[sqlite-gc_off-3000000] [INFO 2023-09-17 02:02:09,946 cachew __init__.py:657 ] cachew.tests.test_cachew:test_many.<locals>.iter_data: wrote 3000000 objects to cachew (sqlite:/tmp/pytest-of-karlicos/pytest-129/test_many_sqlite_gc_off_3000000/test_many)
test_many: initial write to cache took 13.6s
test_many: cache size is 229.220352Mb
[INFO 2023-09-17 02:02:10,780 cachew __init__.py:662 ] cachew.tests.test_cachew:test_many.<locals>.iter_data: loading 3000000 objects from cachew (sqlite:/tmp/pytest-of-karlicos/pytest-129/test_many_sqlite_gc_off_3000000/test_many)
test_many: reading from cache took 7.0s
PASSED
src/cachew/tests/test_cachew.py::test_many[file-gc_off-3000000] [INFO 2023-09-17 02:02:23,944 cachew __init__.py:657 ] cachew.tests.test_cachew:test_many.<locals>.iter_data: wrote 3000000 objects to cachew (file:/tmp/pytest-of-karlicos/pytest-129/test_many_file_gc_off_3000000_0/test_many)
test_many: initial write to cache took 6.1s
test_many: cache size is 202.555667Mb
[INFO 2023-09-17 02:02:23,945 cachew __init__.py:662 ] cachew.tests.test_cachew:test_many.<locals>.iter_data: loading objects from cachew (file:/tmp/pytest-of-karlicos/pytest-129/test_many_file_gc_off_3000000_0/test_many)
test_many: reading from cache took 5.4s
#+end_example

0 comments on commit 9e89fd1

Please sign in to comment.