Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update skc_obj_alloc for spl kmem caches that are backed by Linux #9474

Merged
merged 1 commit into from
Oct 18, 2019

Conversation

sdimitro
Copy link
Contributor

Currently, for certain sizes and classes of allocations we use
SPL caches that are backed by caches in the Linux Slab allocator
to reduce fragmentation and increase utilization of memory. The
way things are implemented for these caches as of now though is
that we don't keep any statistics of the allocations that we
make from these caches.

This patch enables the tracking of allocated objects in those
SPL caches by making the trade-off of grabbing the cache lock
at every object allocation and free to update the respective
counter.

Additionally, this patch makes those caches visible in the
/proc/spl/kmem/slab special file.

As a side note, enabling the specific counter for those caches
enables SDB to create a more user-friendly interface than
/proc/spl/kmem/slab that can also cross-reference data from
slabinfo. Here is for example the output of one of those
caches in SDB that outputs the name of the underlying Linux
cache, the memory of SPL objects allocated in that cache,
and the percentage of those objects compared to all the
objects in it:

> spl_kmem_caches | filter obj.skc_name == "zio_buf_512" | pp
name        entry_size active_objs active_memory            source total_memory util
----------- ---------- ----------- ------------- ----------------- ------------ ----
zio_buf_512        512        2974         1.5MB kmalloc-512[SLUB]       16.9MB    8

Signed-off-by: Serapheim Dimitropoulos serapheim@delphix.com

How Has This Been Tested?

Besides running the test-suite to ensure no regression, I did some manual testing by double checking the values in the /proc filesystem and with sdb. I'm open to feedback on how to test this further.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

@sdimitro sdimitro force-pushed the zol-spl-kmem-counters branch 2 times, most recently from fbc475c to 3a7308f Compare October 16, 2019 20:50
@sdimitro sdimitro requested review from behlendorf and ahrens October 16, 2019 20:59
module/os/linux/spl/spl-kmem-cache.c Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Oct 17, 2019

Codecov Report

Merging #9474 into master will decrease coverage by 0.28%.
The diff coverage is 33.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9474      +/-   ##
==========================================
- Coverage   79.28%      79%   -0.29%     
==========================================
  Files         415      415              
  Lines      123632   123640       +8     
==========================================
- Hits        98019    97678     -341     
- Misses      25613    25962     +349
Flag Coverage Δ
#kernel 79.7% <33.33%> (-0.09%) ⬇️
#user 66.49% <ø> (-0.82%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4313a5b...9a170fc. Read the comment docs.

@sdimitro sdimitro force-pushed the zol-spl-kmem-counters branch from 3a7308f to dd9049a Compare October 17, 2019 16:42
Currently, for certain sizes and classes of allocations we use
SPL caches that are backed by caches in the Linux Slab allocator
to reduce fragmentation and increase utilization of memory. The
way things are implemented for these caches as of now though is
that we don't keep any statistics of the allocations that we
make from these caches.

This patch enables the tracking of allocated objects in those
SPL caches by making the trade-off of grabbing the cache lock
at every object allocation and free to update the respective
counter.

Additionally, this patch makes those caches visible in the
/proc/spl/kmem/slab special file.

As a side note, enabling the specific counter for those caches
enables SDB to create a more user-friendly interface than
/proc/spl/kmem/slab that can also cross-reference data from
slabinfo. Here is for example the output of one of those
caches in SDB that outputs the name of the underlying Linux
cache, the memory of SPL objects allocated in that cache,
and the percentage of those objects compared to all the
objects in it:
```
> spl_kmem_caches | filter obj.skc_name == "zio_buf_512" | pp
name        ...            source total_memory util
----------- ... ----------------- ------------ ----
zio_buf_512 ... kmalloc-512[SLUB]       16.9MB    8
```

Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
@sdimitro sdimitro force-pushed the zol-spl-kmem-counters branch from dd9049a to 9a170fc Compare October 17, 2019 17:01
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Oct 18, 2019
@behlendorf behlendorf merged commit 851eda3 into openzfs:master Oct 18, 2019
sdimitro added a commit to delphix/sdb that referenced this pull request Oct 21, 2019
= Motivation

`/proc/spl/kmem/slab` is a useful tool for debugging issues
in ZoL that are memory-related. Unfortunately, it is not
applicable to crash dumps and not very user friendly overall
(e.g. you get almost no information for spl caches that are
backed by Linux Slab caches).

Besides the above, the specific use-case within Delphix is a
replacement for `::kmastat` that we had in MDB in the illumos
version of the product. The `spl_kmem_caches` command of this
commit is implemented to complement the `slabs` command and
cover the functionality of `::kmastat` that we miss further.

= Patch

This commit introduces the `spl` directory in our code base
and implements `spl_kmem_caches`. By default the command
iterates over all the SPL caches and prints the following
statistics in human-readable form sorted by active_memory
used:
```
name                     entry_size active_objs active_memory                         source total_memory util
------------------------ ---------- ----------- ------------- ------------------------------ ------------ ----
zio_data_buf_131072          135680         760        98.3MB      zio_data_buf_131072[SPL ]      119.0MB   82
dnode_t                        1168       25533        28.4MB                  dnode_t[SLUB]       28.6MB   99
zfs_znode_cache                1104       23039        24.3MB          zfs_znode_cache[SLUB]       24.4MB   99
...
```

All the properties can be seen in the help message and the
user may specify the properties they are looking for with
the `-o` option (and also sort them by a specific field
using `-s`):
```
> spl_kmem_caches -h
usage: spl_kmem_caches [-h] [-H] [-o FIELDS] [-p] [-r] [-s FIELD] [-v]

optional arguments:
  -h, --help       show this help message and exit
  -H               do not display headers and separate fields by a single tab
                   (scripted mode)
  -o FIELDS        comma-separated list of fields to display
  -p               display numbers in parseable (exact) values
  -r, --recursive  recurse down children caches and print statistics
  -s FIELD         sort rows by FIELD
  -v               Print all statistics

FIELDS := address, name, flags, object_size, entry_size, slab_size,
objects_per_slab, entries_per_slab, slabs, active_slabs, active_memory,
total_memory, objs, active_objs, inactive_objs, source, util

If -o is not specified the default fields used are name, entry_size,
active_objs, active_memory, source, total_memory, util.

If the -s option is not specified and the command's output is not piped anywhere
then we sort by the following fields in order: active_memory, address, name. If
none of those exists in the field-set we sort by the first field specified in
the set.

> spl_kmem_caches -o "name,flags,slabs" -s slabs
name                                       flags slabs
------------------------ ----------------------- -----
zio_data_buf_131072         KMC_NODEBUG|KMC_VMEM   115
zio_buf_131072              KMC_NODEBUG|KMC_VMEM    25
zio_data_buf_40960          KMC_NODEBUG|KMC_VMEM    19
...
```

Besides a PrettyPrinter, the command is also a Locator which
means that the user can do things like this:
```
> spl_kmem_caches -s total_memory | head 1 | pp
name                entry_size active_objs active_memory                    source total_memory util
------------------- ---------- ----------- ------------- ------------------------- ------------ ----
zio_data_buf_131072     135680         760        98.3MB zio_data_buf_131072[SPL ]      119.0MB   82
```

A couple more options mimicking Illumos conventions are the
`-p` and `-H` options that output numbers to raw form and
skip printing the headers respectively. This is generally
useful for scripts or output that is to be processed later:
```
> spl_kmem_caches -H -p
zio_data_buf_131072	135680	760	103116800	zio_data_buf_131072[SPL ]	124825600	82
dnode_t	1168	25573	29869264	dnode_t[SLUB]	30005920	99
zfs_znode_cache	1104	23039	25435056	zfs_znode_cache[SLUB]	25548768	99
```

Note that for these command to fully work for SPL caches
that are backed by Linux Slab caches we need the appropriate
ZoL changes that enables us to track allocations for those
cases (see PR: openzfs/zfs#9474).
If the commit from that PR is not in the running system
then all of these caches will incorectly be reported empty.

Side-change:
As the `spl_kmem_caches` command is very close both in terms
of implementation and functionality with the `slabs` command
this patch also updates some of the columns and calculations
in the output of `slabs`, to make the commands consistent
with each other.

= Testing/Verification

I verified the values of all their fields in the following
ways:
[1] For the fields that are also visible in /proc/slabinfo,
    I ensured that the values matched.
[2] For anything else I did the math and cross-referenced
    all the values ensuring that they are consistent with
    each other.

Besides the manual testing done shown in the output above I also
ensured the following error cases:

* ask for invalid field name
```
> spl_kmem_caches -o bogus
sdb: spl_kmem_caches: 'bogus' is not a valid field
```

* sort by invalid field name
```
sdb: spl_kmem_caches: invalid input: 'bogus' is not in field set (name, entry_size, active_objs, active_memory,
source, total_memory, util)
```

* attempt to sort by field that is not in the requested
field set
```
> spl_kmem_caches -o "name,address" -s objs
sdb: spl_kmem_caches: invalid input: 'objs' is not in field set (name, address)
```

= Other Notes

The command only works with SLUB and until we have proper
lexing we still pass fields like this:
`> spl_kmem_caches -o "name,address"`
sdimitro added a commit to delphix/sdb that referenced this pull request Oct 25, 2019
= Motivation

`/proc/spl/kmem/slab` is a useful tool for debugging issues
in ZoL that are memory-related. Unfortunately, it is not
applicable to crash dumps and not very user friendly overall
(e.g. you get almost no information for spl caches that are
backed by Linux Slab caches).

Besides the above, the specific use-case within Delphix is a
replacement for `::kmastat` that we had in MDB in the illumos
version of the product. The `spl_kmem_caches` command of this
commit is implemented to complement the `slabs` command and
cover the functionality of `::kmastat` that we miss further.

= Patch

This commit introduces the `spl` directory in our code base
and implements `spl_kmem_caches`. By default the command
iterates over all the SPL caches and prints the following
statistics in human-readable form sorted by active_memory
used:
```
name                     entry_size active_objs active_memory                         source total_memory util
------------------------ ---------- ----------- ------------- ------------------------------ ------------ ----
zio_data_buf_131072          135680         760        98.3MB      zio_data_buf_131072[SPL ]      119.0MB   82
dnode_t                        1168       25533        28.4MB                  dnode_t[SLUB]       28.6MB   99
zfs_znode_cache                1104       23039        24.3MB          zfs_znode_cache[SLUB]       24.4MB   99
...
```

All the properties can be seen in the help message and the
user may specify the properties they are looking for with
the `-o` option (and also sort them by a specific field
using `-s`):
```
> spl_kmem_caches -h
usage: spl_kmem_caches [-h] [-H] [-o FIELDS] [-p] [-r] [-s FIELD] [-v]

optional arguments:
  -h, --help       show this help message and exit
  -H               do not display headers and separate fields by a single tab
                   (scripted mode)
  -o FIELDS        comma-separated list of fields to display
  -p               display numbers in parseable (exact) values
  -r, --recursive  recurse down children caches and print statistics
  -s FIELD         sort rows by FIELD
  -v               Print all statistics

FIELDS := address, name, flags, object_size, entry_size, slab_size,
objects_per_slab, entries_per_slab, slabs, active_slabs, active_memory,
total_memory, objs, active_objs, inactive_objs, source, util

If -o is not specified the default fields used are name, entry_size,
active_objs, active_memory, source, total_memory, util.

If the -s option is not specified and the command's output is not piped anywhere
then we sort by the following fields in order: active_memory, address, name. If
none of those exists in the field-set we sort by the first field specified in
the set.

> spl_kmem_caches -o "name,flags,slabs" -s slabs
name                                       flags slabs
------------------------ ----------------------- -----
zio_data_buf_131072         KMC_NODEBUG|KMC_VMEM   115
zio_buf_131072              KMC_NODEBUG|KMC_VMEM    25
zio_data_buf_40960          KMC_NODEBUG|KMC_VMEM    19
...
```

Besides a PrettyPrinter, the command is also a Locator which
means that the user can do things like this:
```
> spl_kmem_caches -s total_memory | head 1 | pp
name                entry_size active_objs active_memory                    source total_memory util
------------------- ---------- ----------- ------------- ------------------------- ------------ ----
zio_data_buf_131072     135680         760        98.3MB zio_data_buf_131072[SPL ]      119.0MB   82
```

A couple more options mimicking Illumos conventions are the
`-p` and `-H` options that output numbers to raw form and
skip printing the headers respectively. This is generally
useful for scripts or output that is to be processed later:
```
> spl_kmem_caches -H -p
zio_data_buf_131072	135680	760	103116800	zio_data_buf_131072[SPL ]	124825600	82
dnode_t	1168	25573	29869264	dnode_t[SLUB]	30005920	99
zfs_znode_cache	1104	23039	25435056	zfs_znode_cache[SLUB]	25548768	99
```

Note that for these command to fully work for SPL caches
that are backed by Linux Slab caches we need the appropriate
ZoL changes that enables us to track allocations for those
cases (see PR: openzfs/zfs#9474).
If the commit from that PR is not in the running system
then all of these caches will incorectly be reported empty.

Side-change:
As the `spl_kmem_caches` command is very close both in terms
of implementation and functionality with the `slabs` command
this patch also updates some of the columns and calculations
in the output of `slabs`, to make the commands consistent
with each other.

= Testing/Verification

I verified the values of all their fields in the following
ways:
[1] For the fields that are also visible in /proc/slabinfo,
    I ensured that the values matched.
[2] For anything else I did the math and cross-referenced
    all the values ensuring that they are consistent with
    each other.

Besides the manual testing done shown in the output above I also
ensured the following error cases:

* ask for invalid field name
```
> spl_kmem_caches -o bogus
sdb: spl_kmem_caches: 'bogus' is not a valid field
```

* sort by invalid field name
```
sdb: spl_kmem_caches: invalid input: 'bogus' is not in field set (name, entry_size, active_objs, active_memory,
source, total_memory, util)
```

* attempt to sort by field that is not in the requested
field set
```
> spl_kmem_caches -o "name,address" -s objs
sdb: spl_kmem_caches: invalid input: 'objs' is not in field set (name, address)
```

= Other Notes

The command only works with SLUB and until we have proper
lexing we still pass fields like this:
`> spl_kmem_caches -o "name,address"`
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Dec 26, 2019
Currently, for certain sizes and classes of allocations we use
SPL caches that are backed by caches in the Linux Slab allocator
to reduce fragmentation and increase utilization of memory. The
way things are implemented for these caches as of now though is
that we don't keep any statistics of the allocations that we
make from these caches.

This patch enables the tracking of allocated objects in those
SPL caches by making the trade-off of grabbing the cache lock
at every object allocation and free to update the respective
counter.

Additionally, this patch makes those caches visible in the
/proc/spl/kmem/slab special file.

As a side note, enabling the specific counter for those caches
enables SDB to create a more user-friendly interface than
/proc/spl/kmem/slab that can also cross-reference data from
slabinfo. Here is for example the output of one of those
caches in SDB that outputs the name of the underlying Linux
cache, the memory of SPL objects allocated in that cache,
and the percentage of those objects compared to all the
objects in it:
```
> spl_kmem_caches | filter obj.skc_name == "zio_buf_512" | pp
name        ...            source total_memory util
----------- ... ----------------- ------------ ----
zio_buf_512 ... kmalloc-512[SLUB]       16.9MB    8
```

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes openzfs#9474
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Dec 27, 2019
Currently, for certain sizes and classes of allocations we use
SPL caches that are backed by caches in the Linux Slab allocator
to reduce fragmentation and increase utilization of memory. The
way things are implemented for these caches as of now though is
that we don't keep any statistics of the allocations that we
make from these caches.

This patch enables the tracking of allocated objects in those
SPL caches by making the trade-off of grabbing the cache lock
at every object allocation and free to update the respective
counter.

Additionally, this patch makes those caches visible in the
/proc/spl/kmem/slab special file.

As a side note, enabling the specific counter for those caches
enables SDB to create a more user-friendly interface than
/proc/spl/kmem/slab that can also cross-reference data from
slabinfo. Here is for example the output of one of those
caches in SDB that outputs the name of the underlying Linux
cache, the memory of SPL objects allocated in that cache,
and the percentage of those objects compared to all the
objects in it:
```
> spl_kmem_caches | filter obj.skc_name == "zio_buf_512" | pp
name        ...            source total_memory util
----------- ... ----------------- ------------ ----
zio_buf_512 ... kmalloc-512[SLUB]       16.9MB    8
```

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes openzfs#9474
tonyhutter pushed a commit that referenced this pull request Jan 23, 2020
Currently, for certain sizes and classes of allocations we use
SPL caches that are backed by caches in the Linux Slab allocator
to reduce fragmentation and increase utilization of memory. The
way things are implemented for these caches as of now though is
that we don't keep any statistics of the allocations that we
make from these caches.

This patch enables the tracking of allocated objects in those
SPL caches by making the trade-off of grabbing the cache lock
at every object allocation and free to update the respective
counter.

Additionally, this patch makes those caches visible in the
/proc/spl/kmem/slab special file.

As a side note, enabling the specific counter for those caches
enables SDB to create a more user-friendly interface than
/proc/spl/kmem/slab that can also cross-reference data from
slabinfo. Here is for example the output of one of those
caches in SDB that outputs the name of the underlying Linux
cache, the memory of SPL objects allocated in that cache,
and the percentage of those objects compared to all the
objects in it:
```
> spl_kmem_caches | filter obj.skc_name == "zio_buf_512" | pp
name        ...            source total_memory util
----------- ... ----------------- ------------ ----
zio_buf_512 ... kmalloc-512[SLUB]       16.9MB    8
```

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #9474
sdimitro added a commit to sdimitro/zfs that referenced this pull request Mar 13, 2020
Currently, for certain sizes and classes of allocations we use
SPL caches that are backed by caches in the Linux Slab allocator
to reduce fragmentation and increase utilization of memory. The
way things are implemented for these caches as of now though is
that we don't keep any statistics of the allocations that we
make from these caches.

This patch enables the tracking of allocated objects in those
SPL caches by making the trade-off of grabbing the cache lock
at every object allocation and free to update the respective
counter.

Additionally, this patch makes those caches visible in the
/proc/spl/kmem/slab special file.

As a side note, enabling the specific counter for those caches
enables SDB to create a more user-friendly interface than
/proc/spl/kmem/slab that can also cross-reference data from
slabinfo. Here is for example the output of one of those
caches in SDB that outputs the name of the underlying Linux
cache, the memory of SPL objects allocated in that cache,
and the percentage of those objects compared to all the
objects in it:
```
> spl_kmem_caches | filter obj.skc_name == "zio_buf_512" | pp
name        ...            source total_memory util
----------- ... ----------------- ------------ ----
zio_buf_512 ... kmalloc-512[SLUB]       16.9MB    8
```

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes openzfs#9474
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants