Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NUMA-aware cache modes #400

Open
krizhanovsky opened this issue Jan 24, 2016 · 2 comments
Open

NUMA-aware cache modes #400

krizhanovsky opened this issue Jan 24, 2016 · 2 comments
Labels
cache enhancement performance TDB Tempesta DB module and related issues
Milestone

Comments

@krizhanovsky
Copy link
Contributor

krizhanovsky commented Jan 24, 2016

Introduction

Currently cache supports either sharded or replicated modes. However, there is no sense to replicate very large (e.g. several gigabytes like DVD images) entries to each NUMA node, rather the entries should be stored in one particular node (also the entries are sent in many TDB lookups in sense of #391). In particular, web cache is a bad candidate for replicated data, since there is eviction process and large data set. From the other hand, small and written-once on the configuration time data (#910 and #1350) is a good candidate for replication.

Also see Challenges of Memory Management on Modern NUMA System: Optimizing NUMA systems applications with Carrefour for modern NUMA problems and optimization methods, page replication in particular.

Modes of operation

There must be 3 modes fully resolved on TDB side:

  1. UNIFORM - just pretend that we work on uniform memory (interleaving default NUMA mode)
  2. REPLICATED - inserts and deletes happen on all the nodes, lookup on local node only. At the moment we see configuration time databases only, so no need for lock-free
  3. SHARD - forest of HTries to cope with 128GB database limit (we must be able to keep several shards on each node). It seems we also need to adjust the allocators, including the early kernel huge pages one (also relates to Huge pages allocation issue and the crash on cache sizes >=2GB #1515 ), level to make it place pages on specific nodes. Probably, this way we also can cope with the 1st problem of Huge pages allocation issue and the crash on cache sizes >=2GB #1515 with one too large contiguous memory allocation.

Testing

(This is from tempesta-tech/tempesta-test#61 , reopen the issue if necessary).

When a web resource is saved into cache, it can be saved only on current or on all NUMA nodes. All cache-related tests must validate correctness of cache behaviour:

  • In sharding mode, resource musn't be copied across all the NUMA nodes,
  • In replicated mode, resource must be copied across all NUMA nodes.

Currently we run tests on CI on single NUMA node, so CI must also be updated to use different number of NUMA nodes. Probably there is more tests that must be run on different NUMA topologies

@krizhanovsky krizhanovsky added this to the 0.6 OS milestone Jan 24, 2016
@krizhanovsky krizhanovsky self-assigned this May 23, 2016
@krizhanovsky krizhanovsky removed their assignment Oct 15, 2018
@krizhanovsky krizhanovsky added the TDB Tempesta DB module and related issues label Apr 27, 2020
@krizhanovsky krizhanovsky changed the title Hybrid cache mode NUMA-aware cache modes Sep 8, 2022
@krizhanovsky krizhanovsky modified the milestones: 1.1: TBD, 1.x - TBD Nov 7, 2023
@const-t
Copy link
Contributor

const-t commented Dec 10, 2024

We must keep in mind, NUMA node is not always consist of memory and cpu. Sometimes node can have only memory or only cpu. Now we ignore such cases for cache's REPLICA mode, but prohibit to use SHARD cache's mode on setups where some node doesn't have at least one cpu. Also in this case we reserve memory at boot stage for each online node, even if node doesn't have cpus. Doing this we waste some space, however naive solution with using for_each_node_with_cpus is not applicable here, because at this boot stage for_each_node_with_cpus is not available for use.

@const-t
Copy link
Contributor

const-t commented Dec 13, 2024

Here we doing __cache_add_node(), however more proper way to do it is to call only tfw_cache_copy_resp() and tdb_entry_alloc_unique() for each node, instead of calling whole __cache_add_node().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cache enhancement performance TDB Tempesta DB module and related issues
Projects
None yet
Development

No branches or pull requests

2 participants