perf: enhance the loading process of replicas particularly when a significant number of replicas are spread across multiple disks #2078

empiredan · 2024-07-18T17:03:27Z

No description provided.

acelyc111 · 2024-07-30T11:39:00Z

src/replica/replica_stub.h

+
+    // Get the dir name for a replica from a potentially longer path (both absolute and
+    // relative paths are possible).
+    static std::string get_replica_dir_name(const std::string &dir);


Could you add some tests for these utility functions?

get_replica_dir_name

parse_replica_dir_name

acelyc111 · 2024-07-30T12:15:59Z

src/replica/replica_stub.h

+    parse_replica_dir_name(const std::string &dir_name, gpid &pid, std::string &app_type);
+
+    // Load an existing replica which is located in `dn` with `dir`. Usually each different
+    // `dn` represents a unique disk. `dir` is the absolute path of the directory for a


Suggested change

// `dn` represents a unique disk. `dir` is the absolute path of the directory for a

// `dn` represents an unique disk. `dir` is the absolute path of the directory for a

I think a should be put before unique which begins with a consonant sound j.

acelyc111 · 2024-07-30T12:16:29Z

src/replica/replica_stub.h

+    // Load an existing replica which is located in `dn` with `dir`. Usually each different
+    // `dn` represents a unique disk. `dir` is the absolute path of the directory for a
+    // replica.
+    virtual replica_ptr load_replica(dir_node *dn, const char *dir);


Suggested change

virtual replica_ptr load_replica(dir_node *dn, const char *dir);

virtual replica_ptr load_replica(dir_node *dn, const char *replica_dir);

Clarify it's the replica's dir, not the dir_node's.

acelyc111 · 2024-07-30T12:23:45Z

src/replica/replica_stub.h

+    // Load all replicas synchronously from all disks to `reps`. This function would ensure
+    // that data on each disk is loaded more evenly, rather than that a disk would begin to
+    // be loaded only after another has been finished, in case that there are too many replicas
+    // on a disk and other disks cannot start loading until this disk is finished.


Just simplified as this?

Suggested change

// Load all replicas synchronously from all disks to `reps`. This function would ensure

// that data on each disk is loaded more evenly, rather than that a disk would begin to

// be loaded only after another has been finished, in case that there are too many replicas

// on a disk and other disks cannot start loading until this disk is finished.

// Load all replicas simultaneously from all disks to `reps`.

acelyc111 · 2024-07-30T13:00:30Z

src/replica/replica_stub.cpp

+    const auto *const worker = task::get_current_worker2();
+    if (worker != nullptr) {
+        CHECK(!(worker->pool()->spec().partitioned),
+              "The thread pool for loading replicas must not be partitioned since load balancing "


It would be better to mention which the thread pool is, so that the administators could know how to adjust the config.

acelyc111 · 2024-07-30T13:04:42Z

src/replica/replica_stub.cpp

+    CHECK(reps.find(rep->get_gpid()) == reps.end(),
+          "conflict replica dir: {} <--> {}",
+          rep->dir(),
+          reps[rep->get_gpid()]->dir());


Better to reuse the found iterator to avoid finding it again.

acelyc111 · 2024-07-30T13:10:14Z

src/replica/replica_stub.cpp

+            //
+            // For the docs of clang 16 please see:
+            //
+            // https://releases.llvm.org/16.0.0/tools/clang/docs/ReleaseNotes.html#c-20-feature-support:


Suggested change

// https://releases.llvm.org/16.0.0/tools/clang/docs/ReleaseNotes.html#c-20-feature-support:

// https://releases.llvm.org/16.0.0/tools/clang/docs/ReleaseNotes.html#c-20-feature-support.

acelyc111 · 2024-07-30T13:12:13Z

src/replica/replica_stub.cpp

+    std::vector<size_t> dir_indexes(disks.size(), 0);
+    std::vector<std::queue<std::pair<std::string, task_ptr>>> load_disk_queues(disks.size());


Better to add comments to describe what are they used for.

acelyc111 · 2024-07-30T13:20:05Z

src/replica/replica_stub.cpp

+                                FLAGS_load_replica_max_wait_time_ms,
+                                load_disk_queue.front().first,
+                                load_disk_queue.size(),
+                                disk_index,


The disk_index is just an internal variable, it may confused, is it necessary to be logged?

acelyc111 · 2024-07-30T13:20:47Z

src/replica/replica_stub.cpp

+            if (!load_disk_queue.empty() &&
+                load_disk_queue.size() >= FLAGS_max_replicas_on_load_for_each_disk) {


Suggested change

if (!load_disk_queue.empty() &&

load_disk_queue.size() >= FLAGS_max_replicas_on_load_for_each_disk) {

if (load_disk_queue.size() >= FLAGS_max_replicas_on_load_for_each_disk) {

acelyc111 · 2024-07-30T14:21:45Z

src/replica/replica_stub.cpp

+                    continue;
+                }
+
+                // Continue to load a replica since we are within the limit now.


Skip loading?

acelyc111 · 2024-07-30T14:31:16Z

src/replica/replica_stub.cpp

+                }
+            }
+
+            LOG_DEBUG("ready to load dir(index={}, path={}) for disk(index={}, tag={}, path={})",


How about moving it below the next continue in line 566?

acelyc111 · 2024-07-30T14:46:44Z

src/replica/replica_stub.cpp

@@ -2015,7 +2164,6 @@ replica *replica_stub::load_replica(dir_node *dn, const char *dir)
    const auto err = rep->initialize_on_load();
    if (err != ERR_OK) {
        LOG_ERROR("{}: load replica failed, err = {}", rep->name(), err);
-        rep->close();


Why remove this?

In that it would be called immediately by delete rep;.

acelyc111 · 2024-07-30T14:48:16Z

src/replica/replica_stub.cpp

-        tsk->wait();
-    }
-    uint64_t finish_time = dsn_now_ms();
+    utils::chronograph chrono;


The macros in src/utils/timer.h may help.

acelyc111 · 2024-07-30T15:00:57Z

src/replica/replica_stub.cpp

+            if (!load_disk_queue.empty() &&
+                load_disk_queue.size() >= FLAGS_max_replicas_on_load_for_each_disk) {
+                // Loading replicas should be throttled in case that disk IO is saturated.
+                if (load_disk_queue.front().second->wait(FLAGS_load_replica_max_wait_time_ms)) {


It seems this patch implemented a theadpool-with-max-threads, isn't it? The benifit compare to the former implementation is now we can limit the max_replicas_on_load_for_each_disk.

Could the rocksdb::ThreadPool work well here? https://github.com/apache/incubator-pegasus/blob/master/src/shell/commands/local_partition_split.cpp#L392

… disks

github-actions bot added the cpp label Jul 18, 2024

empiredan marked this pull request as ready for review July 30, 2024 09:13

acelyc111 reviewed Jul 30, 2024

View reviewed changes

empiredan force-pushed the optimize-load-replica branch from 80f2a64 to 029319f Compare August 23, 2024 09:32

empiredan force-pushed the optimize-load-replica branch from aaa9c43 to 0b047f9 Compare September 18, 2024 07:20

github-actions bot added the scripts label Sep 20, 2024

empiredan force-pushed the optimize-load-replica branch 2 times, most recently from 9bffedc to 9a8e525 Compare September 23, 2024 10:49

github-actions bot added the github label Sep 27, 2024

empiredan added 19 commits September 29, 2024 17:51

perf: improve the loading of a great number of replicas from multiple…

be578e2

… disks

perf: improve the loading of a great number of replicas from multiple…

44b80e8

… disks

fix compilation

b61ad76

fix tests

251ae4f

format

afb0233

fix IWYU

3b5e6c1

refactor

7015006

refactor

5db3f76

add test for loading replicas

5e38585

add tests

89d39b9

add tests

6b6b630

fix tests

2fb4a53

add tests

e9c6db6

add comments

05bfb8d

refactor

db6b056

refactor

fadcb34

refactor

2adc2d3

fix clang tidy

89ce78c

fix clang tidy

ef3d4de

empiredan added 11 commits September 29, 2024 17:51

fix centos 7 compilation and IWYU

e4e20e5

fix IWYU

845f42e

fix clang tidy

f52a98c

rename parameter

d5ac1fb

add GetReplicaDirNameTest

284af4f

add ParseReplicaDirNameTest, fix clang-tidy and IWYU

39bb57d

fix ParseReplicaDirNameTest and fix IWYU

40483e7

fix clang tidy

5ecadf9

fix load replicas

c456dc2

fix IWYU

783112f

fix upload-artifact

018d3b4

empiredan force-pushed the optimize-load-replica branch from 6e246bb to 018d3b4 Compare September 29, 2024 09:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: enhance the loading process of replicas particularly when a significant number of replicas are spread across multiple disks #2078

perf: enhance the loading process of replicas particularly when a significant number of replicas are spread across multiple disks #2078

empiredan commented Jul 18, 2024

acelyc111 Jul 30, 2024

empiredan Sep 26, 2024

acelyc111 Jul 30, 2024

empiredan Aug 23, 2024

acelyc111 Jul 30, 2024

empiredan Sep 26, 2024

acelyc111 Jul 30, 2024

empiredan Sep 26, 2024

acelyc111 Jul 30, 2024

empiredan Sep 26, 2024

acelyc111 Jul 30, 2024

acelyc111 Jul 30, 2024

acelyc111 Jul 30, 2024

acelyc111 Jul 30, 2024

acelyc111 Jul 30, 2024

acelyc111 Jul 30, 2024

acelyc111 Jul 30, 2024

acelyc111 Jul 30, 2024

empiredan Aug 23, 2024

acelyc111 Jul 30, 2024

acelyc111 Jul 30, 2024

	// `dn` represents a unique disk. `dir` is the absolute path of the directory for a
	// `dn` represents an unique disk. `dir` is the absolute path of the directory for a

	virtual replica_ptr load_replica(dir_node dn, const char dir);
	virtual replica_ptr load_replica(dir_node dn, const char replica_dir);

	// https://releases.llvm.org/16.0.0/tools/clang/docs/ReleaseNotes.html#c-20-feature-support:
	// https://releases.llvm.org/16.0.0/tools/clang/docs/ReleaseNotes.html#c-20-feature-support.

		std::vector<size_t> dir_indexes(disks.size(), 0);
		std::vector<std::queue<std::pair<std::string, task_ptr>>> load_disk_queues(disks.size());

		if (!load_disk_queue.empty() &&
		load_disk_queue.size() >= FLAGS_max_replicas_on_load_for_each_disk) {

	if (!load_disk_queue.empty() &&
	load_disk_queue.size() >= FLAGS_max_replicas_on_load_for_each_disk) {
	if (load_disk_queue.size() >= FLAGS_max_replicas_on_load_for_each_disk) {

perf: enhance the loading process of replicas particularly when a significant number of replicas are spread across multiple disks #2078

Are you sure you want to change the base?

perf: enhance the loading process of replicas particularly when a significant number of replicas are spread across multiple disks #2078

Conversation

empiredan commented Jul 18, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment