-
Notifications
You must be signed in to change notification settings - Fork 411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DNM] Raft integration and TMT storage review #32
Conversation
dbms/src/Raft/RaftService.cpp
Outdated
@@ -25,6 +25,7 @@ RaftService::RaftService(const std::string & address_, DB::Context & db_context_ | |||
|
|||
RaftService::~RaftService() | |||
{ | |||
// REVIEW: can we remove this mutex? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will there be any upcoming commits to fix this? Or just put some questions in codebase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I think messages like this should put somewhere else like Jira instead of here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All that messages will be addressed(PRs/issues should be created) after online-video discussion, before that nothing need to do
@@ -42,6 +42,8 @@ RegionMeta RegionMeta::deserialize(ReadBuffer & buf) | |||
return RegionMeta(peer, region, apply_state, applied_term, pending_remove); | |||
} | |||
|
|||
// REVIEW: lock? be carefull, can be easily deadlock. | |||
// or use member `const RegionID region_id` | |||
RegionID RegionMeta::regionId() const { return region.id(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should return region_id. I added the member region_id but forgot to change this part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -44,6 +48,8 @@ void applySnapshot(KVStorePtr kvstore, RequestReader read, Context * context) | |||
auto cf_name = data.cf(); | |||
auto key = TiKVKey(); | |||
auto value = TiKVValue(); | |||
// REVIEW: the batch inserting logic is OK, but the calling stack is weird. | |||
// May be we should just do lock action on each node-inserting, it's also fast when lock contention is low. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By now, we don't need a specific function to deal with batch inserting. I implement this function just there was a TODO.
@@ -21,6 +23,7 @@ void RegionPersister::persist(const RegionPtr & region, enginepb::CommandRespons | |||
// Support only on thread persist. | |||
std::lock_guard<std::mutex> lock(mutex); | |||
|
|||
// REVIEW: can we just region->resetPersistParm(void)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I store the value first because it may be changed during doPersist
.
@@ -110,7 +112,9 @@ class RegionMeta | |||
|
|||
// When we create a region peer, we should initialize its log term/index > 0, | |||
// so that we can force the follower peer to sync the snapshot first. | |||
// REVIEW: why initialize term = 5? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ask menglong, this is a magic number.
@@ -176,6 +181,7 @@ void RegionMeta::doSetPendingRemove() { pending_remove = true; } | |||
void RegionMeta::waitIndex(UInt64 index) | |||
{ | |||
std::unique_lock<std::mutex> lock(mutex); | |||
// REVIEW: should we lock inside the closure function? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::condition_variable itself will lock.
@@ -127,9 +130,11 @@ enginepb::CommandResponse RegionMeta::toCommandResponse() const | |||
|
|||
RegionMeta::RegionMeta(RegionMeta && rhs) : region_id(rhs.regionId()) | |||
{ | |||
// REVIEW: lock rhs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe no need. I just added a lock in case
peer = std::move(rhs.peer); | ||
region = std::move(rhs.region); | ||
apply_state = std::move(rhs.apply_state); | ||
// REVIEW: set rhs.* to init state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
even if a region is at initial state, its data is not empty. since we've decided to move it, it should not be used any more.
@@ -15,12 +15,16 @@ void applySnapshot(KVStorePtr kvstore, RequestReader read, Context * context) | |||
|
|||
enginepb::SnapshotRequest request; | |||
auto ok = read(&request); | |||
// REVIEW: use two 'throw' here, 'not ok' and 'no state' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO
@@ -40,6 +43,7 @@ void RegionPersister::doPersist(const RegionPtr & region, enginepb::CommandRespo | |||
} | |||
|
|||
MemoryWriteBuffer buffer; | |||
// REVIEW: add some warning log here if region is too large |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO
@@ -11,6 +11,8 @@ extern const int LOGICAL_ERROR; | |||
|
|||
void RegionPersister::drop(RegionID region_id) | |||
{ | |||
// REVIEW: need mutex? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -70,6 +70,7 @@ class RegionMeta | |||
bool isPendingRemove() const; | |||
void setPendingRemove(); | |||
|
|||
// REVIEW: assign is a better name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be considered
@@ -94,6 +95,7 @@ class RegionMeta | |||
void doSetApplied(UInt64 index, UInt64 term); | |||
|
|||
private: | |||
// REVIEW: we should make sure all these member are deepcopy, (eg: region.peers()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. it is
@@ -225,6 +232,7 @@ void RegionMeta::execChangePeer( | |||
|
|||
switch (change_peer_request.change_type()) | |||
{ | |||
// REVIEW: throws when meet `AddNode`? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any raft command is sent to all peer. we should discard those we don't need.
@@ -58,6 +58,7 @@ class KVStore final : private boost::noncopyable | |||
|
|||
std::atomic<Timepoint> last_try_persist_time = Clock::now(); | |||
|
|||
// REVIEW: the task_mutex looks should be one mutex per region than one big lock in kvstore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because of operation like split and merge, we can not regard region as independent one.
@@ -272,6 +282,9 @@ void KVStore::removeRegion(RegionID region_id, Context * context) | |||
} | |||
|
|||
region_persister.drop(region_id); | |||
|
|||
// REVIEW: if process crushed here, then when the process start again, the region_table will not find this region, is it OK? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In region_table, we don't store the range of table to disk. At any time, we should ensure that the range in region_table should contains the one in region_persister to avoid losing data. After restarted, we can recompute the range of table.
} | ||
else | ||
{ | ||
// REVIEW: is the persisting order OK? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updateRegion can only add info of range, can't change
@@ -160,6 +163,7 @@ void KVStore::onServiceCommand(const enginepb::CommandRequestBatch & cmds, RaftC | |||
auto [it, ok] = regions.emplace(new_region->id(), new_region); | |||
if (!ok) | |||
{ | |||
// REVIEW: do we need to compare the old one and the new one? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because index of the splitted one is definitely init state.
@@ -82,11 +83,13 @@ void KVStore::onSnapshot(RegionPtr new_region, Context * context) | |||
|
|||
if (new_region->isPendingRemove()) | |||
{ | |||
// REVIEW: here we remove the region in persister, then below the region is added to persister again |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a kind of ABA problem. I have avoid the wrong process about selecting in applySnapshot
and MergeTreeDataSelectExecutor::read
.
However one region may appear in two store, because rngine may crush during send raft log which contains ChangePeer
cmd. It will bring disaster to our learner read process.
@@ -66,6 +66,7 @@ void KVStore::onSnapshot(RegionPtr new_region, Context * context) | |||
{ | |||
LOG_DEBUG(log, "KVStore::onSnapshot: previous " << old_region->toString(true) << " ; new " << new_region->toString(true)); | |||
|
|||
// REVIEW: in what case this can happen? rngine crushed? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure. just for idempotent.
/rebuild |
dbms/src/Storages/Page/PageFile.cpp
Outdated
fd = ::open(path.c_str(), flags, 0666); | ||
if (-1 == fd) | ||
{ | ||
ProfileEvents::increment(ProfileEvents::FileOpenFailed); | ||
// REVIEW: this seems not right |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a bug. Should be
if constexpr (!must_exist)
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en