Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(storagenode): replica in the learning state does not report to a metadata repository #159

Merged
merged 1 commit into from
Sep 22, 2022

Conversation

ijsong
Copy link
Member

@ijsong ijsong commented Sep 21, 2022

What this PR does

Previously, a log stream replica in the learning state sent reports to the metadata repository, and
the log stream replica had to use commit contexts sent from the source replica.
However, the source cannot do that in a future patch because replicas in a log stream will store the
latest commit context rather than all commit contexts.
This patch slightly changes the report's behavior in a log stream replica, not to send it while in a
learning state. When the log stream replica copies the data from the source replica, it will not
send the report.

Which issue(s) this PR resolves

Updates #125

Anything else

RFC: https://github.com/kakao/varlog/blob/main/docs/RFCs/20220915_commit_context.md


This change is Reviewable

@ijsong ijsong requested a review from hungryjang as a code owner September 21, 2022 03:40
@ijsong ijsong self-assigned this Sep 21, 2022
@codecov-commenter
Copy link

codecov-commenter commented Sep 21, 2022

Codecov Report

Base: 63.03% // Head: 62.95% // Decreases project coverage by -0.07% ⚠️

Coverage data is based on head (ca4c184) compared to base (b796fd7).
Patch coverage: 18.18% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #159      +/-   ##
==========================================
- Coverage   63.03%   62.95%   -0.08%     
==========================================
  Files         126      126              
  Lines       17052    17061       +9     
==========================================
- Hits        10748    10741       -7     
- Misses       5754     5764      +10     
- Partials      550      556       +6     
Impacted Files Coverage Δ
internal/storagenode/logstream/testing.go 65.95% <0.00%> (-11.55%) ⬇️
internal/storagenode/logstream/executor.go 80.12% <50.00%> (+0.29%) ⬆️
pkg/mrc/mrconnector/mr_connector.go 73.28% <0.00%> (-7.26%) ⬇️
internal/storagenode/replication_server.go 76.00% <0.00%> (-7.00%) ⬇️
internal/metarepos/storage.go 80.59% <0.00%> (+0.93%) ⬆️
internal/storagenode/config.go 45.71% <0.00%> (+2.85%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@@ -348,8 +348,11 @@ func (lse *Executor) Report(_ context.Context) (snpb.LogStreamUncommitReport, er
atomic.AddInt64(&lse.inflight, 1)
defer atomic.AddInt64(&lse.inflight, -1)

if lse.esm.load() == executorStateClosed {
switch lse.esm.load() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any chance that lse.esm.load and lse.lsc.reportCommitBase will be in learning state between executions so incorrect information will be read?

Copy link
Member Author

@ijsong ijsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @hungryjang and @ijsong)


internal/storagenode/logstream/executor.go line 351 at r1 (raw file):

Previously, hungryjang (Hyungkeun Jang) wrote…

Is there any chance that lse.esm.load and lse.lsc.reportCommitBase will be in learning state between executions so incorrect information will be read?

Good point. Currently, the only way to solve the concurrency issue is to use mutex. There is a good candidate who needs to change it as shared mutex:

// muAdmin makes Seal, Unseal, SyncInit and SyncReplicate, Trim methods run mutually exclusively.
muAdmin sync.Mutex

However, it can affect the response time of the report during handling seal, unseal, or sync-related operations. Does the inaccurate information in the report affect the performance of the metadata repository?

… metadata repository

Previously, a log stream replica in the learning state sent reports to the metadata repository, and
the log stream replica had to use commit contexts sent from the source replica.
However, the source cannot do that in a future patch because replicas in a log stream will store the
latest commit context rather than all commit contexts.
This patch slightly changes the report's behavior in a log stream replica, not to send it while in a
learning state. When the log stream replica copies the data from the source replica, it will not
send the report.

Updates kakao#125
@hungryjang
Copy link
Member

Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @hungryjang and @ijsong)

internal/storagenode/logstream/executor.go line 351 at r1 (raw file):

Previously, hungryjang (Hyungkeun Jang) wrote…
Good point. Currently, the only way to solve the concurrency issue is to use mutex. There is a good candidate who needs to change it as shared mutex:

// muAdmin makes Seal, Unseal, SyncInit and SyncReplicate, Trim methods run mutually exclusively.
muAdmin sync.Mutex

However, it can affect the response time of the report during handling seal, unseal, or sync-related operations. Does the inaccurate information in the report affect the performance of the metadata repository?

The performance of metadata repository is not affected. However, it can lead to misjudgment. Let's define a report that can be ignored and use it.

@ijsong ijsong merged commit 0a14ab5 into kakao:main Sep 22, 2022
ijsong added a commit to ijsong/varlog that referenced this pull request Sep 23, 2022
…port

When a log stream replica handles SyncInit, it can be prone to mutate the replica's state too early.
It should change the state after checking a valid sync range. It also updates the version, high
watermark, and uncommittedLLSNBegin with zeros to denote that the replica sends an invalid report
while the learning phase.
The replica checks whether it is in the learning phase at the end of method
`internal/storagenode/logstream.(*Executor).Report` to minimize the probability of concurrency
issues mentioned in kakao#159.

Refs
- kakao#159 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants