Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(storagenode): ignore context error while checking to interleave of Append RPC errors #504

Merged
merged 1 commit into from
Jul 31, 2023

Conversation

ijsong
Copy link
Member

@ijsong ijsong commented Jul 4, 2023

What this PR does

Append batch requests can succeed partially. In case of partial failure, failed log entries must be sequential. That is, suffixes of the batch can be failed to append, whose length can be from zero to the full size.

Our codebase panics when the batch's success and failure are interleaved. This PR ignores context errors caused by clients since the batch can succeed in the storage node, although clients canceled it.

@ijsong
Copy link
Member Author

ijsong commented Jul 4, 2023

@codecov-commenter
Copy link

codecov-commenter commented Jul 4, 2023

Codecov Report

Patch coverage has no change and project coverage change: -0.16% ⚠️

Comparison is base (267cccc) 61.81% compared to head (b54b913) 61.65%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files
@@                     Coverage Diff                      @@
##           restore_uncommitted_logs     #504      +/-   ##
============================================================
- Coverage                     61.81%   61.65%   -0.16%     
============================================================
  Files                           137      137              
  Lines                         18751    18759       +8     
============================================================
- Hits                          11590    11566      -24     
- Misses                         6587     6609      +22     
- Partials                        574      584      +10     
Files Changed Coverage Δ
internal/storagenode/logstream/append.go 45.49% <0.00%> (-1.62%) ⬇️

... and 7 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ijsong ijsong force-pushed the restore_uncommitted_logs branch from 9e8209f to aee1b52 Compare July 4, 2023 06:45
@ijsong ijsong force-pushed the sn_append_partial_fialure_context_error branch from b4f9531 to 6efb942 Compare July 4, 2023 06:46
@ijsong ijsong force-pushed the restore_uncommitted_logs branch from aee1b52 to 6f87f44 Compare July 17, 2023 00:59
@ijsong ijsong force-pushed the sn_append_partial_fialure_context_error branch from 6efb942 to 38d3942 Compare July 17, 2023 00:59
@ijsong ijsong requested a review from hungryjang July 27, 2023 09:14
@ijsong ijsong force-pushed the restore_uncommitted_logs branch from 6f87f44 to 267cccc Compare July 28, 2023 08:40
@ijsong ijsong force-pushed the sn_append_partial_fialure_context_error branch from 38d3942 to b54b913 Compare July 28, 2023 08:40
Base automatically changed from restore_uncommitted_logs to main July 28, 2023 09:54
@ijsong ijsong force-pushed the sn_append_partial_fialure_context_error branch from b54b913 to 3bec88e Compare July 28, 2023 12:33
…f Append RPC errors

Append batch requests can succeed partially. In case of partial failure, failed log entries must be
sequential. That is, suffixes of the batch can be failed to append, whose length can be from zero to
the full size.

Our codebase panics when the batch's success and failure are interleaved. This PR ignores context
errors caused by clients since the batch can succeed in the storage node, although clients canceled
it.
@ijsong ijsong force-pushed the sn_append_partial_fialure_context_error branch from 3bec88e to 04d1052 Compare July 31, 2023 06:09
@ijsong ijsong merged commit 5a7a3b0 into main Jul 31, 2023
@ijsong ijsong deleted the sn_append_partial_fialure_context_error branch July 31, 2023 06:22
ijsong added a commit that referenced this pull request Aug 7, 2023
🤖 I have created a release *beep* *boop*
---


## [0.15.0](v0.14.1...v0.15.0) (2023-07-31)


### Features

* **admin:** add otelgrpc metric interceptor ([d9ca9aa](d9ca9aa))
* **admin:** add otelgrpc metric interceptor ([#509](#509)) ([db7a1a2](db7a1a2))
* **admin:** speed up fetching cluster metadata ([3e46f62](3e46f62))
* **admin:** speed up fetching cluster metadata ([#480](#480)) ([53a8f19](53a8f19))
* **all:** add common flags for telemetry ([fcacd1a](fcacd1a))
* **all:** add common flags for telemetry ([#494](#494)) ([63355e9](63355e9))
* **benchmark:** share a connection between appenders in a target ([7dc53e9](7dc53e9))
* **benchmark:** share a connection between appenders in a target ([#524](#524)) ([2cd9196](2cd9196))
* **client:** add Clear to the log stream appender manager ([9a89065](9a89065))
* **client:** add Clear to the log stream appender manager ([#514](#514)) ([e5b6a2e](e5b6a2e))
* **storagenode:** add --storage-trim-delay to set a delay before the deletion of log entries ([db39713](db39713))
* **storagenode:** add --storage-trim-delay to set a delay before the deletion of log entries ([#529](#529)) ([015bfa4](015bfa4))
* **storagenode:** add --storage-trim-rate to set throttling rate of Trim ([83b7496](83b7496))
* **storagenode:** add --storage-trim-rate to set throttling rate of Trim ([#530](#530)) ([6e69306](6e69306))
* **telemetry:** customize bucket size of process.runtime.go.gc.pause_ns ([b181132](b181132))
* **telemetry:** customize bucket size of process.runtime.go.gc.pause_ns ([#510](#510)) ([9d99520](9d99520))
* **telemetry:** customize bucket size of rpc.server.duration ([a0e5973](a0e5973))
* **telemetry:** customize bucket size of rpc.server.duration ([#511](#511)) ([e41fe1c](e41fe1c))


### Bug Fixes

* **benchmark:** make append duration's precision high ([e3a091d](e3a091d))
* **benchmark:** make append duration's precision high ([#522](#522)) ([815af53](815af53))
* **benchmark:** support graceful stop ([8616d55](8616d55))
* **benchmark:** support graceful stop ([#527](#527)) ([fc4ed81](fc4ed81))
* **metarepos:** add TestMRIgnoreDirtyReport ([fe2a550](fe2a550))
* **metarepos:** allow set commitTick ([bdca20a](bdca20a))
* **metarepos:** ignore invalid report ([e8620de](e8620de))
* **storagenode:** ignore context error while checking to interleave of Append RPC errors ([04d1052](04d1052))
* **storagenode:** ignore context error while checking to interleave of Append RPC errors ([#504](#504)) ([5a7a3b0](5a7a3b0))
* **storagenode:** restore uncommitted logs ([267cccc](267cccc)), closes [#490](#490)
* **storagenode:** restore uncommitted logs ([#492](#492)) ([a9832ee](a9832ee)), closes [#490](#490)


### Performance Improvements

* **admin:** use singleflight to handle Admin's RPCs ([c231888](c231888))
* **admin:** use singleflight to handle Admin's RPCs ([#482](#482)) ([1a6a96d](1a6a96d))
* **metarepos:** add a pool for []*mrpb.Report ([fa8c89d](fa8c89d))
* **metarepos:** add a pool for []*mrpb.Report ([#534](#534)) ([16b2181](16b2181))
* **metarepos:** add a pool for *mrpb.RaftEntry ([be9f121](be9f121))
* **metarepos:** add a pool for *mrpb.RaftEntry ([#536](#536)) ([96ab5e2](96ab5e2))
* **metarepos:** add a pool for mrpb.Reports ([59a6a5a](59a6a5a))
* **metarepos:** add a pool for mrpb.Reports ([#533](#533)) ([b227c75](b227c75))
* **metarepos:** avoid copy overhead by removing unnecessary converting from byte slice to string ([a775628](a775628))
* **metarepos:** avoid copy overhead by removing unnecessary converting from byte slice to string ([#532](#532)) ([1702769](1702769))
* **metarepos:** reuse mrpb.StorageNodeUncommitReport while changed ([57d8039](57d8039))
* **metarepos:** reuse mrpb.StorageNodeUncommitReport while changed ([#537](#537)) ([8f6e097](8f6e097))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants