-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(storagenode): ignore context error while checking to interleave of Append RPC errors #504
Conversation
Current dependencies on/for this PR:
This comment was auto-generated by Graphite. |
Codecov ReportPatch coverage has no change and project coverage change:
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. Additional details and impacted files@@ Coverage Diff @@
## restore_uncommitted_logs #504 +/- ##
============================================================
- Coverage 61.81% 61.65% -0.16%
============================================================
Files 137 137
Lines 18751 18759 +8
============================================================
- Hits 11590 11566 -24
- Misses 6587 6609 +22
- Partials 574 584 +10
☔ View full report in Codecov by Sentry. |
9e8209f
to
aee1b52
Compare
b4f9531
to
6efb942
Compare
aee1b52
to
6f87f44
Compare
6efb942
to
38d3942
Compare
6f87f44
to
267cccc
Compare
38d3942
to
b54b913
Compare
b54b913
to
3bec88e
Compare
…f Append RPC errors Append batch requests can succeed partially. In case of partial failure, failed log entries must be sequential. That is, suffixes of the batch can be failed to append, whose length can be from zero to the full size. Our codebase panics when the batch's success and failure are interleaved. This PR ignores context errors caused by clients since the batch can succeed in the storage node, although clients canceled it.
3bec88e
to
04d1052
Compare
🤖 I have created a release *beep* *boop* --- ## [0.15.0](v0.14.1...v0.15.0) (2023-07-31) ### Features * **admin:** add otelgrpc metric interceptor ([d9ca9aa](d9ca9aa)) * **admin:** add otelgrpc metric interceptor ([#509](#509)) ([db7a1a2](db7a1a2)) * **admin:** speed up fetching cluster metadata ([3e46f62](3e46f62)) * **admin:** speed up fetching cluster metadata ([#480](#480)) ([53a8f19](53a8f19)) * **all:** add common flags for telemetry ([fcacd1a](fcacd1a)) * **all:** add common flags for telemetry ([#494](#494)) ([63355e9](63355e9)) * **benchmark:** share a connection between appenders in a target ([7dc53e9](7dc53e9)) * **benchmark:** share a connection between appenders in a target ([#524](#524)) ([2cd9196](2cd9196)) * **client:** add Clear to the log stream appender manager ([9a89065](9a89065)) * **client:** add Clear to the log stream appender manager ([#514](#514)) ([e5b6a2e](e5b6a2e)) * **storagenode:** add --storage-trim-delay to set a delay before the deletion of log entries ([db39713](db39713)) * **storagenode:** add --storage-trim-delay to set a delay before the deletion of log entries ([#529](#529)) ([015bfa4](015bfa4)) * **storagenode:** add --storage-trim-rate to set throttling rate of Trim ([83b7496](83b7496)) * **storagenode:** add --storage-trim-rate to set throttling rate of Trim ([#530](#530)) ([6e69306](6e69306)) * **telemetry:** customize bucket size of process.runtime.go.gc.pause_ns ([b181132](b181132)) * **telemetry:** customize bucket size of process.runtime.go.gc.pause_ns ([#510](#510)) ([9d99520](9d99520)) * **telemetry:** customize bucket size of rpc.server.duration ([a0e5973](a0e5973)) * **telemetry:** customize bucket size of rpc.server.duration ([#511](#511)) ([e41fe1c](e41fe1c)) ### Bug Fixes * **benchmark:** make append duration's precision high ([e3a091d](e3a091d)) * **benchmark:** make append duration's precision high ([#522](#522)) ([815af53](815af53)) * **benchmark:** support graceful stop ([8616d55](8616d55)) * **benchmark:** support graceful stop ([#527](#527)) ([fc4ed81](fc4ed81)) * **metarepos:** add TestMRIgnoreDirtyReport ([fe2a550](fe2a550)) * **metarepos:** allow set commitTick ([bdca20a](bdca20a)) * **metarepos:** ignore invalid report ([e8620de](e8620de)) * **storagenode:** ignore context error while checking to interleave of Append RPC errors ([04d1052](04d1052)) * **storagenode:** ignore context error while checking to interleave of Append RPC errors ([#504](#504)) ([5a7a3b0](5a7a3b0)) * **storagenode:** restore uncommitted logs ([267cccc](267cccc)), closes [#490](#490) * **storagenode:** restore uncommitted logs ([#492](#492)) ([a9832ee](a9832ee)), closes [#490](#490) ### Performance Improvements * **admin:** use singleflight to handle Admin's RPCs ([c231888](c231888)) * **admin:** use singleflight to handle Admin's RPCs ([#482](#482)) ([1a6a96d](1a6a96d)) * **metarepos:** add a pool for []*mrpb.Report ([fa8c89d](fa8c89d)) * **metarepos:** add a pool for []*mrpb.Report ([#534](#534)) ([16b2181](16b2181)) * **metarepos:** add a pool for *mrpb.RaftEntry ([be9f121](be9f121)) * **metarepos:** add a pool for *mrpb.RaftEntry ([#536](#536)) ([96ab5e2](96ab5e2)) * **metarepos:** add a pool for mrpb.Reports ([59a6a5a](59a6a5a)) * **metarepos:** add a pool for mrpb.Reports ([#533](#533)) ([b227c75](b227c75)) * **metarepos:** avoid copy overhead by removing unnecessary converting from byte slice to string ([a775628](a775628)) * **metarepos:** avoid copy overhead by removing unnecessary converting from byte slice to string ([#532](#532)) ([1702769](1702769)) * **metarepos:** reuse mrpb.StorageNodeUncommitReport while changed ([57d8039](57d8039)) * **metarepos:** reuse mrpb.StorageNodeUncommitReport while changed ([#537](#537)) ([8f6e097](8f6e097)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
What this PR does
Append batch requests can succeed partially. In case of partial failure, failed log entries must be sequential. That is, suffixes of the batch can be failed to append, whose length can be from zero to the full size.
Our codebase panics when the batch's success and failure are interleaved. This PR ignores context errors caused by clients since the batch can succeed in the storage node, although clients canceled it.