Skip to content
This repository has been archived by the owner on Apr 15, 2024. It is now read-only.

feat: orchestrator not failing and requeing failed nonces #631

Merged
merged 5 commits into from
Dec 4, 2023

Conversation

rach-id
Copy link
Member

@rach-id rach-id commented Nov 30, 2023

Overview

Checklist

  • New and updated code has appropriate documentation
  • New and updated code has new and/or updated testing
  • Required CI checks are passing
  • Visual proof for any user facing features like CLI or documentation updates
  • Linked issues closed with keywords

Summary by CodeRabbit

  • Refactor

    • Improved retry logic with adjusted parameters for more responsive error handling.
    • Enhanced logging by setting default log level to debug for better visibility during troubleshooting.
  • New Features

    • Introduced a new constant to optimize requeuing operations.
    • Added a method to efficiently requeue items for processing under specific conditions.
  • Bug Fixes

    • Streamlined context management across various methods to ensure consistent behavior and error handling.

@rach-id rach-id added enhancement New feature or request orchestrator orchestrator related labels Nov 30, 2023
@rach-id rach-id self-assigned this Nov 30, 2023
@rach-id rach-id requested a review from evan-forbes as a code owner November 30, 2023 23:11
Copy link
Contributor

coderabbitai bot commented Nov 30, 2023

Walkthrough

The recent updates focus on refining the retry logic and enhancing debuggability. The retrier's parameters have been adjusted for more frequent attempts, and logging levels have been shifted to debug for less verbosity in standard operation. The orchestrator's methods now require a context passed in, promoting better control over goroutine lifecycles. Additionally, a new requeue mechanism has been introduced, allowing for more resilient error handling and potential reprocessing of tasks.

Changes

File Path Change Summary
cmd/.../cmd.go Retrier parameters changed from 6 retries with 1-minute intervals to 5 retries with 30-second intervals.
e2e/scripts/start_orchestrator_after_validator_created.sh Added --log.level debug to set log level to debug.
helpers/retrier.go Changed logging level from "Info" to "Debug" for retry-related messages.
orchestrator/orchestrator.go Added RequeueWindow constant, modified method signatures to accept context, replaced cancel calls with returns, and added MaybeRequeue method for error handling.

🐇✨
In the code's burrow, deep and wide,
A rabbit tweaked the flow of time.
Retries quick, and contexts passed,
Bugs hop away, the fixes last. 🌟🐾


Tips

Chat with CodeRabbit Bot (@coderabbitai)

  • If you reply to a review comment from CodeRabbit, the bot will automatically respond.
  • To engage with CodeRabbit bot directly around the specific lines of code in the PR, mention @coderabbitai in your review comment
  • Note: Review comments are made on code diffs or files, not on the PR overview.
  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Note: For conversation with the bot, please use the review comments on code diffs or files.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 2496504 and e9a53f9.
Files selected for processing (4)
  • cmd/blobstream/orchestrator/cmd.go (1 hunks)
  • e2e/scripts/start_orchestrator_after_validator_created.sh (2 hunks)
  • helpers/retrier.go (1 hunks)
  • orchestrator/orchestrator.go (6 hunks)
Additional comments: 12
cmd/blobstream/orchestrator/cmd.go (1)
  • 113-113: The change in the retrier instantiation parameters from 6 attempts with a 1-minute interval to 5 attempts with a 30-second interval is correctly implemented.
e2e/scripts/start_orchestrator_after_validator_created.sh (2)
  • 54-58: The addition of --log.level debug will increase the verbosity of the logs, which is useful for debugging. Ensure that this level of detail is appropriate for the environment where the script will be run, as it may result in large log files.

  • 67-71: The same change has been applied to the else branch of the if-else block, adding --log.level debug. As with the previous hunk, verify that the increased log verbosity is suitable for the environment.

helpers/retrier.go (4)
  • 40-40: The change in logging level from "Info" to "Debug" for retry attempts is consistent with the summary and will reduce log verbosity as intended.

  • 43-43: The log message for a successful retry remains at the "Info" level, which is appropriate for signaling important events like recovery from errors.

  • 46-46: The log message for a failed retry attempt is at the "Debug" level, aligning with the changes to make the logs less verbose.

  • 37-49: The exponential backoff strategy in NextTick could lead to very long delays for high retry counts. Ensure this aligns with the desired behavior, especially given the changes to the retry mechanism in other parts of the system.

orchestrator/orchestrator.go (5)
  • 29-31: The addition of RequeueWindow constant is consistent with the summary and seems to be used correctly in the MaybeRequeue method.

  • 73-87: The Start method correctly accepts a context as an argument, which is a change from the previous version where it created its own context. This is consistent with the summary.

  • 73-88: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [73-111]

The StartNewEventsListener, ProcessNonces, and EnqueueMissingEvents methods correctly accept a context as an argument, which is a change from the previous version where they used a context with cancel. This is consistent with the summary.

  • 254-282: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [240-263]

The ProcessNonces method has been updated to call MaybeRequeue instead of closing signalChan and returning an error, which aligns with the summary and the goal of requeuing failed nonces.

  • 265-278: The MaybeRequeue method has been added and correctly implements the logic to determine whether a failed nonce should be requeued, which aligns with the summary.

cmd/blobstream/orchestrator/cmd.go Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Nov 30, 2023

Codecov Report

Attention: 64 lines in your changes are missing coverage. Please review.

Comparison is base (2496504) 25.91% compared to head (0e19f98) 25.49%.
Report is 2 commits behind head on main.

Files Patch % Lines
relayer/relayer.go 0.00% 44 Missing ⚠️
orchestrator/orchestrator.go 0.00% 20 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #631      +/-   ##
==========================================
- Coverage   25.91%   25.49%   -0.43%     
==========================================
  Files          29       29              
  Lines        3044     3095      +51     
==========================================
  Hits          789      789              
- Misses       2160     2211      +51     
  Partials       95       95              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rach-id rach-id enabled auto-merge (squash) December 4, 2023 13:09
@rach-id rach-id merged commit 4e416dc into celestiaorg:main Dec 4, 2023
17 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request orchestrator orchestrator related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants