Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors block #3584

Merged
merged 56 commits into from
Dec 3, 2024
Merged

Errors block #3584

merged 56 commits into from
Dec 3, 2024

Conversation

denis256
Copy link
Member

@denis256 denis256 commented Nov 20, 2024

Description

Implementation of errors block:

  • add parsing of errors block from HCL
  • handling of errors based on configuration
  • added integration tests
  • removed unused context fields for excludes
  • added error documentation
errors {
    # Retry block for transient errors
    retry "retry_network" {
        retryable_errors = [".*Error: network timeout.*"]
        max_attempts = 3
        sleep_interval_sec = 5
    }

    # Ignore block for non-critical errors
    ignore "ignore_warnings" {
        ignorable_errors = [
            ".*Warning: non-critical issue.*"
        ]
        message = "Ignoring non-critical warnings"
    }
}

RFC: #3134

TODOs

Read the Gruntwork contribution guidelines.

  • Update the docs.
  • Run the relevant tests successfully, including pre-commit checks.
  • Ensure any 3rd party code adheres with our license policy or delete this line if its not applicable.
  • Include release notes. If this PR is backward incompatible, include a migration guide.

Release Notes (draft)

Added / Removed / Updated [X].

Migration Guide

@denis256 denis256 changed the title WIP: Errors block Errors block Dec 1, 2024
@denis256 denis256 marked this pull request as ready for review December 1, 2024 20:19
Ignore map[string]*IgnoreConfig
}

// RetryConfig represents the configuration for retrying specific errors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// RetryConfig represents the configuration for retrying specific errors
// RetryConfig represents the configuration for retrying specific errors.

strict-list complains about a comment without a dot at the end.

)

// Sleep before retry
time.Sleep(time.Duration(action.RetrySleepSecs) * time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
time.Sleep(time.Duration(action.RetrySleepSecs) * time.Second)
select {
case <-time.After(time.Duration(action.RetrySleepSecs) * time.Second):
// try again
case <-ctx.Done():
return errors.New(ctx.Err())
}

Let's imagine that if an user sets RetrySleepSecs to say 60 seconds and RetryAttempts to 4 times and then wants to terminate the application by pressing ctrl-c, nothing will happen for 240 seconds. To prevent this, we need to listen to the context in case it was cancelled.

@denis256 denis256 requested a review from levkohimins December 3, 2024 17:22
levkohimins
levkohimins previously approved these changes Dec 3, 2024
- `message` (Optional): A warning message displayed when an error is ignored.
- Example: `"Ignoring safe-to-ignore errors"`.
- `signals` (Optional): Key-value pairs used to emit signals to external systems.
- Example: `safe_to_revert = true` indicates it is safe to revert the operation if it fails.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add a note here indicating what this does.

e.g.

Populating values into the `signals` attribute results in a JSON file named `error-signals.json` being emitted on failure. This file can be useful to inspect in CI/CD systems to determine what the recommended course of action is to address the failure.

In this example, an error has occurred, the author of the unit has signaled that it is safe to revert adjustments to this unit. A CI/CD system could have a standard process of finding all units with files named `error-signals.json`, then checkout the previous commit and apply those units in that state, reverting their updates.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated docs

options/options.go Outdated Show resolved Hide resolved
options/options.go Outdated Show resolved Hide resolved
options/options.go Outdated Show resolved Hide resolved
@denis256 denis256 requested a review from yhakbar December 3, 2024 20:14
@denis256 denis256 merged commit ba1e4b5 into main Dec 3, 2024
5 of 6 checks passed
@denis256 denis256 deleted the tg-262-error-blocks branch December 3, 2024 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants