Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audit: capture emitted errors from nodes #23582

Merged
merged 3 commits into from
Oct 11, 2023

Conversation

peteski22
Copy link

During processing of audit pipelines for logging requests/responses, if we return errors they are send over a channel and gathered as Warnings on the Status struct.

In the case where the broker.Send returns an error, we should include these errors (warnings) in the message that is logged to the Vault server. This should make debugging issues easier for the Vault operator.

@peteski22 peteski22 added core/audit hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed labels Oct 10, 2023
@peteski22 peteski22 added this to the 1.15.1 milestone Oct 10, 2023
@peteski22 peteski22 marked this pull request as ready for review October 10, 2023 15:24
@github-actions
Copy link

Build Results:
All builds succeeded! ✅

@github-actions
Copy link

CI Results:
All Go tests succeeded! ✅

@peteski22 peteski22 requested a review from a team October 10, 2023 16:56
Copy link
Contributor

@ccapurso ccapurso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not certain how the nested multierror.Append calls will look so you might want to functionally test that out. Looks good to me though!

Copy link
Contributor

@raskchanky raskchanky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like @ccapurso said, I'm not sure how the formatting of the nested multierror appends will look, so probably worth checking that. But the idea makes sense.

@peteski22
Copy link
Author

For the output of the multierror append, append: https://go.dev/play/p/RlrnaU8l9kx

@peteski22 peteski22 merged commit bc64648 into main Oct 11, 2023
104 checks passed
@peteski22 peteski22 deleted the peteski22/audit-failure-capture-node-errors branch October 11, 2023 15:48
marcboudreau pushed a commit that referenced this pull request Oct 11, 2023
…guration (#23547)

* CI: Pre-emptively delete logs dir after cache restore in test-collect-reports (#23600)

* Fix OktaNumberChallenge (#23565)

* remove arg

* changelog

* exclude changelog in verifying doc/ui PRs (#23601)

* Audit: eventlogger sink node reopen on SIGHUP (#23598)

* ensure nodes are asked to reload audit files on SIGHUP

* added changelog

* Capture errors emitted from all nodes during proccessing of audit pipelines (#23582)

* Update security-scan.yml

* Listeners: Redaction only for TCP (#23592)

* redaction should only work for TCP listeners, also fix bug that allowed custom response headers for unix listeners

* fix failing test

* updates from PR feedback

* fix panic when unlocking unlocked user (#23611)

* VAULT-18307: update rotation period for aws static roles on update (#23528)

* add disable_replication_status_endpoints tcp listener config parameter

* add wrapping handler for disabled replication status endpoints setting

* adapt disable_replication_status_endpoints configuration parsing code to refactored parsing code

* refactor configuration parsing code to facilitate testing

* fix a panic when parsing configuration

* update refactored configuration parsing code

* fix merge corruption

* add changelog file

* document new TCP listener configuration parameter

* make sure disable_replication_status_endpoints only has effect on TCP listeners

* use active voice for explanation of disable_replication_status_endpoints

* fix minor merge issue

---------

Co-authored-by: Kuba Wieczorek <kuba.wieczorek@hashicorp.com>
Co-authored-by: Angel Garbarino <Monkeychip@users.noreply.github.com>
Co-authored-by: Hamid Ghaf <83242695+hghaf099@users.noreply.github.com>
Co-authored-by: Peter Wilson <peter.wilson@hashicorp.com>
Co-authored-by: Mark Collao <106274486+mcollao-hc@users.noreply.github.com>
Co-authored-by: davidadeleon <56207066+davidadeleon@users.noreply.github.com>
Co-authored-by: kpcraig <3031348+kpcraig@users.noreply.github.com>
@marcboudreau
Copy link
Contributor

Since I've come back to this PR twice looking for an example of the new error message, I've decided to setup an environment to generate such an error and post it here:

In Vault 1.15.0:

2023-11-02T15:22:47.457Z [ERROR] core: failed to audit response: request_path=auth/token/lookup-self
  error=
  | 1 error occurred:
  | \t* event not processed by enough 'sink' nodes
  |

After this PR:

2023-11-02T15:27:33.732Z [ERROR] core: failed to audit response: request_path=auth/token/lookup-self
  error=
  | 2 errors occurred:
  | \t* event not processed by enough 'sink' nodes
  | \t* event.(FileSink).Process: error writing file for sink: event.(FileSink).log: unable to re-write to file for sink: write /vault/audit/audit.log: no space left on device
  |

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core/audit hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed pr/no-changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants