Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vault: log once per interval if batching revocation #8597

Merged
merged 1 commit into from
Aug 6, 2020

Commits on Aug 5, 2020

  1. vault: log once per interval if batching revocation

    This log line should be rare since:
    
    1. Most tokens should be logged synchronously, not via this async
       batched method. Async revocation only takes place when Vault
       connectivity is lost and after leader election so no revocations are
       missed.
    2. There should rarely be >1 batch (1,000) tokens to revoke since the
       above conditions should be brief and infrequent.
    3. Interval is 5 minutes, so this log line will be emitted at *most*
       once every 5 minutes.
    
    What makes this log line rare is also what makes it interesting: due to
    a bug prior to Nomad 0.11.2 some tokens may never get revoked. Therefore
    Nomad tries to re-revoke them on every leader election. This caused a
    massive buildup of old tokens that would never be properly revoked and
    purged. Nomad 0.11.3 mostly fixed this but still had a bug in purging
    revoked tokens via Raft (fixed in #8553).
    
    The nomad.vault.distributed_tokens_revoked metric is only ticked upon
    successful revocation and purging, making any bugs or slowness in the
    process difficult to detect.
    
    Logging before a potentially slow revocation+purge operation is
    performed will give users much better indications of what activity is
    going on should the process fail to make it to the metric.
    schmichael committed Aug 5, 2020
    Configuration menu
    Copy the full SHA
    e992785 View commit details
    Browse the repository at this point in the history