Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix race condition in string generator helper #19875

Merged
merged 2 commits into from
Mar 31, 2023

Conversation

fairclothjm
Copy link
Contributor

A data race in vault/helper/random.(*StringGenerator).validateConfig() was exposed by new text coverage that was recently introduced.

WARNING: DATA RACE
Read at 0x00c005d77000 by goroutine 10158:
  github.com/hashicorp/vault/helper/random.(*StringGenerator).validateConfig()
      /home/circleci/go/src/github.com/hashicorp/vault/helper/random/string_generator.go:243 +0x3e6

and

Previous write at 0x000006f28ae0 by goroutine 10155:
  github.com/hashicorp/vault/helper/random.(*StringGenerator).validateConfig()
      /home/circleci/go/src/github.com/hashicorp/vault/helper/random/string_generator.go:238 +0x293

Failure: https://app.circleci.com/pipelines/github/hashicorp/vault/51983/workflows/d654323c-baa8-4985-9838-a4131f8900f5/jobs/607405/steps

@fairclothjm fairclothjm requested review from a team March 30, 2023 21:17
Comment on lines +124 to +126
g.charsetLock.RLock()
charset := g.charset
g.charsetLock.RUnlock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why locking is needed for setting a variable like this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lock isn't necessary for the write to the local charset variable, but the read of the g.charset value.

In validateConfig:
if len(g.charset) == 0 {
    g.charset = getChars(g.Rules)
}

so if there's parallel access to both validateConfig and generate, we will race reading this variable. Since this assignment isn't an atomic write (its a []rune slice, which is a pointer type), you might read a bad pointer (like, half of the pointer's value could be copied over when you read, so you'd point to a garbage memory location) and thus fail.

In particular, I think when you first initialize the group and set it up to do a password rotation, if you have a lot of in-flight requests, you'll see multiple try and write this charset via validating the config I believe, and some which will be past it trying to do this generate read already.

Or maybe it is parallel modification + use of the policy...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cipherboy! That is a great summary.

@robmonte I am happy to talk more about this if you have more questions. Let me know!

Copy link
Member

@austingebauer austingebauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@cipherboy cipherboy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me!

Comment on lines +124 to +126
g.charsetLock.RLock()
charset := g.charset
g.charsetLock.RUnlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lock isn't necessary for the write to the local charset variable, but the read of the g.charset value.

In validateConfig:
if len(g.charset) == 0 {
    g.charset = getChars(g.Rules)
}

so if there's parallel access to both validateConfig and generate, we will race reading this variable. Since this assignment isn't an atomic write (its a []rune slice, which is a pointer type), you might read a bad pointer (like, half of the pointer's value could be copied over when you read, so you'd point to a garbage memory location) and thus fail.

In particular, I think when you first initialize the group and set it up to do a password rotation, if you have a lot of in-flight requests, you'll see multiple try and write this charset via validating the config I believe, and some which will be past it trying to do this generate read already.

Or maybe it is parallel modification + use of the policy...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants