Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: encryption at rest configurations aren't respected when creating clusters #79265

Closed
nicktrav opened this issue Apr 1, 2022 · 5 comments · Fixed by #81483
Closed
Assignees
Labels
A-testing Testing tools and infrastructure C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-testeng TestEng Team

Comments

@nicktrav
Copy link
Collaborator

nicktrav commented Apr 1, 2022

Describe the problem

A recent change in #76669 altered the way that tests were opted into using encryption at rest (EAR).

Consider a test that should always run with EAR (like this one). After the refactor, the test is required to explicitly opt into random EAR by setting EncryptAtRandom on the TestSpec (a bit of an API wrinkle, as this test always wants EAR, not randomly).

However, even if that flag were true, EAR is only on 50% of the time (see here), which makes the test flaky.

Expected behavior

A roachtest that always wants EAR should be have its settings respected.

Epic: CRDB-10428
Jira issue: CRDB-14664

@nicktrav nicktrav added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-testing Testing tools and infrastructure T-testeng TestEng Team labels Apr 1, 2022
@nicktrav
Copy link
Collaborator Author

nicktrav commented Apr 1, 2022

Related to #78125.

@srosenberg
Copy link
Member

The current logic is still too convoluted,

  • encrypt is a global and defaults to random (for nightlies)
  • we use it to set encAtRest [1] per cluster
  • then, we use encAtRest to override startOpts.RoachprodOpts.EncryptedStores for the same cluster

It bears simplification. Why not lift encAtRest into TestSpec where EncryptAtRandom already lives?

[1] https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/test_runner.go#L602
[2] https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/cluster.go#L1796

@tbg
Copy link
Member

tbg commented May 2, 2022

encrypt is not only a global variable, it is a flag. This is so that you can run roachtest run --encrypt ....

It bears simplification. Why not lift encAtRest into TestSpec where EncryptAtRandom already lives?

because then you can't run roachtest run --encrypt=X any more, since now the encryption is hard-coded per test. Maybe that is fine - we don't seem to be using the --encrypt flag anywhere - but at least when running the test it is highly desirable to stick to one. It would be annoying to have to repro a test that specifies random encryption but only fails in one of the two cases. A local code modification would be necessary and that would create hoops to jump through in order to let CI stress the test for you.

I think there generally is merit to letting tests decide (and we do already do that until I guess I broke it with this issue), as long as it's deterministic. Maybe what we ought to do is to let the test decide between

  • allow both
  • allow only on (tests that verify something about encryption at rest would chose this, even if they could run without encryption, for example we would not want to run tpcc/enc=true/... without encryption since the name says it's encrypted)
  • allow only off

where the actual determination of randomness (if supported) is made on a test-by-test basis behind roachtest run --encrypt=random, which CI would have to pass automatically. (with "on", tests that don't support encryption would be skipped; with off vice versa).

nicktrav added a commit to nicktrav/cockroach that referenced this issue May 5, 2022
The encryption roachtests are consistently failing, due to the issues
mentioned in cockroachdb#79265.

Mark as skipped until that issue is resolved.

Release note: None.
craig bot pushed a commit that referenced this issue May 5, 2022
81054: roachtest: temporarily skip encryption test r=jbowens a=nicktrav

The encryption roachtests are consistently failing, due to the issues
mentioned in #79265.

Mark as skipped until that issue is resolved.

Release note: None.

Co-authored-by: Nick Travers <travers@cockroachlabs.com>
blathers-crl bot pushed a commit that referenced this issue May 5, 2022
The encryption roachtests are consistently failing, due to the issues
mentioned in #79265.

Mark as skipped until that issue is resolved.

Release note: None.
blathers-crl bot pushed a commit that referenced this issue May 5, 2022
The encryption roachtests are consistently failing, due to the issues
mentioned in #79265.

Mark as skipped until that issue is resolved.

Release note: None.
@renatolabs
Copy link
Contributor

@tbg's suggestion makes sense to me. If anyone has another suggestion they'd like to discuss, let me know.

@renatolabs
Copy link
Contributor

@tbg and I talked about this issue earlier today, and I'm documenting some of the things we talked about, and what our conclusion was.

TL;DR: --encrypt will no longer support true|false as it currently does. Instead, the only two possible values will be --encrypt=auto and --encrypt=random. The latter will be used in CI (as is the case today), and the former will be mostly used by developers (and is the default).

Behavior-wise, --auto means that if a test supports encryption, encryption will be enabled (and the reverse for tests that don't support encryption). random means we'll randomly select whether encryption is enabled for tests that indicate they support both.

If a dev wants to force encryption on/off on a test that allows both, they can checkout the relevant branch, change the TestSpec accordingly, and run roachtest with --encrypt=auto.


Longer description:

Previously (before #76669), --encrypt=random would try to enable encryption on the cluster randomly; tests that required encryption to be always enabled or disabled would then set RoachprodOpts.EncryptedStores (or RoachtestOpts.DontEncrypt) manually at some point during the test (as was the case in the test @nicktrav mentioned).

In #76669, Tobi then tried to reconcile the --encrypt flag with what tests declare by making sure the --encrypt flag has the final say on whether encryption is enabled on the cluster or not. That broke Nick's test because it relied on encryption always being enabled.

Some other approaches we considered to deal with this were:

  • --encrypt=true and --encrypt=random would automatically skip tests that can only run with encryption disabled (and vice-versa for --encrypt=false). This was the original plan, but the downside is that we would then need to invoke roachtest twice in CI scripts, which we don't want.

  • --encrypt=true means that roachtest will fail if the filter passed includes tests that don't support encryption (and vice-versa for --encrypt=false). The main reason we don't support this is that we don't foresee a lot of use for this feature, so we might not want to maintain it. For cases where the we do want to force encryption on tests that support both, the branch can be checked out and modified locally. We could add support for this if at some point doing this type of manual change becomes common.

renatolabs added a commit to renatolabs/cockroach that referenced this issue May 18, 2022
This commit introduces the `EncryptionSupport` field to `TestSpec`. By
setting this field, tests can declare whether they support being run
in a cluster with encryption enabled. Three options are available:

- EncryptionAllowed: test can run with or without encryption.
- EncryptionRequired: test can only run if encryption is enabled.
- EncryptionDisabled: test can only run if encryption is disabled (default).

This replaces the `EncryptAtRandom` field added earlier, which only
allowed tests to opt-in to random encryption.

In addition to this change, the values accepted by the `--encrypt`
flag in roachtest also changes. The only two valid values are:

- `--encrypt=auto` (default). Tests that support encryption
(EncryptionAllowed) will run with encryption enabled.
- `--encrypt=random` (CI). Tests that support encryption
(EncryptionAllowed) will run on a cluster which may or may
not have encryption enabled.

The `EncryptionSupport` field should be the only way tests indicate
their relationship with encryption. The `DontEncrypt` field was
removed (it was not being used anymore at the time of this commit),
and all tests that set `EncryptedStores` directly were updated to use
`EncryptionSupport` accordingly.

Resolves: cockroachdb#79265.

Release note: None
renatolabs added a commit that referenced this issue May 19, 2022
This commit introduces the `EncryptionSupport` field to `TestSpec`. By
setting this field, tests can declare whether they support being run
in a cluster with encryption enabled. Three options are available:

- EncryptionAlwaysDisabled: test can only run if encryption is
  disabled (default).
- EncryptionAlwaysEnabled: test can only run if encryption is enabled.
- EncryptionMetamorphic: tests can opt-in to metamorphic encryption;
  the probability of these tests running with encryption enabled is
  controlled by a command flag.

This replaces the `EncryptAtRandom` field added earlier, which only
allowed tests to opt-in to random encryption.

In addition to this change, the `--encrypt` flag is superseded by a
new `--metamorphic-encryption-probability` flag. As the name implies,
this flag controls the probability of enabling encryption for tests
that opted-in to metamorphic encryption. By default, this value is 1,
meaning that tests with EncryptionMetamorphic set will always run with
encryption enabled. To run these tests locally with encryption
disabled, developers can call roachtest passing
--metamorphic-encryption-probability=0.

The `EncryptionSupport` field should be the only way tests indicate
their relationship with encryption. The `DontEncrypt` field was
removed (it was not being used anymore at the time of this commit),
and all tests that set `EncryptedStores` directly were updated to use
`EncryptionSupport` accordingly.

Resolves: #79265.

Release note: None
craig bot pushed a commit that referenced this issue May 20, 2022
81104: opt: disable normalization rules when building lookup expressions r=mgartner a=mgartner

Previously, normalization rules applied during construction of lookup
expression could result in non-canonical lookup expressions. The
execution is unable to generate lookup spans for non-canonical lookup
expressions, so the query resulted in an internal error "unable to
vectorize execution plan: unhandled expression type".

For example, when building the canonical equality `col = false`, the
normalization rule `FoldEqFalse` transforms it into `NOT col`, which is
non-canonical.

This commit fixes the issue by disabling normalization rules when
building lookup expressions.

Fixes #80525

Release note (bug fix): A bug has been fixed that caused errors with the
message "unable to vectorize execution plan: unhandled expression type"
in rare cases. This bug has been present since version 21.2.0.

81483: roachtest: declare encryption support as part of TestSpec r=nicktrav,srosenberg a=renatolabs

This commit introduces the `EncryptionSupport` field to `TestSpec`. By
setting this field, tests can declare whether they support being run
in a cluster with encryption enabled. Three options are available:

- EncryptionAllowed: test can run with or without encryption.
- EncryptionRequired: test can only run if encryption is enabled.
- EncryptionDisabled: test can only run if encryption is disabled (default).

This replaces the `EncryptAtRandom` field added earlier, which only
allowed tests to opt-in to random encryption.

In addition to this change, the values accepted by the `--encrypt`
flag in roachtest also changes. The only two valid values are:

- `--encrypt=auto` (default). Tests that support encryption
(EncryptionAllowed) will run with encryption enabled.
- `--encrypt=random` (CI). Tests that support encryption
(EncryptionAllowed) will run on a cluster which may or may
not have encryption enabled.

The `EncryptionSupport` field should be the only way tests indicate
their relationship with encryption. The `DontEncrypt` field was
removed (it was not being used anymore at the time of this commit),
and all tests that set `EncryptedStores` directly were updated to use
`EncryptionSupport` accordingly.

Resolves: #79265.

Release note: None

****

This commit should be sufficient for us to re-enable `@nicktrav's` tests that originated this issue (#79265). Should I do it in this commit/PR or is that generally done separately once the fix is merged?

81562: colexec: add more redundancy to releasing disk resources r=yuzefovich a=yuzefovich

As a couple of recently-found issues showed, making sure that all disk
resources are released can be tricky since disk-backed operators can
form large graphs with multiple external operators supporting a single
operation. This commit makes the release of disk resources more
bullet-proof by auditing all users of the vectorized disk queues to make
sure they are added to `OpWithMetaInfo.ToClose` which are closed on the
flow cleanup. Since `Close` can be safely called multiple times, it adds
some redundancy, leaning on the side of caution.

In particular, the following changes are made:
- external distinct and external hash aggregators are explicitly added
to `ToClose` slice. They should already be now closed by the
`diskSpillerBase`, but it doesn't hurt closing them explicitly.
- window aggregator operator has been refactored so that it doesn't
throw an error in its `Close` method - with the previous version it was
possible to panic during the `Close` execution and possibly leak some
resources.
- signatures of the constructor methods have been adjusted to return
`ClosableOperator` to make the need for closing be more explicit.
- each router output is now a `Closer` and the consumer of each output
is now resposible for closing it. Again, I'm pretty sure that each
output will have been closed by that time the consumer explicitly tries
to close the output, yet there is no harm in closing it twice.

An additional minor cleanup is the removal of the usage of an embedded
context in a couple `Close` implementations given that the function
takes it as an argument.

Release note: None

Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
Co-authored-by: Renato Costa <renato@cockroachlabs.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
@craig craig bot closed this as completed in #81483 May 20, 2022
renatolabs added a commit to renatolabs/cockroach that referenced this issue May 20, 2022
This commit introduces the `EncryptionSupport` field to `TestSpec`. By
setting this field, tests can declare whether they support being run
in a cluster with encryption enabled. Three options are available:

- EncryptionAlwaysDisabled: test can only run if encryption is
  disabled (default).
- EncryptionAlwaysEnabled: test can only run if encryption is enabled.
- EncryptionMetamorphic: tests can opt-in to metamorphic encryption;
  the probability of these tests running with encryption enabled is
  controlled by a command flag.

This replaces the `EncryptAtRandom` field added earlier, which only
allowed tests to opt-in to random encryption.

In addition to this change, the `--encrypt` flag is superseded by a
new `--metamorphic-encryption-probability` flag. As the name implies,
this flag controls the probability of enabling encryption for tests
that opted-in to metamorphic encryption. By default, this value is 1,
meaning that tests with EncryptionMetamorphic set will always run with
encryption enabled. To run these tests locally with encryption
disabled, developers can call roachtest passing
--metamorphic-encryption-probability=0.

The `EncryptionSupport` field should be the only way tests indicate
their relationship with encryption. The `DontEncrypt` field was
removed (it was not being used anymore at the time of this commit),
and all tests that set `EncryptedStores` directly were updated to use
`EncryptionSupport` accordingly.

Resolves: cockroachdb#79265.

Release note: None
andrewbaptist pushed a commit to andrewbaptist/cockroach that referenced this issue May 25, 2022
This commit introduces the `EncryptionSupport` field to `TestSpec`. By
setting this field, tests can declare whether they support being run
in a cluster with encryption enabled. Three options are available:

- EncryptionAlwaysDisabled: test can only run if encryption is
  disabled (default).
- EncryptionAlwaysEnabled: test can only run if encryption is enabled.
- EncryptionMetamorphic: tests can opt-in to metamorphic encryption;
  the probability of these tests running with encryption enabled is
  controlled by a command flag.

This replaces the `EncryptAtRandom` field added earlier, which only
allowed tests to opt-in to random encryption.

In addition to this change, the `--encrypt` flag is superseded by a
new `--metamorphic-encryption-probability` flag. As the name implies,
this flag controls the probability of enabling encryption for tests
that opted-in to metamorphic encryption. By default, this value is 1,
meaning that tests with EncryptionMetamorphic set will always run with
encryption enabled. To run these tests locally with encryption
disabled, developers can call roachtest passing
--metamorphic-encryption-probability=0.

The `EncryptionSupport` field should be the only way tests indicate
their relationship with encryption. The `DontEncrypt` field was
removed (it was not being used anymore at the time of this commit),
and all tests that set `EncryptedStores` directly were updated to use
`EncryptionSupport` accordingly.

Resolves: cockroachdb#79265.

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-testing Testing tools and infrastructure C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-testeng TestEng Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants