Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Implement compiler sharding #202

Merged
merged 75 commits into from
Mar 15, 2022

Conversation

willbeason
Copy link
Member

@willbeason willbeason commented Mar 2, 2022

This is a massive PR - you may need more than one sitting to consume it.

You may find it helpful to skim https://docs.google.com/document/d/1ibCxaI-7HyWyDjQNL4iDRMHnrauufIMJ1D6LwIKdlsI/edit?usp=sharing again - the high-level design presented in the doc is largely unchanged.

Notable design changes:

  • Queries are run per-Review per-Template. Before only a single query was run to review an object or run audit from cache.
  • Audit from cache is removed. It isn't efficient to unmarshal everything from data.inventory to run Go matching logic, and in-rego matching logic costs too much time to run per-Template queries. Consumers of frameworks (e.g. Gatekeeper) will need to implement their own audit-from-cache. There are options for maintaining audit-from-cache in frameworks while keeping these performance gains, but they aren't pretty.
  • Most built-in Rego code is removed - much of this orchestration has been moved to Go for performance reasons. The last bit is a requirement to keep users from needing to add it as boilerplate to their ConstraintTemplates.

Refactoring changes (non-behavioral)

  • AllowedDataFields is moved to Driver as Client doesn't do anything with this data.
  • Operations on Client are handled per-Template, per-Constraint, per-Target. In the future we could leverage this to allow for more highly-parallel mutations of the set of Constraints and Templates at once, but for now we're able to make updates to Client so quickly that this isn't an issue (discussion incoming with benchmarks). Mechanically, this is why we now have "client.go", "template_client.go", and "constraint_client.go" - each has concerns applicable to that level, and this allows separating out logic and reduces mixing layers of abstraction.

I've refrained from allowing much more parallelization in AddConstraint/AddTemplate operations (as in - adding Constraints and Templates simultaneously, or adding Constraints for different Templates simultaneously). While AddConstraint can be made more parallel, it comes at the cost of a lot of complexity and the absolute gains are not meaningful for known use cases (~5,000 Constraints added per second vs. ~20,000 Constraints added per second).

Update: Benchmark data is available here: https://docs.google.com/spreadsheets/d/1WYLMDdr9w9QwF9NGYvF1bzrYrKwF52Efj4bl7jaGMqA/edit?usp=sharing

AddTemplate

Ignore multithreaded stats (crossed out). We aren't going to do this in this PR as it would hold this up. In the future, we can get an additional ~5x speedup in template compilation on top of what is shown here (e.g. 20x to 100x faster).

Compiling individual ConstraintTemplates is 2x-3x faster than before, from 4-5ms to 1.5-2ms.

Compiling 200 ConstraintTemplates is 10x-20x faster than before, from 3.4s-11s to 0.3s-0.5s.

There is improvement for all tested cases. Tests were run with both simple and complex rego code in Templates, so it is reasonable to assume all cases are improved.

AddConstraint

AddConstraint is ~2x faster than before, so it is theoretically possible to add ~30,000 Constraints to a Gatekeeper installation per second. This is unlikely to be a bottleneck in any use case, so further improvements aren't necessary.

Review

Review is complex.

TLDR: For most use cases, there are significant query improvements. The CPU cost of filtering out objects (and so not running Constraints) is dropped by 20x or more, so users should expect improvements in nearly all cases.

Note that for the data in the review benchmarks, I ran queries where every Constraint returned the same result. So "10 Templates & 100 Constraints / Success" means that all 100 Constraints are run against every object, and every Constraint resulted in no violation. These individual cases do not match reality where most objects are not matched by most Constraints. However, by testing these extreme cases individually, you can construct the expected improvement for various distributions of successes/failures/filters/autorjections.

I've taken data when running queries serially (in 1 thread) and in parallel (in 12 threads). The data for runtime in each page on the linked benchmark data is of throughput, not spent CPU time. In single-threaded mode these are equivalent, but this is not the case for multithreaded. For example, autorejecting a query for 1 Template/10 Constraints in one thread takes 18us, but only 4.3us with 12 threads. Effectively, this means that running in 12 threads means queries can be rejected at a rate of (1/4.3us) = 233,000/second, whereas running in a single thread the throughput is (1/18us) = 55,000/second.

Note that the throughput improvements are universally greater than the CPU improvements. This is largely due to less read locking within Driver - all state is stored in the Rego storage (which has its own locking), so no synchronization is required directly by Driver.

For Constraints which are filtered out against a Review, or which autoreject a Review, speed is universally increased 10x-100x. Since this logic entirely happens in Go instead of Rego, we see a huge performance increase.

Otherwise, there is generally a 2x-3x speedup in Reviews where all Constraints are run for every object. The exception is the case where:

  • there are many Templates
  • there is exactly one Constraint per Template
  • every Constraint matches every object on the cluster

This case is pathological - there shouldn't need to be so many Templates with a single Constraint, each matching every object in a Cluster. At these scales query time is still very reasonable - for example 18ms for 100 Constraints/100 Templates. We wouldn't expect there to be noticeable problems/degradation until a user had ~1,000 Constraints/1,000 Templates, when queries start to take longer than 200ms.

Will Beason added 26 commits March 2, 2022 14:01
Signed-off-by: Will Beason <willbeason@google.com>
Externs should be an argument passed to Driver on initialization.

Signed-off-by: Will Beason <willbeason@google.com>
Signed-off-by: Will Beason <willbeason@google.com>
Signed-off-by: Will Beason <willbeason@google.com>
Signed-off-by: Will Beason <willbeason@google.com>
Hopefully two reminders makes it so if I forget, it'll be obvious to
code reviewers that I made a mistake.

Signed-off-by: Will Beason <willbeason@google.com>
Signed-off-by: Will Beason <willbeason@google.com>
Signed-off-by: Will Beason <willbeason@google.com>
Also one query per object/constraint for Audit.

Signed-off-by: Will Beason <willbeason@google.com>
Signed-off-by: Will Beason <willbeason@google.com>
Now Driver only maintains state for Templates and cached objects for
referential constraints.

Signed-off-by: Will Beason <willbeason@google.com>
Client and Driver are fully usable after calling New()

Signed-off-by: Will Beason <willbeason@google.com>
Signed-off-by: Will Beason <willbeason@google.com>
Signed-off-by: Will Beason <willbeason@google.com>
Signed-off-by: Will Beason <willbeason@google.com>
Signed-off-by: Will Beason <willbeason@google.com>
Signed-off-by: Will Beason <willbeason@google.com>
Just pass path elements instead - this eliminates a lot of complexity
around both writing paths from slices of strings and back to slices of
strings.

Signed-off-by: Will Beason <willbeason@google.com>
Signed-off-by: Will Beason <willbeason@google.com>
Signed-off-by: Will Beason <willbeason@google.com>
Signed-off-by: Will Beason <willbeason@google.com>
Signed-off-by: Will Beason <willbeason@google.com>
Signed-off-by: Will Beason <willbeason@google.com>
Signed-off-by: Will Beason <willbeason@google.com>
Signed-off-by: Will Beason <willbeason@google.com>
This is much more efficient than constantly marhsalling/unmarshalling
the objects.

Also remove "Resource" since it is redundant

Signed-off-by: Will Beason <willbeason@google.com>
Will Beason added 4 commits March 9, 2022 14:40
Signed-off-by: Will Beason <willbeason@google.com>
Signed-off-by: Will Beason <willbeason@google.com>
It can't exist since it's a new storage.

Signed-off-by: Will Beason <willbeason@google.com>
Prevent more inconsistent states

Signed-off-by: Will Beason <willbeason@google.com>
Copy link
Member

@sozercan sozercan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!! LGTM

constraint/pkg/apis/constraints/apis.go Outdated Show resolved Hide resolved
@sozercan
Copy link
Member

sozercan commented Mar 11, 2022

@willbeason looks like unit tests were timing out, re-ran it (edit: timed out again). do we need to increase the timeout value or is it a problem with code/tests?

here's the full output

go test ./pkg/... -coverprofile cover.out
?       github.com/open-policy-agent/frameworks/constraint/pkg/apis     [no test files]
?       github.com/open-policy-agent/frameworks/constraint/pkg/apis/constraints [no test files]
ok      github.com/open-policy-agent/frameworks/constraint/pkg/apis/externaldata/v1alpha1       6.551s  coverage: 2.8% of statements
?       github.com/open-policy-agent/frameworks/constraint/pkg/apis/templates   [no test files]
ok      github.com/open-policy-agent/frameworks/constraint/pkg/apis/templates/v1        6.532s  coverage: 45.9% of statements
ok      github.com/open-policy-agent/frameworks/constraint/pkg/apis/templates/v1alpha1  6.438s  coverage: 45.9% of statements
ok      github.com/open-policy-agent/frameworks/constraint/pkg/apis/templates/v1beta1   6.498s  coverage: 45.9% of statements
coverage: 33.2% of statements
panic: test timed out after 10m0s

goroutine 107 [running]:
testing.(*M).startAlarm.func1()
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1788 +0x8e
created by time.goFunc
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/time/sleep.go:180 +0x31

goroutine 1 [chan receive, 10 minutes]:
testing.(*T).Run(0xc0005824e0, {0x14ada09, 0x46f053}, 0x152bf18)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1307 +0x375
testing.runTests.func1(0xc0005824e0)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1598 +0x6e
testing.tRunner(0xc0005824e0, 0xc00073fce0)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1259 +0x102
testing.runTests(0xc000050200, {0x219f220, 0x18, 0x18}, {0x48daed, 0x14a2da8, 0x21b7ec0})
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1596 +0x43f
testing.(*M).Run(0xc000050200)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1504 +0x51d
main.main()
        _testmain.go:161 +0x1f5

goroutine 19 [chan receive]:
k8s.io/klog/v2.(*loggingT).flushDaemon(0x0)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/vendor/k8s.io/klog/v2/klog.go:1283 +0x6a
created by k8s.io/klog/v2.init.0
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/vendor/k8s.io/klog/v2/klog.go:420 +0xfb

goroutine 12 [chan receive]:
github.com/golang/glog.(*loggingT).flushDaemon(0xc00018d740)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/vendor/github.com/golang/glog/glog.go:882 +0x6a
created by github.com/golang/glog.init.0
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/vendor/github.com/golang/glog/glog.go:410 +0x1c5

goroutine 103 [chan receive, 10 minutes]:
testing.(*T).Run(0xc000603380, {0x14a0a6c, 0x149ef92}, 0xc0002cac80)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1307 +0x375
github.com/open-policy-agent/frameworks/constraint/pkg/client_test.TestClient_AddTemplate(0xc000603380)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/client_test.go:376 +0x189c
testing.tRunner(0xc000603380, 0x152bf18)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1259 +0x102
created by testing.(*T).Run
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1306 +0x35a

goroutine 104 [semacquire, 10 minutes]:
sync.runtime_SemacquireMutex(0xc000296400, 0x60, 0xc0001c1660)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/runtime/sema.go:71 +0x25
sync.(*RWMutex).RLock(...)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/sync/rwmutex.go:63
github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local.(*Driver).getStorage(0xc000225200, {0x16a5060, 0xc0001aa000}, {0x149ef92, 0xb})
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local/driver.go:526 +0x97
github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local.(*Driver).removeData(0x12ddba0, {0x16a5060, 0xc0001aa000}, {0x149ef92, 0xc0001c1680}, {0xc000358360, 0x2, 0x2})
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local/driver.go:494 +0x5b
github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local.(*Driver).RemoveTemplate(0xc000225200, {0x16a5060, 0xc0001aa000}, 0x0)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local/driver.go:105 +0x1d7
github.com/open-policy-agent/frameworks/constraint/pkg/client.(*Client).RemoveTemplate(0xc00013fb00, {0x16a5060, 0xc0001aa000}, 0xc000156180)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/client.go:184 +0x155
github.com/open-policy-agent/frameworks/constraint/pkg/client_test.TestClient_AddTemplate.func1(0xc000603520)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/client_test.go:438 +0x737
testing.tRunner(0xc000603520, 0xc0002cac80)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1259 +0x102
created by testing.(*T).Run
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1306 +0x35a
FAIL    github.com/open-policy-agent/frameworks/constraint/pkg/client   600.016s
?       github.com/open-policy-agent/frameworks/constraint/pkg/client/clienttest        [no test files]
?       github.com/open-policy-agent/frameworks/constraint/pkg/client/clienttest/cts    [no test files]
ok      github.com/open-policy-agent/frameworks/constraint/pkg/client/crds      0.020s  coverage: 89.6% of statements
?       github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers   [no test files]
coverage: 49.1% of statements
panic: test timed out after 10m0s

goroutine 83 [running]:
testing.(*M).startAlarm.func1()
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1788 +0x8e
created by time.goFunc
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/time/sleep.go:180 +0x31

goroutine 1 [chan receive, 9 minutes]:
testing.(*T).Run(0xc0000c4680, {0x130c5e3, 0x46f053}, 0x1380fb8)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1307 +0x375
testing.runTests.func1(0xc0000c4680)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1598 +0x6e
testing.tRunner(0xc0000c4680, 0xc00073fce0)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1259 +0x102
testing.runTests(0xc000416080, {0x1e80000, 0x9, 0x9}, {0x48daed, 0x12fdb33, 0x1e99860})
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1596 +0x43f
testing.(*M).Run(0xc000416080)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1504 +0x51d
main.main()
        _testmain.go:117 +0x1f5

goroutine 19 [chan receive]:
k8s.io/klog/v2.(*loggingT).flushDaemon(0x0)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/vendor/k8s.io/klog/v2/klog.go:1283 +0x6a
created by k8s.io/klog/v2.init.0
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/vendor/k8s.io/klog/v2/klog.go:420 +0xfb

goroutine 20 [chan receive]:
github.com/golang/glog.(*loggingT).flushDaemon(0x0)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/vendor/github.com/golang/glog/glog.go:882 +0x6a
created by github.com/golang/glog.init.0
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/vendor/github.com/golang/glog/glog.go:410 +0x1c5

goroutine 16 [chan receive, 9 minutes]:
testing.(*T).Run(0xc0006031e0, {0x12fdc4b, 0xc0001404a0}, 0xc0002a5680)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1307 +0x375
github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local.TestDriver_RemoveTemplates(0x0)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local/driver_unit_test.go:158 +0x1dd
testing.tRunner(0xc0006031e0, 0x1380fb8)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1259 +0x102
created by testing.(*T).Run
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1306 +0x35a

goroutine 82 [semacquire, 9 minutes]:
sync.runtime_SemacquireMutex(0x12, 0x0, 0x8)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/runtime/sema.go:71 +0x25
sync.(*RWMutex).RLock(...)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/sync/rwmutex.go:63
github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local.(*Driver).getStorage(0xc0001b6cf0, {0x14cd8e0, 0xc000140000}, {0x12f0a22, 0x3})
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local/driver.go:526 +0xa5
github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local.(*Driver).removeData(0x1162260, {0x14cd8e0, 0xc000140000}, {0x12f0a22, 0xc0001b6cf0}, {0xc00052db20, 0x2, 0x2})
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local/driver.go:494 +0x65
github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local.(*Driver).RemoveTemplate(0xc0001b6cf0, {0x14cd8e0, 0xc000140000}, 0x0)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local/driver.go:105 +0x1ea
github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local.TestDriver_RemoveTemplates.func1(0xc000603380)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local/driver_unit_test.go:176 +0x378
testing.tRunner(0xc000603380, 0xc0002a5680)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1259 +0x102
created by testing.(*T).Run
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1306 +0x35a
FAIL    github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local     600.019s
ok      github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/remote    0.010s  coverage: 19.3% of statements
?       github.com/open-policy-agent/frameworks/constraint/pkg/client/errors    [no test files]
ok      github.com/open-policy-agent/frameworks/constraint/pkg/core/constraints 0.044s  coverage: 100.0% of statements
?       github.com/open-policy-agent/frameworks/constraint/pkg/core/templates   [no test files]
ok      github.com/open-policy-agent/frameworks/constraint/pkg/externaldata     0.025s  coverage: 61.4% of statements
?       github.com/open-policy-agent/frameworks/constraint/pkg/handler  [no test files]
?       github.com/open-policy-agent/frameworks/constraint/pkg/handler/handlertest      [no test files]
ok      github.com/open-policy-agent/frameworks/constraint/pkg/regorewriter     0.055s  coverage: 71.8% of statements
ok      github.com/open-policy-agent/frameworks/constraint/pkg/schema   0.024s  coverage: 50.0% of statements
ok      github.com/open-policy-agent/frameworks/constraint/pkg/types    0.024s  coverage: 13.6% of statements
FAIL
Makefile:15: recipe for target 'native-test' failed
make: *** [native-test] Error 1

Will Beason added 3 commits March 14, 2022 08:31
This prevents cases where we need mutexes to guard different sets of
information.

Signed-off-by: Will Beason <willbeason@google.com>
Signed-off-by: Will Beason <willbeason@google.com>
Signed-off-by: Will Beason <willbeason@google.com>
@willbeason
Copy link
Member Author

@willbeason looks like unit tests were timing out, re-ran it (edit: timed out again). do we need to increase the timeout value or is it a problem with code/tests?

here's the full output

go test ./pkg/... -coverprofile cover.out
?       github.com/open-policy-agent/frameworks/constraint/pkg/apis     [no test files]
?       github.com/open-policy-agent/frameworks/constraint/pkg/apis/constraints [no test files]
ok      github.com/open-policy-agent/frameworks/constraint/pkg/apis/externaldata/v1alpha1       6.551s  coverage: 2.8% of statements
?       github.com/open-policy-agent/frameworks/constraint/pkg/apis/templates   [no test files]
ok      github.com/open-policy-agent/frameworks/constraint/pkg/apis/templates/v1        6.532s  coverage: 45.9% of statements
ok      github.com/open-policy-agent/frameworks/constraint/pkg/apis/templates/v1alpha1  6.438s  coverage: 45.9% of statements
ok      github.com/open-policy-agent/frameworks/constraint/pkg/apis/templates/v1beta1   6.498s  coverage: 45.9% of statements
coverage: 33.2% of statements
panic: test timed out after 10m0s

goroutine 107 [running]:
testing.(*M).startAlarm.func1()
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1788 +0x8e
created by time.goFunc
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/time/sleep.go:180 +0x31

goroutine 1 [chan receive, 10 minutes]:
testing.(*T).Run(0xc0005824e0, {0x14ada09, 0x46f053}, 0x152bf18)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1307 +0x375
testing.runTests.func1(0xc0005824e0)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1598 +0x6e
testing.tRunner(0xc0005824e0, 0xc00073fce0)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1259 +0x102
testing.runTests(0xc000050200, {0x219f220, 0x18, 0x18}, {0x48daed, 0x14a2da8, 0x21b7ec0})
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1596 +0x43f
testing.(*M).Run(0xc000050200)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1504 +0x51d
main.main()
        _testmain.go:161 +0x1f5

goroutine 19 [chan receive]:
k8s.io/klog/v2.(*loggingT).flushDaemon(0x0)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/vendor/k8s.io/klog/v2/klog.go:1283 +0x6a
created by k8s.io/klog/v2.init.0
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/vendor/k8s.io/klog/v2/klog.go:420 +0xfb

goroutine 12 [chan receive]:
github.com/golang/glog.(*loggingT).flushDaemon(0xc00018d740)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/vendor/github.com/golang/glog/glog.go:882 +0x6a
created by github.com/golang/glog.init.0
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/vendor/github.com/golang/glog/glog.go:410 +0x1c5

goroutine 103 [chan receive, 10 minutes]:
testing.(*T).Run(0xc000603380, {0x14a0a6c, 0x149ef92}, 0xc0002cac80)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1307 +0x375
github.com/open-policy-agent/frameworks/constraint/pkg/client_test.TestClient_AddTemplate(0xc000603380)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/client_test.go:376 +0x189c
testing.tRunner(0xc000603380, 0x152bf18)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1259 +0x102
created by testing.(*T).Run
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1306 +0x35a

goroutine 104 [semacquire, 10 minutes]:
sync.runtime_SemacquireMutex(0xc000296400, 0x60, 0xc0001c1660)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/runtime/sema.go:71 +0x25
sync.(*RWMutex).RLock(...)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/sync/rwmutex.go:63
github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local.(*Driver).getStorage(0xc000225200, {0x16a5060, 0xc0001aa000}, {0x149ef92, 0xb})
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local/driver.go:526 +0x97
github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local.(*Driver).removeData(0x12ddba0, {0x16a5060, 0xc0001aa000}, {0x149ef92, 0xc0001c1680}, {0xc000358360, 0x2, 0x2})
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local/driver.go:494 +0x5b
github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local.(*Driver).RemoveTemplate(0xc000225200, {0x16a5060, 0xc0001aa000}, 0x0)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local/driver.go:105 +0x1d7
github.com/open-policy-agent/frameworks/constraint/pkg/client.(*Client).RemoveTemplate(0xc00013fb00, {0x16a5060, 0xc0001aa000}, 0xc000156180)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/client.go:184 +0x155
github.com/open-policy-agent/frameworks/constraint/pkg/client_test.TestClient_AddTemplate.func1(0xc000603520)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/client_test.go:438 +0x737
testing.tRunner(0xc000603520, 0xc0002cac80)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1259 +0x102
created by testing.(*T).Run
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1306 +0x35a
FAIL    github.com/open-policy-agent/frameworks/constraint/pkg/client   600.016s
?       github.com/open-policy-agent/frameworks/constraint/pkg/client/clienttest        [no test files]
?       github.com/open-policy-agent/frameworks/constraint/pkg/client/clienttest/cts    [no test files]
ok      github.com/open-policy-agent/frameworks/constraint/pkg/client/crds      0.020s  coverage: 89.6% of statements
?       github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers   [no test files]
coverage: 49.1% of statements
panic: test timed out after 10m0s

goroutine 83 [running]:
testing.(*M).startAlarm.func1()
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1788 +0x8e
created by time.goFunc
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/time/sleep.go:180 +0x31

goroutine 1 [chan receive, 9 minutes]:
testing.(*T).Run(0xc0000c4680, {0x130c5e3, 0x46f053}, 0x1380fb8)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1307 +0x375
testing.runTests.func1(0xc0000c4680)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1598 +0x6e
testing.tRunner(0xc0000c4680, 0xc00073fce0)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1259 +0x102
testing.runTests(0xc000416080, {0x1e80000, 0x9, 0x9}, {0x48daed, 0x12fdb33, 0x1e99860})
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1596 +0x43f
testing.(*M).Run(0xc000416080)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1504 +0x51d
main.main()
        _testmain.go:117 +0x1f5

goroutine 19 [chan receive]:
k8s.io/klog/v2.(*loggingT).flushDaemon(0x0)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/vendor/k8s.io/klog/v2/klog.go:1283 +0x6a
created by k8s.io/klog/v2.init.0
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/vendor/k8s.io/klog/v2/klog.go:420 +0xfb

goroutine 20 [chan receive]:
github.com/golang/glog.(*loggingT).flushDaemon(0x0)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/vendor/github.com/golang/glog/glog.go:882 +0x6a
created by github.com/golang/glog.init.0
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/vendor/github.com/golang/glog/glog.go:410 +0x1c5

goroutine 16 [chan receive, 9 minutes]:
testing.(*T).Run(0xc0006031e0, {0x12fdc4b, 0xc0001404a0}, 0xc0002a5680)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1307 +0x375
github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local.TestDriver_RemoveTemplates(0x0)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local/driver_unit_test.go:158 +0x1dd
testing.tRunner(0xc0006031e0, 0x1380fb8)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1259 +0x102
created by testing.(*T).Run
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1306 +0x35a

goroutine 82 [semacquire, 9 minutes]:
sync.runtime_SemacquireMutex(0x12, 0x0, 0x8)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/runtime/sema.go:71 +0x25
sync.(*RWMutex).RLock(...)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/sync/rwmutex.go:63
github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local.(*Driver).getStorage(0xc0001b6cf0, {0x14cd8e0, 0xc000140000}, {0x12f0a22, 0x3})
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local/driver.go:526 +0xa5
github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local.(*Driver).removeData(0x1162260, {0x14cd8e0, 0xc000140000}, {0x12f0a22, 0xc0001b6cf0}, {0xc00052db20, 0x2, 0x2})
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local/driver.go:494 +0x65
github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local.(*Driver).RemoveTemplate(0xc0001b6cf0, {0x14cd8e0, 0xc000140000}, 0x0)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local/driver.go:105 +0x1ea
github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local.TestDriver_RemoveTemplates.func1(0xc000603380)
        /home/sozercan/go/src/github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local/driver_unit_test.go:176 +0x378
testing.tRunner(0xc000603380, 0xc0002a5680)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1259 +0x102
created by testing.(*T).Run
        /home/linuxbrew/.linuxbrew/Cellar/go/1.17.8/libexec/src/testing/testing.go:1306 +0x35a
FAIL    github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/local     600.019s
ok      github.com/open-policy-agent/frameworks/constraint/pkg/client/drivers/remote    0.010s  coverage: 19.3% of statements
?       github.com/open-policy-agent/frameworks/constraint/pkg/client/errors    [no test files]
ok      github.com/open-policy-agent/frameworks/constraint/pkg/core/constraints 0.044s  coverage: 100.0% of statements
?       github.com/open-policy-agent/frameworks/constraint/pkg/core/templates   [no test files]
ok      github.com/open-policy-agent/frameworks/constraint/pkg/externaldata     0.025s  coverage: 61.4% of statements
?       github.com/open-policy-agent/frameworks/constraint/pkg/handler  [no test files]
?       github.com/open-policy-agent/frameworks/constraint/pkg/handler/handlertest      [no test files]
ok      github.com/open-policy-agent/frameworks/constraint/pkg/regorewriter     0.055s  coverage: 71.8% of statements
ok      github.com/open-policy-agent/frameworks/constraint/pkg/schema   0.024s  coverage: 50.0% of statements
ok      github.com/open-policy-agent/frameworks/constraint/pkg/types    0.024s  coverage: 13.6% of statements
FAIL
Makefile:15: recipe for target 'native-test' failed
make: *** [native-test] Error 1

It was a problem with the code - I didn't lock properly. It's fixed, and I've moved that part of the code to it's own class so future devs will be unlikely to make the same mistake I did. Otherwise the logic is unchanged - just mutexes moved around.

Signed-off-by: Will Beason <willbeason@google.com>
@willbeason willbeason requested a review from ritazh March 14, 2022 15:56
Copy link
Member

@ritazh ritazh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants