Putting it all together: Enables us to run rego rules on git contents #825

JAORMX · 2023-09-01T12:07:13Z

This adds the pieces together that we need in order to run rego rules on git content.

It proposes a new example rule that verifies that we have codeQL in a repository
to do static analysis.

In order to get this working, we had to create two custom rego rules that execute
on the memoryfs that we got from the git ingester.

There were two bugs that were fixes with this:

The clone_url was reset when registering the webhook.
We were casting to the wrong type in the git ingester.

This adds the pieces together that we need in order to run rego rules on git content. It proposes a new example rule that verifies that we have codeQL in a repository to do static analysis. In order to get this working, we had to create two custom rego rules that execute on the memoryfs that we got from the git ingester. There were two bugs that were fixes with this: * The `clone_url` was reset when registering the webhook. * We were casting to the wrong type in the git ingester.

jhrozek

Just a bunch of questions because I'm a rego n00b

jhrozek · 2023-09-01T12:27:11Z

database/query/repositories.sql

@@ -60,7 +60,8 @@ webhook_id = $8,
 webhook_url = $9,
 deploy_url = $10, 
 provider = $11,
-clone_url = $12,
+-- set clone_url if the value is not an empty string
+clone_url = CASE WHEN sqlc.arg(clone_url)::text = '' THEN clone_url ELSE sqlc.arg(clone_url)::text END,


sorry, I'm confused by this case. Is this using the current value of clone_url if the clone_url argument is not empty, otherwise update clone_url with the clone_url argument?
We have existing tests for repositories, could we have a test case for this?

When registering the webhook, the webhook event returns a bunch of repository data which we update the row with. Unfortunately, the clone_url is missing, which caused the row to be updated with an empty clone_url. This ensures we never update it to an empty string.

I'll be honest, I'm not quite sure where to test this.

I was just thinking about a unit test like the ones we have here: https://github.com/stacklok/mediator/blob/main/pkg/db/repositories_test.go#L184

feel free to split this into a separate ticket, writing a test might be a good first issue for a newcomer

jhrozek · 2023-09-01T12:29:39Z

examples/github/rule-types/codeql_enabled.yaml

+            some i
+            workflowstr := file.read("./.github/workflows/codeql.yml")
+            workflow := yaml.unmarshal(workflowstr)
+            steps := workflow.jobs.analyze.steps[i]


I'm not familiar with rego, does this mean "make sure that there exists some path workflow.jobs.analyze.steps" that contains github/codeql-action/analyze@ ?

that's right.

jhrozek · 2023-09-01T12:34:26Z

examples/github/rule-types/codeql_enabled.yaml

+
+        allow {
+            some i
+            workflowstr := file.read("./.github/workflows/codeql.yml")


is this calling the function from our library?

What I mean is - are we going to have to provide a bunch of utility functions in the go code to make it possible to use rego evaluations?

Yes, file.read does not exist in the standard library of rego, I added it and ensured it called our in-memory filesystem where we have the git repo cloned.

Mostly we'd need to write things that pertain to the constraints and assumptions we have in our running evaluation environment. We could use most of these out of the box: https://www.openpolicyagent.org/docs/latest/policy-reference/#built-in-functions

jhrozek · 2023-09-01T12:34:43Z

internal/engine/eval/rego/lib.go

+// in the filesystem being evaluated (which comes from the ingester).
+// It takes one argument, the path to the file to check.
+// It's exposed as `file.exists`.
+func FileExists(res *engif.Result) func(*rego.Rego) {


this one is unused as of now right?

Nope, it is registered in the mediatorRegoLib and exposed as policy as file.exists

Oh, if you mean it's unused in a policy, that's right. It's not currently used. At least I didn't provide an example for it yet.

lukehinds

just echoing @jhrozek , what I know about rego could be written on the back of a matchbox. I love the idea though and can see how this will allow us to quickly spin up policy.

One question I have, although not enough to block this...would we be able to achieve this with other policy langs (jq, cue perhaps). I am guessing not, which is why rego was selected.

JAORMX · 2023-09-01T12:57:39Z

@lukehinds right, rego gives us a lot more flexibility with what we can do. We could potentially use jq to process a file in a directory (we'd need some modifications to the JQ evaluator), but it would be quite limited in that we would need to point to a specific file. With rego we get the flexibility that we can expand and traverse all files in a directory.

Cue would need to be explored, but it's more of a data representation language than a policy language. We could potentially build a policy representation on top of Cue, but we'd be in the same position we are now, we'd need to come up with a driver interface and write them.

Potentially we could replace our YAML data structure representations with Cue; that could be explored, but it still wouldn't replace rego in this case.

JAORMX added 2 commits September 1, 2023 15:05

rego: Add unit tests for new library functions

f84e3bc

JAORMX requested a review from jhrozek September 1, 2023 12:17

jhrozek reviewed Sep 1, 2023

View reviewed changes

lukehinds approved these changes Sep 1, 2023

View reviewed changes

jhrozek approved these changes Sep 1, 2023

View reviewed changes

JAORMX merged commit d8e4171 into main Sep 1, 2023
13 checks passed

JAORMX deleted the git-rego-rule branch September 1, 2023 17:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Putting it all together: Enables us to run rego rules on git contents #825

Putting it all together: Enables us to run rego rules on git contents #825

JAORMX commented Sep 1, 2023

jhrozek left a comment

jhrozek Sep 1, 2023

JAORMX Sep 1, 2023

JAORMX Sep 1, 2023

jhrozek Sep 1, 2023

jhrozek Sep 1, 2023

jhrozek Sep 1, 2023

JAORMX Sep 1, 2023

jhrozek Sep 1, 2023

jhrozek Sep 1, 2023

JAORMX Sep 1, 2023

JAORMX Sep 1, 2023

jhrozek Sep 1, 2023

JAORMX Sep 1, 2023

JAORMX Sep 1, 2023

lukehinds left a comment •

edited

Loading

JAORMX commented Sep 1, 2023

Putting it all together: Enables us to run rego rules on git contents #825

Putting it all together: Enables us to run rego rules on git contents #825

Conversation

JAORMX commented Sep 1, 2023

jhrozek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lukehinds left a comment • edited Loading

Choose a reason for hiding this comment

JAORMX commented Sep 1, 2023

lukehinds left a comment •

edited

Loading