Replace analyze with run-queries and interpret-results #548

edoardopirovano · 2021-06-04T16:48:18Z

Currently, we call database analyze on the database several times, resulting in a number of SARIF files being created that we have to merge. This PR changes this to instead call database run-queries and then perform a call to database interpret-results to create a single SARIF file (per language).

Merge / deployment checklist

Confirm this change is backwards compatible with existing workflows.
Confirm the readme has been updated if necessary.
Confirm the changelog has been updated if necessary.

adityasharad · 2021-06-04T18:39:55Z

src/analyze.ts

        );
-        analysisSummaryBuiltIn = stdout;
-        await injectLinesOfCode(sarifFile, language, locPromise);
-
        statusReport[`analyze_builtin_queries_${language}_duration_ms`] =


Note this means the status report no longer includes result interpretation time. I think that's ok.
@robertbrignull should/could we have a separate telemetry category for interpretation, now that is a separate step?

I've added it in since it seems potentially useful to have.

I don't know if this will break the telemetry consumer. Check with Robert and team.

Sorry I missed this mention of me. What's done here should essentially be fine since the status report endpoint now ignores unknown fields, so it's ok to update the codeql-action first.

One thing though, could you please update the type definition at

codeql-action/src/analyze.ts

Line 26 in 8e36bc2

export interface QueriesStatusReport {

so that it includes the new fields. As you can see in this case that type is really only there for documentation purposes, but if it's there it would be nice if it's correct.

src/codeql.ts

adityasharad · 2021-06-04T18:42:55Z

src/codeql.ts

+        ...getExtraOptionsFromEnv(["database", "interpret-results"]),
+      ];
+      if (automationDetailsId !== undefined) {
+        args.push("--sarif-category", automationDetailsId);


@aeisenberg does the automationDetailsId passed in here ever end with a run-id component that we want to remove before treating it as the category?

A / is always appended to the --sarif-category if it doesn't already end in one. It's handled in exactly the same way for database analyze as it is for interpret results. So, I think it's fine using the same logic for both.

I'm worried about the opposite direction: is the Action passing an automationDetailsId of the form category/id where we need to strip off the id before passing it to the CLI? If not, there's nothing to see here.

I'm not entirely clear on what automationDetailsId is actually doing, but it is coming from user input:

codeql-action/src/analyze-action.ts

Line 88 in 242fd82

actionsUtil.getOptionalInput("category"),

so I'm not sure we have any guarantees about what's contained in it.

I'm worried about the opposite direction: is the Action passing an automationDetailsId of the form category/id where we need to strip off the id before passing it to the CLI? If not, there's nothing to see here.

There's nothing we need to do. The category is passed to the call to analyze if it is defined. And that category will have a / appended to it if one isn't on it already.

If no category is passed to analyze, then in the upload-results phase, the action will inject a calculated category. This category will always have a / on it. See:

codeql-action/src/upload-lib.ts

Line 54 in 3708898

const automationID = getAutomationID(category, analysis_key, environment);

adityasharad · 2021-06-04T18:45:36Z

src/analyze.ts

-
-      // Print the LoC baseline and the summary results from database analyze for the standard
-      // query suite and (if appropriate) each custom query suite.
-      logger.startGroup(`Analysis summary for ${language}`);


Looks like you have lost this log group and summary output. You still need to capture the stdout from interpret-results and log it here. The difference from the earlier code is that we only need to log one summary, rather than having separate summaries for builtin and custom queries.

printLinesOfCodeSummary only includes the baseline produced by the Action, not the summary produced by the CLI.

Is it still needed? It looks like the stdout from the interpret-results call is showing up just fine in the Actions logs: https://github.com/github/codeql-action/runs/2748181217?check_suite_focus=true#step:5:2435

My understanding of the previous code was that it was there because otherwise the summary might have been further back in the logs (in particular if any custom queries were run) and we wanted to duplicate it at the end for convenience - since now interpret-results is always the last thing we do I think it isn't needed any more.

@aeisenberg Am I right in my understanding or is the logger output also used for something other than the Actions logs that needs to see that summary?

Hmmm...I wasn't aware of the duplication, but that makes sense. Though, now there are a lot of lines that are ungrouped (not exactly sure how it was before). Could you enclose the call to interpret-results in a logger group?

In fact, it would be really nice if we could put run queries in a group as well.

Now that I think about it, though, if you put interpret-results in a group, then the summary table would be hidden, and I think that should be prominently displayed.

What do you think of both grouping interpret-results and duplicating the table outside of the group? I think that would make the logs more readable.

Seems like a good idea! Just pushed a commit with that, let's see what the logs in the PR checks look like but I agree it should be neater 🙂

I think that @AlonaHlobina has blog posts going out that include screenshots of the existing log groupings, so I was trying not to change it. This does make the summary more visible though, which is arguably even better than pointing users to the right group.

It is ok. I can include a new screenshot of it. The changelog is not published yet. Where can I see the new logs?

I've got another PR open that will further change the logs slightly, so if you need a screenshot now I would look at the run logs from that PR's checks, e.g. https://github.com/github/codeql-action/runs/2782645280?check_suite_focus=true#step:7:309

When that's merged to main (probably later today) the CodeQL Analysis workflows on semmle-code will also start producing the latest logs here: https://github.com/github/semmle-code/actions/workflows/codeql-analysis.yml

adityasharad · 2021-06-04T18:47:23Z

src/codeql.ts

+      let output = "";
+      await new toolrunner.ToolRunner(cmd, args, {
+        listeners: {
+          stdout: (data: Buffer) => {
+            output += data.toString("utf8");
+          },
+        },
+      }).exec();
+      return output;


We're not using stdout from run-queries, so no need to capture or return it.

Good point, done.

aeisenberg · 2021-06-04T18:53:51Z

src/analyze.test.ts

          searchPath: string | undefined
+        ) => {
+          searchPathsUsed.push(searchPath!);


If you change searchPathsUsed to have type (string | undefined}[] I think you could avoid using the ! type assertion.

Thanks, done.

aeisenberg · 2021-06-04T18:56:59Z

src/analyze.ts

@@ -253,13 +225,29 @@ export async function runQueries(

  return statusReport;

+  async function runInterpretResults(
+    language: Language,
+    querySuites: string[],


nit: these are paths, right? Could you change the name to reflect that?

Suggested change

querySuites: string[],

querySuitePaths: string[],

Yep, changed here and elsewhere.

aeisenberg · 2021-06-04T19:01:28Z

src/codeql.ts

+        ...getExtraOptionsFromEnv(["database", "interpret-results"]),
+      ];
+      if (automationDetailsId !== undefined) {
+        args.push("--sarif-category", automationDetailsId);


A / is always appended to the --sarif-category if it doesn't already end in one. It's handled in exactly the same way for database analyze as it is for interpret results. So, I think it's fine using the same logic for both.

aeisenberg · 2021-06-04T19:02:57Z

src/analyze.test.ts

@@ -36,10 +36,17 @@ test("status report fields and search path setting", async (t) => {

    for (const language of Object.values(Language)) {
      setCodeQL({
-        databaseAnalyze: async (


Can we completely remove databaseAnalyze now? The only place it was used before has been removed.

Yep, removed.

aeisenberg

This looks great! I think it would be slightly better to flip the output (as in the suggestion). This is how it was before.

Also, I think the the baseline comment is a bit awkward. We should probably change the text, but that could happen in another PR.

src/analyze.ts

edoardopirovano requested a review from aeisenberg June 4, 2021 16:48

edoardopirovano force-pushed the use-interpret-results branch from e614bdd to f004bee Compare June 4, 2021 17:19

adityasharad reviewed Jun 4, 2021

View reviewed changes

aeisenberg reviewed Jun 4, 2021

View reviewed changes

edoardopirovano force-pushed the use-interpret-results branch 2 times, most recently from 45ef1cb to aac2736 Compare June 7, 2021 07:13

aeisenberg approved these changes Jun 7, 2021

View reviewed changes

src/analyze.ts Outdated Show resolved Hide resolved

Replace analyze with run-queries and interpret-results

8e36bc2

edoardopirovano force-pushed the use-interpret-results branch from 88655c7 to 8e36bc2 Compare June 8, 2021 08:14

edoardopirovano enabled auto-merge (rebase) June 8, 2021 08:15

edoardopirovano merged commit 2cc885d into github:main Jun 8, 2021

edoardopirovano deleted the use-interpret-results branch June 8, 2021 08:26

edoardopirovano mentioned this pull request Jun 9, 2021

Add intepret-results timings to status reports #556

Merged

3 tasks

This was referenced Jun 17, 2021

Merge main into v1 #568

Closed

Merge main into v1 #570

Merged

henrymercer mentioned this pull request Aug 2, 2021

Re-enable diagnostics summaries in the output logs of the analyze action #672

Merged

3 tasks

chrisjcox79 mentioned this pull request Jun 20, 2023

[Snyk] Security upgrade semver from 7.3.2 to 7.5.2 chrisjcox79/codeql-action#7

Open

This was referenced Jun 21, 2023

[Snyk] Security upgrade semver from 5.7.1 to 7.5.2 aliscco/codeql-action#526

Open

[Snyk] Fix for 1 vulnerabilities aliscco/codeql-action#547

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace analyze with run-queries and interpret-results #548

Replace analyze with run-queries and interpret-results #548

edoardopirovano commented Jun 4, 2021

adityasharad Jun 4, 2021

edoardopirovano Jun 7, 2021

adityasharad Jun 8, 2021

robertbrignull Jun 9, 2021

adityasharad Jun 4, 2021

aeisenberg Jun 4, 2021

adityasharad Jun 4, 2021

edoardopirovano Jun 7, 2021 •

edited

Loading

aeisenberg Jun 7, 2021

adityasharad Jun 4, 2021

edoardopirovano Jun 7, 2021

aeisenberg Jun 7, 2021

aeisenberg Jun 7, 2021

edoardopirovano Jun 7, 2021 •

edited

Loading

adityasharad Jun 8, 2021

AlonaHlobina Jun 9, 2021

edoardopirovano Jun 9, 2021

adityasharad Jun 4, 2021

edoardopirovano Jun 7, 2021

aeisenberg Jun 4, 2021

edoardopirovano Jun 7, 2021

aeisenberg Jun 4, 2021

edoardopirovano Jun 7, 2021

aeisenberg Jun 4, 2021

aeisenberg Jun 4, 2021

edoardopirovano Jun 7, 2021

aeisenberg left a comment

Replace analyze with run-queries and interpret-results #548

Replace analyze with run-queries and interpret-results #548

Conversation

edoardopirovano commented Jun 4, 2021

Merge / deployment checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

edoardopirovano Jun 7, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

edoardopirovano Jun 7, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aeisenberg left a comment

Choose a reason for hiding this comment

edoardopirovano Jun 7, 2021 •

edited

Loading

edoardopirovano Jun 7, 2021 •

edited

Loading