Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix quoting of string columns in csv #1379

Merged
merged 5 commits into from
Jun 20, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions extensions/ql-vscode/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

- Prints end-of-query evaluator log summaries to the Query Log. [#1349](https://github.com/github/vscode-codeql/pull/1349)
- Be consistent about casing in Query History menu. [#1369](https://github.com/github/vscode-codeql/pull/1369)
- Fix quoting string columns in exported CSV results. [#1379](https://github.com/github/vscode-codeql/pull/1379)

## 1.6.6 - 17 May 2022

Expand Down
7 changes: 7 additions & 0 deletions extensions/ql-vscode/src/pure/bqrs-cli-types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,14 @@ export function transformBqrsResultSet(
};
}

type BqrsKind = 'String' | 'Float' | 'Integer' | 'String' | 'Boolean' | 'Date' | 'Entity';

interface BqrsColumn {
name: string;
kind: BqrsKind;
}
export interface DecodedBqrsChunk {
tuples: CellValue[][];
next?: number;
columns: BqrsColumn[];
}
4 changes: 2 additions & 2 deletions extensions/ql-vscode/src/query-history.ts
Original file line number Diff line number Diff line change
Expand Up @@ -956,11 +956,11 @@ export class QueryHistoryManager extends DisposableObject {
void this.tryOpenExternalFile(query.csvPath);
return;
}
await query.exportCsvResults(this.qs, query.csvPath, () => {
if (await query.exportCsvResults(this.qs, query.csvPath)) {
void this.tryOpenExternalFile(
query.csvPath
);
});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be displaying an error in the else case here? I see we reject() where we were previously using showAndLogErrorMessage() but as far as I can tell we are then swallowing that error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aeisenberg I don't think you've addressed this comment. It's less blocking the other others, so I'm happy to approve without any changes here if you don't think any changes are needed, but just checking it wasn't missed accidentally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh...sorry. I forgot to comment here. The error is not being swallowed. Take a look at commandRunnerWithProgress, you can see that there is an error barrier there that catches and logs all thrown and rejected errors. So, you don't ever need to explicitly catch and log an error when you're inside a command as long as that command is wrapped in commandRunnerWithProgress.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, indeed, when I try this out, forcing an error inside of exportCsvResults, the error is logged appropriately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, thanks for the explanation 🙂

}

async handleViewCsvAlerts(
Expand Down
62 changes: 49 additions & 13 deletions extensions/ql-vscode/src/run-queries.ts
Original file line number Diff line number Diff line change
Expand Up @@ -341,31 +341,67 @@ export class QueryEvaluationInfo {
/**
* Creates the CSV file containing the results of this query. This will only be called if the query
* does not have interpreted results and the CSV file does not already exist.
*
* @return Promise<true> if the operation creates the file. Promise<false> if the operation does
* not create the file.
*
* @throws Error if the operation fails.
*/
async exportCsvResults(qs: qsClient.QueryServerClient, csvPath: string, onFinish: () => void): Promise<void> {
async exportCsvResults(qs: qsClient.QueryServerClient, csvPath: string): Promise<boolean> {
const resultSet = await this.chooseResultSet(qs);
if (!resultSet) {
void showAndLogWarningMessage('Query has no result set.');
return false;
}
let stopDecoding = false;
const out = fs.createWriteStream(csvPath);
out.on('finish', onFinish);
out.on('error', () => {
if (!stopDecoding) {
stopDecoding = true;
void showAndLogErrorMessage(`Failed to write CSV results to ${csvPath}`);
}

const promise: Promise<boolean> = new Promise((resolve, reject) => {
out.on('finish', () => resolve(true));
out.on('error', () => {
if (!stopDecoding) {
stopDecoding = true;
reject(new Error(`Failed to write CSV results to ${csvPath}`));
}
});
});

let nextOffset: number | undefined = 0;
while (nextOffset !== undefined && !stopDecoding) {
const chunk: DecodedBqrsChunk = await qs.cliServer.bqrsDecode(this.resultsPaths.resultsPath, SELECT_QUERY_NAME, {
do {
const chunk: DecodedBqrsChunk = await qs.cliServer.bqrsDecode(this.resultsPaths.resultsPath, resultSet, {
pageSize: 100,
offset: nextOffset,
});
for (const tuple of chunk.tuples) {
out.write(tuple.join(',') + '\n');
}
chunk.tuples.forEach((tuple) => {
out.write(tuple.map((v, i) =>
chunk.columns[i].kind === 'String'
? `"${typeof v === 'string' ? v.replaceAll('"', '""') : v}"`
: v
).join(',') + '\n');
});
nextOffset = chunk.next;
}
} while (nextOffset && !stopDecoding);
out.end();

return promise;
}

/**
* Choose the name of the result set to run. If the `#select` set exists, use that. Otherwise,
* arbitrarily choose the first set. Most of the time, this will be correct.
*
* If the query has no result sets, then return undefined.
*/
async chooseResultSet(qs: qsClient.QueryServerClient) {
const resultSets = (await qs.cliServer.bqrsInfo(this.resultsPaths.resultsPath, 0))['result-sets'];
if (!resultSets.length) {
return undefined;
}
if (resultSets.find(r => r.name === SELECT_QUERY_NAME)) {
return SELECT_QUERY_NAME;
}
return resultSets[0].name;
}
/**
* Returns the path to the CSV alerts interpretation of this query results. If CSV results have
* not yet been produced, this will return first create the CSV results and then return the path.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,11 +1,16 @@
import { expect } from 'chai';
import * as path from 'path';
import * as fs from 'fs-extra';
import * as sinon from 'sinon';
import { Uri } from 'vscode';

import { QueryEvaluationInfo } from '../../run-queries';
import { Severity, compileQuery } from '../../pure/messages';
import * as config from '../../config';
import { tmpDir } from '../../helpers';
import { QueryServerClient } from '../../queryserver-client';
import { CodeQLCliServer } from '../../cli';
import { SELECT_QUERY_NAME } from '../../contextual/locationFinder';

describe('run-queries', () => {
let sandbox: sinon.SinonSandbox;
Expand Down Expand Up @@ -53,6 +58,85 @@ describe('run-queries', () => {
expect(info.canHaveInterpretedResults()).to.eq(true);
});

[SELECT_QUERY_NAME, 'other'].forEach(resultSetName => {
it(`should export csv results for result set ${resultSetName}`, async () => {
const csvLocation = path.join(tmpDir.name, 'test.csv');
const qs = createMockQueryServerClient(
createMockCliServer({
bqrsInfo: [{ 'result-sets': [{ name: resultSetName }, { name: 'hucairz' }] }],
bqrsDecode: [{
columns: [{ kind: 'NotString' }, { kind: 'String' }],
tuples: [['a', 'b'], ['c', 'd']],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's expand the tuples here to include some quote characters and make sure those get escaped correctly.

next: 1
}, {
// just for fun, give a different set of columns here
// this won't happen with the real CLI, but it's a good test
columns: [{ kind: 'String' }, { kind: 'NotString' }, { kind: 'StillNotString' }],
tuples: [['a', 'b', 'c']]
}]
})
);
const info = createMockQueryInfo();
const promise = info.exportCsvResults(qs, csvLocation);

const result = await promise;
expect(result).to.eq(true);

const csv = fs.readFileSync(csvLocation, 'utf8');
expect(csv).to.eq('a,"b"\nc,"d"\n"a",b,c\n');

// now verify that we are using the expected result set
expect((qs.cliServer.bqrsDecode as sinon.SinonStub).callCount).to.eq(2);
expect((qs.cliServer.bqrsDecode as sinon.SinonStub).getCall(0).args[1]).to.eq(resultSetName);
});
});

it('should export csv results with characters that need to be escaped', async () => {
const csvLocation = path.join(tmpDir.name, 'test.csv');
const qs = createMockQueryServerClient(
createMockCliServer({
bqrsInfo: [{ 'result-sets': [{ name: SELECT_QUERY_NAME }, { name: 'hucairz' }] }],
bqrsDecode: [{
columns: [{ kind: 'NotString' }, { kind: 'String' }],
// We only escape string columns. In practice, we will only see quotes in strings, but
// it is a good test anyway.
tuples: [
['"a"', '"b"'],
['c,xxx', 'd,yyy'],
['aaa " bbb', 'ccc " ddd'],
[true, false],
[123, 456],
[123.98, 456.99],
],
}]
})
);
const info = createMockQueryInfo();
const promise = info.exportCsvResults(qs, csvLocation);

const result = await promise;
expect(result).to.eq(true);

const csv = fs.readFileSync(csvLocation, 'utf8');
expect(csv).to.eq('"a","""b"""\nc,xxx,"d,yyy"\naaa " bbb,"ccc "" ddd"\ntrue,"false"\n123,"456"\n123.98,"456.99"\n');

// now verify that we are using the expected result set
expect((qs.cliServer.bqrsDecode as sinon.SinonStub).callCount).to.eq(1);
expect((qs.cliServer.bqrsDecode as sinon.SinonStub).getCall(0).args[1]).to.eq(SELECT_QUERY_NAME);
});

it('should handle csv exports for a query with no result sets', async () => {
const csvLocation = path.join(tmpDir.name, 'test.csv');
const qs = createMockQueryServerClient(
createMockCliServer({
bqrsInfo: [{ 'result-sets': [] }]
})
);
const info = createMockQueryInfo();
const result = await info.exportCsvResults(qs, csvLocation);
expect(result).to.eq(false);
});

describe('compile', () => {
it('should compile', async () => {
const info = createMockQueryInfo();
Expand Down Expand Up @@ -116,7 +200,7 @@ describe('run-queries', () => {
);
}

function createMockQueryServerClient() {
function createMockQueryServerClient(cliServer?: CodeQLCliServer): QueryServerClient {
return {
config: {
timeoutSecs: 5
Expand All @@ -131,7 +215,20 @@ describe('run-queries', () => {
})),
logger: {
log: sandbox.spy()
}
};
},
cliServer
} as unknown as QueryServerClient;
}

function createMockCliServer(mockOperations: Record<string, any[]>): CodeQLCliServer {
const mockServer: Record<string, any> = {};
for (const [operation, returns] of Object.entries(mockOperations)) {
mockServer[operation] = sandbox.stub();
returns.forEach((returnValue, i) => {
mockServer[operation].onCall(i).resolves(returnValue);
});
}

return mockServer as unknown as CodeQLCliServer;
}
});