Skip to content

Commit

Permalink
Data Explorer: Create preliminary positron-duckdb extension using duc…
Browse files Browse the repository at this point in the history
…kdb-wasm to provide "headless" data explorer backend (#4964)

For epic #2187, addresses #4963. 

This provides a new built-in positron-duckdb extension that loads
duckdb-wasm in a web worker and provides an RPC endpoint using VSCode's
command service for fulfilling Data Explorer requests. Only getting
schemas, data values, and null count summary statistics are supported
right now. So follow on work includes:

- Numeric formatting and string truncation (respecting the passed
FormatOptions)
- Row filtering
- Sorting
- Detailed summary statistics
- Histograms and frequency tables for sparklines

There are some rough edges, for example if you click on a file before
the extension is fully loaded at application startup, it will fail, so I
will need to consult others on how to fix that.

Lastly, I have checked in some small (~10K total) data files to use in
the extension tests (`yarn test-extension -l positron-duckdb`) and added
exclusions to hygiene.js so that pre-commit checks do not complain about
them. I'm not sure if there is a better way to handle this.

Other notes:

- Added code to comms/generate-comms.ts to generate interfaces
containing all the parameters for each RPC, same as there already is for
Rust and Python, which was needed to provide a fully formed command
protocol to communicate with the extension. We can potentially look at
further improving the TypeScript code generation.
- I copied the interface stubs needed into an interfaces.ts file in the
extension. Maybe it's possible to cross-import from the main codebase
into the extension but I do not know the right incantation of
tsconfig.json/package.json configurations to do this.

In action

https://github.com/user-attachments/assets/70dabb96-6330-49e4-8db1-10293c331051

### QA Notes

You can click on .parquet, .csv, or .tsv files in the file explorer
after Positron has loaded to open the data explorer.

---------

Co-authored-by: Jonathan McPherson <jonathan@rstudio.com>
  • Loading branch information
wesm and jmcphers authored Oct 17, 2024
1 parent e8be5d2 commit 8b5306d
Show file tree
Hide file tree
Showing 35 changed files with 5,980 additions and 153 deletions.
5 changes: 5 additions & 0 deletions .vscode-test.js
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,11 @@ const extensions = [
workspaceFolder: path.join(os.tmpdir(), `positron-connections-${Math.floor(Math.random() * 100000)}`),
mocha: { timeout: 60_000 }
},
{
label: 'positron-duckdb',
workspaceFolder: path.join(os.tmpdir(), `positron-duckdb-${Math.floor(Math.random() * 100000)}`),
mocha: { timeout: 60_000 }
},
{
label: 'positron-run-app',
workspaceFolder: 'extensions/positron-run-app/test-workspace',
Expand Down
1 change: 1 addition & 0 deletions build/gulpfile.extensions.js
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ const compilations = [
'extensions/open-remote-ssh/tsconfig.json',
'extensions/positron-code-cells/tsconfig.json',
'extensions/positron-connections/tsconfig.json',
'extensions/positron-duckdb/tsconfig.json',
'extensions/positron-ipywidgets/renderer/tsconfig.json',
'extensions/positron-javascript/tsconfig.json',
'extensions/positron-notebook-controllers/tsconfig.json',
Expand Down
6 changes: 6 additions & 0 deletions build/hygiene.js
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,9 @@ function hygiene(some, linting = true, secrets = true) {

this.emit('data', file);
});

const testDataFiles = filter(['**/*', '!**/*.csv', '!**/*.parquet'], { restore: true });

// --- End Positron ---

const formatting = es.map(function (file, cb) {
Expand Down Expand Up @@ -208,6 +211,9 @@ function hygiene(some, linting = true, secrets = true) {
const result = input
.pipe(filter((f) => !f.stat.isDirectory()))
.pipe(snapshotFilter)
// --- Start Positron ---
.pipe(testDataFiles)
// --- End Positron ---
.pipe(productJsonFilter)
.pipe(process.env['BUILD_SOURCEVERSION'] ? es.through() : productJson)
.pipe(productJsonFilter.restore)
Expand Down
1 change: 1 addition & 0 deletions build/npm/dirs.js
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ const dirs = [
'extensions/open-remote-ssh',
'extensions/positron-code-cells',
'extensions/positron-connections',
'extensions/positron-duckdb',
'extensions/positron-ipywidgets',
'extensions/positron-javascript',
'extensions/positron-notebook-controllers',
Expand Down
26 changes: 26 additions & 0 deletions extensions/positron-duckdb/extension.webpack.config.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
/*---------------------------------------------------------------------------------------------
* Copyright (C) 2024 Posit Software, PBC. All rights reserved.
* Licensed under the Elastic License 2.0. See LICENSE.txt for license information.
*--------------------------------------------------------------------------------------------*/

//@ts-check

'use strict';

const { IgnorePlugin } = require('webpack');
const withDefaults = require('../shared.webpack.config');

module.exports = withDefaults({
context: __dirname,
entry: {
extension: './src/extension.ts',
},
node: {
__dirname: false
},
externals: {
// eslint-disable-next-line @typescript-eslint/naming-convention
'@duckdb/duckdb-wasm': 'commonjs @duckdb/duckdb-wasm',
'web-worker': 'commonjs web-worker',
}
});
43 changes: 43 additions & 0 deletions extensions/positron-duckdb/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
{
"name": "positron-duckdb",
"displayName": "%displayName%",
"description": "%description%",
"version": "0.0.1",
"publisher": "vscode",
"engines": {
"vscode": "^1.65.0"
},
"activationEvents": [
"onStartupFinished"
],
"main": "./out/extension.js",
"scripts": {
"vscode:prepublish": "yarn run compile",
"pretest": "yarn run compile && yarn run lint",
"lint": "eslint src --ext ts"
},
"contributes": {},
"devDependencies": {
"@types/glob": "^7.2.0",
"@types/mocha": "^9.1.0",
"@types/node": "14.x",
"@typescript-eslint/eslint-plugin": "^5.12.1",
"@typescript-eslint/parser": "^5.12.1",
"@vscode/test-electron": "^2.1.2",
"eslint": "^8.9.0",
"glob": "^7.2.0",
"mocha": "^9.2.1",
"ts-node": "^10.9.1",
"typescript": "^4.5.5",
"vsce": "^2.11.0"
},
"dependencies": {
"@duckdb/duckdb-wasm": "1.29.0",
"apache-arrow": "^16.0.0",
"web-worker": "^1.3.0"
},
"repository": {
"type": "git",
"url": "https://github.com/posit-dev/positron"
}
}
4 changes: 4 additions & 0 deletions extensions/positron-duckdb/package.nls.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
{
"displayName": "Positron DuckDB Wasm Support",
"description": "Provides DuckDB support for headless data explorers for previewing data files."
}
Loading

0 comments on commit 8b5306d

Please sign in to comment.