Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The duckdb-based data explorer backend doesn't work on windows #5084

Open
jonvanausdeln opened this issue Oct 21, 2024 · 10 comments
Open

The duckdb-based data explorer backend doesn't work on windows #5084

jonvanausdeln opened this issue Oct 21, 2024 · 10 comments
Labels
area: data explorer Issues related to Data Explorer category. bug Something isn't working os-windows Windows issue

Comments

@jonvanausdeln
Copy link
Contributor

System details:

Positron and OS details:

Windows 11
2024.11.0-69

Describe the issue:

The new feature from #4963 doesn't work on Windows

Steps to reproduce the issue:

  1. Launch Positron on windows
  2. Double click to open a .parquet file

Expected or desired behavior:

File opens in Data Explorer

Were there any error messages in the UI, Output panel, or Developer Tools console?

@jonvanausdeln jonvanausdeln added the area: data explorer Issues related to Data Explorer category. label Oct 21, 2024
@wesm wesm changed the title The "headless" data explorer feature doesn't work on windows The duckdb-based data explorer backend doesn't work on windows Oct 21, 2024
@wesm wesm added the bug Something isn't working label Oct 21, 2024
@wesm
Copy link
Contributor

wesm commented Oct 21, 2024

Can you please provide any information from the developer console?

@jonvanausdeln
Copy link
Contributor Author

Here is what I see in the console when opening the file

workbench.desktop.main.js:130697   ERR command 'positron-duckdb.dataExplorerRpc' not found: Error: command 'positron-duckdb.dataExplorerRpc' not found
    at $DUb.n (vscode-file://vscode-app/c:/Program%20Files/Positron/resources/app/out/vs/workbench/workbench.desktop.main.js:346253:39)
    at $DUb.executeCommand (vscode-file://vscode-app/c:/Program%20Files/Positron/resources/app/out/vs/workbench/workbench.desktop.main.js:346248:25)
    at async $kpc.m (vscode-file://vscode-app/c:/Program%20Files/Positron/resources/app/out/vs/workbench/workbench.desktop.main.js:428354:30)
    at async $kpc.openDataset (vscode-file://vscode-app/c:/Program%20Files/Positron/resources/app/out/vs/workbench/workbench.desktop.main.js:428366:28)
workbench.desktop.main.js:130667 DEBUG Comments: URIs of continue on comments to add to storage .
workbench.desktop.main.js:130667 DEBUG Comments: URIs of continue on comments to add to storage .

@wesm
Copy link
Contributor

wesm commented Oct 21, 2024

That error usually happens when the file is clicked before the application has fully initialized (I mentioned this in the PR, still haven't sorted out a fix for this problem), but there might be something else on Windows preventing the extension from loading

@wesm wesm added the os-windows Windows issue label Oct 21, 2024
@wesm
Copy link
Contributor

wesm commented Oct 21, 2024

I loaded this on Windows and see this error "workbench.desktop.main.js:504637 Activating extension 'vscode.positron-duckdb' failed: The URL must be of scheme file."

Image

@sharon-wang
Copy link
Member

It looks like the issue is that the web-worker package expects all paths to start with the file scheme since it uses URL.fileURLToPath: https://github.com/developit/web-worker/blob/b89a392aa178c70701ee89abef4a5d30f8c59527/node.js#L101, but on Windows file paths may use scheme C.

I made the following changes to extensions\positron-duckdb\src\extension.ts:

/* other code */
import { pathToFileURL } from 'url';

class DuckDBInstance {
	constructor(readonly db: duckdb.AsyncDuckDB, readonly con: duckdb.AsyncDuckDBConnection) { }

	static async create(ctx: vscode.ExtensionContext): Promise<DuckDBInstance> {
		// Create the path to the DuckDB WASM bundle. Note that only the EH
		// bundle for Node is used for now as we don't support Positron
		// extensions running in a browser context yet.
		const distPath = join(ctx.extensionPath, 'node_modules', '@duckdb', 'duckdb-wasm', 'dist');
		const bundle = {
			mainModule: join(distPath, 'duckdb-eh.wasm'),
			mainWorker: join(distPath, 'duckdb-node-eh.worker.cjs')
		};
		// if we're on windows, we need to call pathToFileURL on mainModule and mainWorker
		// to get the correct path format
		if (process.platform === 'win32') {
			bundle.mainModule = pathToFileURL(bundle.mainModule).toString();
			bundle.mainWorker = pathToFileURL(bundle.mainWorker).toString();
		}

		// console.log output
		// mainModule: "file:///c:/Users/sharon/positron/extensions/positron-duckdb/node_modules/@duckdb/duckdb-wasm/dist/duckdb-eh.wasm"
		// mainWorker: "file:///c:/Users/sharon/positron/extensions/positron-duckdb/node_modules/@duckdb/duckdb-wasm/dist/duckdb-node-eh.worker.cjs"

/* other code */

which gets us past the The URL must be of scheme file error, but then we have another issue with file path resolution:

Image

@sharon-wang
Copy link
Member

I'm stepping away for a bit but I have some WIP fixes on https://github.com/posit-dev/positron/tree/duckdb-windows-path.

One more path issue to fix up, occurs when opening up the parquet file:

  ERR [Extension Host] Error: IO Error: No files found that match the pattern "/c:/Users/sharon/qa-example-content/data-files/flights/flights.parquet"
	at f.onMessage (c:\Users\sharon\positron\extensions\positron-duckdb\node_modules\@duckdb\duckdb-wasm\dist\duckdb-node.cjs:1:14750)
	at c:\Users\sharon\positron\extensions\positron-duckdb\node_modules\web-worker\cjs\node.js:40:17
	at Array.forEach (<anonymous>)
	at Worker.dispatchEvent (c:\Users\sharon\positron\extensions\positron-duckdb\node_modules\web-worker\cjs\node.js:38:10)
	at Worker.<anonymous> (c:\Users\sharon\positron\extensions\positron-duckdb\node_modules\web-worker\cjs\node.js:109:14)
	at Worker.emit (node:events:519:28)
	at MessagePort.<anonymous> (node:internal/worker:262:53)
	at [nodejs.internal.kHybridDispatch] (node:internal/event_target:820:20)
	at MessagePort.<anonymous> (node:internal/per_context/messageport:23:28)
	at MessagePort.callbackTrampoline (node:internal/async_hooks:130:17)

@wesm
Copy link
Contributor

wesm commented Oct 21, 2024

thanks @sharon-wang ! I am taking a look at this on your branch also and have got my windows environment set up

@jmcphers jmcphers added this to the 2024.11.0 Pre-Release milestone Oct 21, 2024
@sharon-wang
Copy link
Member

@wesm I just pushed some more changes to duckdb-windows-path that look like they might be working!

Image

There's still an error re: get_column_profiles timing out -- not sure if that's related or a separate issue.

If the changes on the branch look roughly okay, I can open up a PR for review!

@wesm
Copy link
Contributor

wesm commented Oct 21, 2024

Sure please go ahead and open a PR and we can refine as needed there! Thank you

@wesm
Copy link
Contributor

wesm commented Oct 21, 2024

I’ll fix the other error in the PR since I know what is wrong

wesm added a commit that referenced this issue Oct 22, 2024
Addresses #5084. The extension needed some massaging of file paths on
Windows to get the web worker loading and the SQL strings in a format
where DuckDB can read the files.

I verify that opening files from the command palette works on Windows,
so the bug from #5076 appears to only affect macOS for now

---------

Co-authored-by: sharon wang <25834218+sharon-wang@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: data explorer Issues related to Data Explorer category. bug Something isn't working os-windows Windows issue
Projects
None yet
Development

No branches or pull requests

4 participants