Measure data fetched when duckdb-wasm queries .duckdb
files in S3
Fetch 1 row from each of several s3://duckdb-wasm-test/*.duckdb
files in S3:
(see fetch-1/)
(see select-1/)
(see count-star/)
npm install
next dev
Open the resulting server (likely at http://localhost:3000/), or visit runsascoded.com/duckdb-wasm-test:
- Enter query
- Clear "Network" tab
- Filter: ".duckdb"
- "Disable cache" ✅
- "Run all"
- Download .har file
Move .har
file to this directory, and give it a name, e.g. the examples in this repo were generated from .har
s named:
fetch-1.har
(see fetch-1/:select * from crashes limit 1
)select-1.har
(see select-1/:select * from crashes where id=50000
)count-star.har
(see count-star/:select count(*) from crashes
)
(the .har
s themselves are .gitignore
d, as they're pretty large)
Then run analyze-reqs.ipynb on it:
name=fetch-1 # use your .har file's stem
query=… # query you entered in step 1. above
pip install -r requirements.txt
mkdir -p "$name"
echo "$query" > "$name/query.sql"
nb=analyze-reqs.ipynb
papermill -p name "$name" $nb "$name/$nb"
The $name/
directory will contain a fetched.png
like the plots above.