-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Technique example: data loader, Python to parquet #1422
Conversation
can we link to https://arrow.apache.org/docs/python/generated/pyarrow.parquet.write_table.html explicitly mentioning that there are many options (and recommend compression); and maybe show compression in action? |
Co-authored-by: Philippe Rivière <fil@rezo.net>
@Fil I added the compression codec explicitly in the loader (compression="snappy"), and include a sentence pointing to the write_table docs and different compression algorithms. Look okay?
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mind copying the new virtual environment pattern from #1468?
<div class="note">
To run this data loader, you’ll need python3 and the geopandas, matplotlib, io, and sys modules installed and available on your `$PATH`.
</div>
<div class="tip">
We recommend using a [Python virtual environment](https://observablehq.com/framework/loaders#venv), such as with venv or uv, and managing required packages via `requirements.txt` rather than installing them globally.
</div>
Quick question - would dbt work here? |
Plot.barX(dams, | ||
Plot.groupY({x: "count"}, {y: "Primary Purpose", fill: "Hazard Potential Classification", sort: {y: "x", reverse: true} | ||
}) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please prettier this. 🙏
Also, you can use sort: {y: "-x"}
to shorten.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep will do!
@@ -53,7 +53,7 @@ const dams = FileAttachment("data/us-dams.parquet").parquet(); | |||
We can display the table using `Inputs.table`. | |||
|
|||
```js echo | |||
Inputs.table(dams) | |||
Inputs.table(dams); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This prettier edit will prevent the table from displaying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was correctly updated in a subsequent commit (I turned prettier off after formatting, to leave the semicolon after FileAttachment but remove in the Inputs.table and Plot code:
https://observablehq.observablehq.cloud/framework-example-loader-python-to-parquet/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right you are, thanks!
Deployed: https://observablehq.observablehq.cloud/framework-example-loader-python-to-parquet/