-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: rework why Ibis article to explain what Ibis is and other updates #8490
docs: rework why Ibis article to explain what Ibis is and other updates #8490
Conversation
also:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really great, @lostmygithubaccount -- that tabset code sample at the top is especially clean.
Some small nits and some suggestions for (all of us for) the future, but I like the categorization and presentation here.
docs/why.qmd
Outdated
|
||
Out of the box, Ibis offers a great local experience for working with many file | ||
formats. | ||
|
||
DuckDB is the default backend, with Polars and DataFusion as two other great | ||
local options. Many of the backends can run locally but require more setup than | ||
a pip installation. | ||
|
||
### Scaling up and out | ||
Ibis already works with other Python dataframes like: | ||
|
||
After prototyping on a local backend, directly scale in the cloud. | ||
- [pandas](https://github.com/pandas-dev/pandas), | ||
- [Dask](https://github.com/dask/dask) | ||
- [Polars](https://github.com/pola-rs/polars) | ||
|
||
You can prototype on DuckDB and deploy with MotherDuck. You can scale from any | ||
Python client with Ibis installed to whatever your backend supports. | ||
Ibis already works well with visualization libraries like: | ||
|
||
## Use cases | ||
- [matplotlib](https://github.com/matplotlib/matplotlib) | ||
- [seaborn](https://github.com/mwaskom/seaborn) | ||
- [plotly](https://github.com/plotly/plotly.py) | ||
- [altair](https://github.com/altair-viz/altair) | ||
- [plotnine](https://github.com/has2k1/plotnine) | ||
|
||
You can use Ibis at any stage of your data workflow. | ||
Ibis already works well with dashboarding libraries like: | ||
|
||
Use the same framework for local exploration on a few files or production | ||
workloads on the most advanced data platforms. | ||
- [Streamlit](https://github.com/streamlit/streamlit) | ||
- [dash](https://github.com/plotly/dash) | ||
- [Quarto dashboards](https://github.com/quarto-dev/quarto-cli) | ||
|
||
Ibis helps with: | ||
Ibis already works well with machine learning libraries like: | ||
|
||
- data catalog exploration | ||
- exploratory data analysis | ||
- transforming data | ||
- visualizing data | ||
- data science and machine learning | ||
- [scikit-learn](https://github.com/scikit-learn/scikit-learn) | ||
- [XGBoost](https://github.com/dmlc/xgboost) | ||
- [LightGBM](https://github.com/microsoft/lightgbm) | ||
- [PyTorch](https://github.com/pytorch/pytorch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broader documentation comment -- for everything like this that we advertise, I think we should aim to have short how-to docs on using X with Ibis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great write up, left a few comments.
- prototype with the same API that will be used in production | ||
- preprocess and feature engineer data before training a machine learning model | ||
|
||
### Ibis for data platforms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i personally think data platforms/engines are integration partners, not users. For that reason, I almost feel like we should have a separate blog to focus on the whys for partners, to keep this one focused and avoid any potential confusion to users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave this as a follow up blog -- "Ibis for data platforms" -- I think this still makes sense here
Out of the box, Ibis offers a great local experience for working with many file | ||
formats. | ||
As of [Ibis 8.0](./posts/ibis-version-8.0.0-release/index.qmd), the first stream | ||
processing backends have been added. Since these systems tend to support SQL, we |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can describe why unification is possible with a single Python dataframe API. The concept is stream-table duality. https://www.confluent.io/blog/kafka-streams-tables-part-1-event-streaming/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave this as a follow-up for someone more confidence in batch/streaming concepts than me!
there's one more commit (the freeze) that seems lost in the GH incident I guess it'd good? I'm a little confused |
Description of changes
Issues closed
closes #8251
closes #8488