-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve comments on target user and unify intro summaries #12418
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,24 +17,28 @@ | |
#![warn(missing_docs, clippy::needless_borrow)] | ||
|
||
//! [DataFusion] is an extensible query engine written in Rust that | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I clarified this text and made it consistent with the other intros |
||
//! uses [Apache Arrow] as its in-memory format. DataFusion help developers | ||
//! build fast and feature rich database and analytic systems, customized to | ||
//! particular workloads. See [use cases] for examples | ||
//! uses [Apache Arrow] as its in-memory format. DataFusion's target users are | ||
//! developers building fast and feature rich database and analytic systems, | ||
//! customized to particular workloads. See [use cases] for examples. | ||
//! | ||
//! "Out of the box," DataFusion quickly runs complex [SQL] and | ||
//! [`DataFrame`] queries using a full-featured query planner, a columnar, | ||
//! streaming, multi-threaded, vectorized execution engine, and partitioned data | ||
//! sources (Parquet, CSV, JSON, and Avro). | ||
//! "Out of the box," DataFusion offers [SQL] and [`Dataframe`] APIs, | ||
//! excellent [performance], built-in support for CSV, Parquet, JSON, and Avro, | ||
//! extensive customization, and a great community. | ||
//! [Python Bindings] are also available. | ||
//! | ||
//! DataFusion is designed for easy customization such as | ||
//! additional data sources, query languages, functions, custom | ||
//! operators and more. See the [Architecture] section for more details. | ||
//! DataFusion features a full query planner, a columnar, streaming, multi-threaded, | ||
//! vectorized execution engine, and partitioned data sources. You can | ||
//! customize DataFusion at almost all points including additional data sources, | ||
//! query languages, functions, custom operators and more. | ||
//! See the [Architecture] section below for more details. | ||
//! | ||
//! [DataFusion]: https://datafusion.apache.org/ | ||
//! [Apache Arrow]: https://arrow.apache.org | ||
//! [use cases]: https://datafusion.apache.org/user-guide/introduction.html#use-cases | ||
//! [SQL]: https://datafusion.apache.org/user-guide/sql/index.html | ||
//! [`DataFrame`]: dataframe::DataFrame | ||
//! [performance]: https://benchmark.clickhouse.com/ | ||
//! [Python Bindings]: https://github.com/apache/datafusion-python | ||
//! [Architecture]: #architecture | ||
//! | ||
//! # Examples | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -32,14 +32,23 @@ Apache DataFusion | |
<a class="github-button" href="https://github.com/apache/datafusion/fork" data-size="large" data-show-count="true" aria-label="Fork apache/datafusion on GitHub">Fork</a> | ||
</p> | ||
|
||
DataFusion is a very fast, extensible query engine for building high-quality data-centric systems in | ||
`Rust <http://rustlang.org>`_, using the `Apache Arrow <https://arrow.apache.org>`_ | ||
in-memory format. | ||
|
||
DataFusion offers SQL and Dataframe APIs, excellent | ||
`performance <https://benchmark.clickhouse.com>`_, built-in support for | ||
CSV, Parquet, JSON, and Avro, extensive customization, and a great | ||
community. | ||
|
||
DataFusion is an extensible query engine written in `Rust <http://rustlang.org>`_ that | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it could be argued that we should move this content out of the github readme.md leave a link to the main website https://datafusion.apache.org/ 🤔 |
||
uses `Apache Arrow <https://arrow.apache.org>`_ as its in-memory format. DataFusion's target users are | ||
developers building fast and feature rich database and analytic systems, | ||
customized to particular workloads. See `use cases <https://datafusion.apache.org/user-guide/introduction.html#use-cases>`_ for examples. | ||
|
||
"Out of the box," DataFusion offers `SQL <https://datafusion.apache.org/user-guide/sql/index.html>`_ | ||
and `Dataframe <https://docs.rs/datafusion/latest/datafusion/dataframe/struct.DataFrame.html>`_ APIs, | ||
excellent `performance <https://benchmark.clickhouse.com/>`_, built-in support for CSV, Parquet, JSON, and Avro, | ||
extensive customization, and a great community. | ||
`Python Bindings <https://github.com/apache/datafusion-python>`_ are also available. | ||
|
||
DataFusion features a full query planner, a columnar, streaming, multi-threaded, | ||
vectorized execution engine, and partitioned data sources. You can | ||
customize DataFusion at almost all points including additional data sources, | ||
query languages, functions, custom operators and more. | ||
See the `Architecture <https://datafusion.apache.org/contributor-guide/architecture.html>`_ section for more details. | ||
|
||
To get started, see | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now the same as in lib.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it is redundant to have the same content in three places I think it is worthwhile as the three places are the three most common "landing" pages for people with DataFusion: