Skip to content

Commit

Permalink
docs: Add references to the extras users need to install and quicksta…
Browse files Browse the repository at this point in the history
…rts (feast-dev#3249)

docs: Add references to the extras users need to install and quickstart templates

Signed-off-by: Danny Chiao <danny@tecton.ai>

Signed-off-by: Danny Chiao <danny@tecton.ai>
  • Loading branch information
adchia authored and felixwang9817 committed Sep 26, 2022
1 parent 50d5737 commit 7216ee9
Show file tree
Hide file tree
Showing 17 changed files with 311 additions and 251 deletions.
3 changes: 3 additions & 0 deletions docs/reference/offline-stores/bigquery.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ The BigQuery offline store provides support for reading [BigQuerySources](../dat
* All joins happen within BigQuery.
* Entity dataframes can be provided as a SQL query or can be provided as a Pandas dataframe. A Pandas dataframes will be uploaded to BigQuery as a table (marked for expiration) in order to complete join operations.

## Getting started
In order to use this offline store, you'll need to run `pip install 'feast[gcp]'`. You can get started by then running `feast init -t gcp`.

## Example

{% code title="feature_store.yaml" %}
Expand Down
39 changes: 21 additions & 18 deletions docs/reference/offline-stores/mssql.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ The MsSQL offline store provides support for reading [MsSQL Sources](../data-sou

* Entity dataframes can be provided as a SQL query or can be provided as a Pandas dataframe.

## Getting started
In order to use this offline store, you'll need to run `pip install 'feast[azure]'`. You can get started by then following this [tutorial](https://github.com/feast-dev/feast/blob/master/docs/tutorials/azure/README.md).

## Disclaimer

The MsSQL offline store does not achieve full test coverage.
Expand Down Expand Up @@ -34,26 +37,26 @@ offline_store:
The set of functionality supported by offline stores is described in detail [here](overview.md#functionality).
Below is a matrix indicating which functionality is supported by the Spark offline store.
| | MsSql |
| :-------------------------------- | :-- |
| `get_historical_features` (point-in-time correct join) | yes |
| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes |
| `pull_all_from_table_or_query` (retrieve a saved dataset) | yes |
| `offline_write_batch` (persist dataframes to offline store) | no |
| `write_logged_features` (persist logged features to offline store) | no |
| | MsSql |
| :----------------------------------------------------------------- | :---- |
| `get_historical_features` (point-in-time correct join) | yes |
| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes |
| `pull_all_from_table_or_query` (retrieve a saved dataset) | yes |
| `offline_write_batch` (persist dataframes to offline store) | no |
| `write_logged_features` (persist logged features to offline store) | no |

Below is a matrix indicating which functionality is supported by `MsSqlServerRetrievalJob`.

| | MsSql |
| --------------------------------- | --- |
| export to dataframe | yes |
| export to arrow table | yes |
| export to arrow batches | no |
| export to SQL | no |
| export to data lake (S3, GCS, etc.) | no |
| export to data warehouse | no |
| local execution of Python-based on-demand transforms | no |
| remote execution of Python-based on-demand transforms | no |
| persist results in the offline store | yes |
| | MsSql |
| ----------------------------------------------------- | ----- |
| export to dataframe | yes |
| export to arrow table | yes |
| export to arrow batches | no |
| export to SQL | no |
| export to data lake (S3, GCS, etc.) | no |
| export to data warehouse | no |
| local execution of Python-based on-demand transforms | no |
| remote execution of Python-based on-demand transforms | no |
| persist results in the offline store | yes |

To compare this set of functionality against other offline stores, please see the full [functionality matrix](overview.md#functionality-matrix).
45 changes: 24 additions & 21 deletions docs/reference/offline-stores/postgres.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@ The PostgreSQL offline store provides support for reading [PostgreSQLSources](..
The PostgreSQL offline store does not achieve full test coverage.
Please do not assume complete stability.

## Getting started
In order to use this offline store, you'll need to run `pip install 'feast[postgres]'`. You can get started by then running `feast init -t postgres`.

## Example

{% code title="feature_store.yaml" %}
Expand Down Expand Up @@ -42,29 +45,29 @@ The full set of configuration options is available in [PostgreSQLOfflineStoreCon
The set of functionality supported by offline stores is described in detail [here](overview.md#functionality).
Below is a matrix indicating which functionality is supported by the PostgreSQL offline store.

| | Postgres |
| :-------------------------------- | :-- |
| `get_historical_features` (point-in-time correct join) | yes |
| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes |
| `pull_all_from_table_or_query` (retrieve a saved dataset) | yes |
| `offline_write_batch` (persist dataframes to offline store) | no |
| `write_logged_features` (persist logged features to offline store) | no |
| | Postgres |
| :----------------------------------------------------------------- | :------- |
| `get_historical_features` (point-in-time correct join) | yes |
| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes |
| `pull_all_from_table_or_query` (retrieve a saved dataset) | yes |
| `offline_write_batch` (persist dataframes to offline store) | no |
| `write_logged_features` (persist logged features to offline store) | no |

Below is a matrix indicating which functionality is supported by `PostgreSQLRetrievalJob`.

| | Postgres |
| --------------------------------- | --- |
| export to dataframe | yes |
| export to arrow table | yes |
| export to arrow batches | no |
| export to SQL | yes |
| export to data lake (S3, GCS, etc.) | yes |
| export to data warehouse | yes |
| export as Spark dataframe | no |
| local execution of Python-based on-demand transforms | yes |
| remote execution of Python-based on-demand transforms | no |
| persist results in the offline store | yes |
| preview the query plan before execution | yes |
| read partitioned data | yes |
| | Postgres |
| ----------------------------------------------------- | -------- |
| export to dataframe | yes |
| export to arrow table | yes |
| export to arrow batches | no |
| export to SQL | yes |
| export to data lake (S3, GCS, etc.) | yes |
| export to data warehouse | yes |
| export as Spark dataframe | no |
| local execution of Python-based on-demand transforms | yes |
| remote execution of Python-based on-demand transforms | no |
| persist results in the offline store | yes |
| preview the query plan before execution | yes |
| read partitioned data | yes |

To compare this set of functionality against other offline stores, please see the full [functionality matrix](overview.md#functionality-matrix).
45 changes: 24 additions & 21 deletions docs/reference/offline-stores/redshift.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ The Redshift offline store provides support for reading [RedshiftSources](../dat
* All joins happen within Redshift.
* Entity dataframes can be provided as a SQL query or can be provided as a Pandas dataframe. A Pandas dataframes will be uploaded to Redshift temporarily in order to complete join operations.

## Getting started
In order to use this offline store, you'll need to run `pip install 'feast[aws]'`. You can get started by then running `feast init -t aws`.

## Example

{% code title="feature_store.yaml" %}
Expand All @@ -32,30 +35,30 @@ The full set of configuration options is available in [RedshiftOfflineStoreConfi
The set of functionality supported by offline stores is described in detail [here](overview.md#functionality).
Below is a matrix indicating which functionality is supported by the Redshift offline store.
| | Redshift |
| :-------------------------------- | :-- |
| `get_historical_features` (point-in-time correct join) | yes |
| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes |
| `pull_all_from_table_or_query` (retrieve a saved dataset) | yes |
| `offline_write_batch` (persist dataframes to offline store) | yes |
| `write_logged_features` (persist logged features to offline store) | yes |
| | Redshift |
| :----------------------------------------------------------------- | :------- |
| `get_historical_features` (point-in-time correct join) | yes |
| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes |
| `pull_all_from_table_or_query` (retrieve a saved dataset) | yes |
| `offline_write_batch` (persist dataframes to offline store) | yes |
| `write_logged_features` (persist logged features to offline store) | yes |

Below is a matrix indicating which functionality is supported by `RedshiftRetrievalJob`.

| | Redshift |
| --------------------------------- | --- |
| export to dataframe | yes |
| export to arrow table | yes |
| export to arrow batches | yes |
| export to SQL | yes |
| export to data lake (S3, GCS, etc.) | no |
| export to data warehouse | yes |
| export as Spark dataframe | no |
| local execution of Python-based on-demand transforms | yes |
| remote execution of Python-based on-demand transforms | no |
| persist results in the offline store | yes |
| preview the query plan before execution | yes |
| read partitioned data | yes |
| | Redshift |
| ----------------------------------------------------- | -------- |
| export to dataframe | yes |
| export to arrow table | yes |
| export to arrow batches | yes |
| export to SQL | yes |
| export to data lake (S3, GCS, etc.) | no |
| export to data warehouse | yes |
| export as Spark dataframe | no |
| local execution of Python-based on-demand transforms | yes |
| remote execution of Python-based on-demand transforms | no |
| persist results in the offline store | yes |
| preview the query plan before execution | yes |
| read partitioned data | yes |

To compare this set of functionality against other offline stores, please see the full [functionality matrix](overview.md#functionality-matrix).

Expand Down
49 changes: 28 additions & 21 deletions docs/reference/offline-stores/snowflake.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,13 @@ The [Snowflake](https://trial.snowflake.com) offline store provides support for
* All joins happen within Snowflake.
* Entity dataframes can be provided as a SQL query or can be provided as a Pandas dataframe. A Pandas dataframes will be uploaded to Snowflake as a temporary table in order to complete join operations.

## Getting started
In order to use this offline store, you'll need to run `pip install 'feast[snowflake]'`.

If you're using a file based registry, then you'll also need to install the relevant cloud extra (`pip install 'feast[snowflake, CLOUD]'` where `CLOUD` is one of `aws`, `gcp`, `azure`)

You can get started by then running `feast init -t snowflake`.

## Example

{% code title="feature_store.yaml" %}
Expand All @@ -31,29 +38,29 @@ The full set of configuration options is available in [SnowflakeOfflineStoreConf
The set of functionality supported by offline stores is described in detail [here](overview.md#functionality).
Below is a matrix indicating which functionality is supported by the Snowflake offline store.
| | Snowflake |
| :-------------------------------- | :-- |
| `get_historical_features` (point-in-time correct join) | yes |
| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes |
| `pull_all_from_table_or_query` (retrieve a saved dataset) | yes |
| `offline_write_batch` (persist dataframes to offline store) | yes |
| `write_logged_features` (persist logged features to offline store) | yes |
| | Snowflake |
| :----------------------------------------------------------------- | :-------- |
| `get_historical_features` (point-in-time correct join) | yes |
| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes |
| `pull_all_from_table_or_query` (retrieve a saved dataset) | yes |
| `offline_write_batch` (persist dataframes to offline store) | yes |
| `write_logged_features` (persist logged features to offline store) | yes |

Below is a matrix indicating which functionality is supported by `SnowflakeRetrievalJob`.

| | Snowflake |
| --------------------------------- | --- |
| export to dataframe | yes |
| export to arrow table | yes |
| export to arrow batches | no |
| export to SQL | yes |
| export to data lake (S3, GCS, etc.) | yes |
| export to data warehouse | yes |
| export as Spark dataframe | no |
| local execution of Python-based on-demand transforms | yes |
| remote execution of Python-based on-demand transforms | no |
| persist results in the offline store | yes |
| preview the query plan before execution | yes |
| read partitioned data | yes |
| | Snowflake |
| ----------------------------------------------------- | --------- |
| export to dataframe | yes |
| export to arrow table | yes |
| export to arrow batches | no |
| export to SQL | yes |
| export to data lake (S3, GCS, etc.) | yes |
| export to data warehouse | yes |
| export as Spark dataframe | no |
| local execution of Python-based on-demand transforms | yes |
| remote execution of Python-based on-demand transforms | no |
| persist results in the offline store | yes |
| preview the query plan before execution | yes |
| read partitioned data | yes |

To compare this set of functionality against other offline stores, please see the full [functionality matrix](overview.md#functionality-matrix).
45 changes: 24 additions & 21 deletions docs/reference/offline-stores/spark.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@ The Spark offline store provides support for reading [SparkSources](../data-sour
The Spark offline store does not achieve full test coverage.
Please do not assume complete stability.

## Getting started
In order to use this offline store, you'll need to run `pip install 'feast[spark]'`. You can get started by then running `feast init -t spark`.

## Example

{% code title="feature_store.yaml" %}
Expand Down Expand Up @@ -39,29 +42,29 @@ The full set of configuration options is available in [SparkOfflineStoreConfig](
The set of functionality supported by offline stores is described in detail [here](overview.md#functionality).
Below is a matrix indicating which functionality is supported by the Spark offline store.
| | Spark |
| :-------------------------------- | :-- |
| `get_historical_features` (point-in-time correct join) | yes |
| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes |
| `pull_all_from_table_or_query` (retrieve a saved dataset) | yes |
| `offline_write_batch` (persist dataframes to offline store) | no |
| `write_logged_features` (persist logged features to offline store) | no |
| | Spark |
| :----------------------------------------------------------------- | :---- |
| `get_historical_features` (point-in-time correct join) | yes |
| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes |
| `pull_all_from_table_or_query` (retrieve a saved dataset) | yes |
| `offline_write_batch` (persist dataframes to offline store) | no |
| `write_logged_features` (persist logged features to offline store) | no |

Below is a matrix indicating which functionality is supported by `SparkRetrievalJob`.

| | Spark |
| --------------------------------- | --- |
| export to dataframe | yes |
| export to arrow table | yes |
| export to arrow batches | no |
| export to SQL | no |
| export to data lake (S3, GCS, etc.) | no |
| export to data warehouse | no |
| export as Spark dataframe | yes |
| local execution of Python-based on-demand transforms | no |
| remote execution of Python-based on-demand transforms | no |
| persist results in the offline store | yes |
| preview the query plan before execution | yes |
| read partitioned data | yes |
| | Spark |
| ----------------------------------------------------- | ----- |
| export to dataframe | yes |
| export to arrow table | yes |
| export to arrow batches | no |
| export to SQL | no |
| export to data lake (S3, GCS, etc.) | no |
| export to data warehouse | no |
| export as Spark dataframe | yes |
| local execution of Python-based on-demand transforms | no |
| remote execution of Python-based on-demand transforms | no |
| persist results in the offline store | yes |
| preview the query plan before execution | yes |
| read partitioned data | yes |

To compare this set of functionality against other offline stores, please see the full [functionality matrix](overview.md#functionality-matrix).
Loading

0 comments on commit 7216ee9

Please sign in to comment.