docs: Add references to the extras users need to install and quicksta…

…rts (feast-dev#3249) docs: Add references to the extras users need to install and quickstart templates Signed-off-by: Danny Chiao <danny@tecton.ai> Signed-off-by: Danny Chiao <danny@tecton.ai>
felixwang9817 · Sep 26, 2022 · 7216ee9 · 7216ee9
1 parent 50d5737
commit 7216ee9
Show file tree

Hide file tree

Showing 17 changed files with 311 additions and 251 deletions.
diff --git a/docs/reference/offline-stores/bigquery.md b/docs/reference/offline-stores/bigquery.md
@@ -7,6 +7,9 @@ The BigQuery offline store provides support for reading [BigQuerySources](../dat
 * All joins happen within BigQuery. 
 * Entity dataframes can be provided as a SQL query or can be provided as a Pandas dataframe. A Pandas dataframes will be uploaded to BigQuery as a table (marked for expiration) in order to complete join operations.
 
+## Getting started
+In order to use this offline store, you'll need to run `pip install 'feast[gcp]'`. You can get started by then running `feast init -t gcp`.
+
 ## Example
 
 {% code title="feature_store.yaml" %}

diff --git a/docs/reference/offline-stores/mssql.md b/docs/reference/offline-stores/mssql.md
@@ -6,6 +6,9 @@ The MsSQL offline store provides support for reading [MsSQL Sources](../data-sou
 
 * Entity dataframes can be provided as a SQL query or can be provided as a Pandas dataframe.
 
+## Getting started
+In order to use this offline store, you'll need to run `pip install 'feast[azure]'`. You can get started by then following this [tutorial](https://github.com/feast-dev/feast/blob/master/docs/tutorials/azure/README.md).
+
 ## Disclaimer
 
 The MsSQL offline store does not achieve full test coverage.
@@ -34,26 +37,26 @@ offline_store:
 The set of functionality supported by offline stores is described in detail [here](overview.md#functionality).
 Below is a matrix indicating which functionality is supported by the Spark offline store.
 
-| | MsSql |
-| :-------------------------------- | :-- |
-| `get_historical_features` (point-in-time correct join)             | yes |
-| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes |
-| `pull_all_from_table_or_query` (retrieve a saved dataset)          | yes |
-| `offline_write_batch` (persist dataframes to offline store)        | no  |
-| `write_logged_features` (persist logged features to offline store) | no  |
+|                                                                    | MsSql |
+| :----------------------------------------------------------------- | :---- |
+| `get_historical_features` (point-in-time correct join)             | yes   |
+| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes   |
+| `pull_all_from_table_or_query` (retrieve a saved dataset)          | yes   |
+| `offline_write_batch` (persist dataframes to offline store)        | no    |
+| `write_logged_features` (persist logged features to offline store) | no    |
 
 Below is a matrix indicating which functionality is supported by `MsSqlServerRetrievalJob`.
 
-| | MsSql |
-| --------------------------------- | --- |
-| export to dataframe                                   | yes |
-| export to arrow table                                 | yes |
-| export to arrow batches                               | no  |
-| export to SQL                                         | no  |
-| export to data lake (S3, GCS, etc.)                   | no  |
-| export to data warehouse                              | no  |
-| local execution of Python-based on-demand transforms  | no  |
-| remote execution of Python-based on-demand transforms | no  |
-| persist results in the offline store                  | yes |
+|                                                       | MsSql |
+| ----------------------------------------------------- | ----- |
+| export to dataframe                                   | yes   |
+| export to arrow table                                 | yes   |
+| export to arrow batches                               | no    |
+| export to SQL                                         | no    |
+| export to data lake (S3, GCS, etc.)                   | no    |
+| export to data warehouse                              | no    |
+| local execution of Python-based on-demand transforms  | no    |
+| remote execution of Python-based on-demand transforms | no    |
+| persist results in the offline store                  | yes   |
 
 To compare this set of functionality against other offline stores, please see the full [functionality matrix](overview.md#functionality-matrix).
diff --git a/docs/reference/offline-stores/postgres.md b/docs/reference/offline-stores/postgres.md
@@ -10,6 +10,9 @@ The PostgreSQL offline store provides support for reading [PostgreSQLSources](..
 The PostgreSQL offline store does not achieve full test coverage.
 Please do not assume complete stability.
 
+## Getting started
+In order to use this offline store, you'll need to run `pip install 'feast[postgres]'`. You can get started by then running `feast init -t postgres`.
+
 ## Example
 
 {% code title="feature_store.yaml" %}
@@ -42,29 +45,29 @@ The full set of configuration options is available in [PostgreSQLOfflineStoreCon
 The set of functionality supported by offline stores is described in detail [here](overview.md#functionality).
 Below is a matrix indicating which functionality is supported by the PostgreSQL offline store.
 
-| | Postgres |
-| :-------------------------------- | :-- |
-| `get_historical_features` (point-in-time correct join)             | yes |
-| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes |
-| `pull_all_from_table_or_query` (retrieve a saved dataset)          | yes |
-| `offline_write_batch` (persist dataframes to offline store)        | no  |
-| `write_logged_features` (persist logged features to offline store) | no  |
+|                                                                    | Postgres |
+| :----------------------------------------------------------------- | :------- |
+| `get_historical_features` (point-in-time correct join)             | yes      |
+| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes      |
+| `pull_all_from_table_or_query` (retrieve a saved dataset)          | yes      |
+| `offline_write_batch` (persist dataframes to offline store)        | no       |
+| `write_logged_features` (persist logged features to offline store) | no       |
 
 Below is a matrix indicating which functionality is supported by `PostgreSQLRetrievalJob`.
 
-| | Postgres |
-| --------------------------------- | --- |
-| export to dataframe                                   | yes |
-| export to arrow table                                 | yes |
-| export to arrow batches                               | no  |
-| export to SQL                                         | yes |
-| export to data lake (S3, GCS, etc.)                   | yes |
-| export to data warehouse                              | yes |
-| export as Spark dataframe                             | no  |
-| local execution of Python-based on-demand transforms  | yes |
-| remote execution of Python-based on-demand transforms | no  |
-| persist results in the offline store                  | yes |
-| preview the query plan before execution               | yes |
-| read partitioned data                                 | yes |
+|                                                       | Postgres |
+| ----------------------------------------------------- | -------- |
+| export to dataframe                                   | yes      |
+| export to arrow table                                 | yes      |
+| export to arrow batches                               | no       |
+| export to SQL                                         | yes      |
+| export to data lake (S3, GCS, etc.)                   | yes      |
+| export to data warehouse                              | yes      |
+| export as Spark dataframe                             | no       |
+| local execution of Python-based on-demand transforms  | yes      |
+| remote execution of Python-based on-demand transforms | no       |
+| persist results in the offline store                  | yes      |
+| preview the query plan before execution               | yes      |
+| read partitioned data                                 | yes      |
 
 To compare this set of functionality against other offline stores, please see the full [functionality matrix](overview.md#functionality-matrix).
diff --git a/docs/reference/offline-stores/redshift.md b/docs/reference/offline-stores/redshift.md
@@ -7,6 +7,9 @@ The Redshift offline store provides support for reading [RedshiftSources](../dat
 * All joins happen within Redshift. 
 * Entity dataframes can be provided as a SQL query or can be provided as a Pandas dataframe. A Pandas dataframes will be uploaded to Redshift temporarily in order to complete join operations.
 
+## Getting started
+In order to use this offline store, you'll need to run `pip install 'feast[aws]'`. You can get started by then running `feast init -t aws`.
+
 ## Example
 
 {% code title="feature_store.yaml" %}
@@ -32,30 +35,30 @@ The full set of configuration options is available in [RedshiftOfflineStoreConfi
 The set of functionality supported by offline stores is described in detail [here](overview.md#functionality).
 Below is a matrix indicating which functionality is supported by the Redshift offline store.
 
-| | Redshift |
-| :-------------------------------- | :-- |
-| `get_historical_features` (point-in-time correct join)             | yes |
-| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes |
-| `pull_all_from_table_or_query` (retrieve a saved dataset)          | yes |
-| `offline_write_batch` (persist dataframes to offline store)        | yes |
-| `write_logged_features` (persist logged features to offline store) | yes |
+|                                                                    | Redshift |
+| :----------------------------------------------------------------- | :------- |
+| `get_historical_features` (point-in-time correct join)             | yes      |
+| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes      |
+| `pull_all_from_table_or_query` (retrieve a saved dataset)          | yes      |
+| `offline_write_batch` (persist dataframes to offline store)        | yes      |
+| `write_logged_features` (persist logged features to offline store) | yes      |
 
 Below is a matrix indicating which functionality is supported by `RedshiftRetrievalJob`.
 
-| | Redshift |
-| --------------------------------- | --- |
-| export to dataframe                                   | yes |
-| export to arrow table                                 | yes |
-| export to arrow batches                               | yes |
-| export to SQL                                         | yes |
-| export to data lake (S3, GCS, etc.)                   | no  |
-| export to data warehouse                              | yes |
-| export as Spark dataframe                             | no  |
-| local execution of Python-based on-demand transforms  | yes |
-| remote execution of Python-based on-demand transforms | no  |
-| persist results in the offline store                  | yes |
-| preview the query plan before execution               | yes |
-| read partitioned data                                 | yes |
+|                                                       | Redshift |
+| ----------------------------------------------------- | -------- |
+| export to dataframe                                   | yes      |
+| export to arrow table                                 | yes      |
+| export to arrow batches                               | yes      |
+| export to SQL                                         | yes      |
+| export to data lake (S3, GCS, etc.)                   | no       |
+| export to data warehouse                              | yes      |
+| export as Spark dataframe                             | no       |
+| local execution of Python-based on-demand transforms  | yes      |
+| remote execution of Python-based on-demand transforms | no       |
+| persist results in the offline store                  | yes      |
+| preview the query plan before execution               | yes      |
+| read partitioned data                                 | yes      |
 
 To compare this set of functionality against other offline stores, please see the full [functionality matrix](overview.md#functionality-matrix).
 

diff --git a/docs/reference/offline-stores/snowflake.md b/docs/reference/offline-stores/snowflake.md
@@ -6,6 +6,13 @@ The [Snowflake](https://trial.snowflake.com) offline store provides support for
 * All joins happen within Snowflake.
 * Entity dataframes can be provided as a SQL query or can be provided as a Pandas dataframe. A Pandas dataframes will be uploaded to Snowflake as a temporary table in order to complete join operations.
 
+## Getting started
+In order to use this offline store, you'll need to run `pip install 'feast[snowflake]'`. 
+
+If you're using a file based registry, then you'll also need to install the relevant cloud extra (`pip install 'feast[snowflake, CLOUD]'` where `CLOUD` is one of `aws`, `gcp`, `azure`)
+
+You can get started by then running `feast init -t snowflake`.
+
 ## Example
 
 {% code title="feature_store.yaml" %}
@@ -31,29 +38,29 @@ The full set of configuration options is available in [SnowflakeOfflineStoreConf
 The set of functionality supported by offline stores is described in detail [here](overview.md#functionality).
 Below is a matrix indicating which functionality is supported by the Snowflake offline store.
 
-| | Snowflake |
-| :-------------------------------- | :-- |
-| `get_historical_features` (point-in-time correct join)             | yes |
-| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes |
-| `pull_all_from_table_or_query` (retrieve a saved dataset)          | yes |
-| `offline_write_batch` (persist dataframes to offline store)        | yes |
-| `write_logged_features` (persist logged features to offline store) | yes |
+|                                                                    | Snowflake |
+| :----------------------------------------------------------------- | :-------- |
+| `get_historical_features` (point-in-time correct join)             | yes       |
+| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes       |
+| `pull_all_from_table_or_query` (retrieve a saved dataset)          | yes       |
+| `offline_write_batch` (persist dataframes to offline store)        | yes       |
+| `write_logged_features` (persist logged features to offline store) | yes       |
 
 Below is a matrix indicating which functionality is supported by `SnowflakeRetrievalJob`.
 
-| | Snowflake |
-| --------------------------------- | --- |
-| export to dataframe                                   | yes |
-| export to arrow table                                 | yes |
-| export to arrow batches                               | no  |
-| export to SQL                                         | yes  |
-| export to data lake (S3, GCS, etc.)                   | yes |
-| export to data warehouse                              | yes |
-| export as Spark dataframe                             | no  |
-| local execution of Python-based on-demand transforms  | yes |
-| remote execution of Python-based on-demand transforms | no  |
-| persist results in the offline store                  | yes |
-| preview the query plan before execution               | yes |
-| read partitioned data                                 | yes |
+|                                                       | Snowflake |
+| ----------------------------------------------------- | --------- |
+| export to dataframe                                   | yes       |
+| export to arrow table                                 | yes       |
+| export to arrow batches                               | no        |
+| export to SQL                                         | yes       |
+| export to data lake (S3, GCS, etc.)                   | yes       |
+| export to data warehouse                              | yes       |
+| export as Spark dataframe                             | no        |
+| local execution of Python-based on-demand transforms  | yes       |
+| remote execution of Python-based on-demand transforms | no        |
+| persist results in the offline store                  | yes       |
+| preview the query plan before execution               | yes       |
+| read partitioned data                                 | yes       |
 
 To compare this set of functionality against other offline stores, please see the full [functionality matrix](overview.md#functionality-matrix).
diff --git a/docs/reference/offline-stores/spark.md b/docs/reference/offline-stores/spark.md
@@ -11,6 +11,9 @@ The Spark offline store provides support for reading [SparkSources](../data-sour
 The Spark offline store does not achieve full test coverage.
 Please do not assume complete stability.
 
+## Getting started
+In order to use this offline store, you'll need to run `pip install 'feast[spark]'`. You can get started by then running `feast init -t spark`.
+
 ## Example
 
 {% code title="feature_store.yaml" %}
@@ -39,29 +42,29 @@ The full set of configuration options is available in [SparkOfflineStoreConfig](
 The set of functionality supported by offline stores is described in detail [here](overview.md#functionality).
 Below is a matrix indicating which functionality is supported by the Spark offline store.
 
-| | Spark |
-| :-------------------------------- | :-- |
-| `get_historical_features` (point-in-time correct join)             | yes |
-| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes |
-| `pull_all_from_table_or_query` (retrieve a saved dataset)          | yes |
-| `offline_write_batch` (persist dataframes to offline store)        | no  |
-| `write_logged_features` (persist logged features to offline store) | no  |
+|                                                                    | Spark |
+| :----------------------------------------------------------------- | :---- |
+| `get_historical_features` (point-in-time correct join)             | yes   |
+| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes   |
+| `pull_all_from_table_or_query` (retrieve a saved dataset)          | yes   |
+| `offline_write_batch` (persist dataframes to offline store)        | no    |
+| `write_logged_features` (persist logged features to offline store) | no    |
 
 Below is a matrix indicating which functionality is supported by `SparkRetrievalJob`.
 
-| | Spark |
-| --------------------------------- | --- |
-| export to dataframe                                   | yes |
-| export to arrow table                                 | yes |
-| export to arrow batches                               | no  |
-| export to SQL                                         | no  |
-| export to data lake (S3, GCS, etc.)                   | no  |
-| export to data warehouse                              | no  |
-| export as Spark dataframe                             | yes |
-| local execution of Python-based on-demand transforms  | no  |
-| remote execution of Python-based on-demand transforms | no  |
-| persist results in the offline store                  | yes |
-| preview the query plan before execution               | yes |
-| read partitioned data                                 | yes |
+|                                                       | Spark |
+| ----------------------------------------------------- | ----- |
+| export to dataframe                                   | yes   |
+| export to arrow table                                 | yes   |
+| export to arrow batches                               | no    |
+| export to SQL                                         | no    |
+| export to data lake (S3, GCS, etc.)                   | no    |
+| export to data warehouse                              | no    |
+| export as Spark dataframe                             | yes   |
+| local execution of Python-based on-demand transforms  | no    |
+| remote execution of Python-based on-demand transforms | no    |
+| persist results in the offline store                  | yes   |
+| preview the query plan before execution               | yes   |
+| read partitioned data                                 | yes   |
 
 To compare this set of functionality against other offline stores, please see the full [functionality matrix](overview.md#functionality-matrix).