Skip to content

Commit

Permalink
GitBook: [#334] Fix typo in stream ingestion docs and update other re…
Browse files Browse the repository at this point in the history
…ferences to streaming
  • Loading branch information
adchia authored and gitbook-bot committed Nov 8, 2021
1 parent d8bd8cf commit 821d284
Show file tree
Hide file tree
Showing 4 changed files with 23 additions and 25 deletions.
15 changes: 7 additions & 8 deletions docs/getting-started/architecture-and-components/overview.md
Original file line number Diff line number Diff line change
@@ -1,32 +1,31 @@
# Overview

![Feast Architecture Diagram](../../.gitbook/assets/image%20%284%29.png)
![Feast Architecture Diagram](<../../.gitbook/assets/image (4).png>)

## Functionality

* **Create Batch Features:** ELT/ETL systems like Spark and SQL are used to transform data in the batch store.
* **Feast Apply:** The user \(or CI\) publishes versioned controlled feature definitions using `feast apply`. This CLI command updates infrastructure and persists definitions in the object store registry.
* **Feast Materialize:** The user \(or scheduler\) executes `feast materialize` which loads features from the offline store into the online store.
* **Feast Apply:** The user (or CI) publishes versioned controlled feature definitions using `feast apply`. This CLI command updates infrastructure and persists definitions in the object store registry.
* **Feast Materialize:** The user (or scheduler) executes `feast materialize` which loads features from the offline store into the online store.
* **Model Training:** A model training pipeline is launched. It uses the Feast Python SDK to retrieve a training dataset and trains a model.
* **Get Historical Features:** Feast exports a point-in-time correct training dataset based on the list of features and entity dataframe provided by the model training pipeline.
* **Deploy Model:** The trained model binary \(and list of features\) are deployed into a model serving system. This step is not executed by Feast.
* **Deploy Model:** The trained model binary (and list of features) are deployed into a model serving system. This step is not executed by Feast.
* **Prediction:** A backend system makes a request for a prediction from the model serving service.
* **Get Online Features:** The model serving service makes a request to the Feast Online Serving service for online features using a Feast SDK.

## Components

A complete Feast deployment contains the following components:

* **Feast Registry**: An object store \(GCS, S3\) based registry used to persist feature definitions that are registered with the feature store. Systems can discover feature data by interacting with the registry through the Feast SDK.
* **Feast Registry**: An object store (GCS, S3) based registry used to persist feature definitions that are registered with the feature store. Systems can discover feature data by interacting with the registry through the Feast SDK.
* **Feast Python SDK/CLI:** The primary user facing SDK. Used to:
* Manage version controlled feature definitions.
* Materialize \(load\) feature values into the online store.
* Materialize (load) feature values into the online store.
* Build and retrieve training datasets from the offline store.
* Retrieve online features.
* **Online Store:** The online store is a database that stores only the latest feature values for each entity. The online store is populated by materialization jobs.
* **Online Store:** The online store is a database that stores only the latest feature values for each entity. The online store is populated by materialization jobs and from [stream ingestion](../../reference/alpha-stream-ingestion.md).
* **Offline Store:** The offline store persists batch data that has been ingested into Feast. This data is used for producing training datasets. Feast does not manage the offline store directly, but runs queries against it.

{% hint style="info" %}
Java and Go Clients are also available for online feature retrieval.
{% endhint %}

6 changes: 3 additions & 3 deletions docs/getting-started/concepts/feature-view.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ driver_stats_fv = FeatureView(
Feature views are used during

* The generation of training datasets by querying the data source of feature views in order to find historical feature values. A single training dataset may consist of features from multiple feature views.
* Loading of feature values into an online store. Feature views determine the storage schema in the online store.
* Loading of feature values into an online store. Feature views determine the storage schema in the online store. Feature values can be loaded from batch sources or from [stream sources](../../reference/alpha-stream-ingestion.md).
* Retrieval of features from the online store. Feature views provide the schema definition to Feast in order to look up features from the online store.

{% hint style="info" %}
Expand Down Expand Up @@ -57,7 +57,7 @@ global_stats_fv = FeatureView(

"Entity aliases" can be specified to join `entity_dataframe` columns that do not match the column names in the source table of a FeatureView.

This could be used if a user has no control over these column names or if there are multiple entities are a subclass of a more general entity. For example, "spammer" and "reporter" could be aliases of a "user" entity, and "origin" and "destination" could be aliases of a "location" entity as shown below.
This could be used if a user has no control over these column names or if there are multiple entities are a subclass of a more general entity. For example, "spammer" and "reporter" could be aliases of a "user" entity, and "origin" and "destination" could be aliases of a "location" entity as shown below.

It is suggested that you dynamically specify the new FeatureView name using `.with_name` and `join_key_map` override using `.with_join_key_map` instead of needing to register each new copy.

Expand All @@ -78,6 +78,7 @@ location_stats_fv= FeatureView(
)
```
{% endtab %}

{% tab title="temperatures_feature_service.py" %}
```python
from location_stats_feature_view import location_stats_fv
Expand Down Expand Up @@ -150,4 +151,3 @@ def transformed_conv_rate(features_df: pd.DataFrame) -> pd.DataFrame:
df['conv_rate_plus_val2'] = (features_df['conv_rate'] + features_df['val_to_add_2'])
return df
```

25 changes: 12 additions & 13 deletions docs/getting-started/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,11 @@ No, there are [feature views without entities](concepts/feature-view.md#feature-

### Does Feast provide security or access control?

Feast currently does not support any access control other than the access control required for the Provider's environment \(for example, GCP and AWS permissions\).
Feast currently does not support any access control other than the access control required for the Provider's environment (for example, GCP and AWS permissions).

### Does Feast support streaming sources?

Feast is actively working on this right now. Please reach out to the Feast team if you're interested in giving feedback!
Yes. In earlier versions of Feast, we used Feast Spark to manage ingestion from stream sources. In the current version of Feast, we support [push based ingestion](../reference/alpha-stream-ingestion.md).

### Does Feast support composite keys?

Expand All @@ -42,12 +42,12 @@ Feast is designed to work at scale and support low latency online serving. Bench

Yes. Specifically:

* Simple lists / dense embeddings:
* BigQuery supports list types natively
* Redshift does not support list types, so you'll need to serialize these features into strings \(e.g. json or protocol buffers\)
* Feast's implementation of online stores serializes features into Feast protocol buffers and supports list types \(see [reference](https://github.com/feast-dev/feast/blob/master/docs/specs/online_store_format.md#appendix-a-value-proto-format)\)
* Sparse embeddings \(e.g. one hot encodings\)
* One way to do this efficiently is to have a protobuf or string representation of [https://www.tensorflow.org/guide/sparse\_tensor](https://www.tensorflow.org/guide/sparse_tensor)
* Simple lists / dense embeddings:
* BigQuery supports list types natively
* Redshift does not support list types, so you'll need to serialize these features into strings (e.g. json or protocol buffers)
* Feast's implementation of online stores serializes features into Feast protocol buffers and supports list types (see [reference](https://github.com/feast-dev/feast/blob/master/docs/specs/online\_store\_format.md#appendix-a-value-proto-format))
* Sparse embeddings (e.g. one hot encodings)
* One way to do this efficiently is to have a protobuf or string representation of [https://www.tensorflow.org/guide/sparse\_tensor](https://www.tensorflow.org/guide/sparse\_tensor)

### Does Feast support X storage engine?

Expand All @@ -61,7 +61,7 @@ Please follow the instructions [here](../how-to-guides/adding-support-for-a-new-

Yes. There are two ways to use S3 in Feast:

* Using Redshift as a data source via Spectrum \([AWS tutorial](https://docs.aws.amazon.com/redshift/latest/dg/tutorial-nested-data-create-table.html)\), and then continuing with the [Running Feast with GCP/AWS](../how-to-guides/feast-gcp-aws/) guide. See a [presentation](https://youtu.be/pMFbRJ7AnBk?t=9463) we did on this at our apply\(\) meetup.
* Using Redshift as a data source via Spectrum ([AWS tutorial](https://docs.aws.amazon.com/redshift/latest/dg/tutorial-nested-data-create-table.html)), and then continuing with the [Running Feast with GCP/AWS](../how-to-guides/feast-gcp-aws/) guide. See a [presentation](https://youtu.be/pMFbRJ7AnBk?t=9463) we did on this at our apply() meetup.
* Using the `s3_endpoint_override` in a `FileSource` data source. This endpoint is more suitable for quick proof of concepts that won't necessarily scale for production use cases.

### How can I use Spark with Feast?
Expand All @@ -76,11 +76,11 @@ Please see the [roadmap](../roadmap.md).

### What is the difference between Feast 0.9 and Feast 0.10+?

Feast 0.10+ is much lighter weight and more extensible than Feast 0.9. It is designed to be simple to install and use. Please see this [document](https://docs.google.com/document/d/1AOsr_baczuARjCpmZgVd8mCqTF4AZ49OEyU4Cn-uTT0) for more details.
Feast 0.10+ is much lighter weight and more extensible than Feast 0.9. It is designed to be simple to install and use. Please see this [document](https://docs.google.com/document/d/1AOsr\_baczuARjCpmZgVd8mCqTF4AZ49OEyU4Cn-uTT0) for more details.

### How do I migrate from Feast 0.9 to Feast 0.10+?

Please see this [document](https://docs.google.com/document/d/1AOsr_baczuARjCpmZgVd8mCqTF4AZ49OEyU4Cn-uTT0). If you have any questions or suggestions, feel free to leave a comment on the document!
Please see this [document](https://docs.google.com/document/d/1AOsr\_baczuARjCpmZgVd8mCqTF4AZ49OEyU4Cn-uTT0). If you have any questions or suggestions, feel free to leave a comment on the document!

### How do I contribute to Feast?

Expand All @@ -93,6 +93,5 @@ Feast Core and Feast Serving were both part of Feast Java. We plan to support Fe
{% hint style="info" %}
**Don't see your question?**

We encourage you to ask questions on [Slack](https://slack.feast.dev/) or [Github](https://github.com/feast-dev/feast). Even better, once you get an answer, add the answer to this FAQ via a [pull request](../project/development-guide.md)!
We encourage you to ask questions on [Slack](https://slack.feast.dev) or [Github](https://github.com/feast-dev/feast). Even better, once you get an answer, add the answer to this FAQ via a [pull request](../project/development-guide.md)!
{% endhint %}

2 changes: 1 addition & 1 deletion docs/reference/alpha-stream-ingestion.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Feast now allows users to push features previously registered in a feature view

## Example

See [https://github.com/feast-dev/feast-demo](https://github.com/feast-dev/on-demand-feature-views-demo) for an example on how to use on demand feature views.
See [https://github.com/feast-dev/feast-demo](https://github.com/feast-dev/on-demand-feature-views-demo) for an example on how to ingest stream data into Feast.

We register a feature view as normal, and during stream processing (e.g. Kafka consumers), now we push a dataframe matching the feature view schema:

Expand Down

0 comments on commit 821d284

Please sign in to comment.