Skip to content

danbaron63/online-features

Repository files navigation

Online Features Example

This repo contains sample code to implement streaming feature engineering from a local Kafka deployed using docker compose.

Structure

This project contains 3 components:

  1. Data producer
  2. Kafka
  3. Spark consumer

Data Producer

The data producer is a simple python application that is deployed in a docker container. This will randomly sample records in data/ and send them to Kafka.

Kafka

This is deployed in a docker container with the necessary ports exposed.

Spark consumer

This is a pyspark application that consumes data from Kafka and performs some transformations. This can either be in streaming or batch mode.

Installation

To install necessary python dependencies and build jars required by Spark, run:

make install

To activate your virtual environment, run:

source .venv/bin/activate

Run

Run Kafka

To run Kafka along with the random data producer run:

docker compose up --build -d

This will build and run the necessary containers.

Run Spark

To run the Spark stream job run:

python -m src.consumer_spark

Feature Store

The feature store is available in demo/feature_repo. Feature definitions are defined in transaction_repo.py and a test workflow is available to test the feature store by requesting online features via the feature server and Feast SDK.

Note: to ensure data is available for the feature store, please run earlier steps and ensure the docker containers are up and running and the spark stream is running successfully. See Run section for more details.

To setup the feature store, run:

cd demo/feature_repo
feast plan
feast apply

In order to populate the feature store, you must materialize the features from the offline store:

CURRENT_TIME=$(date -u +"%Y-%m-%dT%H:%M:%S")
feast materialize-incremental $CURRENT_TIME

To start the feast server run:

feast serve

You can then run the test workflow:

python test_workflow.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published