-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Initial Bytewax materialization engine #2974
Conversation
9a673b0
to
be0af00
Compare
{ | ||
"command": ["sh", "-c", "sh ./entrypoint.sh"], | ||
"env": job_env, | ||
"image": "bytewax/bytewax-feast:latest", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing that this is https://hub.docker.com/r/bytewax/bytewax-feast - can we add some docs on what dependencies this container comes with and how it's built?
Additionally, I think this should be configurable (so that feast users can potentially supply a customer container with custom online store implementation), with this image being the default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think making it configurable is a good idea.
My thought was to create a repo that would be used to build the bytewax/bytewax-feast
image and push it to Dockerhub. People that want to make changes can fork that repo and build whichever images they would like to use. What do you think?
I haven't taken that step quite yet as I was building the Feast SDK from the source files that included the changes in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having a repo that builds the image makes sense. Having it as part of the main repo may be good too, since that keeps everything in the same place. Doing it in a subsequent PR is fine by me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good. I'll add it in a subsequent PR, since it will depend on being able to install a release version of Feast that includes this code.
/ok-to-test |
Codecov Report
@@ Coverage Diff @@
## master #2974 +/- ##
==========================================
+ Coverage 67.44% 75.94% +8.50%
==========================================
Files 169 203 +34
Lines 14936 16939 +2003
==========================================
+ Hits 10074 12865 +2791
+ Misses 4862 4074 -788
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
dc21105
to
dc7dd79
Compare
6d80cec
to
1526cbc
Compare
@whoahbot looking at this PR now, but do you mind signing your commits? Also you may need a rebase! |
dc984e9
to
d7a1b9c
Compare
Signed-off-by: Dan Herrera <whoahbot@bytewax.io>
- Add integration test, by factoring out shared consistency test. - Make the number of Pods dynamic, based on the number of .parquet file paths. - Add instructions for creating a bytewax test cluster for integration testing. Signed-off-by: Dan Herrera <whoahbot@bytewax.io>
Signed-off-by: Dan Herrera <whoahbot@bytewax.io>
Signed-off-by: Dan Herrera <whoahbot@bytewax.io>
d7a1b9c
to
383e3cb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: achals, whoahbot The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
# [0.24.0](v0.23.0...v0.24.0) (2022-08-25) ### Bug Fixes * Check if on_demand_feature_views is an empty list rather than None for snowflake provider ([#3046](#3046)) ([9b05e65](9b05e65)) * FeatureStore.apply applies BatchFeatureView correctly ([#3098](#3098)) ([41be511](41be511)) * Fix Feast Java inconsistency with int64 serialization vs python ([#3031](#3031)) ([4bba787](4bba787)) * Fix feature service inference logic ([#3089](#3089)) ([4310ed7](4310ed7)) * Fix field mapping logic during feature inference ([#3067](#3067)) ([cdfa761](cdfa761)) * Fix incorrect on demand feature view diffing and improve Java tests ([#3074](#3074)) ([0702310](0702310)) * Fix Java helm charts to work with refactored logic. Fix FTS image ([#3105](#3105)) ([2b493e0](2b493e0)) * Fix on demand feature view output in feast plan + Web UI crash ([#3057](#3057)) ([bfae6ac](bfae6ac)) * Fix release workflow to release 0.24.0 ([#3138](#3138)) ([a69aaae](a69aaae)) * Fix Spark offline store type conversion to arrow ([#3071](#3071)) ([b26566d](b26566d)) * Fixing Web UI, which fails for the SQL registry ([#3028](#3028)) ([64603b6](64603b6)) * Force Snowflake Session to Timezone UTC ([#3083](#3083)) ([9f221e6](9f221e6)) * Make infer dummy entity join key idempotent ([#3115](#3115)) ([1f5b1e0](1f5b1e0)) * More explicit error messages ([#2708](#2708)) ([e4d7afd](e4d7afd)) * Parse inline data sources ([#3036](#3036)) ([c7ba370](c7ba370)) * Prevent overwriting existing file during `persist` ([#3088](#3088)) ([69af21f](69af21f)) * Register BatchFeatureView in feature repos correctly ([#3092](#3092)) ([b8e39ea](b8e39ea)) * Return an empty infra object from sql registry when it doesn't exist ([#3022](#3022)) ([8ba87d1](8ba87d1)) * Teardown tables for Snowflake Materialization testing ([#3106](#3106)) ([0a0c974](0a0c974)) * UI error when saved dataset is present in registry. ([#3124](#3124)) ([83cf753](83cf753)) * Update sql.py ([#3096](#3096)) ([2646a86](2646a86)) * Updated snowflake template ([#3130](#3130)) ([f0594e1](f0594e1)) ### Features * Add authentication option for snowflake connector ([#3039](#3039)) ([74c75f1](74c75f1)) * Add Cassandra/AstraDB online store contribution ([#2873](#2873)) ([feb6cb8](feb6cb8)) * Add Snowflake materialization engine ([#2948](#2948)) ([f3b522b](f3b522b)) * Adding saved dataset capabilities for Postgres ([#3070](#3070)) ([d3253c3](d3253c3)) * Allow passing repo config path via flag ([#3077](#3077)) ([0d2d951](0d2d951)) * Contrib azure provider with synapse/mssql offline store and Azure registry store ([#3072](#3072)) ([9f7e557](9f7e557)) * Custom Docker image for Bytewax batch materialization ([#3099](#3099)) ([cdd1b07](cdd1b07)) * Feast AWS Athena offline store (again) ([#3044](#3044)) ([989ce08](989ce08)) * Implement spark offline store `offline_write_batch` method ([#3076](#3076)) ([5b0cc87](5b0cc87)) * Initial Bytewax materialization engine ([#2974](#2974)) ([55c61f9](55c61f9)) * Refactor feature server helm charts to allow passing feature_store.yaml in environment variables ([#3113](#3113)) ([85ee789](85ee789))
What this PR does / why we need it:
This PR adds a Bytewax batch materialization engine.
Which issue(s) this PR fixes:
Fixes #