-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Adding support for dictionary writes to online store #4156
feat: Adding support for dictionary writes to online store #4156
Conversation
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
sdk/python/feast/feature_store.py
Outdated
@@ -1406,7 +1407,8 @@ def push( | |||
def write_to_online_store( | |||
self, | |||
feature_view_name: str, | |||
df: pd.DataFrame, | |||
df: Optional[pd.DataFrame] = None, | |||
input_dict: Optional[Dict] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this make more sense as a single param with a Union type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could do a type check I just thought I'd be explicit about it first...because if I make it a single parameter then it would make sense to change the name of the input argument, which means then I'd have to make a breaking change or do a staged rollout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do a staggered release for this and get feedback from folks but I wouldn't want to make any breaking changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, df
name is really awkward. maybe call the new one something other than input_dict
so that it can become the universal one after we deprecate and remove df
? just inputs
for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I like that. I'll update.
hey quick question, how will the change optimize feature writes? 🤔 |
I think this is the same question for me :) |
@HaoXuAI @sudohainguyen I added some notes to the PR to be more explicit about the plan here. |
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
@franciscojavierarceo btw, what is |
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
I haven't spent any time using pyarrow but definitely open to it. |
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
@@ -1406,7 +1407,8 @@ def push( | |||
def write_to_online_store( | |||
self, | |||
feature_view_name: str, | |||
df: pd.DataFrame, | |||
df: Optional[pd.DataFrame] = None, | |||
inputs: Optional[Union[Dict[str, List[Any]], pd.DataFrame]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tokoko updated
…#4156) * feat: Adding support for dictionary writes to online store Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * Simple approach Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * lint Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * adding error if both are missing Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * rename dict to input_dict Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * updated input arg to test Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * Renaming function argument Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * updated docstring Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * updated type signature Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * updated type to be more explicit Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
…#4156) * feat: Adding support for dictionary writes to online store Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * Simple approach Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * lint Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * adding error if both are missing Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * rename dict to input_dict Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * updated input arg to test Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * Renaming function argument Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * updated docstring Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * updated type signature Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * updated type to be more explicit Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> Signed-off-by: Theodor Mihalache <tmihalac@redhat.com>
# [0.38.0](v0.37.0...v0.38.0) (2024-05-24) ### Bug Fixes * Add vector database doc ([#4165](#4165)) ([37f36b6](37f36b6)) * Change checkout action back to v3 from v5 which isn't released yet ([#4147](#4147)) ([9523fff](9523fff)) * Change numpy version <1.25 dependency to <2 in setup.py ([#4085](#4085)) ([2ba71ff](2ba71ff)), closes [#4084](#4084) * Changed the code the way mysql container is initialized. ([#4140](#4140)) ([8b5698f](8b5698f)), closes [#4126](#4126) * Correct nightly install command, move all installs to uv ([#4164](#4164)) ([c86d594](c86d594)) * Default value is not set in Redis connection string using environment variable ([#4136](#4136)) ([95acfb4](95acfb4)), closes [#3669](#3669) * Get container host addresses from testcontainers (java) ([#4125](#4125)) ([9184dde](9184dde)) * Get rid of empty string `name_alias` during feature view projection deserialization ([#4116](#4116)) ([65056ce](65056ce)) * Helm chart `feast-feature-server`, improve Service template name ([#4161](#4161)) ([dedc164](dedc164)) * Improve the code related to on-demand-featureview. ([#4203](#4203)) ([d91d7e0](d91d7e0)) * Integration tests for async sdk method ([#4201](#4201)) ([08c44ae](08c44ae)) * Make sure schema is used when calling `get_table_query_string` method for Snowflake datasource ([#4131](#4131)) ([c1579c7](c1579c7)) * Make sure schema is used when generating `from_expression` for Snowflake ([#4177](#4177)) ([5051da7](5051da7)) * Pass native input values to `get_online_features` from feature server ([#4117](#4117)) ([60756cb](60756cb)) * Pass region to S3 client only if set (Java) ([#4151](#4151)) ([b8087f7](b8087f7)) * Pgvector patch ([#4108](#4108)) ([ad45bb4](ad45bb4)) * Update doc ([#4153](#4153)) ([e873636](e873636)) * Update master-only benchmark bucket name due to credential update ([#4183](#4183)) ([e88f1e3](e88f1e3)) * Updating the instructions for quickstart guide. ([#4120](#4120)) ([0c30e96](0c30e96)) * Upgrading the test container so that local tests works with updated d… ([#4155](#4155)) ([93ddb11](93ddb11)) ### Features * Add a Kubernetes Operator for the Feast Feature Server ([#4145](#4145)) ([4a696dc](4a696dc)) * Add delta format to `FileSource`, add support for it in ibis/duckdb ([#4123](#4123)) ([2b6f1d0](2b6f1d0)) * Add materialization support to ibis/duckdb ([#4173](#4173)) ([369ca98](369ca98)) * Add optional private key params to Snowflake config ([#4205](#4205)) ([20f5419](20f5419)) * Add s3 remote storage export for duckdb ([#4195](#4195)) ([6a04c48](6a04c48)) * Adding DatastoreOnlineStore 'database' argument. ([#4180](#4180)) ([e739745](e739745)) * Adding get_online_features_async to feature store sdk ([#4172](#4172)) ([311efc5](311efc5)) * Adding support for dictionary writes to online store ([#4156](#4156)) ([abfac01](abfac01)) * Elasticsearch vector database ([#4188](#4188)) ([bf99640](bf99640)) * Enable other distance metrics for Vector DB and Update docs ([#4170](#4170)) ([ba9f4ef](ba9f4ef)) * Feast/IKV datetime edgecase errors ([#4211](#4211)) ([bdae562](bdae562)) * Feast/IKV documenation language changes ([#4149](#4149)) ([690a621](690a621)) * Feast/IKV online store contrib plugin integration ([#4068](#4068)) ([f2b4eb9](f2b4eb9)) * Feast/IKV online store documentation ([#4146](#4146)) ([73601e4](73601e4)) * Feast/IKV upgrade client version ([#4200](#4200)) ([0e42150](0e42150)) * Incorporate substrait ODFVs into ibis-based offline store queries ([#4102](#4102)) ([c3a102f](c3a102f)) * Isolate input-dependent calculations in `get_online_features` ([#4041](#4041)) ([2a6edea](2a6edea)) * Make arrow primary interchange for online ODFV execution ([#4143](#4143)) ([3fdb716](3fdb716)) * Move data source validation entrypoint to offline store ([#4197](#4197)) ([a17725d](a17725d)) * Upgrading python version to 3.11, adding support for 3.11 as well. ([#4159](#4159)) ([4b1634f](4b1634f)), closes [#4152](#4152) [#4114](#4114) ### Reverts * Reverts "fix: Using version args to install the correct feast version" ([#4112](#4112)) ([b66baa4](b66baa4)), closes [#3953](#3953)
# [0.38.0](v0.37.0...v0.38.0) (2024-05-24) ### Bug Fixes * Add vector database doc ([#4165](#4165)) ([37f36b6](37f36b6)) * Change checkout action back to v3 from v5 which isn't released yet ([#4147](#4147)) ([9523fff](9523fff)) * Change numpy version <1.25 dependency to <2 in setup.py ([#4085](#4085)) ([2ba71ff](2ba71ff)), closes [#4084](#4084) * Changed the code the way mysql container is initialized. ([#4140](#4140)) ([8b5698f](8b5698f)), closes [#4126](#4126) * Correct nightly install command, move all installs to uv ([#4164](#4164)) ([c86d594](c86d594)) * Default value is not set in Redis connection string using environment variable ([#4136](#4136)) ([95acfb4](95acfb4)), closes [#3669](#3669) * Get container host addresses from testcontainers (java) ([#4125](#4125)) ([9184dde](9184dde)) * Get rid of empty string `name_alias` during feature view projection deserialization ([#4116](#4116)) ([65056ce](65056ce)) * Helm chart `feast-feature-server`, improve Service template name ([#4161](#4161)) ([dedc164](dedc164)) * Improve the code related to on-demand-featureview. ([#4203](#4203)) ([d91d7e0](d91d7e0)) * Integration tests for async sdk method ([#4201](#4201)) ([08c44ae](08c44ae)) * Make sure schema is used when calling `get_table_query_string` method for Snowflake datasource ([#4131](#4131)) ([c1579c7](c1579c7)) * Make sure schema is used when generating `from_expression` for Snowflake ([#4177](#4177)) ([5051da7](5051da7)) * Pass native input values to `get_online_features` from feature server ([#4117](#4117)) ([60756cb](60756cb)) * Pass region to S3 client only if set (Java) ([#4151](#4151)) ([b8087f7](b8087f7)) * Pgvector patch ([#4108](#4108)) ([ad45bb4](ad45bb4)) * Update doc ([#4153](#4153)) ([e873636](e873636)) * Update master-only benchmark bucket name due to credential update ([#4183](#4183)) ([e88f1e3](e88f1e3)) * Updating the instructions for quickstart guide. ([#4120](#4120)) ([0c30e96](0c30e96)) * Upgrading the test container so that local tests works with updated d… ([#4155](#4155)) ([93ddb11](93ddb11)) ### Features * Add a Kubernetes Operator for the Feast Feature Server ([#4145](#4145)) ([4a696dc](4a696dc)) * Add delta format to `FileSource`, add support for it in ibis/duckdb ([#4123](#4123)) ([2b6f1d0](2b6f1d0)) * Add materialization support to ibis/duckdb ([#4173](#4173)) ([369ca98](369ca98)) * Add optional private key params to Snowflake config ([#4205](#4205)) ([20f5419](20f5419)) * Add s3 remote storage export for duckdb ([#4195](#4195)) ([6a04c48](6a04c48)) * Adding DatastoreOnlineStore 'database' argument. ([#4180](#4180)) ([e739745](e739745)) * Adding get_online_features_async to feature store sdk ([#4172](#4172)) ([311efc5](311efc5)) * Adding support for dictionary writes to online store ([#4156](#4156)) ([abfac01](abfac01)) * Elasticsearch vector database ([#4188](#4188)) ([bf99640](bf99640)) * Enable other distance metrics for Vector DB and Update docs ([#4170](#4170)) ([ba9f4ef](ba9f4ef)) * Feast/IKV datetime edgecase errors ([#4211](#4211)) ([bdae562](bdae562)) * Feast/IKV documenation language changes ([#4149](#4149)) ([690a621](690a621)) * Feast/IKV online store contrib plugin integration ([#4068](#4068)) ([f2b4eb9](f2b4eb9)) * Feast/IKV online store documentation ([#4146](#4146)) ([73601e4](73601e4)) * Feast/IKV upgrade client version ([#4200](#4200)) ([0e42150](0e42150)) * Incorporate substrait ODFVs into ibis-based offline store queries ([#4102](#4102)) ([c3a102f](c3a102f)) * Isolate input-dependent calculations in `get_online_features` ([#4041](#4041)) ([2a6edea](2a6edea)) * Make arrow primary interchange for online ODFV execution ([#4143](#4143)) ([3fdb716](3fdb716)) * Move data source validation entrypoint to offline store ([#4197](#4197)) ([a17725d](a17725d)) * Upgrading python version to 3.11, adding support for 3.11 as well. ([#4159](#4159)) ([4b1634f](4b1634f)), closes [#4152](#4152) [#4114](#4114) ### Reverts * Reverts "fix: Using version args to install the correct feast version" ([#4112](#4112)) ([b66baa4](b66baa4)), closes [#3953](#3953)
What this PR does / why we need it:
This PR enables support for passing a Python
Dict
object to write to the online store. This is a first step along the path to optimizing feature writes to the online store.Note, this won't speed up anything and, in fact, introduces minor latency to serialize the dict into a DataFrame. In a subsequent PR I will directly use the dictionary object as it propagates through the Feast backend layer to write rather than any of the existing pandas code. When I do this next step, I will measure the performance benefits to highlight the latency gains.
Which issue(s) this PR fixes:
N/A
Fixes
N/A