-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Online Serving unable to retrieve feature data after Feature Set update. #908
Fix Online Serving unable to retrieve feature data after Feature Set update. #908
Conversation
...nnectors/redis/src/main/java/feast/storage/connectors/redis/retriever/FeatureRowDecoder.java
Outdated
Show resolved
Hide resolved
...nnectors/redis/src/main/java/feast/storage/connectors/redis/retriever/FeatureRowDecoder.java
Outdated
Show resolved
Hide resolved
...nnectors/redis/src/main/java/feast/storage/connectors/redis/retriever/FeatureRowDecoder.java
Outdated
Show resolved
Hide resolved
.map( | ||
name -> { | ||
String nameHash = | ||
Hashing.murmur3_32().hashString(name, StandardCharsets.UTF_8).toString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the length of the hash string? Just want to make sure its as small as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8 characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming that the fields stored in the feature row are float values (32 bit, 4 bytes), this would mean a ~3x increase in space consumption.
@woop @pyalex @khorshuheng
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not all fields that are stored are float, there're a lot strings as well and int64. So everything is not so bad
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I guess that we can safely cut hash string to 4-5 chars
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mrzzy, pyalex The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This reverts commit aecc9a6e9a70be3fd84d04f81442b518be01a4c6.
/test test-end-to-end-batch-dataflow |
/lgtm |
…update. (#908) * Update RedisCustomIO to write FeatureRows with field's name set to hash of field. * Update FeatureRowDecoder to decode by name hash instead of order * Bump pytest order numbers by 2 to make space for new tests * Revert "Bump pytest order numbers by 2 to make space for new tests" This reverts commit aecc9a6e9a70be3fd84d04f81442b518be01a4c6. * Added e2e to check that feature rows with missing or extra fields can be retrieved * Clarify docs about Feature Row v1 encoding and Feature Row v2 encoding * Fix python lint * Update FeatureRowDecoder's isEncodedV2 check to use anyMatch() * Make missing field/extra field e2e tests independent of other tests. * Update FeatureRowDecoder if/else statement into 2 ifs * Fix python and java lint * Fix java unit test failures * Fix ImportJobTest java unit test * Sync github workflows with master * Sync .github folder with master for fix * Replace v1/v2 encoding with v1/v2 decoder in docs
Problem
Feast Online Serving will throw an when the user attempts Feature Retrieval in the following scenario:
What this PR does / why we need it:
Update Ingestion's
RedisCustomIO
to encode feature rows by setting field name to a hash of the actual name.Update Online Serving's
FeatureRowDecoder
to support decoding Feature Rows by name hash.Which issue(s) this PR fixes:
Fixes #
Does this PR introduce a user-facing change?: