-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add entity join key and fix entity references #1429
Conversation
# create FeatureView | ||
fv = FeatureView( | ||
name="test_bq_table_correctness", | ||
entities=["driver_id"], | ||
entities=["Driver ID"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, we support arbitrary text for the name? I don't think thats a good idea. We should probably just be supporting lowercase alphanumeric with underscore.
8b5150d
to
c2d1187
Compare
/kind housekeeping |
sdk/python/feast/infra/gcp.py
Outdated
feature_name_columns, | ||
event_timestamp_column, | ||
created_timestamp_column, | ||
) = feature_view.run_reverse_field_mapping(join_keys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if its a good practice to query the registry here and pass in the join keys. Its almost a half step. We should be pulling the feature views from the registry here, based on their names. Then we should either
- Pull the entities as you've done it here and have a helper function like
run_reverse_field_mapping(feature_view, entities)
which is a reversal of this method approach, or - When we pull the FeatureViews from the registry they get a reference to the
FeatureStore
stored inside of them. Then we can add a method likeFeatureView.get_entities()
which returns its Entity objects. May need to change the previous one toFeatureView.entity_names
so that it isnt confusing. Anyway, then when you runFeatureView.run_reverse_field_mapping()
it will find its own entities using the reference it has to the FeatureStore class, as opposed to it being passed in.
or - We remove this List[str] as entities and replace it with
List[Entity]
so that when we do aget_feature_view()
we actually get the entity details back in the first place. Its also much cleaner for end users when registering FeatureViews since they dont have to deal with string references.
Given how much time is left, I am leaning towards the first option, but its definitely technical debt.
What do you think @jklegar ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah there's definitely tech debt here - IMO the main problem is getting the feature view from the registry doesn't also get the entities from the registry and we should do something like your third option (in a separate diff). Happy to do the first option here though not sure how much it actually gets us since we need join_keys in this method further down anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess my concern is just that we are making the code base more complicated (using a method) but we're not gaining anything from that since the method acts like a function.
need join_keys in this method further down anyway
Wouldnt we be able to use this list of entities to get the join keys? I think what I am trying to avoid as much as possible is to have methods with I/O. I want to us to try and follow a more immutable/functional approach as much as we can (which I realize isnt 100% possible). So get_entities/get_feature_views etc should ideally be run only once near the start of one of our methods and then the state should just propagate down. It'll be way easier to reason about and test long term.
989369c
to
aedc552
Compare
Signed-off-by: Jacob Klegar <jacob@tecton.ai>
Signed-off-by: Jacob Klegar <jacob@tecton.ai>
ea644bb
to
bd85028
Compare
Signed-off-by: Willem Pienaar <git@willem.co>
Signed-off-by: Willem Pienaar <git@willem.co>
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jklegar, woop The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
What this PR does / why we need it: Add entity join key and fix FeatureStore methods to use join keys instead of input entity names
Which issue(s) this PR fixes:
Fixes #
Does this PR introduce a user-facing change?: