-
Notifications
You must be signed in to change notification settings - Fork 998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add offline_store config #1552
Add offline_store config #1552
Conversation
Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai>
35991e1
to
345380f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main concern here is that the local provider works with both local and BQ offline stores, as does the GCP provider, so having the offline_store type option might be confusing to users especially since it doesn't affect interaction with the offline store. Not sure if that's already been discussed, and not sure what the better alternative would be though. Thoughts?
Ultimately it seems like we have to support decoupling storage (offline, online) from the provider if we want to have reuse, even if the provider still has to opt-in (or can override) the storage implementation. So at first glance, it seems natural to have the
What is the "it" here, the "offline_store" object, or the "provider" option? |
+1 to Willem's question, I am not sure what the "it" here is. This seems like a fine change to our API, but I might be missing something here:
@tsotnet, is there a default offline store type for each provider? I just want to check, but I don't think users should have to specify the |
"it" here is the offline_store type - users can set the type to be bigquery and then use file sources or both file and bigquery sources, or vice versa |
@jay there is a default offline store type. As you'd expect FileOfflineStore for local provider and BigqueryOfflineStore for gcp provider.
@jklegar There are 2 directions we can go from here:
I think in a long term we should support multiple offline stores (and even multiple online stores). But it's not as simple. Currently we partially support this: we can have different data sources in a project, but when getting a historical data all requested features need to be from the same offline store. I recommend a middle ground: enable defining multiple offline store configs, but the behavior of getting a historical data stays the same for now. Thoughts? |
Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai>
…to dataset Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai>
Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai>
Codecov Report
@@ Coverage Diff @@
## master #1552 +/- ##
==========================================
- Coverage 83.71% 83.59% -0.13%
==========================================
Files 65 65
Lines 5632 5680 +48
==========================================
+ Hits 4715 4748 +33
- Misses 917 932 +15
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai>
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: tsotnet, woop The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* Add offline_store config Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai> * Enforce offline_store during feast apply, rename entity_dataset_name to dataset Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai> * Remove ugly getattr since it's unnecessary anymore Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai> * Rename Bigquery to BigQuery Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai>
What this PR does / why we need it: This PR adds offline_store config to feature_store.yaml. Specifically, it makes it possible to override temporary entity dataframe bigquery dataset name.
Here is an example yaml file:
Which issue(s) this PR fixes:
Fixes #
Does this PR introduce a user-facing change?: