-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support on demand feature views in go feature server #2494
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2494 +/- ##
===========================================
- Coverage 84.49% 59.56% -24.93%
===========================================
Files 131 132 +1
Lines 11080 11043 -37
===========================================
- Hits 9362 6578 -2784
- Misses 1718 4465 +2747
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just some initial comments; will review again later
sdk/python/tests/integration/feature_repos/repo_configuration.py
Outdated
Show resolved
Hide resolved
sdk/python/feast/feature_store.py
Outdated
@@ -1939,6 +1941,9 @@ def serve_transformations(self, port: int) -> None: | |||
|
|||
transformation_server.start_server(self, port) | |||
|
|||
def _refresh_go_server(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this name is pretty misleading - I think your idea is that setting self._go_server = None
will force the FS to reinitialize the go server on a new get_online_features
, if so maybe add a comment to describe exactly what you're doing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to _teardown..
. Yeah, the idea is to kill current instance (if it exists) and it will be recreated on the next requests. Added comment about that
Signed-off-by: pyalex <moskalenko.alexey@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good
@@ -32,6 +32,7 @@ github.com/Azure/go-autorest/logger v0.2.1/go.mod h1:T9E3cAhj2VqvPOtCYAvby9aBXkZ | |||
github.com/Azure/go-autorest/tracing v0.6.0/go.mod h1:+vhtPC754Xsa23ID7GlGsrdKBpUA79WCAKPPZVC2DeU= | |||
github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03qcyfWMU= | |||
github.com/BurntSushi/xgb v0.0.0-20160522181843-27f122750802/go.mod h1:IVnqGOEym/WlBOVXweHU+Q+/VP0lqqI8lqeDx9IjBqo= | |||
github.com/JohnCGriffin/overflow v0.0.0-20211019200055-46fa312c352c/go.mod h1:X0CRv0ky0k6m906ixxpzmDRLvX58TFUKS2eePweuyxk= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such a weird dep to pull in. This project was last updated years ago.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like it's github.com/apache/arrow/go/v7
dependency
github.com/antihax/optional v1.0.0/go.mod h1:uupD/76wgC+ih3iEmQUL+0Ugr19nfwCT1kdvxnR2qWY= | ||
github.com/apache/arrow/go/arrow v0.0.0-20200730104253-651201b0f516 h1:byKBBF2CKWBjjA4J1ZL2JXttJULvWSl50LegTyRZ728= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come this is present in addition to the v7
version a few lines below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is from this dependencies that was added in #2446
github.com/xitongsys/parquet-go v1.6.2
github.com/xitongsys/parquet-go-source v0.0.0-20220315005136-aec0fe3e777c
I will fix this in the following refactoring PR
go/cmd/server/main.go
Outdated
@@ -17,6 +17,10 @@ const ( | |||
feastServerVersion = "0.18.0" | |||
) | |||
|
|||
func dummyTransformCallback(ODFVName string, inputArrPtr, inputSchemaPtr, outArrPtr, outSchemaPtr uintptr, fullFeatureNames bool) int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docs for this function? At least to explain what the return 0
means, or to use a more meaningful return type and name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, w/o Python as a wrapper on demand transformations are not supported. I replaced this dummy implementation with passing nil
to the FeatureStore
constructor.
I also added a doc about TransformationCallback
signature (see transformation.go
)
entities, _ := s.fs.ListEntities(true) | ||
entitiesByName := make(map[string]*feast.Entity) | ||
for _, entity := range entities { | ||
entitiesByName[entity.Name] = entity | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should pull this into the feature store struct, since it seems like it may be a commonly used method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. This will be addressed in the following refactoring PR
} | ||
for entityName, _ := range view.Entities { | ||
entity := entitiesByName[entityName] | ||
joinKeyTypes[entity.JoinKey] = int32(entity.ValueType.Number()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe return map[string]ValueType
instead? More readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GetEntityTypesMapByFeatureService
is a function called from Python. Gopy doesn't understand that ValueType is an alias for int32, so packing and unpacking it will be additional headache, not sure that it will make it more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just saying that using native types is easier in some cases.
go/cmd/server/main.go
Outdated
@@ -41,7 +45,7 @@ func main() { | |||
} | |||
|
|||
log.Println("Initializing feature store...") | |||
fs, err := feast.NewFeatureStore(repoConfig) | |||
fs, err := feast.NewFeatureStore(repoConfig, dummyTransformCallback) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we wanted to start the standalone go feature server, how can instantiate the transformation callback in python and pass it in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. With current implementation Go code must be always wrapped in Python, it's an easy way to have Python interpreter in the same process. Go feature server can run in standalone mode, but w/o support of transformations and python-based store connectors.
Signed-off-by: pyalex <moskalenko.alexey@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: achals, pyalex The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
# [0.20.0](v0.19.0...v0.20.0) (2022-04-14) ### Bug Fixes * Add inlined data sources to the top level registry ([#2456](#2456)) ([356788a](356788a)) * Add new value types to types.ts for web ui ([#2463](#2463)) ([ad5694e](ad5694e)) * Add PushSource proto and Python class ([#2428](#2428)) ([9a4bd63](9a4bd63)) * Add spark to lambda dockerfile ([#2480](#2480)) ([514666f](514666f)) * Added private_key auth for Snowflake ([#2508](#2508)) ([c42c9b0](c42c9b0)) * Added Redshift and Spark typecheck to data_source event_timestamp_col inference ([#2389](#2389)) ([04dea73](04dea73)) * Building of go extension fails ([#2448](#2448)) ([7d1efd5](7d1efd5)) * Bump the number of versions bumps expected to 27 ([#2549](#2549)) ([ecc9938](ecc9938)) * Create __init__ files for the proto-generated python dirs ([#2410](#2410)) ([e17028d](e17028d)) * Don't prevent apply from running given duplicate empty names in data sources. Also fix repeated apply of Spark data source. ([#2415](#2415)) ([b95f441](b95f441)) * Dynamodb deduplicate batch write request by partition keys ([#2515](#2515)) ([70d4a13](70d4a13)) * Ensure that __init__ files exist in proto dirs ([#2433](#2433)) ([9b94f7b](9b94f7b)) * Fix DataSource constructor to unbreak custom data sources ([#2492](#2492)) ([712653e](712653e)) * Fix default feast apply path without any extras ([#2373](#2373)) ([6ba7fc7](6ba7fc7)) * Fix definitions.py with new definition ([#2541](#2541)) ([eefc34a](eefc34a)) * Fix entity row to use join key instead of name ([#2521](#2521)) ([c22fa2c](c22fa2c)) * Fix Java Master ([#2499](#2499)) ([e083458](e083458)) * Fix registry proto ([#2435](#2435)) ([ea6a9b2](ea6a9b2)) * Fix some inconsistencies in the docs and comments in the code ([#2444](#2444)) ([ad008bf](ad008bf)) * Fix spark docs ([#2382](#2382)) ([d4a606a](d4a606a)) * Fix Spark template to work correctly on feast init -t spark ([#2393](#2393)) ([ae133fd](ae133fd)) * Fix the feature repo fixture used by java tests ([#2469](#2469)) ([32e925e](32e925e)) * Fix unhashable Snowflake and Redshift sources ([cd8f1c9](cd8f1c9)) * Fixed bug in passing config file params to snowflake python connector ([#2503](#2503)) ([34f2b59](34f2b59)) * Fixing Spark template to include source name ([#2381](#2381)) ([a985f1d](a985f1d)) * Make name a keyword arg for the Entity class ([#2467](#2467)) ([43847de](43847de)) * Making a name for data sources not a breaking change ([#2379](#2379)) ([71d7ae2](71d7ae2)) * Minor link fix in `CONTRIBUTING.md` ([#2481](#2481)) ([2917e27](2917e27)) * Preserve ordering of features in _get_column_names ([#2457](#2457)) ([495b435](495b435)) * Relax click python requirement to >=7 ([#2450](#2450)) ([f202f92](f202f92)) * Remove date partition column field from datasources that don't s… ([#2478](#2478)) ([ce35835](ce35835)) * Remove docker step from unit test workflow ([#2535](#2535)) ([6f22f22](6f22f22)) * Remove spark from the AWS Lambda dockerfile ([#2498](#2498)) ([6abae16](6abae16)) * Request data api update ([#2488](#2488)) ([0c9e5b7](0c9e5b7)) * Schema update ([#2509](#2509)) ([cf7bbc2](cf7bbc2)) * Simplify DataSource.from_proto logic ([#2424](#2424)) ([6bda4d2](6bda4d2)) * Snowflake api update ([#2487](#2487)) ([1181a9e](1181a9e)) * Support passing batch source to streaming sources for backfills ([#2523](#2523)) ([90db1d1](90db1d1)) * Timestamp update ([#2486](#2486)) ([bf23111](bf23111)) * Typos in Feast UI error message ([#2432](#2432)) ([e14369d](e14369d)) * Update feature view APIs to prefer keyword args ([#2472](#2472)) ([7c19cf7](7c19cf7)) * Update file api ([#2470](#2470)) ([83a11c6](83a11c6)) * Update Makefile to cd into python dir before running commands ([#2437](#2437)) ([ca32155](ca32155)) * Update redshift api ([#2479](#2479)) ([4fa73a9](4fa73a9)) * Update some fields optional in UI parser ([#2380](#2380)) ([cff7ac3](cff7ac3)) * Use a single version of jackson libraries and upgrade to 2.12.6.1 ([#2473](#2473)) ([5be1cc6](5be1cc6)) * Use dateutil parser to parse materialization times ([#2464](#2464)) ([6c55e49](6c55e49)) * Use the correct dockerhub image tag when building feature servers ([#2372](#2372)) ([0d62c1d](0d62c1d)) ### Features * Add `/write-to-online-store` method to the python feature server ([#2423](#2423)) ([d2fb048](d2fb048)) * Add description, tags, owner fields to all feature view classes ([#2440](#2440)) ([ed5e928](ed5e928)) * Add DQM Logging on GRPC Server with FileLogStorage for Testing ([#2403](#2403)) ([57a97d8](57a97d8)) * Add Feast types in preparation for changing type system ([#2475](#2475)) ([4864252](4864252)) * Add Field class ([#2500](#2500)) ([1279612](1279612)) * Add support for DynamoDB online_read in batches ([#2371](#2371)) ([702ec49](702ec49)) * Add Support for DynamodbOnlineStoreConfig endpoint_url parameter ([#2485](#2485)) ([7b863d1](7b863d1)) * Add templating for dynamodb table name ([#2394](#2394)) ([f591088](f591088)) * Allow local feature server to use Go feature server if enabled ([#2538](#2538)) ([a2ef375](a2ef375)) * Allow using entity's join_key in get_online_features ([#2420](#2420)) ([068c765](068c765)) * Data Source Api Update ([#2468](#2468)) ([6b96b21](6b96b21)) * Go server ([#2339](#2339)) ([d12e7ef](d12e7ef)), closes [#2354](#2354) [#2361](#2361) [#2332](#2332) [#2356](#2356) [#2363](#2363) [#2349](#2349) [#2355](#2355) [#2336](#2336) [#2361](#2361) [#2363](#2363) [#2344](#2344) [#2354](#2354) [#2347](#2347) [#2350](#2350) [#2356](#2356) [#2355](#2355) [#2349](#2349) [#2352](#2352) [#2341](#2341) [#2336](#2336) [#2373](#2373) [#2315](#2315) [#2372](#2372) [#2332](#2332) [#2349](#2349) [#2336](#2336) [#2361](#2361) [#2363](#2363) [#2344](#2344) [#2354](#2354) [#2347](#2347) [#2350](#2350) [#2356](#2356) [#2355](#2355) [#2349](#2349) [#2352](#2352) [#2341](#2341) [#2336](#2336) [#2373](#2373) [#2379](#2379) [#2380](#2380) [#2382](#2382) [#2364](#2364) [#2366](#2366) [#2386](#2386) * Graduate write_to_online_store out of experimental status ([#2426](#2426)) ([e7dd4b7](e7dd4b7)) * Make feast PEP 561 compliant ([#2405](#2405)) ([3c41f94](3c41f94)), closes [#2420](#2420) [#2418](#2418) [#2425](#2425) [#2426](#2426) [#2427](#2427) [#2431](#2431) [#2433](#2433) [#2420](#2420) [#2418](#2418) [#2425](#2425) [#2426](#2426) [#2427](#2427) [#2431](#2431) [#2433](#2433) * Makefile for contrib for Issue [#2364](#2364) ([#2366](#2366)) ([a02325b](a02325b)) * Support on demand feature views in go feature server ([#2494](#2494)) ([6edd274](6edd274)) * Switch from Feature to Field ([#2514](#2514)) ([6a03bed](6a03bed)) * Use a daemon thread to monitor the go feature server exclusively ([#2391](#2391)) ([0bb5e8c](0bb5e8c))
What this PR does / why we need it:
This PR enables using on demand feature transformations with go feature server. Now Go feature server has equal functionality with python implementation and passes all existing e2e tests.
Which issue(s) this PR fixes:
Fixes #