Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Remote offline Store #4262

Merged
merged 41 commits into from
Jun 13, 2024
Merged

Conversation

redhatHameed
Copy link
Contributor

What this PR does / why we need it:

This PR added feature for remote offline store contain below changes.

Added an Offline Server using Arrow Flight:

  • Implement the remote OfflineServer similar to the one developed for the Remote Registry or Feature Server but using Arrow Flight Server instead of pure gRPC.
  • Added the command into cli for starting the offline server using the feast serve_offline command.
  • Implemented the the OfflineStore interface methods as arrow flight server methods.
  • Added unit test and integration test functions for the OfflineServer class.

Added an Offline Remote Client:

  • Implement the offline remote client using arrow flight remote client approach, which supports the type: remote configuration in the offline store configuration for example as seen below.
    type: remote 
    host: mystore.feast.svc.cluster.local
     port: 8815
  • Develop the RemoteOfflineStore class, which implements the existing OfflineStore interface in Feast. Utilize the get_historical_features method and pass parameters such as entity definition, feature views, offline store config to interact with the Arrow server functions using the Arrow Flight client.
  • Client implemented the other methods like pull_latest_from_table_or_query, write_logged_features , offline_write_batch
  • added unit test function to test the remote client.

Updated Helm Chart for Deploying as a service on Kubernetes:

  • updated chart deployment to install offline server by using feast_mode property based on the deployment choice user can set the property. The online mode is the default and maintains backward compatibility with previous Feast Feature Server implementations.

Helm install examples:

helm install feast-feature-server feast-charts/feast-feature-server --set feature_store_yaml_base64=$(base64 > feature_store.yaml)
helm install feast-offline-server feast-charts/feast-feature-server --set feast_mode=offline  --set feature_store_yaml_base64=$(base64 > feature_store.yaml)
helm install feast-ui-server feast-charts/feast-feature-server --set feast_mode=ui  --set feature_store_yaml_base64=$(base64 > feature_store.yaml)
helm install feast-registry-server feast-charts/feast-feature-server --set feast_mode=registry  --set feature_store_yaml_base64=$(base64 > feature_store.yaml)

Added documentation for remote offline support

Added example for remote offline

Which issue(s) this PR fixes:

Fixes #4032

Fixes

@redhatHameed redhatHameed force-pushed the remote_offline branch 2 times, most recently from ddbc10a to 7f64708 Compare June 10, 2024 18:52
redhatHameed and others added 25 commits June 10, 2024 14:57
…ight server and client

Signed-off-by: Abdul Hameed <ahameed@redhat.com>
Signed-off-by: Abdul Hameed <ahameed@redhat.com>
Signed-off-by: Abdul Hameed <ahameed@redhat.com>
Signed-off-by: Abdul Hameed <ahameed@redhat.com>
…ed _make_flight_info

Signed-off-by: Abdul Hameed <ahameed@redhat.com>
Signed-off-by: Abdul Hameed <ahameed@redhat.com>
Signed-off-by: Abdul Hameed <ahameed@redhat.com>
Signed-off-by: Abdul Hameed <ahameed@redhat.com>
Signed-off-by: Abdul Hameed <ahameed@redhat.com>
Signed-off-by: Abdul Hameed <ahameed@redhat.com>
Signed-off-by: Abdul Hameed <ahameed@redhat.com>
… an environment variable

Signed-off-by: Theodor Mihalache <tmihalac@redhat.com>
Signed-off-by: Abdul Hameed <ahameed@redhat.com>
use feature_view_names to transfer feature views and remove dummies

Signed-off-by: Abdul Hameed <ahameed@redhat.com>
Signed-off-by: Abdul Hameed <ahameed@redhat.com>
Signed-off-by: Abdul Hameed <ahameed@redhat.com>
Signed-off-by: Abdul Hameed <ahameed@redhat.com>
…ight server and client

Signed-off-by: Abdul Hameed <ahameed@redhat.com>
Signed-off-by: Abdul Hameed <ahameed@redhat.com>
Signed-off-by: Abdul Hameed <ahameed@redhat.com>
Signed-off-by: Abdul Hameed <ahameed@redhat.com>
Signed-off-by: Abdul Hameed <ahameed@redhat.com>
…), offline, ui and registry

Signed-off-by: Abdul Hameed <ahameed@redhat.com>
Signed-off-by: Abdul Hameed <ahameed@redhat.com>
Signed-off-by: Abdul Hameed <ahameed@redhat.com>
Signed-off-by: Abdul Hameed <ahameed@redhat.com>
@redhatHameed redhatHameed force-pushed the remote_offline branch 4 times, most recently from 2eb38d1 to 2818626 Compare June 11, 2024 23:58
@tokoko
Copy link
Collaborator

tokoko commented Jun 12, 2024

@redhatHameed just curious, are the tests failing only in gh workflows? Do they pass locally?

@dmartinol
Copy link
Contributor

dmartinol commented Jun 12, 2024

Just a note, if we decide in the future to use this server for remote materialization engine as well, the name might no longer be a good fit. Just pointing it out.. I don't have any better ideas.

Maybe it's too early to kickoff a thread, but in that case, shouldn't we have a separate materialization server with its own codebase? (if you have an issue we can move the discussion there)

@tokoko
Copy link
Collaborator

tokoko commented Jun 12, 2024

Maybe it's too early to kickoff a thread, but in that case, shouldn't we have a separate materialization server with its own codebase? (if you have an issue we can move the discussion there)

maybe, that would certainly be a cleaner approach, but imho wouldn't really serve any other practical purpose. I'll open a ticket and we probably should pick this up later once remote offline is a bit more fleshed out.

@dmartinol
Copy link
Contributor

@redhatHameed just curious, are the tests failing only in gh workflows? Do they pass locally?

@tokoko Yes, apart some of them that fail locally. Do you have any idea why from the GH action execution we don't even see the log of the Test local integration tests step? Do you have any chance to re-run it and enable the debugging option?

@tokoko
Copy link
Collaborator

tokoko commented Jun 12, 2024

I don't have any privileges there. @franciscojavierarceo @jeremyary any ideas?

@redhatHameed
Copy link
Contributor Author

@redhatHameed just curious, are the tests failing only in gh workflows? Do they pass locally?

Yes it does - apart from the one test case -> test_spark_materialization_consistency with error - py4j.protocol.Py4JJavaError: An error occurred while calling o114.count. but if I skip it all are passing locally see the here results

@redhatHameed redhatHameed force-pushed the remote_offline branch 2 times, most recently from 2e1879a to 08c2616 Compare June 12, 2024 20:07
del self.flights[key]
return fl.RecordBatchStream(table)

def offline_write_batch(self, command, key):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not a mandatory to give the types, but it makes it more readable to have types unless it is expected to have any data type.

Copy link
Contributor

@tmihalac tmihalac Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lokeshrangineni
You mean in the methods parameters or the return value or both?
I guess you mean for all the methods and not only this specific one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was specifically about method parameters. but if there is no change in functionality better to add types to return variable also.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lokeshrangineni the change implemented under commit thanks @tmihalac addressing it

Signed-off-by: Abdul Hameed <ahameed@redhat.com>
docs/SUMMARY.md Outdated Show resolved Hide resolved
tmihalac and others added 3 commits June 13, 2024 09:05
Signed-off-by: Theodor Mihalache <tmihalac@redhat.com>
Signed-off-by: Abdul Hameed <ahameed@redhat.com>
update the document change and fix teardown function
@redhatHameed
Copy link
Contributor Author

redhatHameed commented Jun 13, 2024

@jeremyary any thoughts I am getting incorrect credentials error one of integration test it was running fine previously ERROR sdk/python/tests/integration/offline_store/test_push_features_to_offline_store.py::test_push_features_and_read[AWS:Redshift:redis:python_fs:False] - feast.errors.RedshiftCredentialsError: Redshift API failed due to incorrect credentials ->https://github.com/feast-dev/feast/actions/runs/9504367262/job/26196850053?pr=4262

Signed-off-by: Theodor Mihalache <tmihalac@redhat.com>
@tokoko
Copy link
Collaborator

tokoko commented Jun 13, 2024

@redhatHameed It's a rate limit error, says An error occurred (ValidationException) when calling the DescribeTable operation: Rate exceeded if you scroll up. Redshift is being a little dramatic in the final error. A rerun should fix it.

@redhatHameed
Copy link
Contributor Author

@tokoko can we get your approval on this if no more comments. Also who will approve this for merging ?

Copy link
Collaborator

@tokoko tokoko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, lgtm. great work, guys. @jeremyary or @franciscojavierarceo can approve and merge

@jeremyary jeremyary merged commit 28a3d24 into feast-dev:master Jun 13, 2024
16 checks passed
franciscojavierarceo pushed a commit that referenced this pull request Jun 18, 2024
# [0.39.0](v0.38.0...v0.39.0) (2024-06-18)

### Bug Fixes

* Feast UI importlib change ([#4248](#4248)) ([5d486b8](5d486b8))
* Feature server no_feature_log argument error ([#4255](#4255)) ([15524ce](15524ce))
* Feature UI Server image won't start in an OpenShift cluster ([#4250](#4250)) ([4891f76](4891f76))
* Handles null values in data during GO Feature retrieval ([#4274](#4274)) ([c491e57](c491e57))
* Make Java gRPC client use timeouts as expected ([#4237](#4237)) ([f5a37c1](f5a37c1))
* Remove self assignment code line. ([#4238](#4238)) ([e514f66](e514f66))
* Set default values for feature_store.serve() function ([#4225](#4225)) ([fa74438](fa74438))

### Features

* Add online_read_async for dynamodb ([#4244](#4244)) ([b5ef384](b5ef384))
* Add the ability to list objects by `tags` ([#4246](#4246)) ([fbf92da](fbf92da))
* Added deadline to gRPC Java client ([#4217](#4217)) ([ff429c9](ff429c9))
* Adding vector search for sqlite ([#4176](#4176)) ([2478831](2478831))
* Change get_online_features signature, move online retrieval functions to utils ([#4278](#4278)) ([7287662](7287662))
* Feature/adding remote online store ([#4226](#4226)) ([9454d7c](9454d7c))
* List all feature views ([#4256](#4256)) ([36a574d](36a574d))
* Make RegistryServer writable ([#4231](#4231)) ([79e1143](79e1143))
* Remote offline Store  ([#4262](#4262)) ([28a3d24](28a3d24))
* Set optional full-scan for deletion ([#4189](#4189)) ([b9cadd5](b9cadd5))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remote offline feature server deployment
6 participants