Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Elasticsearch vector database #4188

Merged
merged 18 commits into from
May 13, 2024
Merged

Conversation

HaoXuAI
Copy link
Collaborator

@HaoXuAI HaoXuAI commented May 9, 2024

  • Add ElasticSearch in the online store as another vector database.
  • Adjust the retrieve_online_document API

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes

Signed-off-by: cmuhao <sduxuhao@gmail.com>
Signed-off-by: cmuhao <sduxuhao@gmail.com>
Signed-off-by: cmuhao <sduxuhao@gmail.com>
Signed-off-by: cmuhao <sduxuhao@gmail.com>
Signed-off-by: cmuhao <sduxuhao@gmail.com>
Signed-off-by: cmuhao <sduxuhao@gmail.com>
@HaoXuAI HaoXuAI changed the title [feat] WIP Elasticsearch vector database feat: WIP Elasticsearch vector database May 9, 2024
from feast.repo_config import FeastConfigBaseModel


class ElasticsearchOnlineStoreConfig(FeastConfigBaseModel):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

Signed-off-by: cmuhao <sduxuhao@gmail.com>
Signed-off-by: cmuhao <sduxuhao@gmail.com>
Signed-off-by: cmuhao <sduxuhao@gmail.com>
Signed-off-by: cmuhao <sduxuhao@gmail.com>
Signed-off-by: cmuhao <sduxuhao@gmail.com>
@HaoXuAI HaoXuAI requested review from DvirDukhan and a team as code owners May 12, 2024 00:37
@HaoXuAI HaoXuAI requested review from mavysavydav and removed request for a team May 12, 2024 00:37
Signed-off-by: cmuhao <sduxuhao@gmail.com>
Signed-off-by: cmuhao <sduxuhao@gmail.com>
@HaoXuAI HaoXuAI changed the title feat: WIP Elasticsearch vector database feat: Elasticsearch vector database May 12, 2024
Signed-off-by: cmuhao <sduxuhao@gmail.com>
Signed-off-by: cmuhao <sduxuhao@gmail.com>
Signed-off-by: cmuhao <sduxuhao@gmail.com>
Signed-off-by: cmuhao <sduxuhao@gmail.com>
@@ -1886,7 +1886,7 @@ def retrieve_online_documents(
feature: str,
query: Union[str, List[float]],
top_k: int,
distance_metric: str,
distance_metric: Optional[str] = None,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have to make it optional as elasticsearch doesn't allow specifying the metric in the online API. Instead, it has to update the index.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense can we add some explicit tests for this to highlight the behavior? Also this is awesome.

Copy link
Member

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 🚀🚀🚀

@HaoXuAI HaoXuAI merged commit bf99640 into master May 13, 2024
26 checks passed
franciscojavierarceo pushed a commit that referenced this pull request May 24, 2024
# [0.38.0](v0.37.0...v0.38.0) (2024-05-24)

### Bug Fixes

* Add vector database doc ([#4165](#4165)) ([37f36b6](37f36b6))
* Change checkout action back to v3 from v5 which isn't released yet ([#4147](#4147)) ([9523fff](9523fff))
* Change numpy version <1.25 dependency to <2 in setup.py ([#4085](#4085)) ([2ba71ff](2ba71ff)), closes [#4084](#4084)
* Changed the code the way mysql container is initialized.  ([#4140](#4140)) ([8b5698f](8b5698f)), closes [#4126](#4126)
* Correct nightly install command, move all installs to uv ([#4164](#4164)) ([c86d594](c86d594))
* Default value is not set in Redis connection string using environment variable ([#4136](#4136)) ([95acfb4](95acfb4)), closes [#3669](#3669)
* Get container host addresses from testcontainers (java) ([#4125](#4125)) ([9184dde](9184dde))
* Get rid of empty string `name_alias` during feature view projection deserialization  ([#4116](#4116)) ([65056ce](65056ce))
* Helm chart `feast-feature-server`, improve Service template name ([#4161](#4161)) ([dedc164](dedc164))
* Improve the code related to on-demand-featureview. ([#4203](#4203)) ([d91d7e0](d91d7e0))
* Integration tests for async sdk method ([#4201](#4201)) ([08c44ae](08c44ae))
* Make sure schema is used when calling `get_table_query_string` method for Snowflake datasource ([#4131](#4131)) ([c1579c7](c1579c7))
* Make sure schema is used when generating `from_expression` for Snowflake ([#4177](#4177)) ([5051da7](5051da7))
* Pass native input values to `get_online_features` from feature server ([#4117](#4117)) ([60756cb](60756cb))
* Pass region to S3 client only if set (Java) ([#4151](#4151)) ([b8087f7](b8087f7))
* Pgvector patch ([#4108](#4108)) ([ad45bb4](ad45bb4))
* Update doc ([#4153](#4153)) ([e873636](e873636))
* Update master-only benchmark bucket name due to credential update ([#4183](#4183)) ([e88f1e3](e88f1e3))
* Updating the instructions for quickstart guide. ([#4120](#4120)) ([0c30e96](0c30e96))
* Upgrading the test container so that local tests works with updated d… ([#4155](#4155)) ([93ddb11](93ddb11))

### Features

* Add a Kubernetes Operator for the Feast Feature Server ([#4145](#4145)) ([4a696dc](4a696dc))
* Add delta format to `FileSource`, add support for it in ibis/duckdb ([#4123](#4123)) ([2b6f1d0](2b6f1d0))
* Add materialization support to ibis/duckdb ([#4173](#4173)) ([369ca98](369ca98))
* Add optional private key params to Snowflake config ([#4205](#4205)) ([20f5419](20f5419))
* Add s3 remote storage export for duckdb ([#4195](#4195)) ([6a04c48](6a04c48))
* Adding DatastoreOnlineStore 'database' argument. ([#4180](#4180)) ([e739745](e739745))
* Adding get_online_features_async to feature store sdk ([#4172](#4172)) ([311efc5](311efc5))
* Adding support for dictionary writes to online store  ([#4156](#4156)) ([abfac01](abfac01))
* Elasticsearch vector database ([#4188](#4188)) ([bf99640](bf99640))
* Enable other distance metrics for Vector DB and Update docs ([#4170](#4170)) ([ba9f4ef](ba9f4ef))
* Feast/IKV datetime edgecase errors ([#4211](#4211)) ([bdae562](bdae562))
* Feast/IKV documenation language changes ([#4149](#4149)) ([690a621](690a621))
* Feast/IKV online store contrib plugin integration ([#4068](#4068)) ([f2b4eb9](f2b4eb9))
* Feast/IKV online store documentation ([#4146](#4146)) ([73601e4](73601e4))
* Feast/IKV upgrade client version ([#4200](#4200)) ([0e42150](0e42150))
* Incorporate substrait ODFVs into ibis-based offline store queries ([#4102](#4102)) ([c3a102f](c3a102f))
* Isolate input-dependent calculations in `get_online_features` ([#4041](#4041)) ([2a6edea](2a6edea))
* Make arrow primary interchange for online ODFV execution ([#4143](#4143)) ([3fdb716](3fdb716))
* Move data source validation entrypoint to offline store ([#4197](#4197)) ([a17725d](a17725d))
* Upgrading python version to 3.11, adding support for 3.11 as well. ([#4159](#4159)) ([4b1634f](4b1634f)), closes [#4152](#4152) [#4114](#4114)

### Reverts

* Reverts "fix: Using version args to install the correct feast version" ([#4112](#4112)) ([b66baa4](b66baa4)), closes [#3953](#3953)
franciscojavierarceo pushed a commit that referenced this pull request May 27, 2024
# [0.38.0](v0.37.0...v0.38.0) (2024-05-24)

### Bug Fixes

* Add vector database doc ([#4165](#4165)) ([37f36b6](37f36b6))
* Change checkout action back to v3 from v5 which isn't released yet ([#4147](#4147)) ([9523fff](9523fff))
* Change numpy version <1.25 dependency to <2 in setup.py ([#4085](#4085)) ([2ba71ff](2ba71ff)), closes [#4084](#4084)
* Changed the code the way mysql container is initialized.  ([#4140](#4140)) ([8b5698f](8b5698f)), closes [#4126](#4126)
* Correct nightly install command, move all installs to uv ([#4164](#4164)) ([c86d594](c86d594))
* Default value is not set in Redis connection string using environment variable ([#4136](#4136)) ([95acfb4](95acfb4)), closes [#3669](#3669)
* Get container host addresses from testcontainers (java) ([#4125](#4125)) ([9184dde](9184dde))
* Get rid of empty string `name_alias` during feature view projection deserialization  ([#4116](#4116)) ([65056ce](65056ce))
* Helm chart `feast-feature-server`, improve Service template name ([#4161](#4161)) ([dedc164](dedc164))
* Improve the code related to on-demand-featureview. ([#4203](#4203)) ([d91d7e0](d91d7e0))
* Integration tests for async sdk method ([#4201](#4201)) ([08c44ae](08c44ae))
* Make sure schema is used when calling `get_table_query_string` method for Snowflake datasource ([#4131](#4131)) ([c1579c7](c1579c7))
* Make sure schema is used when generating `from_expression` for Snowflake ([#4177](#4177)) ([5051da7](5051da7))
* Pass native input values to `get_online_features` from feature server ([#4117](#4117)) ([60756cb](60756cb))
* Pass region to S3 client only if set (Java) ([#4151](#4151)) ([b8087f7](b8087f7))
* Pgvector patch ([#4108](#4108)) ([ad45bb4](ad45bb4))
* Update doc ([#4153](#4153)) ([e873636](e873636))
* Update master-only benchmark bucket name due to credential update ([#4183](#4183)) ([e88f1e3](e88f1e3))
* Updating the instructions for quickstart guide. ([#4120](#4120)) ([0c30e96](0c30e96))
* Upgrading the test container so that local tests works with updated d… ([#4155](#4155)) ([93ddb11](93ddb11))

### Features

* Add a Kubernetes Operator for the Feast Feature Server ([#4145](#4145)) ([4a696dc](4a696dc))
* Add delta format to `FileSource`, add support for it in ibis/duckdb ([#4123](#4123)) ([2b6f1d0](2b6f1d0))
* Add materialization support to ibis/duckdb ([#4173](#4173)) ([369ca98](369ca98))
* Add optional private key params to Snowflake config ([#4205](#4205)) ([20f5419](20f5419))
* Add s3 remote storage export for duckdb ([#4195](#4195)) ([6a04c48](6a04c48))
* Adding DatastoreOnlineStore 'database' argument. ([#4180](#4180)) ([e739745](e739745))
* Adding get_online_features_async to feature store sdk ([#4172](#4172)) ([311efc5](311efc5))
* Adding support for dictionary writes to online store  ([#4156](#4156)) ([abfac01](abfac01))
* Elasticsearch vector database ([#4188](#4188)) ([bf99640](bf99640))
* Enable other distance metrics for Vector DB and Update docs ([#4170](#4170)) ([ba9f4ef](ba9f4ef))
* Feast/IKV datetime edgecase errors ([#4211](#4211)) ([bdae562](bdae562))
* Feast/IKV documenation language changes ([#4149](#4149)) ([690a621](690a621))
* Feast/IKV online store contrib plugin integration ([#4068](#4068)) ([f2b4eb9](f2b4eb9))
* Feast/IKV online store documentation ([#4146](#4146)) ([73601e4](73601e4))
* Feast/IKV upgrade client version ([#4200](#4200)) ([0e42150](0e42150))
* Incorporate substrait ODFVs into ibis-based offline store queries ([#4102](#4102)) ([c3a102f](c3a102f))
* Isolate input-dependent calculations in `get_online_features` ([#4041](#4041)) ([2a6edea](2a6edea))
* Make arrow primary interchange for online ODFV execution ([#4143](#4143)) ([3fdb716](3fdb716))
* Move data source validation entrypoint to offline store ([#4197](#4197)) ([a17725d](a17725d))
* Upgrading python version to 3.11, adding support for 3.11 as well. ([#4159](#4159)) ([4b1634f](4b1634f)), closes [#4152](#4152) [#4114](#4114)

### Reverts

* Reverts "fix: Using version args to install the correct feast version" ([#4112](#4112)) ([b66baa4](b66baa4)), closes [#3953](#3953)
@tokoko tokoko deleted the hao-xu-elasticsearch-vector branch July 16, 2024 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants