Skip to content

Commit

Permalink
Draft changes to add remote online store to feast.
Browse files Browse the repository at this point in the history
Signed-off-by: Lokesh Rangineni <lokeshforjava@gmail.com>

Adding the integration test and remote online creator class so that it will fit into existing integration testing framework.

Signed-off-by: Lokesh Rangineni <lokeshforjava@gmail.com>

Fix after rebase

Signed-off-by: Lokesh Rangineni <lokeshforjava@gmail.com>

Removing the RemoteOnlineStoreCreator and adding custom integration test case. Incorporating the code review comments.

Signed-off-by: Lokesh Rangineni <lokeshforjava@gmail.com>

reformatting the code, removing unnecessary braces.

Signed-off-by: Lokesh Rangineni <lokeshforjava@gmail.com>

Trying to fix the errors reported in make lint-python

Signed-off-by: Lokesh Rangineni <lokeshforjava@gmail.com>

Ran the command make format-python and trying to see if it fixes the lint errors.

Signed-off-by: Lokesh Rangineni <lokeshforjava@gmail.com>

increasing the server start timeout to see if it fixes the integration test cases.

Signed-off-by: Lokesh Rangineni <lokeshforjava@gmail.com>

checking changes after make format-python

Signed-off-by: Lokesh Rangineni <lokeshforjava@gmail.com>

trying to see if this fixes the PR integrationt test failure.
Signed-off-by: Lokesh Rangineni <lokeshemail@email.com>

Signed-off-by: Lokesh Rangineni <lokeshforjava@gmail.com>

checking in the changes for make format-python

Signed-off-by: Lokesh Rangineni <lokeshforjava@gmail.com>

Upgrading python version to 3.11, adding support for 3.11 as well.

Signed-off-by: Lokesh Rangineni <lokeshforjava@gmail.com>

chore: Bump macOS runners to macos-13 (feast-dev#4152)

bump macos runner to 13

Signed-off-by: tokoko <togurg14@freeuni.edu.ge>

Signed-off-by: Lokesh Rangineni <lokeshforjava@gmail.com>

chore: Use pixi to lock python dependencies in a single command (feast-dev#4114)

use pixi to lock python dependencies in a single command

Signed-off-by: tokoko <togurg14@freeuni.edu.ge>

Signed-off-by: Lokesh Rangineni <lokeshforjava@gmail.com>

feat: List all feature views (feast-dev#4256)

* feature: Adding type to base feature view

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* fixed linter

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* fixed type and meta

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* adding new listing

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* updated

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* cleaning up changes

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* reverting FV proto

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* doing simple way

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* added a test

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* updated to add warnings

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

---------

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
feat: Adding vector search for sqlite (feast-dev#4176)

* feat: Adding vector search for sqlite

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* adding the sqlite_vss dependency

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* linter

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* latest progress

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* uploading latest progress

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* updated function

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* adding configuration

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* adding current progress

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* updating requirements files

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* moving to sqlite-vec

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* updating sqlite.py

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* checking in progress

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* updated test type

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* got the initialization working, nice

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* checking in progress from last night

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* removing unnecessary stuff

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* fixing merge conflicts

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* removing files changed accidentally]

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* uploading current progress...things run but need to update the virtual table insertion

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* linted

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* adding working notes

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* found a bug, original feature_store.py was only grabbing first feature view, adjusted

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* cant use a string have to verify it is a proper FeatureView object

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* updated got it working, need to fix some other stuff still

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* working

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* linter

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* fixing some type issues

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* fixed typing and lint issues

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* updated dependencies

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* fix for pixi and updating requirements

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* fixed type

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* linter

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* testing sqlite_vec import

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* adding minimal example test

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* lint

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* testing raw sqlite

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* Printing package version

* printing version

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* updated requirements

* rebuilding requirments

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* only going to run this on 3.10 for now

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* updated docs for sqlite caveats

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* adding reason

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* skipping

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* updated tests

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* removing print

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* added method call

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* added prubt

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* added print

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* removing print

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* adding check in sqlite

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* missed an =

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* still running on 3.11

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* typo

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* fix

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* fix

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* updated setup and docs

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* renamed things

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

---------

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
squashing the last 15 commits to one.

Merge branch 'master' into feature/adding-remote-onlinestore-rebase

Adding documentation and incorporating code review comment.

Signed-off-by: Lokesh Rangineni <lokeshforjava@gmail.com>

Adding documentation and incorporating code review comment.

Signed-off-by: Lokesh Rangineni <lokeshforjava@gmail.com>

Merge remote-tracking branch 'fork/feature/adding-remote-onlinestore-rebase' into feature/adding-remote-onlinestore-rebase
  • Loading branch information
lokeshrangineni committed Jun 12, 2024
1 parent 0d162e9 commit 5231bac
Show file tree
Hide file tree
Showing 26 changed files with 1,465 additions and 421 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/pr_integration_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [ "3.9", "3.10", "3.11" ]
python-version: [ "3.11" ]
os: [ ubuntu-latest ]
env:
OS: ${{ matrix.os }}
Expand Down
6 changes: 0 additions & 6 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -70,12 +70,6 @@ lock-python-dependencies-all:
pixi run --environment py311 --manifest-path infra/scripts/pixi/pixi.toml "uv pip compile --system --no-strip-extras setup.py --output-file sdk/python/requirements/py3.11-requirements.txt"
pixi run --environment py311 --manifest-path infra/scripts/pixi/pixi.toml "uv pip compile --system --no-strip-extras setup.py --extra ci --output-file sdk/python/requirements/py3.11-ci-requirements.txt"

lock-python-dependencies-all:
pixi run --environment py39 --manifest-path infra/scripts/pixi/pixi.toml "python -m piptools compile -U --output-file sdk/python/requirements/py3.9-requirements.txt"
pixi run --environment py39 --manifest-path infra/scripts/pixi/pixi.toml "python -m piptools compile -U --extra ci --output-file sdk/python/requirements/py3.9-ci-requirements.txt"
pixi run --environment py310 --manifest-path infra/scripts/pixi/pixi.toml "python -m piptools compile -U --output-file sdk/python/requirements/py3.10-requirements.txt"
pixi run --environment py310 --manifest-path infra/scripts/pixi/pixi.toml "python -m piptools compile -U --extra ci --output-file sdk/python/requirements/py3.10-ci-requirements.txt"

benchmark-python:
IS_TEST=True python -m pytest --integration --benchmark --benchmark-autosave --benchmark-save-data sdk/python/tests

Expand Down
18 changes: 18 additions & 0 deletions docs/reference/alpha-vector-database.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,9 @@ Below are supported vector databases and implemented features:
| Elasticsearch | [x] | [x] |
| Milvus | [ ] | [ ] |
| Faiss | [ ] | [ ] |
| SQLite | [x] | [ ] |

Note: SQLite is in limited access and only working on Python 3.10. It will be updated as [sqlite_vec](https://github.com/asg017/sqlite-vec/) progresses.

## Example

Expand Down Expand Up @@ -108,4 +110,20 @@ def print_online_features(features):
print(key, " : ", value)

print_online_features(features)
```

### Configuration
We offer two Online Store options for Vector Databases. PGVector and SQLite.

#### Installation with SQLite
If you are using `pyenv` to manage your Python versions, you can install the SQLite extension with the following command:
```bash
PYTHON_CONFIGURE_OPTS="--enable-loadable-sqlite-extensions" \
LDFLAGS="-L/opt/homebrew/opt/sqlite/lib" \
CPPFLAGS="-I/opt/homebrew/opt/sqlite/include" \
pyenv install 3.10.14
```
And you can the Feast install package via:
```bash
pip install feast[sqlite_vec]
```
4 changes: 4 additions & 0 deletions docs/reference/online-stores/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,3 +61,7 @@ Please see [Online Store](../../getting-started/architecture-and-components/onli
{% content-ref url="scylladb.md" %}
[scylladb.md](scylladb.md)
{% endcontent-ref %}

{% content-ref url="remote.md" %}
[remote.md](remote.md)
{% endcontent-ref %}
21 changes: 21 additions & 0 deletions docs/reference/online-stores/remote.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Remote online store

## Description

This remote online store will let you interact with remote feature server. At this moment this only supports the read operation. You can use this online store and able retrieve online features `store.get_online_features` from remote feature server.

## Examples

The registry is pointing to registry of remote feature store. If it is not accessible then should be configured to use remote registry.

{% code title="feature_store.yaml" %}
```yaml
project: my-local-project
registry: /remote/data/registry.db
provider: local
online_store:
path: http://localhost:6566
type: remote
entity_key_serialization_version: 2
```
{% endcode %}
660 changes: 419 additions & 241 deletions infra/scripts/pixi/pixi.lock

Large diffs are not rendered by default.

10 changes: 7 additions & 3 deletions infra/scripts/pixi/pixi.toml
Original file line number Diff line number Diff line change
@@ -1,19 +1,23 @@
[project]
name = "pixi-feast"
channels = ["conda-forge"]
platforms = ["linux-64"]
platforms = ["linux-64", "osx-arm64"]

[tasks]

[dependencies]
pip-tools = ">=7.4.1,<7.5"
uv = ">=0.1.39,<0.2"

[feature.py39.dependencies]
python = "~=3.9.0"

[feature.py310.dependencies]
python = "~=3.10.0"

[feature.py311.dependencies]
python = "~=3.11.0"

[environments]
py39 = ["py39"]
py310 = ["py310"]
py310 = ["py310"]
py311 = ["py311"]
67 changes: 65 additions & 2 deletions sdk/python/feast/feature_store.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
# limitations under the License.
import copy
import itertools
import logging
import os
import warnings
from collections import Counter, defaultdict
Expand Down Expand Up @@ -247,6 +248,20 @@ def list_feature_services(self) -> List[FeatureService]:
"""
return self._registry.list_feature_services(self.project)

def list_all_feature_views(
self, allow_cache: bool = False
) -> List[Union[FeatureView, StreamFeatureView, OnDemandFeatureView]]:
"""
Retrieves the list of feature views from the registry.
Args:
allow_cache: Whether to allow returning entities from a cached registry.
Returns:
A list of feature views.
"""
return self._list_all_feature_views(allow_cache)

def list_feature_views(self, allow_cache: bool = False) -> List[FeatureView]:
"""
Retrieves the list of feature views from the registry.
Expand All @@ -257,12 +272,50 @@ def list_feature_views(self, allow_cache: bool = False) -> List[FeatureView]:
Returns:
A list of feature views.
"""
logging.warning(
"list_feature_views will make breaking changes. Please use list_batch_feature_views instead. "
"list_feature_views will behave like list_all_feature_views in the future."
)
return self._list_feature_views(allow_cache)

def _list_all_feature_views(
self,
allow_cache: bool = False,
) -> List[Union[FeatureView, StreamFeatureView, OnDemandFeatureView]]:
all_feature_views = (
self._list_feature_views(allow_cache)
+ self._list_stream_feature_views(allow_cache)
+ self.list_on_demand_feature_views(allow_cache)
)
return all_feature_views

def _list_feature_views(
self,
allow_cache: bool = False,
hide_dummy_entity: bool = True,
) -> List[FeatureView]:
logging.warning(
"_list_feature_views will make breaking changes. Please use _list_batch_feature_views instead. "
"_list_feature_views will behave like _list_all_feature_views in the future."
)
feature_views = []
for fv in self._registry.list_feature_views(
self.project, allow_cache=allow_cache
):
if (
hide_dummy_entity
and fv.entities
and fv.entities[0] == DUMMY_ENTITY_NAME
):
fv.entities = []
fv.entity_columns = []
feature_views.append(fv)
return feature_views

def _list_batch_feature_views(
self,
allow_cache: bool = False,
hide_dummy_entity: bool = True,
) -> List[FeatureView]:
feature_views = []
for fv in self._registry.list_feature_views(
Expand Down Expand Up @@ -1881,18 +1934,28 @@ def _retrieve_online_documents(
"Using embedding functionality is not supported for document retrieval. Please embed the query before calling retrieve_online_documents."
)
(
requested_feature_views,
available_feature_views,
_,
) = self._get_feature_views_to_use(
features=[feature], allow_cache=True, hide_dummy_entity=False
)
requested_feature_view_name = (
feature.split(":")[0] if isinstance(feature, str) else feature
)
for feature_view in available_feature_views:
if feature_view.name == requested_feature_view_name:
requested_feature_view = feature_view
if not requested_feature_view:
raise ValueError(
f"Feature view {requested_feature_view} not found in the registry."
)
requested_feature = (
feature.split(":")[1] if isinstance(feature, str) else feature
)
provider = self._get_provider()
document_features = self._retrieve_from_online_store(
provider,
requested_feature_views[0],
requested_feature_view,
requested_feature,
query,
top_k,
Expand Down
174 changes: 174 additions & 0 deletions sdk/python/feast/infra/online_stores/remote.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
# Copyright 2021 The Feast Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import json
import logging
from datetime import datetime
from typing import Any, Callable, Dict, List, Literal, Optional, Sequence, Tuple

import requests
from pydantic import StrictStr

from feast import Entity, FeatureView, RepoConfig
from feast.infra.online_stores.online_store import OnlineStore
from feast.protos.feast.types.EntityKey_pb2 import EntityKey as EntityKeyProto
from feast.protos.feast.types.Value_pb2 import Value as ValueProto
from feast.repo_config import FeastConfigBaseModel
from feast.type_map import python_values_to_proto_values
from feast.value_type import ValueType

logger = logging.getLogger(__name__)


class RemoteOnlineStoreConfig(FeastConfigBaseModel):
"""Remote Online store config for remote online store"""

type: Literal["remote"] = "remote"
"""Online store type selector"""

path: StrictStr = "http://localhost:6566"
""" str: Path to metadata store.
If type is 'remote', then this is a URL for registry server """


class RemoteOnlineStore(OnlineStore):
"""
remote online store implementation wrapper to communicate with feast online server.
"""

def online_write_batch(
self,
config: RepoConfig,
table: FeatureView,
data: List[
Tuple[EntityKeyProto, Dict[str, ValueProto], datetime, Optional[datetime]]
],
progress: Optional[Callable[[int], Any]],
) -> None:
pass

def online_read(
self,
config: RepoConfig,
table: FeatureView,
entity_keys: List[EntityKeyProto],
requested_features: Optional[List[str]] = None,
) -> List[Tuple[Optional[datetime], Optional[Dict[str, ValueProto]]]]:
assert isinstance(config.online_store, RemoteOnlineStoreConfig)
config.online_store.__class__ = RemoteOnlineStoreConfig

req_body = self._construct_online_read_api_json_request(
entity_keys, table, requested_features
)
response = requests.post(
f"{config.online_store.path}/get-online-features", data=req_body
)
if response.status_code == 200:
logger.debug("Able to retrieve the online features from feature server.")
response_json = json.loads(response.text)
event_ts = self._get_event_ts(response_json)
# Iterating over results and converting the API results in column format to row format.
result_tuples: List[
Tuple[Optional[datetime], Optional[Dict[str, ValueProto]]]
] = []
for feature_value_index in range(len(entity_keys)):
feature_values_dict: Dict[str, ValueProto] = dict()
for index, feature_name in enumerate(
response_json["metadata"]["feature_names"]
):
if (
requested_features is not None
and feature_name in requested_features
):
if (
response_json["results"][index]["statuses"][
feature_value_index
]
== "PRESENT"
):
message = python_values_to_proto_values(
[
response_json["results"][index]["values"][
feature_value_index
]
],
ValueType.UNKNOWN,
)
feature_values_dict[feature_name] = message[0]
else:
feature_values_dict[feature_name] = ValueProto()

result_tuples.append((event_ts, feature_values_dict))
return result_tuples
else:
error_msg = f"Unable to retrieve the online store data using feature server API. Error_code={response.status_code}, error_message={response.reason}"
logger.error(error_msg)
raise RuntimeError(error_msg)

def _construct_online_read_api_json_request(
self,
entity_keys: List[EntityKeyProto],
table: FeatureView,
requested_features: Optional[List[str]] = None,
):
api_requested_features = []
if requested_features is not None:
for requested_feature in requested_features:
api_requested_features.append(f"{table.name}:{requested_feature}")

entity_values = []
entity_key = ""
for row in entity_keys:
entity_key = row.join_keys[0]
entity_values.append(
getattr(row.entity_values[0], row.entity_values[0].WhichOneof("val"))
)

req_body = json.dumps(
{
"features": api_requested_features,
"entities": {entity_key: entity_values},
}
)
return req_body

def _check_if_feature_requested(self, feature_name, requested_features):
for requested_feature in requested_features:
if feature_name in requested_feature:
return True
return False

def _get_event_ts(self, response_json) -> datetime:
event_ts = ""
if len(response_json["results"]) > 1:
event_ts = response_json["results"][1]["event_timestamps"][0]
return datetime.fromisoformat(event_ts.replace("Z", "+00:00"))

def update(
self,
config: RepoConfig,
tables_to_delete: Sequence[FeatureView],
tables_to_keep: Sequence[FeatureView],
entities_to_delete: Sequence[Entity],
entities_to_keep: Sequence[Entity],
partial: bool,
):
pass

def teardown(
self,
config: RepoConfig,
tables: Sequence[FeatureView],
entities: Sequence[Entity],
):
pass
Loading

0 comments on commit 5231bac

Please sign in to comment.