-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Msudhir/add vector update functionality #14
Conversation
Signed-off-by: Danny C <d.chiao@gmail.com>
update Catalog argument in athena connector Signed-off-by: Gyumin Lee <t1100394@T1100394PM01.local> Co-authored-by: Gyumin Lee <t1100394@T1100394PM01.local>
Signed-off-by: Danny C <d.chiao@gmail.com>
ensure correct precedence with the two operators Signed-off-by: Ben Fletcher <ben.fletcher@ft.com>
Signed-off-by: Jiwon Park <bakjeeone@hotmail.com>
…east-dev#3677) Bumps [tough-cookie](https://github.com/salesforce/tough-cookie) from 4.0.0 to 4.1.3. - [Release notes](https://github.com/salesforce/tough-cookie/releases) - [Changelog](https://github.com/salesforce/tough-cookie/blob/master/CHANGELOG.md) - [Commits](salesforce/tough-cookie@v4.0.0...v4.1.3) --- updated-dependencies: - dependency-name: tough-cookie dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [tough-cookie](https://github.com/salesforce/tough-cookie) from 4.0.0 to 4.1.3. - [Release notes](https://github.com/salesforce/tough-cookie/releases) - [Changelog](https://github.com/salesforce/tough-cookie/blob/master/CHANGELOG.md) - [Commits](salesforce/tough-cookie@v4.0.0...v4.1.3) --- updated-dependencies: - dependency-name: tough-cookie dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…east-dev#3630) * sql.py data_sources.data_source_name String(255) Extend the limit of the data_source_name field from 50 to 255. Signed-off-by: Ross Donnachie <code@radonn.co.za>
…east-dev#3680) feat: Optimize bytes processed when retrieving entity df schema to 0 Signed-off-by: Hai Nguyen <quanghai.ng1512@gmail.com>
…tore.plan() on python (feast-dev#3640) * fix! KeyError: __dummy on entityless fv Signed-off-by: williamfoschiera <william.foschiera@buser.com.br> * fix! join_keys typing. Signed-off-by: williamfoschiera <william.foschiera@buser.com.br> --------- Signed-off-by: williamfoschiera <william.foschiera@buser.com.br> Co-authored-by: williamfoschiera <william.foschiera@buser.com.br>
Bumps [protobufjs](https://github.com/protobufjs/protobuf.js) from 7.1.1 to 7.2.4. - [Release notes](https://github.com/protobufjs/protobuf.js/releases) - [Changelog](https://github.com/protobufjs/protobuf.js/blob/master/CHANGELOG.md) - [Commits](protobufjs/protobuf.js@protobufjs-v7.1.1...protobufjs-v7.2.4) --- updated-dependencies: - dependency-name: protobufjs dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…east-dev#3675) Bumps [protobufjs](https://github.com/protobufjs/protobuf.js) from 7.1.2 to 7.2.4. - [Release notes](https://github.com/protobufjs/protobuf.js/releases) - [Changelog](https://github.com/protobufjs/protobuf.js/blob/master/CHANGELOG.md) - [Commits](protobufjs/protobuf.js@protobufjs-v7.1.2...protobufjs-v7.2.4) --- updated-dependencies: - dependency-name: protobufjs dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [semver](https://github.com/npm/node-semver) from 6.3.0 to 6.3.1. - [Release notes](https://github.com/npm/node-semver/releases) - [Changelog](https://github.com/npm/node-semver/blob/v6.3.1/CHANGELOG.md) - [Commits](npm/node-semver@v6.3.0...v6.3.1) --- updated-dependencies: - dependency-name: semver dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…-dev#3679) Bumps [semver](https://github.com/npm/node-semver) from 6.3.0 to 6.3.1. - [Release notes](https://github.com/npm/node-semver/releases) - [Changelog](https://github.com/npm/node-semver/blob/v6.3.1/CHANGELOG.md) - [Commits](npm/node-semver@v6.3.0...v6.3.1) --- updated-dependencies: - dependency-name: semver dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
) Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.47.0 to 1.53.0. - [Release notes](https://github.com/grpc/grpc-go/releases) - [Commits](grpc/grpc-go@v1.47.0...v1.53.0) --- updated-dependencies: - dependency-name: google.golang.org/grpc dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
# [0.32.0](feast-dev/feast@v0.31.0...v0.32.0) (2023-07-17) ### Bug Fixes * Added generic Feature store Creation for CLI ([feast-dev#3618](feast-dev#3618)) ([bf740d2](feast-dev@bf740d2)) * Broken non-root path with projects-list.json ([feast-dev#3665](feast-dev#3665)) ([4861af0](feast-dev@4861af0)) * Clean up snowflake to_spark_df() ([feast-dev#3607](feast-dev#3607)) ([e8e643e](feast-dev@e8e643e)) * Entityless fv breaks with `KeyError: __dummy` applying feature_store.plan() on python ([feast-dev#3640](feast-dev#3640)) ([ef4ef32](feast-dev@ef4ef32)) * Fix scan datasize to 0 for inference schema ([feast-dev#3628](feast-dev#3628)) ([c3dd74e](feast-dev@c3dd74e)) * Fix timestamp consistency in push api ([feast-dev#3614](feast-dev#3614)) ([9b227d7](feast-dev@9b227d7)) * For SQL registry, increase max data_source_name length to 255 ([feast-dev#3630](feast-dev#3630)) ([478caec](feast-dev@478caec)) * Implements connection pool for postgres online store ([feast-dev#3633](feast-dev#3633)) ([059509a](feast-dev@059509a)) * Manage redis pipe's context ([feast-dev#3655](feast-dev#3655)) ([48e0971](feast-dev@48e0971)) * Missing Catalog argument in athena connector ([feast-dev#3661](feast-dev#3661)) ([f6d3caf](feast-dev@f6d3caf)) * Optimize bytes processed when retrieving entity df schema to 0 ([feast-dev#3680](feast-dev#3680)) ([1c01035](feast-dev@1c01035)) ### Features * Add gunicorn for serve with multiprocess ([feast-dev#3636](feast-dev#3636)) ([4de7faf](feast-dev@4de7faf)) * Use string as a substitute for unregistered types during schema inference ([feast-dev#3646](feast-dev#3646)) ([c474ccd](feast-dev@c474ccd))
* Add fully-qualified-table-name Redshift prop Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * pre-commit Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * Docstring Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * Test fully_qualified_table_name Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * Simplify logic Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * pre-commit Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * pre-commit Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * Test offline_write_batch Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * Bump to trigger CI Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * another bump for ci Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> --------- Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com>
…SA role (feast-dev#3696) Add aws-sts dependency in java sdk Signed-off-by: harmeet-singh-discovery <harmeet_singh@discovery.com>
collection_available = utility.has_collection(table_to_keep.name) | ||
try: | ||
if collection_available: | ||
logger.error(f"Collection {table_to_keep.name} already exists.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please make this log statement on "info" level. If a collection exists it is not an error and is expected.
|
||
|
||
@dataclass | ||
class MockFeatureView: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it really necessary to use a mock? Can you use a VectorFeatureView instead?
|
||
# Assert that logger.info was called with expected messages | ||
expected_log_calls = [ | ||
mocker.call("Closing the connection to Milvus"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
verification through logs is not a good practice in unit tests. Look into methods of "connections" like get_connection_addr()
https://github.com/milvus-io/pymilvus/blob/master/pymilvus/orm/connections.py#L396 to use for the assert statements
assert len(utility.list_collections()) == 1 | ||
assert utility.has_collection(self.collection_to_write) is True | ||
assert ( | ||
f"Collection {self.collection_to_write} has been created successfully." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to assert a specific log statement here
partial=None, | ||
) | ||
|
||
assert f"Collection {self.collection_to_write} already exists." in caplog.text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert with list_collections() == 1
instead
|
||
online_store_creator.teardown() | ||
MilvusOnlineStore().update( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using the milvus online store here makes this test dependent on update()
working as expected. Instead creating a collection via the pymilvus sdk directly would be better here to simulate that a collection already exists.
if collection_available: | ||
logger.error(f"Collection {table_to_keep.name} already exists.") | ||
else: | ||
Collection(name=table_to_keep.name, schema=table_to_keep.schema) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just catching this. FeatureView schemas are not the same as the schema that Milvus expects. We will need to write logic to translate that.
] | ||
) | ||
|
||
MilvusOnlineStore().update( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here I would also rather create the collection first through the pymilvus sdk so that we do not introduce tests that are interdependent.
assert utility.has_collection(self.collection_to_delete) is False | ||
assert len(utility.list_collections()) == 0 | ||
assert ( | ||
f"Collection {self.collection_to_delete} has been deleted successfully." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not assert log-statements. If the wording changes one has to go and update the tests.
) | ||
|
||
assert ( | ||
f"Collection {self.collection_to_delete} does not exist or is already deleted." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check with assert connections.has_connection(self.collection_to_delete)
instead
…vus readable schema
f"Collection {table_to_delete.name} has been deleted successfully." | ||
) | ||
else: | ||
return logger.error( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove return statement
field_name = field.name | ||
description = field.tags.get("description") | ||
is_primary = boolean_mapping_from_string.get(field.tags.get("is_primary")) | ||
dimension = field.tags.get("dimension") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dimensions have to be retrieved from VectorFeatureView
. Also, the name of the column holding the vector is defined separately in VectorFeatureView
("vector_field")
is_primary = boolean_mapping_from_string.get(field.tags.get("is_primary")) | ||
dimension = field.tags.get("dimension") | ||
|
||
if dimension is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed?
@@ -50,6 +50,7 @@ class VectorFeatureView(BaseFeatureView): | |||
|
|||
# inheriting from FeatureView wouldn't work due to issue with conflicting proto classes | |||
# therefore using composition instead | |||
name: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feature_view already has an attribute name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It did not reflect and threw an error when I tried to add name
|
||
def setup_method(self, milvus_online_setup): | ||
# Ensuring that the collections created are dropped before the tests are run | ||
with PymilvusConnectionContext(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's ok to re-use MilvusConnectionManager
in order to not duplicate code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to keep the test not dependent on any functionality in Milvus online store. This context manager is a very simple open and close connection, purely for testing purposes. Will not add in username, password, etc.
tags={ | ||
"is_primary": "False", | ||
"description": "float32", | ||
"dimension": "128", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thinking about it this is actually a great suggestion to use a tag to pass along the number of dimensions. I think we can do likewise with index algorithm and eliminate the need for a VectorFeatureView
altogether. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thats a good idea
) | ||
|
||
# Here we want to open and add a collection using pymilvus directly and close the connection, we need to check if the collection count remains 1 and exists. | ||
with PymilvusConnectionContext(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does the embedded Milvus actually persist data after it is stopped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it does
partial=None, | ||
) | ||
|
||
assert "Collection abc does not exist or is already deleted." in caplog.text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: I'd rather assert against Milvus not having any collection. The update() method should really have a return value, but that is an issue in Feast itself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great work!
What this PR does / why we need it:
Added updated functionality to MilvusOnlineStore. The update functionality is able to delete and create a collection.
Relevant tests added as well
Which issue(s) this PR fixes:
Fixes #