-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(vertexai): add google_vertex_ai_index for Vertex AI Matching Engine #6728
Merged
Merged
Changes from all commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
7333ffa
feat: add google_vertex_ai_index for Vertex AI Matching Engine
shotarok c4fbed0
fix: increase timeouts to 60 min because 20 wasn't enough for creation
shotarok efd9549
fix: change coe to make name computed instead of an input
shotarok b9ba3f6
fix: use costom flatten code to ignore_read a nested property's field
shotarok 36ac11a
fix: add skip_import_test: true to the auto-gen test
shotarok b2d7df2
feat: add a test with a manually updated ImportStateVerifyIgnore
shotarok 5ece43a
Apply suggestions from code review [ci skip]
shotarok dae899c
refactor: use ignore_read_extra instead of a manual test
shotarok b7cb7c2
fix: use an empty object for bruteForceConfig
shotarok 60f02bb
feat: define additional fields to api.yaml
shotarok f1fbffe
feat: add an example to increase test coverage
shotarok d55968c
feat: deal with contentsDeltaUri as an updatable field
shotarok 38ddf5e
fix: fix the error because the cosine distance type only supports uni…
shotarok d764090
feat: remove approximate_neighbors_count from an example with brute_f…
shotarok 64f9a14
test: add a handwritten test for patch
shotarok 6435466
fix: add update_mask: true to use the mask as a url param
shotarok 92a91d1
refactor: put 'input: true' on the fields patch couldn't update
shotarok cda37c4
feat: use custom pre update code for a nested object
shotarok 15001b3
fix: update the handwritten test accordingly
shotarok 4661d7c
feat: add custom flatten code for is_complete_overwrite
shotarok File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
18 changes: 18 additions & 0 deletions
18
mmv1/templates/terraform/custom_flatten/vertex_ai_index_ignore_contents_delta_uri.go.erb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
<%# The license inside this block applies to this file. | ||
# Copyright 2022 Google Inc. | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
-%> | ||
func flatten<%= prefix -%><%= titlelize_property(property) -%>(v interface{}, d *schema.ResourceData, config *Config) interface{} { | ||
// We want to ignore read on this field, but cannot because it is nested | ||
return d.Get("metadata.0.contents_delta_uri") | ||
} |
18 changes: 18 additions & 0 deletions
18
mmv1/templates/terraform/custom_flatten/vertex_ai_index_ignore_is_complete_overwrite.go.erb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
<%# The license inside this block applies to this file. | ||
# Copyright 2022 Google Inc. | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
-%> | ||
func flatten<%= prefix -%><%= titlelize_property(property) -%>(v interface{}, d *schema.ResourceData, config *Config) interface{} { | ||
// We want to ignore read on this field, but cannot because it is nested | ||
return d.Get("metadata.0.is_complete_overwrite") | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
resource "google_storage_bucket" "bucket" { | ||
name = "<%= ctx[:test_env_vars]['project'] %>-<%= ctx[:vars]['bucket_name'] %>" # Every bucket name must be globally unique | ||
location = "us-central1" | ||
uniform_bucket_level_access = true | ||
} | ||
|
||
# The sample data comes from the following link: | ||
# https://cloud.google.com/vertex-ai/docs/matching-engine/filtering#specify-namespaces-tokens | ||
resource "google_storage_bucket_object" "data" { | ||
name = "contents/data.json" | ||
bucket = google_storage_bucket.bucket.name | ||
content = <<EOF | ||
{"id": "42", "embedding": [0.5, 1.0], "restricts": [{"namespace": "class", "allow": ["cat", "pet"]},{"namespace": "category", "allow": ["feline"]}]} | ||
{"id": "43", "embedding": [0.6, 1.0], "restricts": [{"namespace": "class", "allow": ["dog", "pet"]},{"namespace": "category", "allow": ["canine"]}]} | ||
EOF | ||
} | ||
|
||
resource "google_vertex_ai_index" "index" { | ||
labels = { | ||
foo = "bar" | ||
} | ||
region = "us-central1" | ||
display_name = "<%= ctx[:vars]['display_name'] %>" | ||
description = "index for test" | ||
metadata { | ||
contents_delta_uri = "gs://${google_storage_bucket.bucket.name}/contents" | ||
config { | ||
dimensions = 2 | ||
approximate_neighbors_count = 150 | ||
distance_measure_type = "DOT_PRODUCT_DISTANCE" | ||
algorithm_config { | ||
tree_ah_config { | ||
leaf_node_embedding_count = 500 | ||
leaf_nodes_to_search_percent = 7 | ||
} | ||
} | ||
} | ||
} | ||
index_update_method = "BATCH_UPDATE" | ||
} |
37 changes: 37 additions & 0 deletions
37
mmv1/templates/terraform/examples/vertex_ai_index_streaming.tf.erb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
resource "google_storage_bucket" "bucket" { | ||
name = "<%= ctx[:test_env_vars]['project'] %>-<%= ctx[:vars]['bucket_name'] %>" # Every bucket name must be globally unique | ||
location = "us-central1" | ||
uniform_bucket_level_access = true | ||
} | ||
|
||
# The sample data comes from the following link: | ||
# https://cloud.google.com/vertex-ai/docs/matching-engine/filtering#specify-namespaces-tokens | ||
resource "google_storage_bucket_object" "data" { | ||
name = "contents/data.json" | ||
bucket = google_storage_bucket.bucket.name | ||
content = <<EOF | ||
{"id": "42", "embedding": [0.5, 1.0], "restricts": [{"namespace": "class", "allow": ["cat", "pet"]},{"namespace": "category", "allow": ["feline"]}]} | ||
{"id": "43", "embedding": [0.6, 1.0], "restricts": [{"namespace": "class", "allow": ["dog", "pet"]},{"namespace": "category", "allow": ["canine"]}]} | ||
EOF | ||
} | ||
|
||
resource "google_vertex_ai_index" "index" { | ||
labels = { | ||
foo = "bar" | ||
} | ||
region = "us-central1" | ||
display_name = "<%= ctx[:vars]['display_name'] %>" | ||
description = "index for test" | ||
metadata { | ||
contents_delta_uri = "gs://${google_storage_bucket.bucket.name}/contents" | ||
config { | ||
dimensions = 2 | ||
distance_measure_type = "COSINE_DISTANCE" | ||
feature_norm_type = "UNIT_L2_NORM" | ||
algorithm_config { | ||
brute_force_config {} | ||
} | ||
} | ||
} | ||
index_update_method = "STREAM_UPDATE" | ||
} |
22 changes: 22 additions & 0 deletions
22
mmv1/templates/terraform/pre_update/vertex_ai_index.go.erb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
newUpdateMask := []string{} | ||
|
||
if d.HasChange("metadata.0.contents_delta_uri") { | ||
// Use the current value of isCompleteOverwrite when updating contentsDeltaUri | ||
newUpdateMask = append(newUpdateMask, "metadata.contentsDeltaUri") | ||
newUpdateMask = append(newUpdateMask, "metadata.isCompleteOverwrite") | ||
} | ||
|
||
for _, mask := range updateMask { | ||
// Use granular update masks instead of 'metadata' to avoid the following error: | ||
// 'If `contents_delta_gcs_uri` is set as part of `index.metadata`, then no other Index fields can be also updated as part of the same update call.' | ||
if mask == "metadata" { | ||
continue | ||
} | ||
newUpdateMask = append(newUpdateMask, mask) | ||
} | ||
|
||
// Refreshing updateMask after adding extra schema entries | ||
url, err = addQueryParams(url, map[string]string{"updateMask": strings.Join(newUpdateMask, ",")}) | ||
if err != nil { | ||
return err | ||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you set
update_mask: true
here, that will make it so that only updated fields are included in the PATCH request - it sounds like the PATCH request has to exclude contentsDeltaUri if it wants to update other fields. This should resolve the issue.You might also need to set update_mask_fields for all nested fields in order for them to be updated properly... it varies from API to API. I'd recommend testing without update_mask_fields first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for your clear advice! I'll give it a try with a non-nested field first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes,
update_mask: true
worked for the case updating the description locally. I'll add the otherupdate_mask_fields
later,There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to update
update_mask_fields
for multi-nested objects. Onlymetadata.{isCompleteOverwrite,contentsDeltaUri}
are applied. Therefore, I'll put all fields inupdate_mask_fields
ofmetadata
.details
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested to update
metadata.config.dimensions
with the above strategy. However,metadata.config.dimensions
couldn't be updated. I wonder if we putmetadata.contentsDeltaUri
in the updateMask query parameter, we can't update the other metadata fields at the same time based on this quote:details
It seems the current
update_mask_fields
can deal with only top-level nested objects. I should use a custom code withpre_update
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could use granular update masks with the
pre_update
custom code. However, it turns out the patch can't update all fields inmetadata.config
andindex_update_method
.It seems the patch can update only
contentsDelta
,isCompleteOverwrite
under the metadata in-place. Because even those fields don't appear in a response of get, we need toignore_read_extra
for them, though...Anyway, I'll push code to use granular update masks. And, I'll put
input: true
on the unpatchable fields.