Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] wrapper for train API for creating knn index models #291

Closed
armenabnousi opened this issue Feb 14, 2023 · 5 comments
Closed

[FEATURE] wrapper for train API for creating knn index models #291

armenabnousi opened this issue Feb 14, 2023 · 5 comments
Labels
enhancement New feature or request

Comments

@armenabnousi
Copy link

Is your feature request related to a problem?

I'm not able to use the python client to train knn indexing models.

What solution would you like?

Indexing methods pq and ivf that are added to opensearch require training before indexing the entire set of documents. This is done through the train API on opensearch. Opensearch-py client does not seem to provide a wrapper for this method. Adding a feature supporting it would be very useful.

What alternatives have you considered?

calling the API directly (without the python client). Although it will work, it's not ideal.

Do you have any additional context?

Add any other context or screenshots about the feature request here.

@armenabnousi armenabnousi added enhancement New feature or request untriaged Need triage labels Feb 14, 2023
@armenabnousi armenabnousi changed the title [FEATURE] [FEATURE] wrapper for train API for creating knn index models Feb 14, 2023
@wbeckler wbeckler removed the untriaged Need triage label Feb 15, 2023
@wbeckler
Copy link
Contributor

You are right, the plugins don't generally have methods in the clients. Feel free to raise a PR to add them if you're up for it.

@ReinGrad
Copy link

###Support for PQ and IVF indexing methods via trainAPI in OpenSearch-py

The OpenSearch-py client currently does not have a feature to support model training using IVF and PQ indexing methods via the trainAPI. This feature is available in OpenSearch and can be used via direct API calls or other clients such as Elasticsearch-py or Requests-OpenSearch. However, for users who prefer to use OpenSearch-py, it would be helpful to have this feature built into the client.

Alternative Solutions:

1 Developing and adding a feature that supports model training using IVF and PQ indexing methods to the OpenSearch-py client.

2 Using another client for OpenSearch API which already supports model training using IVF and PQ indexing methods, such as Elasticsearch-py or Requests-OpenSearch.

3 Adding a function to the OpenSearch-py client that supports PQ and IVF indexing methods via the trainAPI.

Expected Outcome:

Having this feature in the OpenSearch-py client will make it more convenient for users who want to use the PQ and IVF indexing methods for model training, and will provide a more complete experience for users of the OpenSearch-py client.

Opensearch-py, which supports pq and ivf indexing methods via trainAPI

from opensearchpy import OpenSearch
from opensearchpy.helpers import _make_path, _bulk_body

class KNNClient(OpenSearch):
    def train_knn_model(self, model_id, training_index, training_field, dimension, max_training_vector_count, search_size, description, method):
        path = _make_path("_plugins", "_knn", "models", model_id, "_train")
        params = {"preference": "_local"}

        body = {
            "training_index": training_index,
            "training_field": training_field,
            "dimension": dimension,
            "max_training_vector_count": max_training_vector_count,
            "search_size": search_size,
            "description": description,
            "method": method
        }

        response = self.transport.perform_request("POST", path, params=params, body=body)
        return response

    def train_knn_model_auto_id(self, training_index, training_field, dimension, max_training_vector_count, search_size, description, method):
        path = _make_path("_plugins", "_knn", "models", "_train")
        params = {"preference": "_local"}

        body = {
            "training_index": training_index,
            "training_field": training_field,
            "dimension": dimension,
            "max_training_vector_count": max_training_vector_count,
            "search_size": search_size,
            "description": description,
            "method": method
        }

        response = self.transport.perform_request("POST", path, params=params, body=body)
        return response

@wbeckler
Copy link
Contributor

I think it's a matter of time before a contributor adds this feature to the client, and you could be the first. Feel free to PR this feature.

@wbeckler
Copy link
Contributor

wbeckler commented Jul 8, 2023

Have you looked at the opensearch-py-ml client?

@dblock
Copy link
Member

dblock commented Nov 10, 2023

Closing as a dup of #300.

@dblock dblock closed this as completed Nov 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants