Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up nightly runs for vector search workloads #4483

Closed
2 tasks
VijayanB opened this issue Feb 28, 2024 · 5 comments
Closed
2 tasks

Set up nightly runs for vector search workloads #4483

VijayanB opened this issue Feb 28, 2024 · 5 comments
Labels
enhancement New Enhancement

Comments

@VijayanB
Copy link
Member

VijayanB commented Feb 28, 2024

Is your feature request related to a problem? Please describe

Similar to https://opensearch.org/benchmarks/ , we would like to set up nightly runs that runs vector search workloads every night, publish performance metrics, and, make it available for public.

Describe the solution you'd like

Would like to see following night runs:

  • Cohere 1M

    • Engine types: nmslib, faiss
    • space type: innerproduct
    • Cluster configuration:
      3 nodes ( 3 shards, 1 replica )
      4 CPU cores in each node
      32 GB memory in each node
    • Algo params
      hnsw_ef_construction : 100
      hnsw_ef_search : 100
      hnsw_m : 16
  • Cohere 10 M

    • Engine types:nmslib, faiss
    • space type: innerproduct
    • Cluster configuration:
      3 nodes ( 3 shards, 1 replica )
      16 CPU cores in each node
      128 GB memory in each node
    • Algo params
      hnsw_ef_construction : 100
      hnsw_ef_search : 256
      hnsw_m : 16

Describe alternatives you've considered

No response

Additional context

Support to run those configuration should be available in opensearch-benchmarks-workloads. Hence, users can reproduce this easily without performing any additional steps

@VijayanB VijayanB added enhancement New Enhancement untriaged Issues that have not yet been triaged labels Feb 28, 2024
@jordarlu jordarlu removed the untriaged Issues that have not yet been triaged label Mar 12, 2024
@jordarlu
Copy link
Contributor

jordarlu commented Mar 12, 2024

Hi, @rishabh6788 , @IanHoang, if you can help commnet on this issue, thanks!!

@rishabh6788
Copy link
Collaborator

Nightly benchmarks for VectorSearch workload has been scheduled. @VijayanB is currently reviewing metrics with the team and will confirm if he needs any more updates.

@bbarani
Copy link
Member

bbarani commented Apr 2, 2024

@rishabh6788 @VijayanB Can you please provide an update? Is the dashboard ready for public use?

@VijayanB
Copy link
Member Author

VijayanB commented Apr 2, 2024

Created 6 dashboards for Vector Search. We are reviewing the metrics and identifying any new dashboards that can be added. It is already available here (Please check metrics starting March 28)

Nmslib 1M Cohere 768D
Nmslib 10 M Cohere 768D
Faiss 1M Cohere 768D
Faiss 10 M Cohere 768D
Lucene 1M Cohere 768D
Lucene 10 M Cohere 768D

@VijayanB
Copy link
Member Author

This is available in website. Can this be closed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New Enhancement
Projects
None yet
Development

No branches or pull requests

4 participants