Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add nb test cases for vector search jsonapi #512

Merged
merged 4 commits into from
Sep 7, 2023
Merged

Conversation

Yuqi-Du
Copy link
Contributor

@Yuqi-Du Yuqi-Du commented Aug 21, 2023

What this PR does:
Adding new benchmark file for vector search through jsonapi

Checklist

  • Changes manually tested
  • Automated Tests added/updated
  • Documentation added/updated
  • CLA Signed: DataStax CLA

@Yuqi-Du Yuqi-Du requested a review from a team as a code owner August 21, 2023 15:29
This workflow is perfect for testing Stargate performance using your own JSON dataset or any other realistic dataset.

In contrast to other workflows, this one is not split into ramp-up and main phases.
Instead, there is only the main phase with 4 different load types (write, read, update and delete).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there specific reason why no ramp-up is defined?

- `dataset_file` - the file to read the JSON documents from (note that if number of documents in a file is smaller than the `docscount` parameter, the documents will be reused)
- `connections` - number of HTTP2 connections to be shared between the threads (default: `20`)

note that too many docscount requires larger storage for cassandra backend
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "too many" here mean? Could perhaps be reworded to indicate that if a large number is used, Cassandra backend resources need to be adjusted accordingly (to allow storing all specific documents).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, will change with proper statement.

Copy link
Contributor

@tatu-at-datastax tatu-at-datastax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I think someone with more experience (@kathirsvn or @maheshrajamani ) should probably give actual approval.

@Yuqi-Du Yuqi-Du merged commit d392ec2 into main Sep 7, 2023
@Yuqi-Du Yuqi-Du deleted the vector_nosqlbench branch September 7, 2023 23:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants