Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set random sampling as the default sampling method for ingestion #446

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

NikolaosPapailiou
Copy link
Collaborator

@NikolaosPapailiou NikolaosPapailiou commented Jul 15, 2024

Using FIRST_N sampling can lead to quality and ingestion partition balancing problems for users that are not aware of it and try to experiment with TileDB vector search.

FIRST_N can give some ingestion performance boost but it is better for a user to configure this specifically rather than facing hidden quality issues.

sc-50492

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant