Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update several minor in conjugate graph #147

Merged
merged 1 commit into from
Dec 2, 2024

Conversation

ShawnShawnYou
Copy link
Collaborator

This PR contains several minor changes in conjugate graph:

  • support int8 in conjugate graph
  • support using conjugate graph in fresh_hnsw
  • add multi-thread test
  • modify search logic with conjugate graph
  • add maximum degree limit in conjugate graph

@ShawnShawnYou ShawnShawnYou requested a review from inabao November 17, 2024 15:49
@ShawnShawnYou ShawnShawnYou force-pushed the update-serveral-minor-in-conjugate-graph branch from 768d717 to 8a50245 Compare November 17, 2024 15:53
}

// return result
while (results.size() > k) {
results.pop();
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comparing with original search process, here only move partial pop operations from hnswalg to hnsw. So it wouldn't affect the search speed.

std::priority_queue<std::pair<float, size_t>> results;
double time_cost;
try {
Timer t(time_cost);
if (use_conjugate_graph_ and params.use_conjugate_graph_search) {
k = LOOK_AT_K;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here adjust k to provide enough information for conjugate graph to enhance result

if (this->is_empty()) {
return 0;
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here allow use conjugate graph as default setting without influencing search process

@@ -27,6 +27,9 @@ ConjugateGraph::AddNeighbor(int64_t from_tag_id, int64_t to_tag_id) {
return false;
}
auto& neighbor_set = conjugate_graph_[from_tag_id];
if (neighbor_set.size() >= MAXIMUM_DEGREE) {
return false;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here limit out degree to limit graph size and enhance latency

std::priority_queue<std::pair<float, size_t>> results;
double time_cost;
try {
Timer t(time_cost);
if (use_conjugate_graph_ and params.use_conjugate_graph_search) {
k = LOOK_AT_K;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider the situation when k > LOOK_AT_K

@@ -701,10 +709,15 @@ HNSW::brute_force(const DatasetPtr& query, int64_t k) {
float* dists = (float*)allocator_->Allocate(sizeof(float) * k);
result->Distances(dists);

auto vector = query->GetFloat32Vectors();
const void* vector;
if (type_ == DataTypes::DATA_TYPE_INT8) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use get_vectors here to replace the judgement.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

auto base = Dataset::Make();
auto generated_query = Dataset::Make();
if (type_ == DataTypes::DATA_TYPE_INT8) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto


for (const int64_t& base_tag_id : base_tag_ids) {
try {
base->Float32Vectors(this->alg_hnsw_->getDataByLabel(base_tag_id));
if (type_ == DataTypes::DATA_TYPE_INT8) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto


for (int d = 0; d < dim_; d++) {
generated_data.get()[d] = vsag::GENERATE_OMEGA * base->GetFloat32Vectors()[d] +
(1 - vsag::GENERATE_OMEGA) * topk_data[d];
if (type_ == DataTypes::DATA_TYPE_INT8) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@ShawnShawnYou ShawnShawnYou force-pushed the update-serveral-minor-in-conjugate-graph branch 2 times, most recently from 0c3eeb7 to 1068b44 Compare November 19, 2024 08:25
@ShawnShawnYou ShawnShawnYou changed the title Update several minor in conjugate graph update several minor in conjugate graph Nov 19, 2024
@ShawnShawnYou ShawnShawnYou force-pushed the update-serveral-minor-in-conjugate-graph branch from 1068b44 to 2ea5ac9 Compare November 19, 2024 08:40
@ShawnShawnYou ShawnShawnYou requested a review from LHT129 November 19, 2024 12:35
->Owner(false)
->NumElements(num_element);
} else if (type_ == DataTypes::DATA_TYPE_INT8) {
base->Int8Vectors((int8_t*)vectors_ptr)->Dim(dim_)->Owner(false)->NumElements(num_element);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent

} else if (type_ == DataTypes::DATA_TYPE_INT8) {
base->Int8Vectors((int8_t*)vectors_ptr)->Dim(dim_)->Owner(false)->NumElements(num_element);
} else {
throw std::invalid_argument(fmt::format("no support for this metric: {}", (int)type_));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no support for this data type: ?

Copy link
Collaborator

@jiaweizone jiaweizone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ShawnShawnYou ShawnShawnYou force-pushed the update-serveral-minor-in-conjugate-graph branch from 26f2be5 to 64e3f4d Compare November 25, 2024 08:07
Copy link
Collaborator

@wxyucs wxyucs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@wxyucs wxyucs added the kind/feature New feature or request label Dec 2, 2024
@ShawnShawnYou ShawnShawnYou force-pushed the update-serveral-minor-in-conjugate-graph branch from ec373ec to dd48c34 Compare December 2, 2024 08:26
@pull-request-size pull-request-size bot added size/XL and removed size/L labels Dec 2, 2024
Signed-off-by: zhongxiaoyao.zxy <zhongxiaoyao.zxy@antgroup.com>
@ShawnShawnYou ShawnShawnYou force-pushed the update-serveral-minor-in-conjugate-graph branch from dd48c34 to 642866a Compare December 2, 2024 08:29
@pull-request-size pull-request-size bot added size/L and removed size/XL labels Dec 2, 2024
@ShawnShawnYou ShawnShawnYou merged commit 977b2c6 into main Dec 2, 2024
2 checks passed
@ShawnShawnYou ShawnShawnYou deleted the update-serveral-minor-in-conjugate-graph branch December 2, 2024 11:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature New feature or request size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants