Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new format short key index #1572

Closed
wants to merge 1 commit into from
Closed

Conversation

imay
Copy link
Contributor

@imay imay commented Aug 1, 2019

In this patch, we create a new format for short key index. In orgin code
index is stored in format like RowCusor which is not effecient to
compare. Now we encode multiple column into binary, and we assure that
this binary is sorted same with the key columns.

In this patch, we create a new format for short key index. In orgin code
index is stored in format like RowCusor which is not effecient to
compare. Now we encode multiple column into binary, and we assure that
this binary is sorted same with the key columns.
be/src/olap/short_key_index.cpp Show resolved Hide resolved
be/src/olap/short_key_index.h Show resolved Hide resolved
// How many rows in this segment
optional uint32 num_segment_rows = 6;
// Total bytes for this segment
optional uint32 segment_bytes = 7;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the usage of segment_id, and segment_bytes ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the difference between num_items and num_segment_rows?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also add a version field so that we can easily evolve the format in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the usage of segment_id, and segment_bytes ?

segment_id is used to say its is in this rowset.
segment_bytes is used to store how many bytes in this segment

I put these fields here because in old version, storage get these information from index. We put these here, maybe put these other place later.

num_items is the number of index items, num_segment_rows is total rows in segment.

About version, we can add one

ASSERT_EQ(val, check_val);
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to wrap the following block in a loop, say test 10000 random pairs

ASSERT_EQ(val, check_val);
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to wrap the following block in a loop, say test 10000 random pairs

be/src/olap/key_coder.h Show resolved Hide resolved
be/src/olap/short_key_index.h Show resolved Hide resolved
// How many rows in this segment
optional uint32 num_segment_rows = 6;
// Total bytes for this segment
optional uint32 segment_bytes = 7;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the difference between num_items and num_segment_rows?

be/src/olap/short_key_index.h Show resolved Hide resolved
be/src/olap/short_key_index.h Show resolved Hide resolved
// equal with or larger than given key.
// NOTE: This function holds that without common prefix key, the one
// who has more length it the bigger one. Two key is the same only
// when their length are equal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should review this comment. It is confusing

be/src/olap/short_key_index.h Show resolved Hide resolved
class ShortKeyIndexBuilder {
public:
ShortKeyIndexBuilder(uint32_t segment_id,
uint32_t num_rows_per_block) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are these two arguments for?

be/src/olap/key_coder.h Show resolved Hide resolved
@imay
Copy link
Contributor Author

imay commented Aug 5, 2019

Close this PR, because this patch is contained in #1577

@imay imay closed this Aug 5, 2019
luwei16 pushed a commit to luwei16/incubator-doris that referenced this pull request Apr 7, 2023
…#17851)" (apache#1572)

move load big lateral view from p1 to p2, this case takes a long time to execute

Co-authored-by: Pxl <pxl290@qq.com>
SWJTU-ZhangLei pushed a commit to SWJTU-ZhangLei/incubator-doris that referenced this pull request Jul 25, 2023
…#17851)" (apache#1572)

move load big lateral view from p1 to p2, this case takes a long time to execute

Co-authored-by: Pxl <pxl290@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants