Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DocDB] Pack full row updates #13291

Closed
kmuthukk opened this issue Jul 13, 2022 · 0 comments
Closed

[DocDB] Pack full row updates #13291

kmuthukk opened this issue Jul 13, 2022 · 0 comments
Assignees
Labels
area/docdb YugabyteDB core features kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue

Comments

@kmuthukk
Copy link
Collaborator

kmuthukk commented Jul 13, 2022

Jira Link: DB-2926

Description

With the packed columns feature, we now pack all the columns in the initial insert of a row as one entry in DocDB/RocksDB rather than as separate entries.

But subsequent updates to the row are kept in the older, one entry per column, format. These partial updates aren't packed themselves because there could be a large number of such updates, and a read for a specific column would have to look through all of them in the worst case to find the latest value of the column. For instance,

// say you had a 100 (non-primary key columns) insert for primary key k
INSERT INTO T(k, c1, c2, ..., c100) VALUES (....)

With the packed columns feature, this will get stored as one entry k -> { c1: ?, c2: ?, .... c100: ? } in DocDB.

// say now, you do 20 such updates to columns c1..c10.
UPDATE T SET c1 = ?, ..., c10 = ? where k = 'k';
.. 20 times
UPDATE T SET c1 = ?, ..., c10 = ? where k = 'k';

And now you want to read column c25 of this row k.

  • If you store UPDATEs in exploded format you only have to look for the presence of the most recent unpacked entry k.c25 and if that's not present, look for most recent packed entry k to see if it contains c25.

  • On the other hand, if you store these partial UPDATEs also in packed format, you have to look at each those 20 "partially packed entries" to see if it contains c25.. and each one will be a miss (because they only contain c1 through c10), and finally you'll go to the fully packed initial insert for k to find c25.

Proposal

  • It would be worth it and reasonably simple to recognize the special case where the UPDATE does touch all columns of the row, and optimize this case by storing such "full" UPDATES in the "packed" format.

  • While the discussion above uses tables as example, the same optimization applies to indices also since at the storage layer indices are similar to tables. If all INCLUDED columns of an index are getting updated, we should record the updated INDEX entry also in a packed format.

@kmuthukk kmuthukk added area/docdb YugabyteDB core features status/awaiting-triage Issue awaiting triage labels Jul 13, 2022
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Jul 13, 2022
@yugabyte-ci yugabyte-ci removed the status/awaiting-triage Issue awaiting triage label Jul 19, 2022
@yugabyte-ci yugabyte-ci added kind/enhancement This is an enhancement of an existing feature and removed kind/bug This issue is a bug labels Jul 29, 2022
Huqicheng added a commit that referenced this issue Jul 29, 2022
Summary:
We have packed columns feature: pack all columns for an INSERT as one entry instead of multiple entries for each column.

With this diff, we support store updated row in packed format if we are doing a full row update (updating all non-key columns).

Test Plan:
./yb_build.sh --cxx-test pg_packed_row-test --gtest_filter PgPackedRowTest.Update
./yb_build.sh --cxx-test pg_packed_row-test --gtest_filter PgPackedRowTest.UpdateReturning

Reviewers: bogdan, mihnea, tnayak, dmitry, sergei

Reviewed By: sergei

Subscribers: ybase, bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D18450
@bmatican bmatican closed this as completed Aug 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

4 participants