Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[#23330] docdb: fixed static columns handling for CQL operations
Summary: For CQL tables with primary key which have both hash and range components, `INSERT INTO INTO ... RETURNS STATUS AS ROW` might do useless extra reads. In case when then there are a lot of deleted (not yet compacted) rows with the same hash key column value, it might read all these rows during processing INSERT. In its turn this might cause growth of `docdb_obsolete_keys_found`, `docdb_obsolete_keys_found` metrics. When part of those deleted keys is already behind history cutoff limit (15 minutes by default) metric `docdb_obsolete_keys_found_past_cutoff` will also be increased and this can lead to triggering automatic full compactions for affected tablets resulting in additional system load. This behaviour starts with the following code change (f68853b) inside `CreateProjections` function for CQL operations: https://github.com/yugabyte/yugabyte-db/blame/889f44fb8153b9535663542d5bf4b4824c9da983/src/yb/docdb/cql_operation.cc#L135. It affects `QLWriteOperation::ReadColumns` logic in the following way. The change causes `static_projection` to always contain hash key columns and being non-empty even if we are processing `INTO INTO ... RETURNS STATUS AS ROW` without modifying static columns (and even if we don't have static columns at all). This was required due to updated read path logic that is now reading values for required columns only but not for all. But as a side effect, this leads to `QLWriteOperation::ReadColumns` calling `QLWriteOperation::InitializeKeys` with `hashed_key = !static_projection->columns.empty() == true`. Because of this `QLWriteOperation::InitializeKeys` sets `hashed_doc_key_` to non-empty value. And due to this `QLWriteOperation::ReadColumns` tries to find existing rows in DocDB querying only by hashed column value (even if primary key has range columns): https://github.com/yugabyte/yugabyte-db/blame/889f44fb8153b9535663542d5bf4b4824c9da983/src/yb/docdb/cql_operation.cc#L499. Due to TTL, there could be a lot of obsolete rows with the same hashed column value and `DocRowwiseIterator` skips them during iteration when trying to find a live row. This causes high growth of both `docdb_obsolete_keys_found` and `docdb_obsolete_keys_found_past_cutoff` (when huge amount of obsolete keys is already behind history cutoff). Added unit-test for catching this bug and implemented fix for `QLWriteOperation::ReadColumns` function. Jira: DB-12255 Test Plan: `ybd --cxx-test integration-tests_cql-test --gtest_filter CqlTest.InsertHashAndRangePkWithReturnsStatusAsRow -n 50 -- -p 1` for asan/tsan/debug/release. Reviewers: sergei, rthallam, arybochkin Reviewed By: sergei, arybochkin Subscribers: yql, ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D36921
- Loading branch information