MyRocks 8.0.28 has poor performance of primary key range query #1315

yeqing12 · 2023-06-05T08:09:28Z

the problem phenomenon：

Reproduction method：

$ sysbench --version
sysbench 1.0.17

use sysbench create one table, insert 100w rows

sysbench /usr/share/sysbench/oltp_insert.lua --mysql-host=xx --mysql-port=xx --db-driver=mysql --mysql-user=xx --mysql-password=xx --tables=1 --table-size=1000000 --threads=8 --time=600 --report-interval=10 --mysql_storage_engine=rocksdb prepare

sql:
select id from sbtest1 where id >=300000 order by id asc limit 1; // poor
select id,pad from sbtest1 where id >=300000 order by id asc limit 1; // ok

innodb has no problem

the relational code is:
for rocksdb

// for rocksdb， the cost of range scan in ha_rocksdb.cc
double ha_rocksdb::read_time(uint index, uint ranges, ha_rows rows) {
  DBUG_ENTER_FUNC();

  if (index != table->s->primary_key) {
    /* Non covering index range scan */
    DBUG_RETURN(handler::read_time(index, ranges, rows));
  }

  DBUG_RETURN((rows / 20.0) + 1);
}

for sql optimizer:
in sql/range_optimizer/range_optimizer.cc

// calc the cost of range scan, compare full index cost and range scan cost
int test_quick_select()
{
   ...
     /* Calculate cost of full index read for the shortest covering index */
  if (!table->covering_keys.is_clear_all() &&
      /*
        If optimizer_force_index_for_range is on and force index is used,
        then skip calculating index scan cost.
      */
      !(thd->variables.optimizer_force_index_for_range && table->force_index)) {
    int key_for_use = find_shortest_key(table, &table->covering_keys);
    // find_shortest_key() should return a valid key:
    assert(key_for_use != MAX_KEY);

    Cost_estimate key_read_time = param.table->file->index_scan_cost(
        key_for_use, 1, static_cast<double>(records));
    key_read_time.add_cpu(
        cost_model->row_evaluate_cost(static_cast<double>(records)));

    bool chosen = false;
    if (key_read_time < cost_est) {
      cost_est = key_read_time;
      chosen = true;
    }

    Opt_trace_object trace_cov(trace, "best_covering_index_scan",
                               Opt_trace_context::RANGE_OPTIMIZER);
    trace_cov.add_utf8("index", table->key_info[key_for_use].name)
        .add("cost", key_read_time)
        .add("chosen", chosen);
    if (!chosen) trace_cov.add_alnum("cause", "cost");
  }

  AccessPath *best_path = nullptr;
  double best_cost = cost_est.total_cost();
  ...
}

I have tried two different approaches to solve this problem, But I'm not sure if these two directions are suitable or if they will bring other problems. I hope the official can come up with a solution or help evaluate these two solutions.

the first way: Do not use full index read to eliminate range scan

--- a/sql/range_optimizer/range_optimizer.cc
+++ b/sql/range_optimizer/range_optimizer.cc
@@ -616,7 +616,6 @@ int test_quick_select(THD *thd, MEM_ROOT *return_mem_root,
   const bool index_merge_intersect_allowed =
       index_merge_allowed &&
       thd->optimizer_switch_flag(OPTIMIZER_SWITCH_INDEX_MERGE_INTERSECT);

   /* Calculate cost of full index read for the shortest covering index */
   if (!table->covering_keys.is_clear_all() &&
       /*
@@ -624,7 +623,7 @@ int test_quick_select(THD *thd, MEM_ROOT *return_mem_root,
         then skip calculating index scan cost.
       */
       !(thd->variables.optimizer_force_index_for_range && table->force_index)) {
-    int key_for_use = find_shortest_key(table, &table->covering_keys);
+/*    int key_for_use = find_shortest_key(table, &table->covering_keys);
     // find_shortest_key() should return a valid key:
     assert(key_for_use != MAX_KEY);

@@ -645,8 +644,8 @@ int test_quick_select(THD *thd, MEM_ROOT *return_mem_root,
         .add("cost", key_read_time)
         .add("chosen", chosen);
     if (!chosen) trace_cov.add_alnum("cause", "cost");
+*/
   }

the second way: change the cost of range scan

--- a/storage/rocksdb/ha_rocksdb.cc
+++ b/storage/rocksdb/ha_rocksdb.cc
@@ -18245,7 +18245,7 @@ double ha_rocksdb::read_time(uint index, uint ranges, ha_rows rows) {
     DBUG_RETURN(handler::read_time(index, ranges, rows));
   }

-  DBUG_RETURN((rows / 20.0) + 1);
+  DBUG_RETURN((rows / 100.0) + 1);
 }

The above are just two simple ways to represent two solutions, after testing, both of them can solve the above problem

The text was updated successfully, but these errors were encountered:

luqun · 2023-06-07T16:23:18Z

Thanks for reporting. We will investigate the issue and give update later.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MyRocks 8.0.28 has poor performance of primary key range query #1315

MyRocks 8.0.28 has poor performance of primary key range query #1315

yeqing12 commented Jun 5, 2023 •

edited

Loading

luqun commented Jun 7, 2023

MyRocks 8.0.28 has poor performance of primary key range query #1315

MyRocks 8.0.28 has poor performance of primary key range query #1315

Comments

yeqing12 commented Jun 5, 2023 • edited Loading

luqun commented Jun 7, 2023

yeqing12 commented Jun 5, 2023 •

edited

Loading