Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MyRocks 8.0.28 has poor performance of primary key range query #1315

Open
yeqing12 opened this issue Jun 5, 2023 · 1 comment
Open

MyRocks 8.0.28 has poor performance of primary key range query #1315

yeqing12 opened this issue Jun 5, 2023 · 1 comment

Comments

@yeqing12
Copy link

yeqing12 commented Jun 5, 2023

the problem phenomenon:
截屏2023-06-05 下午3 24 27

Reproduction method:

$ sysbench --version
sysbench 1.0.17

use sysbench create one table, insert 100w rows

sysbench /usr/share/sysbench/oltp_insert.lua --mysql-host=xx --mysql-port=xx --db-driver=mysql --mysql-user=xx --mysql-password=xx --tables=1 --table-size=1000000 --threads=8 --time=600 --report-interval=10 --mysql_storage_engine=rocksdb prepare

sql:
select id from sbtest1 where id >=300000 order by id asc limit 1; // poor
select id,pad from sbtest1 where id >=300000 order by id asc limit 1; // ok

innodb has no problem

the relational code is:
for rocksdb

// for rocksdb, the cost of range scan in ha_rocksdb.cc
double ha_rocksdb::read_time(uint index, uint ranges, ha_rows rows) {
  DBUG_ENTER_FUNC();

  if (index != table->s->primary_key) {
    /* Non covering index range scan */
    DBUG_RETURN(handler::read_time(index, ranges, rows));
  }

  DBUG_RETURN((rows / 20.0) + 1);
}

for sql optimizer:
in sql/range_optimizer/range_optimizer.cc

// calc the cost of range scan, compare full index cost and range scan cost
int test_quick_select()
{
   ...
     /* Calculate cost of full index read for the shortest covering index */
  if (!table->covering_keys.is_clear_all() &&
      /*
        If optimizer_force_index_for_range is on and force index is used,
        then skip calculating index scan cost.
      */
      !(thd->variables.optimizer_force_index_for_range && table->force_index)) {
    int key_for_use = find_shortest_key(table, &table->covering_keys);
    // find_shortest_key() should return a valid key:
    assert(key_for_use != MAX_KEY);

    Cost_estimate key_read_time = param.table->file->index_scan_cost(
        key_for_use, 1, static_cast<double>(records));
    key_read_time.add_cpu(
        cost_model->row_evaluate_cost(static_cast<double>(records)));

    bool chosen = false;
    if (key_read_time < cost_est) {
      cost_est = key_read_time;
      chosen = true;
    }

    Opt_trace_object trace_cov(trace, "best_covering_index_scan",
                               Opt_trace_context::RANGE_OPTIMIZER);
    trace_cov.add_utf8("index", table->key_info[key_for_use].name)
        .add("cost", key_read_time)
        .add("chosen", chosen);
    if (!chosen) trace_cov.add_alnum("cause", "cost");
  }

  AccessPath *best_path = nullptr;
  double best_cost = cost_est.total_cost();
  ...
}

I have tried two different approaches to solve this problem, But I'm not sure if these two directions are suitable or if they will bring other problems. I hope the official can come up with a solution or help evaluate these two solutions.

the first way: Do not use full index read to eliminate range scan

--- a/sql/range_optimizer/range_optimizer.cc
+++ b/sql/range_optimizer/range_optimizer.cc
@@ -616,7 +616,6 @@ int test_quick_select(THD *thd, MEM_ROOT *return_mem_root,
   const bool index_merge_intersect_allowed =
       index_merge_allowed &&
       thd->optimizer_switch_flag(OPTIMIZER_SWITCH_INDEX_MERGE_INTERSECT);

   /* Calculate cost of full index read for the shortest covering index */
   if (!table->covering_keys.is_clear_all() &&
       /*
@@ -624,7 +623,7 @@ int test_quick_select(THD *thd, MEM_ROOT *return_mem_root,
         then skip calculating index scan cost.
       */
       !(thd->variables.optimizer_force_index_for_range && table->force_index)) {
-    int key_for_use = find_shortest_key(table, &table->covering_keys);
+/*    int key_for_use = find_shortest_key(table, &table->covering_keys);
     // find_shortest_key() should return a valid key:
     assert(key_for_use != MAX_KEY);

@@ -645,8 +644,8 @@ int test_quick_select(THD *thd, MEM_ROOT *return_mem_root,
         .add("cost", key_read_time)
         .add("chosen", chosen);
     if (!chosen) trace_cov.add_alnum("cause", "cost");
+*/
   }

the second way: change the cost of range scan

--- a/storage/rocksdb/ha_rocksdb.cc
+++ b/storage/rocksdb/ha_rocksdb.cc
@@ -18245,7 +18245,7 @@ double ha_rocksdb::read_time(uint index, uint ranges, ha_rows rows) {
     DBUG_RETURN(handler::read_time(index, ranges, rows));
   }

-  DBUG_RETURN((rows / 20.0) + 1);
+  DBUG_RETURN((rows / 100.0) + 1);
 }


The above are just two simple ways to represent two solutions, after testing, both of them can solve the above problem
截屏2023-06-05 下午4 04 51

@luqun
Copy link
Contributor

luqun commented Jun 7, 2023

Thanks for reporting. We will investigate the issue and give update later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants