Background

Where does the show accounts cmd come from?
1. internal
  
  the CN MetricStorageUsage cron task gathers any new accounts every minute and all accounts every 15 minutes, and then writes them in metrics. but this cron task only exists in exactly one CN.
2. users
  
  users can launch show accounts from the SQL client at any time.
Who has the access right?
1. system account:
  
  at every minute, the sys account executes show accounts like xxx to collect storage usage of newly added accounts.
  
  at every 15 minutes, the sys account executes show accounts to collect storage usage of all accounts
2. normal account:
  
  any normal accounts can launch show accounts SQL, the effect equals to the sys account executes show accounts like xxx.
Storage Usage Definition

we only consider these data that are ready to flush to remote or already persistent in remote.

The Current Implementation

using show accounts to get all storage usage info for all accounts.

mysql> show accounts;
+--------------+------------+---------------------+--------+----------------+----------+-------------+-----------+----------+----------------+
| account_name | admin_name | created             | status | suspended_time | db_count | table_count | row_count | size     | comment        |
+--------------+------------+---------------------+--------+----------------+----------+-------------+-----------+----------+----------------+
| tenant_test  | admin_1    | 2023-10-06 02:43:34 | open   | NULL           |        5 |          45 |      1157 |    0.066 |                |
| tenant_2     | admin_2    | 2023-10-06 02:44:21 | open   | NULL           |        5 |          45 |      1157 |    0.066 |                |
| tenant_3     | admin_3    | 2023-10-06 02:44:33 | open   | NULL           |        5 |          45 |      1157 |    0.066 |                |
| sys          | root       | 2023-09-28 09:44:41 | open   | NULL           |        7 |          82 |  32351824 | 3555.283 | system account |
+--------------+------------+---------------------+--------+----------------+----------+-------------+-----------+----------+----------------+
4 rows in set (0.03 sec)

How does show accounts work?

after the frontend receives show accounts, it starts a transaction.

the first step is to get related account info:

SELECT
	account_id AS `account_id`,
 	account_name AS `account_name`,
 	created_time AS `created`,
 	status AS `status`,
 	suspended_time AS `suspended_time`,
 	comments AS `comment`
FROM
 	mo_catalog.mo_account %s;

for every account, it uses the built-in functions mo_table_rows and mo_table_size to get account storage usage info, and then calculates the results with in-memory writes to get the final result.

SELECT
    mu2.user_name AS `admin_name`,
    COUNT(DISTINCT mt.reldatabase) AS `db_count`,
    COUNT(DISTINCT mt.relname) AS `table_count`,
    SUM(mo_table_rows(mt.reldatabase, mt.relname)) AS `row_count`,
    CAST(SUM(mo_table_size(mt.reldatabase, mt.relname)) / 1048576 AS DECIMAL(29, 3)) AS `size`
FROM
    mo_catalog.mo_tables AS mt
JOIN
    (
        SELECT
            MIN(user_id) AS `min_user_id`
        FROM
            mo_catalog.mo_user
    ) AS mu1 ON mu2.user_id = mu1.min_user_id
JOIN
    mo_catalog.mo_user AS mu2 ON 
WHERE
    mt.relkind != '%s' AND mt.account_id = %d;

How much does the show accounts cost?

the cost depends on the number of accounts and how many tables an account has:

the join operation could be time-consuming
massive subscriptions could degrade CN performance

mo_table_size
mo_table_rows
 |
 |
 |-- tb_1  --
 |-- tb_2     \
 |-- tb_3       ====> UpdateBlockInfos ===> SubscribeTable
 |            /
 |-- tb_n  --

The Optimization Plan

in summary, using the checkpoint process, we can save CN's performance.

Checkpoint Overview

catalog
|
---------------------------------------------------------
        |               |               |       ...
     db_entry        db_entry        db_entry
                       |
                       |
                      ----------------------------------------
                                |        |         |     ...
                            tb_entry  tb_entry  tb_entr
                                |
                                |
                               --------------------------------
                                        |         |        ...
                                    seg_entry  seg_entry
                                        |
                                        |
                 ------------------------------------------------------------------------------
                        |       |         |       |        |      |       |         |      |
                       blk     blk       blk     blk      blk    blk     blk       blk    blk
                   |---------------|    |-----------|    |--------------------|
                   s1    ckp1     e2    s2   ckp2   e2   s3      ckp3        e3
           |----------------------------------------|
           -∞             global ckp                e2

Whenever a block is created or deleted, its metadata is recorded in the catalog. The checkpoint runner periodically collects this metadata and logs it in the checkpoint.

Implementation Detail

so we also could record the table size and row counts changing increasingly, we only need to add an extra batch to ckp, like:

   type AccountInfoBatch struct {
       // { accout_id, database_id, table_id, table_total_flushed_size }
       Attrs   []string 
       Vecs    []Vector
       //...
   }

when the user wants to know its storage usage, it requests CN --> TN, and TN returns a global ckp and a list of incremental checkpoints. the CN needs to decode these batches stored in checkpoints and combine them with the blocks that are ready to flush. the codes like:

for ckp in [global_ckp, incremental_ckps] {
    for bat in ckp {
        size[bat.acc_id] += bat.size
    }
}
    
for blk in blk_not_checkpoint_yet {
    size[data.acc_id] += data.size
}

Interface Compaitable:

result interface

// res batch
type Batch struct {
    // account_name     
    // admin_name	
    // created          
    // status           
    // suspendedTime    
    // comment
    // db_count	
    // table_account	// join mo_user, mo_tables, mo_account

    // size (MB)	// comes from ckp
    Attrs []string

    Vecs  []*vector.Vector
}

Handle Pipeline

we can use the log tail pull pipeline
```
OpCode_OpGetLogTail
```
we can use the debug pipeline (dedicated for mo_ctl)
```
TxnMethod_DEBUG
```

type SyncLogTailResp struct {
   CkpLocation []Location
   Commands    []BlockEntry
}

Others Considerations 1. timeliness

   by default, will schedule increment ckp each 5s and global ckp each 100 increment ckps have done.

   and ckp is also constrained by the minimal updates count restriction.

   but we can scan the catalog to collect block metadata that has not been checkpoint.

   we are trying to update storage usage info every 5 minutes.

3. cost
    
    * time: 1) decode serval ckps; 2) join 3 system tables.
    * space: need extra 8B + 8B + 8B + 8B = 32B for each blk update record in ckp

4. accurate
   
we only consider blocks that are already flushed to the remote or ready to flush.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage_usage.md

storage_usage.md

Files

storage_usage.md

Latest commit

History

storage_usage.md

File metadata and controls