Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support real-time retrieval of profiles from admin API (part 1) #15958

Merged
merged 4 commits into from
Jul 3, 2024

Conversation

dqhl76
Copy link
Collaborator

@dqhl76 dqhl76 commented Jul 3, 2024

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

Support get PlanProfile from admin API if the query is still running: /v1/queries/:query_id/profiling

This PR is part one. It only support the running query. If the query finish very quickly, it cannot get the result.

For the next part, It will maintain a cache queue for the latest 50 (could be config) queries's profile.

Example: GET http://localhost:8080/v1/queries/46f89e47-9e27-4ca9-b87c-868d80678ec0/profiling

RESPONSE:

{ "query_id": "46f89e47-9e27-4ca9-b87c-868d80678ec0", "profiles": [ { "id": 6, "name": "AggregatePartial", "parent_id": 4, "title": "max(number), sum(number)", "labels": [ { "name": "Aggregate Functions", "value": [ "max(number)", "sum(number)" ] }, { "name": "Grouping keys", "value": [ "numbers_mt.number (#0) % 3", "numbers_mt.number (#0) % 4", "numbers_mt.number (#0) % 5" ] } ], "statistics": [ 111359276793, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 23170576 ], "errors": [
  ]
},
{
  "id": 7,
  "name": "EvalScalar",
  "parent_id": 6,
  "title": "numbers_mt.number (#0) % 3, numbers_mt.number (#0) % 4, numbers_mt.number (#0) % 5",
  "labels": [
    {
      "name": "List of Expressions",
      "value": [
        "numbers_mt.number (#0) % 3",
        "numbers_mt.number (#0) % 4",
        "numbers_mt.number (#0) % 5"
      ]
    }
  ],
  "statistics": [
    17133330215,
    0,
    0,
    0,
    68157440,
    749731840,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    1007472
  ],
  "errors": [
    
  ]
},
{
  "id": 1,
  "name": "Limit",
  "parent_id": null,
  "title": "LIMIT 5 OFFSET 0",
  "labels": [
    {
      "name": "Number of rows",
      "value": [
        "5"
      ]
    },
    {
      "name": "Offset",
      "value": [
        "0"
      ]
    }
  ],
  "statistics": [
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    32
  ],
  "errors": [
    
  ]
},
{
  "id": 4,
  "name": "AggregateFinal",
  "parent_id": 3,
  "title": "max(number), sum(number)",
  "labels": [
    {
      "name": "Aggregate Functions",
      "value": [
        "max(number)",
        "sum(number)"
      ]
    },
    {
      "name": "Grouping keys",
      "value": [
        "numbers_mt.number (#0) % 3",
        "numbers_mt.number (#0) % 4",
        "numbers_mt.number (#0) % 5"
      ]
    }
  ],
  "statistics": [
    63958,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    2492
  ],
  "errors": [
    
  ]
},
{
  "id": 3,
  "name": "Limit",
  "parent_id": 1,
  "title": "LIMIT 5 OFFSET 0",
  "labels": [
    {
      "name": "Number of rows",
      "value": [
        "5"
      ]
    },
    {
      "name": "Offset",
      "value": [
        "0"
      ]
    }
  ],
  "statistics": [
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    160
  ],
  "errors": [
    
  ]
},
{
  "id": 8,
  "name": "TableScan",
  "parent_id": 7,
  "title": "default.''.'numbers_mt'",
  "labels": [
    {
      "name": "Full table name",
      "value": [
        "default.''.'numbers_mt'"
      ]
    },
    {
      "name": "Total partitions",
      "value": [
        "15259"
      ]
    },
    {
      "name": "Columns (1 / 1)",
      "value": [
        "number"
      ]
    }
  ],
  "statistics": [
    2985702728,
    0,
    0,
    0,
    68157440,
    545259520,
    545259520,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    523968
  ],
  "errors": []
}

],
"statistics_desc": {
"CpuTime": {
"desc": "The time spent to process in nanoseconds",
"display_name": "cpu time",
"index": 0,
"unit": "NanoSeconds",
"plain_statistics": false
},
"WaitTime": {
"desc": "The time spent to wait in nanoseconds, usually used to measure the time spent on waiting for I/O",
"display_name": "wait time",
"index": 1,
"unit": "NanoSeconds",
"plain_statistics": false
},
"ExchangeRows": {
"desc": "The number of data rows exchange between nodes in cluster mode",
"display_name": "exchange rows",
"index": 2,
"unit": "Rows",
"plain_statistics": true
},
"ExchangeBytes": {
"desc": "The number of data bytes exchange between nodes in cluster mode",
"display_name": "exchange bytes",
"index": 3,
"unit": "Bytes",
"plain_statistics": true
},
"OutputRows": {
"desc": "The number of rows from the physical plan output to the next physical plan",
"display_name": "output rows",
"index": 4,
"unit": "Rows",
"plain_statistics": true
},
"OutputBytes": {
"desc": "The number of bytes from the physical plan output to the next physical plan",
"display_name": "output bytes",
"index": 5,
"unit": "Bytes",
"plain_statistics": true
},
"ScanBytes": {
"desc": "The bytes scanned of query",
"display_name": "bytes scanned",
"index": 6,
"unit": "Bytes",
"plain_statistics": true
},
"ScanCacheBytes": {
"desc": "The bytes scanned from cache of query",
"display_name": "bytes scanned from cache",
"index": 7,
"unit": "Bytes",
"plain_statistics": true
},
"ScanPartitions": {
"desc": "The partitions scanned of query",
"display_name": "partitions scanned",
"index": 8,
"unit": "Count",
"plain_statistics": true
},
"SpillWriteCount": {
"desc": "The number of spilled by write",
"display_name": "numbers spilled by write",
"index": 9,
"unit": "Count",
"plain_statistics": true
},
"SpillWriteBytes": {
"desc": "The bytes spilled by write",
"display_name": "bytes spilled by write",
"index": 10,
"unit": "Bytes",
"plain_statistics": true
},
"SpillWriteTime": {
"desc": "The time spent to write spill in millisecond",
"display_name": "spilled time by write",
"index": 11,
"unit": "MillisSeconds",
"plain_statistics": false
},
"SpillReadCount": {
"desc": "The number of spilled by read",
"display_name": "numbers spilled by read",
"index": 12,
"unit": "Count",
"plain_statistics": true
},
"SpillReadBytes": {
"desc": "The bytes spilled by read",
"display_name": "bytes spilled by read",
"index": 13,
"unit": "Bytes",
"plain_statistics": true
},
"SpillReadTime": {
"desc": "The time spent to read spill in millisecond",
"display_name": "spilled time by read",
"index": 14,
"unit": "MillisSeconds",
"plain_statistics": false
},
"RuntimeFilterPruneParts": {
"desc": "The partitions pruned by runtime filter",
"display_name": "parts pruned by runtime filter",
"index": 15,
"unit": "Count",
"plain_statistics": true
},
"MemoryUsage": {
"desc": "The real time memory usage",
"display_name": "memory usage",
"index": 16,
"unit": "Bytes",
"plain_statistics": false
}
}
}

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Will introduce in next PR

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

Signed-off-by: Liuqing Yue <dqhl76@gmail.com>
@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Jul 3, 2024
Signed-off-by: Liuqing Yue <dqhl76@gmail.com>
@dqhl76 dqhl76 marked this pull request as ready for review July 3, 2024 08:41
dqhl76 added 2 commits July 3, 2024 17:19
Signed-off-by: Liuqing Yue <dqhl76@gmail.com>
Signed-off-by: Liuqing Yue <dqhl76@gmail.com>
@dqhl76 dqhl76 changed the title feat: support get profile from admin API(part 1) feat: support get profile from admin API (part 1) Jul 3, 2024
@dqhl76 dqhl76 changed the title feat: support get profile from admin API (part 1) feat: support real-time retrieval of profiles from admin API (part 1) Jul 3, 2024
@dqhl76 dqhl76 added this pull request to the merge queue Jul 3, 2024
Merged via the queue into databendlabs:main with commit 657b827 Jul 3, 2024
84 checks passed
@dqhl76 dqhl76 deleted the realtime-profile branch July 3, 2024 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-feature this PR introduces a new feature to the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants