Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implemented a new sql explain analyze graphical #16543

Merged
merged 11 commits into from
Oct 20, 2024

Conversation

Maricaya
Copy link
Contributor

@Maricaya Maricaya commented Sep 29, 2024

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary:

This PR introduces a new EXPLAIN ANALYZE GRAPHICAL command, which provides a graphical visualization of SQL performance analysis. It allows users to better understand query execution and optimize performance by providing graphical insights into query statistics, table scans, and partitioning details.

Example:

You can test this feature with the following curl request:

curl -X POST http://localhost:8000/v1/query \
    -H "Content-Type: application/json" \
    -H "Authorization: Basic cm9vdDo=" \
    -d '{
        "sql": "explain analyze graphical select * from price",
        "pagination": {
            "page": 1,
            "page_size": 10
        }
    }'

Expected Output: The output will be in a graphical JSON format containing detailed statistics about query execution, table scans, partitions, and other metrics, such as:

{
  "query_id": "1db1833b-a17d-4b98-a88f-07d95b056a65",
  "profiles": [
    {
      "id": 0,
      "name": "TableScan",
      "title": "default.'default'.'price'",
      "labels": [
        {
          "name": "Full table name",
          "value": ["default.'default'.'price'"]
        },
        {
          "name": "Total partitions",
          "value": ["1"]
        },
        {
          "name": "Columns (5 / 5)",
          "value": ["currency", "id", "price", "price_date", "product_name"]
        }
      ],
      "statistics": [21974328, 0, 0, 0, 5, 484, 484, 0, 0, 0, 0],
      "metrics": {
        "iOgkqzvCtFhHcifLXPXS52": [
          {
            "name": "opendal_requests_total",
            "value": {"Counter": 1.0}
          },
          {
            "name": "opendal_request_duration_seconds_sum",
            "value": {"Untyped": 0.000264774}
          }
        ]
      }
    }
//...
  ],
 "statistics_desc": {
            "CpuTime": {
                "desc": "The time spent to process in nanoseconds",
                "display_name": "cpu time",
                "index": 0,
                "plain_statistics": false,
                "unit": "NanoSeconds"
            },
            "ExchangeBytes": {
                "desc": "The number of data bytes exchange between nodes in cluster mode",
                "display_name": "exchange bytes",
                "index": 3,
                "plain_statistics": true,
                "unit": "Bytes"
            },
            "ExchangeRows": {
                "desc": "The number of data rows exchange between nodes in cluster mode",
                "display_name": "exchange rows",
                "index": 2,
                "plain_statistics": true,
                "unit": "Rows"
            },
            "MemoryUsage": {
                "desc": "The real time memory usage",
                "display_name": "memory usage",
                "index": 16,
                "plain_statistics": false,
                "unit": "Bytes"
            },
            "OutputBytes": {
                "desc": "The number of bytes from the physical plan output to the next physical plan",
                "display_name": "output bytes",
                "index": 5,
                "plain_statistics": true,
                "unit": "Bytes"
            },
            "OutputRows": {
                "desc": "The number of rows from the physical plan output to the next physical plan",
                "display_name": "output rows",
                "index": 4,
                "plain_statistics": true,
                "unit": "Rows"
            },
            "RuntimeFilterPruneParts": {
                "desc": "The partitions pruned by runtime filter",
                "display_name": "parts pruned by runtime filter",
                "index": 15,
                "plain_statistics": true,
                "unit": "Count"
            },
            "ScanBytes": {
                "desc": "The bytes scanned of query",
                "display_name": "bytes scanned",
                "index": 6,
                "plain_statistics": true,
                "unit": "Bytes"
            },
            "ScanCacheBytes": {
                "desc": "The bytes scanned from cache of query",
                "display_name": "bytes scanned from cache",
                "index": 7,
                "plain_statistics": true,
                "unit": "Bytes"
            },
            "ScanPartitions": {
                "desc": "The partitions scanned of query",
                "display_name": "partitions scanned",
                "index": 8,
                "plain_statistics": true,
                "unit": "Count"
            },
            "SpillReadBytes": {
                "desc": "The bytes spilled by read",
                "display_name": "bytes spilled by read",
                "index": 13,
                "plain_statistics": true,
                "unit": "Bytes"
            },
            "SpillReadCount": {
                "desc": "The number of spilled by read",
                "display_name": "numbers spilled by read",
                "index": 12,
                "plain_statistics": true,
                "unit": "Count"
            },
            "SpillReadTime": {
                "desc": "The time spent to read spill in millisecond",
                "display_name": "spilled time by read",
                "index": 14,
                "plain_statistics": false,
                "unit": "MillisSeconds"
            },
            "SpillWriteBytes": {
                "desc": "The bytes spilled by write",
                "display_name": "bytes spilled by write",
                "index": 10,
                "plain_statistics": true,
                "unit": "Bytes"
            },
            "SpillWriteCount": {
                "desc": "The number of spilled by write",
                "display_name": "numbers spilled by write",
                "index": 9,
                "plain_statistics": true,
                "unit": "Count"
            },
            "SpillWriteTime": {
                "desc": "The time spent to write spill in millisecond",
                "display_name": "spilled time by write",
                "index": 11,
                "plain_statistics": false,
                "unit": "MillisSeconds"
            },
            "WaitTime": {
                "desc": "The time spent to wait in nanoseconds, usually used to measure the time spent on waiting for I/O",
                "display_name": "wait time",
                "index": 1,
                "plain_statistics": false,
                "unit": "NanoSeconds"
            }
        }
}

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Sep 29, 2024
@Maricaya Maricaya marked this pull request as ready for review September 29, 2024 06:30
@Maricaya
Copy link
Contributor Author

@zhang2014 The contributor is from the Open Source Promotion Plan 2024.
I'm so sorry to have to close the previous PR due to my missteps. I'd be really grateful if you could take a look at this one instead.

@dqhl76
Copy link
Collaborator

dqhl76 commented Sep 29, 2024

Hi @Maricaya, it seems the test failed because we need to run cargo +nightly fmt to ensure the code is well-formatted and clean. When you have a moment, you might want to run make lint to apply all the linting checks. Thank you!

@Maricaya Maricaya requested a review from zhang2014 September 29, 2024 16:10
@Maricaya Maricaya force-pushed the graphical branch 5 times, most recently from dc6a71f to 6296db8 Compare October 15, 2024 17:51
@sundy-li sundy-li added this pull request to the merge queue Oct 20, 2024
Merged via the queue into databendlabs:main with commit b1538fa Oct 20, 2024
72 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-feature this PR introduces a new feature to the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants