Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotations: Change indices and rewrites annotation find query to improve database query performance #21915

Merged
merged 8 commits into from
Feb 5, 2020

Conversation

papagian
Copy link
Contributor

@papagian papagian commented Feb 4, 2020

What this PR does / why we need it:
Drop indices and create new ones and rewrites annotation find query
to address performance issues when querying annotation table and
there is a large amount of rows.

The query is used in the following cases:

Which issue(s) this PR fixes:
Fixes #21902

Special notes for your reviewer:

Co-Authored-By: Marcus Efraimsson <marcus.efraimsson@gmail.com>
@papagian papagian changed the title Modify indices on the annotation table SQLStore: Modify indices on the annotation table Feb 4, 2020
@papagian
Copy link
Contributor Author

papagian commented Feb 4, 2020

The generated query for the above two cases is:

SELECT \
    annotation.id, \
    annotation.epoch as time, \
    annotation.epoch_end as time_end, \
    annotation.dashboard_id, \
    annotation.panel_id, \
    annotation.new_state, \
    annotation.prev_state, \
    annotation.alert_id, \
    annotation.text, \
    annotation.tags, \
    annotation.data, \
    annotation.created, \
    annotation.updated, \
    usr.email, \
    usr.login, \
    alert.name as alert_name \
FROM annotation \
LEFT OUTER JOIN user as usr on usr.id = annotation.user_id \
LEFT OUTER JOIN alert on alert.id = annotation.alert_id \
WHERE annotation.org_id = <org_id> \
AND annotation.dashboard_id = <dashboard_id> \
AND annotation.panel_id = <panel_id> \
AND annotation.alert_id > 0 \
ORDER BY epoch DESC LIMIT <limit>;

@papagian
Copy link
Contributor Author

papagian commented Feb 4, 2020

The generated queries for this one can be:

SELECT \
    annotation.id, \
    annotation.epoch as time, \
    annotation.epoch_end as time_end, \
    annotation.dashboard_id, \
    annotation.panel_id, \
    annotation.new_state, \
    annotation.prev_state, \
    annotation.alert_id, \
    annotation.text, \
    annotation.tags, \
    annotation.data, \
    annotation.created, \
    annotation.updated, \
    usr.email, \
    usr.login, \
    alert.name as alert_name \
FROM annotation \
LEFT OUTER JOIN user as usr on usr.id = annotation.user_id \
LEFT OUTER JOIN alert on alert.id = annotation.alert_id \
WHERE annotation.org_id = 1 \
AND annotation.epoch <= <timestamp> AND annotation.epoch_end >= <timestamp> \
AND annotation.dashboard_id = <dashboard_id> \
ORDER BY epoch DESC LIMIT <limit>;

or

SELECT \
    annotation.id, \
    annotation.epoch as time, \
    annotation.epoch_end as time_end, \
    annotation.dashboard_id, \
    annotation.panel_id, \
    annotation.new_state, \
    annotation.prev_state, \
    annotation.alert_id, \
    annotation.text, \
    annotation.tags, \
    annotation.data, \
    annotation.created, \
    annotation.updated, \
    usr.email, \
    usr.login, \
    alert.name as alert_name \
FROM annotation \
LEFT OUTER JOIN user as usr on usr.id = annotation.user_id \
LEFT OUTER JOIN alert on alert.id = annotation.alert_id \
WHERE annotation.org_id = 1 \
AND annotation.epoch <= <timestamp> AND annotation.epoch_end >= <timestamp> \
AND (
    SELECT SUM(1) FROM annotation_tag at
        INNER JOIN tag on tag.id = at.tag_id
        WHERE at.annotation_id = annotation.id
        AND (
            tag.key = "<tag_key>" OR tag.value = "<tag_value>"
        )
) > 0 ORDER BY epoch DESC LIMIT <limit>;

or

SELECT \
    annotation.id, \
    annotation.epoch as time, \
    annotation.epoch_end as time_end, \
    annotation.dashboard_id, \
    annotation.panel_id, \
    annotation.new_state, \
    annotation.prev_state, \
    annotation.alert_id, \
    annotation.text, \
    annotation.tags, \
    annotation.data, \
    annotation.created, \
    annotation.updated, \
    usr.email, \
    usr.login, \
    alert.name as alert_name \
FROM annotation \
LEFT OUTER JOIN user as usr on usr.id = annotation.user_id \
LEFT OUTER JOIN alert on alert.id = annotation.alert_id \
WHERE annotation.org_id = 1 \
AND annotation.epoch <= <timestamp> AND annotation.epoch_end >= <timestamp> \
AND (
    SELECT SUM(1) FROM annotation_tag at
        INNER JOIN tag on tag.id = at.tag_id
        WHERE at.annotation_id = annotation.id
        AND (
            tag.key = "<tag_key>" OR tag.value = "<tag_value>"
        )
) = <tag_length> ORDER BY epoch DESC LIMIT <limit>;

@papagian papagian closed this Feb 4, 2020
@papagian papagian reopened this Feb 4, 2020
@papagian
Copy link
Contributor Author

papagian commented Feb 4, 2020

The generated queries for this one can be:

SELECT \
    annotation.id, \
    annotation.epoch as time, \
    annotation.epoch_end as time_end, \
    annotation.dashboard_id, \
    annotation.panel_id, \
    annotation.new_state, \
    annotation.prev_state, \
    annotation.alert_id, \
    annotation.text, \
    annotation.tags, \
    annotation.data, \
    annotation.created, \
    annotation.updated, \
    usr.email, \
    usr.login, \
    alert.name as alert_name \
FROM annotation \
LEFT OUTER JOIN user as usr on usr.id = annotation.user_id \
LEFT OUTER JOIN alert on alert.id = annotation.alert_id \
WHERE annotation.org_id = 1 \
AND annotation.epoch <= <timestamp> AND annotation.epoch_end >= <timestamp> \
AND annotation.alert_id > 0 \
ORDER BY epoch DESC LIMIT <limit>;

or

SELECT \
    annotation.id, \
    annotation.epoch as time, \
    annotation.epoch_end as time_end, \
    annotation.dashboard_id, \
    annotation.panel_id, \
    annotation.new_state, \
    annotation.prev_state, \
    annotation.alert_id, \
    annotation.text, \
    annotation.tags, \
    annotation.data, \
    annotation.created, \
    annotation.updated, \
    usr.email, \
    usr.login, \
    alert.name as alert_name \
FROM annotation \
LEFT OUTER JOIN user as usr on usr.id = annotation.user_id \
LEFT OUTER JOIN alert on alert.id = annotation.alert_id \
WHERE annotation.org_id = 1 \
AND annotation.epoch <= <timestamp> AND annotation.epoch_end >= <timestamp> \
AND annotation.alert_id = 0 \
ORDER BY epoch DESC LIMIT <limit>;

or

SELECT \
    annotation.id, \
    annotation.epoch as time, \
    annotation.epoch_end as time_end, \
    annotation.dashboard_id, \
    annotation.panel_id, \
    annotation.new_state, \
    annotation.prev_state, \
    annotation.alert_id, \
    annotation.text, \
    annotation.tags, \
    annotation.data, \
    annotation.created, \
    annotation.updated, \
    usr.email, \
    usr.login, \
    alert.name as alert_name \
FROM annotation \
LEFT OUTER JOIN user as usr on usr.id = annotation.user_id \
LEFT OUTER JOIN alert on alert.id = annotation.alert_id \
WHERE annotation.org_id = 1 \
AND annotation.epoch <= <timestamp> AND annotation.epoch_end >= <timestamp> \
AND annotation.alert_id > 0 \
AND annotation.dashboard_id = <dashboard_id> \
ORDER BY epoch DESC LIMIT <limit>;

or

SELECT \
    annotation.id, \
    annotation.epoch as time, \
    annotation.epoch_end as time_end, \
    annotation.dashboard_id, \
    annotation.panel_id, \
    annotation.new_state, \
    annotation.prev_state, \
    annotation.alert_id, \
    annotation.text, \
    annotation.tags, \
    annotation.data, \
    annotation.created, \
    annotation.updated, \
    usr.email, \
    usr.login, \
    alert.name as alert_name \
FROM annotation \
LEFT OUTER JOIN user as usr on usr.id = annotation.user_id \
LEFT OUTER JOIN alert on alert.id = annotation.alert_id \
WHERE annotation.org_id = 1 \
AND annotation.epoch <= <timestamp> AND annotation.epoch_end >= <timestamp> \
AND annotation.alert_id = 0 \
AND annotation.dashboard_id = <dashboard_id> \
ORDER BY epoch DESC LIMIT <limit>;

@marefr
Copy link
Contributor

marefr commented Feb 4, 2020

The generated query (using updated query) for the above two cases is:

SELECT
  annotation.id,
  annotation.epoch as time,
  annotation.epoch_end as time_end,
  annotation.dashboard_id,
  annotation.panel_id,
  annotation.new_state,
  annotation.prev_state,
  annotation.alert_id,
  annotation.text,
  annotation.tags,
  annotation.data,
  annotation.created,
  annotation.updated,
  usr.email,
  usr.login,
  alert.name as alert_name
FROM annotation
LEFT OUTER JOIN user as usr on usr.id = annotation.user_id
LEFT OUTER JOIN alert on alert.id = annotation.alert_id
INNER JOIN (
  SELECT a.id from annotation a
    WHERE a.org_id = <org_id>
    AND a.dashboard_id = <dashboard_id>
    AND a.panel_id = <panel_id>
    AND a.alert_id > 0
    ORDER BY a.org_id, a.epoch_end DESC, a.epoch DESC LIMIT <limit>
) dt 
ON dt.id = annotation.id

The generated queries (using updated query) for this one can be:

SELECT
  annotation.id,
  annotation.epoch as time,
  annotation.epoch_end as time_end,
  annotation.dashboard_id,
  annotation.panel_id,
  annotation.new_state,
  annotation.prev_state,
  annotation.alert_id,
  annotation.text,
  annotation.tags,
  annotation.data,
  annotation.created,
  annotation.updated,
  usr.email,
  usr.login,
  alert.name as alert_name
FROM annotation
LEFT OUTER JOIN user as usr on usr.id = annotation.user_id
LEFT OUTER JOIN alert on alert.id = annotation.alert_id
INNER JOIN (
  SELECT a.id from annotation a
    WHERE a.org_id = <org_id>
    AND a.epoch <= <timestamp> AND a.epoch_end >= <timestamp>
    AND a.dashboard_id = <dashboard_id>
  ORDER BY a.org_id, a.epoch_end DESC, a.epoch DESC LIMIT <limit>
) dt 
ON dt.id = annotation.id

or

SELECT
    annotation.id,
    annotation.epoch as time,
    annotation.epoch_end as time_end,
    annotation.dashboard_id,
    annotation.panel_id,
    annotation.new_state,
    annotation.prev_state,
    annotation.alert_id,
    annotation.text,
    annotation.tags,
    annotation.data,
    annotation.created,
    annotation.updated,
    usr.email,
    usr.login,
    alert.name as alert_name
FROM annotation
LEFT OUTER JOIN user as usr on usr.id = annotation.user_id
LEFT OUTER JOIN alert on alert.id = annotation.alert_id
INNER JOIN (
  SELECT a.id from annotation a
  WHERE a.org_id = <org_id>
  AND a.epoch <= <timestamp> AND a.epoch_end >= <timestamp>
  AND (
    SELECT SUM(1) FROM annotation_tag at
      INNER JOIN tag on tag.id = at.tag_id
    WHERE at.annotation_id = a.id
    AND (
      tag.key = "<tag_key>" OR tag.value = "<tag_value>"
    )
  ) > 0 
  ORDER BY a.org_id, a.epoch_end DESC, a.epoch DESC LIMIT <limit>
) dt 
ON dt.id = annotation.id

or

SELECT
  annotation.id,
  annotation.epoch as time,
  annotation.epoch_end as time_end,
  annotation.dashboard_id,
  annotation.panel_id,
  annotation.new_state,
  annotation.prev_state,
  annotation.alert_id,
  annotation.text,
  annotation.tags,
  annotation.data,
  annotation.created,
  annotation.updated,
  usr.email,
  usr.login,
  alert.name as alert_name
FROM annotation
LEFT OUTER JOIN user as usr on usr.id = annotation.user_id
LEFT OUTER JOIN alert on alert.id = annotation.alert_id
INNER JOIN (
  SELECT a.id from annotation a
  WHERE a.org_id = <org_id>
  AND a.epoch <= <timestamp> AND a.epoch_end >= <timestamp>
  AND (
    SELECT SUM(1) FROM annotation_tag at
      INNER JOIN tag on tag.id = at.tag_id
    WHERE at.annotation_id = annotation.id
      AND (
        tag.key = "<tag_key>" OR tag.value = "<tag_value>"
      )
  ) = <tag_length> 
  ORDER BY a.org_id, a.epoch_end DESC, a.epoch DESC LIMIT <limit>
) dt 
ON dt.id = annotation.id

The generated queries (using update query) for this one can be:

SELECT
  annotation.id,
  annotation.epoch as time,
  annotation.epoch_end as time_end,
  annotation.dashboard_id,
  annotation.panel_id,
  annotation.new_state,
  annotation.prev_state,
  annotation.alert_id,
  annotation.text,
  annotation.tags,
  annotation.data,
  annotation.created,
  annotation.updated,
  usr.email,
  usr.login,
  alert.name as alert_name
FROM annotation
LEFT OUTER JOIN user as usr on usr.id = annotation.user_id
LEFT OUTER JOIN alert on alert.id = annotation.alert_id
INNER JOIN (
  SELECT a.id from annotation a
  WHERE a.org_id = <org_id>
  AND a.epoch <= <timestamp> AND a.epoch_end >= <timestamp>
  AND a.alert_id > 0
  ORDER BY a.org_id, a.epoch_end DESC, a.epoch DESC LIMIT <limit>
) dt 
ON dt.id = annotation.id

or

SELECT
  annotation.id,
  annotation.epoch as time,
  annotation.epoch_end as time_end,
  annotation.dashboard_id,
  annotation.panel_id,
  annotation.new_state,
  annotation.prev_state,
  annotation.alert_id,
  annotation.text,
  annotation.tags,
  annotation.data,
  annotation.created,
  annotation.updated,
  usr.email,
  usr.login,
  alert.name as alert_name
FROM annotation
LEFT OUTER JOIN user as usr on usr.id = annotation.user_id
LEFT OUTER JOIN alert on alert.id = annotation.alert_id
INNER JOIN (
  SELECT a.id from annotation a
  WHERE a.org_id = <org_id>
  AND a.epoch <= <timestamp> AND a.epoch_end >= <timestamp>
  AND a.alert_id = 0
  ORDER BY a.org_id, a.epoch_end DESC, a.epoch DESC LIMIT <limit>
) dt 
ON dt.id = annotation.id

or

SELECT
  annotation.id,
  annotation.epoch as time,
  annotation.epoch_end as time_end,
  annotation.dashboard_id,
  annotation.panel_id,
  annotation.new_state,
  annotation.prev_state,
  annotation.alert_id,
  annotation.text,
  annotation.tags,
  annotation.data,
  annotation.created,
  annotation.updated,
  usr.email,
  usr.login,
  alert.name as alert_name
FROM annotation
LEFT OUTER JOIN user as usr on usr.id = annotation.user_id
LEFT OUTER JOIN alert on alert.id = annotation.alert_id
INNER JOIN (
  SELECT a.id from annotation a
  WHERE a.org_id = <org_id>
  AND a.epoch <= <timestamp> AND a.epoch_end >= <timestamp>
  AND a.alert_id > 0
  AND a.dashboard_id = <dashboard_id>
  ORDER BY a.org_id, a.epoch_end DESC, a.epoch DESC LIMIT <limit>
) dt 
ON dt.id = annotation.id

or

SELECT
  annotation.id,
  annotation.epoch as time,
  annotation.epoch_end as time_end,
  annotation.dashboard_id,
  annotation.panel_id,
  annotation.new_state,
  annotation.prev_state,
  annotation.alert_id,
  annotation.text,
  annotation.tags,
  annotation.data,
  annotation.created,
  annotation.updated,
  usr.email,
  usr.login,
  alert.name as alert_name
FROM annotation
LEFT OUTER JOIN user as usr on usr.id = annotation.user_id
LEFT OUTER JOIN alert on alert.id = annotation.alert_id
INNER JOIN (
  SELECT a.id from annotation a
  WHERE a.org_id = <org_id>
  AND a.epoch <= <timestamp> AND a.epoch_end >= <timestamp>
  AND a.alert_id = 0
  AND a.dashboard_id = <dashboard_id>
  ORDER BY a.org_id, a.epoch_end DESC, a.epoch DESC LIMIT <limit>
) dt 
ON dt.id = annotation.id

@marefr
Copy link
Contributor

marefr commented Feb 5, 2020

Load test results

Setup

  • Used the ha_test
  • Used and added the number generator from https://stackoverflow.com/a/9751493.
  • Provisioned 100 alert rules/dashboards using ./alerts.sh provision -a 100 -c 20 which creates 200 annotations, 100 pending, 100 alerting
  • Changed alert condition ./alerts.sh provision -a 100 -c 100 to make alerts transition to OK which creates 100 annotations
  • Based on these 300 annotations I generated 1,5 million annotations with 10 minutes difference each step going backward:
    • v6.6.0 and this branch: insert into annotation (org_id, alert_id, user_id, dashboard_id, panel_id, category_id, type, title, text, metric, prev_state, new_state, data, epoch, region_id, tags, created, updated, epoch_end) select at.org_id, at.alert_id, at.user_id, at.dashboard_id, at.panel_id, at.category_id, at.type, at.title, at.text, at.metric, at.prev_state, at.new_state, at.data, at.epoch -(n*600000), at.region_id, at.tags, at.created, at.updated, at.epoch_end - (n*600000) from annotation at cross join generator_64k WHERE n > 0 AND n <= 5000\g
    • v6.3.7: insert into annotation (org_id, alert_id, user_id, dashboard_id, panel_id, category_id, type, title, text, metric, prev_state, new_state, data, epoch, region_id, tags, created, updated) select at.org_id, at.alert_id, at.user_id, at.dashboard_id, at.panel_id, at.category_id, at.type, at.title, at.text, at.metric, at.prev_state, at.new_state, at.data, at.epoch -(n*600000), at.region_id, at.tags, at.created, at.updated from annotation at cross join generator_64k WHERE n > 0 AND n <= 5000;
  • Then running annotation_by_tag_test load test for 5 minutes and using 200 concurrent users which each iteration does one annotation by tag request (no dashboard filter and time range) together with 20 tsdb queries using test datasource

v6.3.7

Details
> ./run.sh -d 5m -v 200 -c annotations_by_tag_test -u http://grafana.loc

          /\      |‾‾|  /‾‾/  /‾/
     /\  /  \     |  |_/  /  / /
    /  \/    \    |      |  /  ‾‾\
   /          \   |  |\  \ | (_) |
  / __________ \  |__|  \__\ \___/ .io

  execution: local
     output: -
     script: src/annotations_by_tag_test.js

    duration: 5m0s, iterations: -
         vus: 200,  max: 200

    done [==========================================================] 5m0s / 5m0s

    █ annotation by tag test

      █ user authenticates thru ui with username and password

        ✓ response status is 200
        ✓ response has cookie 'grafana_session' with 32 characters

      █ batch tsdb requests with annotations by tag

        ✗ response status is 200
         ↳  99% — ✓ 39999 / ✗ 1

    checks.....................: 99.99% ✓ 40399 ✗ 1
    data_received..............: 546 MB 1.8 MB/s
    data_sent..................: 14 MB  48 kB/s
    group_duration.............: avg=21.78s   min=134.87ms med=23.35s   max=46.48s   p(90)=29.45s  p(95)=31.4s
    http_req_blocked...........: avg=569.08µs min=1.34µs   med=4.49µs   max=222.78ms p(90)=10.17µs p(95)=32.75µs
    http_req_connecting........: avg=541.68µs min=0s       med=0s       max=222.68ms p(90)=0s      p(95)=0s
    http_req_duration..........: avg=2.05s    min=2.83ms   med=629.55ms max=46.29s   p(90)=3.32s   p(95)=7.59s
    http_req_receiving.........: avg=4.38ms   min=24.6µs   med=904.97µs max=251.85ms p(90)=8.55ms  p(95)=16.71ms
    http_req_sending...........: avg=187.92µs min=7.3µs    med=26.12µs  max=350.09ms p(90)=50.85µs p(95)=100.47µs
    http_req_tls_handshaking...: avg=0s       min=0s       med=0s       max=0s       p(90)=0s      p(95)=0s
    http_req_waiting...........: avg=2.05s    min=2.46ms   med=623.86ms max=46.29s   p(90)=3.31s   p(95)=7.58s
    http_reqs..................: 44344  147.8133/s
    iteration_duration.........: avg=26.7s    min=5.18µs   med=28.28s   max=51.48s   p(90)=34.54s  p(95)=36.53s
    iterations.................: 2135   7.116665/s
    vus........................: 200    min=200 max=200
    vus_max....................: 200    min=200 max=200

image
image
image
image

v6.6.0

Details
> ./run.sh -d 5m -v 200 -c annotations_by_tag_test -u http://grafana.loc

          /\      |‾‾|  /‾‾/  /‾/
     /\  /  \     |  |_/  /  / /
    /  \/    \    |      |  /  ‾‾\
   /          \   |  |\  \ | (_) |
  / __________ \  |__|  \__\ \___/ .io

  execution: local
     output: -
     script: src/annotations_by_tag_test.js

    duration: 15m0s, iterations: -
         vus: 200,   max: 200

    done [==========================================================] 5m0.3s / 15m0s

    █ annotation by tag test

      █ user authenticates thru ui with username and password

        ✓ response has cookie 'grafana_session' with 32 characters
        ✓ response status is 200

      █ batch tsdb requests with annotations by tag

        ✗ response status is 200
         ↳  86% — ✓ 3608 / ✗ 552

    checks.....................: 87.89% ✓ 4008  ✗ 552
    data_received..............: 49 MB  165 kB/s
    data_sent..................: 2.6 MB 8.6 kB/s
    group_duration.............: avg=47.51s   min=622.61ms med=1m0s     max=3m48s   p(90)=1m51s    p(95)=1m59s
    http_req_blocked...........: avg=26.89ms  min=1.43µs   med=4.37µs   max=2.55s   p(90)=2.89ms   p(95)=16.96ms
    http_req_connecting........: avg=26.36ms  min=0s       med=0s       max=2.55s   p(90)=2.45ms   p(95)=15.89ms
    http_req_duration..........: avg=35.07s   min=13.79ms  med=38.14s   max=1m2s    p(90)=1m0s     p(95)=1m0s
    http_req_receiving.........: avg=801.75µs min=18.01µs  med=149.15µs max=69.79ms p(90)=1.45ms   p(95)=3.14ms
    http_req_sending...........: avg=17.06ms  min=6.78µs   med=24.79µs  max=2.51s   p(90)=138.96µs p(95)=606.14µs
    http_req_tls_handshaking...: avg=0s       min=0s       med=0s       max=0s      p(90)=0s       p(95)=0s
    http_req_waiting...........: avg=35.05s   min=13.59ms  med=38.13s   max=1m2s    p(90)=1m0s     p(95)=1m0s
    http_reqs..................: 7786   25.924376/s
    iteration_duration.........: avg=50.77s   min=2.62s    med=1m5s     max=3m47s   p(90)=1m55s    p(95)=2m3s
    iterations.................: 404    1.345164/s
    vus........................: 200    min=200 max=200
    vus_max....................: 200    min=200 max=200

image
image
image

Changes in this branch (index changes)

Details
> ./run.sh -d 5m -v 200 -c annotations_by_tag_test -u http://grafana.loc

          /\      |‾‾|  /‾‾/  /‾/   
     /\  /  \     |  |_/  /  / /    
    /  \/    \    |      |  /  ‾‾\  
   /          \   |  |\  \ | (_) | 
  / __________ \  |__|  \__\ \___/ .io

  execution: local
     output: -
     script: src/annotations_by_tag_test.js

    duration: 5m0s, iterations: -
         vus: 200,  max: 200

    done [==========================================================] 5m0s / 5m0s

    █ annotation by tag test

      █ user authenticates thru ui with username and password

        ✓ response status is 200
        ✓ response has cookie 'grafana_session' with 32 characters

      █ batch tsdb requests with annotations by tag

        ✗ response status is 200
         ↳  97% — ✓ 60610 / ✗ 1290

    checks.....................: 97.92% ✓ 60950 ✗ 1290
    data_received..............: 968 MB 3.2 MB/s
    data_sent..................: 22 MB  75 kB/s
    group_duration.............: avg=12.72s   min=12.21ms  med=12.58s   max=35.68s   p(90)=21.2s   p(95)=23.27s
    http_req_blocked...........: avg=439.23µs min=1.43µs   med=4.35µs   max=177.6ms  p(90)=8.29µs  p(95)=18.91µs
    http_req_connecting........: avg=418.28µs min=0s       med=0s       max=177.52ms p(90)=0s      p(95)=0s
    http_req_duration..........: avg=1.97s    min=644.54µs med=972.39ms max=35.67s   p(90)=4.32s   p(95)=7.82s
    http_req_receiving.........: avg=21.1ms   min=19.47µs  med=1.68ms   max=1.23s    p(90)=46.98ms p(95)=104.85ms
    http_req_sending...........: avg=138.52µs min=8.53µs   med=24.31µs  max=286.01ms p(90)=44.96µs p(95)=71.02µs
    http_req_tls_handshaking...: avg=0s       min=0s       med=0s       max=0s       p(90)=0s      p(95)=0s
    http_req_waiting...........: avg=1.95s    min=560.73µs med=941.8ms  max=35.66s   p(90)=4.3s    p(95)=7.8s
    http_reqs..................: 68409  228.029945/s
    iteration_duration.........: avg=17.71s   min=1.96µs   med=17.58s   max=40.68s   p(90)=26.24s  p(95)=28.31s
    iterations.................: 3258   10.859997/s
    vus........................: 200    min=200 max=200
    vus_max....................: 200    min=200 max=200

image
image
image
image

Changes in this branch (index changes and query rewrite)

Details
> ./run.sh -d 5m -v 200 -c annotations_by_tag_test -u http://grafana.loc

          /\      |‾‾|  /‾‾/  /‾/
     /\  /  \     |  |_/  /  / /
    /  \/    \    |      |  /  ‾‾\
   /          \   |  |\  \ | (_) |
  / __________ \  |__|  \__\ \___/ .io

  execution: local
     output: -
     script: src/annotations_by_tag_test.js

    duration: 5m0s, iterations: -
         vus: 200,  max: 200

    done [==========================================================] 5m0s / 5m0s

    █ annotation by tag test

      █ user authenticates thru ui with username and password

        ✓ response status is 200
        ✓ response has cookie 'grafana_session' with 32 characters

      █ batch tsdb requests with annotations by tag

        ✗ response status is 200
         ↳  96% — ✓ 73768 / ✗ 3012

    checks.....................: 96.09% ✓ 74168 ✗ 3012
    data_received..............: 1.1 GB 3.8 MB/s
    data_sent..................: 27 MB  90 kB/s
    group_duration.............: avg=9.81s    min=10.13ms  med=10.42s  max=19.57s   p(90)=16.42s   p(95)=17.06s
    http_req_blocked...........: avg=129.94µs min=1.2µs    med=4.43µs  max=206.99ms p(90)=11.81µs  p(95)=21.55µs
    http_req_connecting........: avg=84.54µs  min=0s       med=0s      max=58.06ms  p(90)=0s       p(95)=0s
    http_req_duration..........: avg=2.56s    min=617.49µs med=1.92s   max=18.28s   p(90)=5.69s    p(95)=7.41s
    http_req_receiving.........: avg=140.72ms min=16.47µs  med=4.79ms  max=2.71s    p(90)=427.76ms p(95)=799.37ms
    http_req_sending...........: avg=132.09µs min=6.31µs   med=26.01µs max=323.71ms p(90)=65.82µs  p(95)=127.22µs
    http_req_tls_handshaking...: avg=0s       min=0s       med=0s      max=0s       p(90)=0s       p(95)=0s
    http_req_waiting...........: avg=2.42s    min=567.58µs med=1.8s    max=18.28s   p(90)=5.44s    p(95)=7.05s
    http_reqs..................: 80925  269.749846/s
    iteration_duration.........: avg=14.89s   min=2.11µs   med=15.5s   max=24.58s   p(90)=21.48s   p(95)=22.12s
    iterations.................: 3893   12.976659/s
    vus........................: 200    min=200 max=200
    vus_max....................: 200    min=200 max=200

image
image
image

Summary

Version HTTP requests total HTTP requests/s
v6.3.7 44344 ~148
v6.6.0 7786 ~26
This branch (index) 68409 ~228
This branch (index + query rewrite) 80925 ~270

It's worth mentioning that

  • the Grafana instance using version v6.6.0 was extremely unresponsive during the load test and basically timed out and didn't serve requests.
  • the v6.3.7 did respond quickly and behaved really snappy during the load test.
  • the current branch (both index changes and with/without query rewrite) did respond okay even though with latency serving requests. Given that this load test served much more requests than v6.3.7 and did put on twice the load my conclusion is that the additional amount of requests did put too much load on my machine/mysql.

@kylebrandt
Copy link
Contributor

kylebrandt commented Feb 5, 2020

Using my dataset (830k) annotations, generated via api calls against 6.3.7 (which doubles the count in that version before migration). Lines up with Marcus' results:

https://gist.github.com/kylebrandt/7ec4409aa008d791494368f1fd8c8c7a

Summary

Version HTTP requests total HTTP requests/s
v6.3.7 9480 ~31
v6.6.0 19465 (note: includes errors) ~64
This branch 92374 ~307

@marefr marefr marked this pull request as ready for review February 5, 2020 12:32
@marefr marefr requested a review from DanCech February 5, 2020 12:47
@marefr marefr changed the title SQLStore: Modify indices on the annotation table Annotations: Change indices and rewrites annotation find query to improve database query performance Feb 5, 2020
@marefr marefr merged commit 5ae9519 into master Feb 5, 2020
@marefr marefr deleted the 21902_fix branch February 5, 2020 15:52
alexanderzobnin pushed a commit that referenced this pull request Feb 6, 2020
…rove database query performance (#21915)

Drop indices and create new ones and rewrites annotation find query
to address performance issues when querying annotation table and
there is a large amount of rows.

Fixes #21902

Co-authored-by: Marcus Efraimsson <marcus.efraimsson@gmail.com>
Co-authored-by: Kyle Brandt <kyle@kbrandt.com>
(cherry picked from commit 5ae9519)
alexanderzobnin pushed a commit that referenced this pull request Feb 6, 2020
…rove database query performance (#21915)

Drop indices and create new ones and rewrites annotation find query
to address performance issues when querying annotation table and
there is a large amount of rows.

Fixes #21902

Co-authored-by: Marcus Efraimsson <marcus.efraimsson@gmail.com>
Co-authored-by: Kyle Brandt <kyle@kbrandt.com>
(cherry picked from commit 5ae9519)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Queries against Large Numbers of Annotations can overload/lock MySQL
5 participants