Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase report filter performance #1052

Merged
merged 1 commit into from
Oct 27, 2017

Conversation

csordasmarton
Copy link
Contributor

@csordasmarton csordasmarton commented Oct 18, 2017

This is the part of the #1038 issue.

@csordasmarton csordasmarton added discussion 💡 enhancement 🌟 CLI 💻 Related to the command-line interface, such as the cmd, store, etc. commands WARN ⚠️: Backward compatibility breaker! MIND THE GAP! Merging this patch will mess up compatibility with the previous releases! labels Oct 18, 2017
@csordasmarton csordasmarton added this to the release 6.2 milestone Oct 18, 2017
@csordasmarton csordasmarton requested a review from gyorb October 18, 2017 12:11

def downgrade():
# ### commands auto generated by Alembic - please adjust! ###
op.drop_index('report_filter_idx', table_name='reports')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we support donwgrades?

@csordasmarton csordasmarton force-pushed the report_filter_performance branch from feedfcc to 30a2e61 Compare October 18, 2017 14:32
Copy link
Contributor

@gyorb gyorb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alembic migration script is missing.

@gyorb gyorb added the WIP 💣 Work In Progress label Oct 19, 2017
@csordasmarton csordasmarton force-pushed the report_filter_performance branch 2 times, most recently from f15b1b6 to 5008ec6 Compare October 24, 2017 14:21
@csordasmarton csordasmarton removed WARN ⚠️: Backward compatibility breaker! MIND THE GAP! Merging this patch will mess up compatibility with the previous releases! WIP 💣 Work In Progress labels Oct 24, 2017
@csordasmarton csordasmarton requested a review from gyorb October 24, 2017 14:28
@csordasmarton csordasmarton force-pushed the report_filter_performance branch from 5008ec6 to 8cf54fc Compare October 24, 2017 16:17
Copy link
Contributor

@whisperity whisperity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a little explanation on what was slow and what is the general idea behind the modifications?

@csordasmarton
Copy link
Contributor Author

  • Remove unnecessary IN clause if all run ids (1..850) are given:
-- Before: 10851ms
SELECT reports.severity AS reports_severity, count(DISTINCT reports.bug_id) AS count_1 
FROM reports
LEFT OUTER JOIN files ON reports.file_id = files.id
LEFT OUTER JOIN review_statuses ON review_statuses.bug_hash = reports.bug_id
WHERE reports.run_id IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511, 512, 513, 514, 515, 516, 517, 518, 519, 520, 521, 522, 523, 524, 525, 526, 527, 528, 529, 530, 531, 532, 533, 534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544, 545, 546, 547, 548, 549, 550, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571, 572, 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584, 585, 586, 587, 588, 589, 590, 591, 592, 593, 594, 595, 596, 597, 598, 599, 600, 601, 602, 603, 604, 605, 606, 607, 608, 609, 610, 611, 612, 613, 614, 615, 616, 617, 618, 619, 620, 621, 622, 623, 624, 625, 626, 627, 628, 629, 630, 631, 632, 633, 634, 635, 636, 637, 638, 639, 640, 641, 642, 643, 644, 645, 646, 647, 648, 649, 650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660, 661, 662, 663, 664, 665, 666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677, 678, 679, 680, 681, 682, 683, 684, 685, 686, 687, 688, 689, 690, 691, 692, 693, 694, 695, 696, 697, 698, 699, 700, 701, 702, 703, 704, 705, 706, 707, 708, 709, 710, 711, 712, 713, 714, 715, 716, 717, 718, 719, 720, 721, 722, 723, 724, 725, 726, 727, 728, 729, 730, 731, 732, 733, 734, 735, 736, 737, 738, 739, 740, 741, 742, 743, 744, 745, 746, 747, 748, 749, 750, 751, 752, 753, 754, 755, 756, 757, 758, 759, 760, 761, 762, 763, 764, 765, 766, 767, 768, 769, 770, 771, 772, 773, 774, 775, 776, 777, 778, 779, 780, 781, 782, 783, 784, 785, 786, 787, 788, 789, 790, 791, 792, 793, 794, 795, 796, 797, 798, 799, 800, 801, 802, 803, 804, 805, 806, 807, 808, 809, 810, 811, 812, 813, 814, 815, 816, 817, 818, 819, 820, 821, 822, 823, 824, 825, 826, 827, 828, 829, 830, 831, 832, 833, 834, 835, 836, 837, 838, 839, 840, 841, 842, 843, 844, 845, 846, 847, 848, 849, 850)
GROUP BY reports.severity

-- After: 10008ms
SELECT reports.severity AS reports_severity, count(DISTINCT reports.bug_id) AS count_1 
FROM reports
LEFT OUTER JOIN files ON reports.file_id = files.id
LEFT OUTER JOIN review_statuses ON review_statuses.bug_hash = reports.bug_id
GROUP BY reports.severity
  • Use subqueries and group by bug hash first:
-- Before: 10008ms
SELECT reports.severity AS reports_severity, count(DISTINCT reports.bug_id) AS count_1 
FROM reports
LEFT OUTER JOIN files ON reports.file_id = files.id
LEFT OUTER JOIN review_statuses ON review_statuses.bug_hash = reports.bug_id
GROUP BY reports.severity

-- After: 174ms
SELECT anon_1.severity AS anon_1_severity, count(anon_1.bug_id) AS count_1 
FROM (
  SELECT max(reports.severity) AS severity, reports.bug_id AS bug_id 
  FROM reports
  LEFT OUTER JOIN files ON reports.file_id = files.id
  LEFT OUTER JOIN review_statuses ON review_statuses.bug_hash = reports.bug_id
  GROUP BY reports.bug_id
) AS anon_1
GROUP BY anon_1.severity
  • Grouping by multiple columns are very slow. In this case we should use SQL Indexes for getting results:
-- Before: 38946ms
SELECT DISTINCT reports.bug_id AS reports_bug_id, reports.checker_id AS reports_checker_id, reports.checker_message AS reports_checker_message, reports.severity AS reports_severity, files.filename AS files_filename, review_statuses.bug_hash AS review_statuses_bug_hash, review_statuses.status AS review_statuses_status, review_statuses.author AS review_statuses_author, review_statuses.message AS review_statuses_message, review_statuses.date AS review_statuses_date 
FROM reports
LEFT OUTER JOIN files ON reports.file_id = files.id
LEFT OUTER JOIN review_statuses ON review_statuses.bug_hash = reports.bug_id 
ORDER BY reports.severity DESC, reports.bug_id, reports.checker_id, reports.checker_message, reports.severity, files.filename, review_statuses.status
LIMIT 500 OFFSET 0

-- After: 645ms
CREATE INDEX reports_filter_column_idx ON reports (
  bug_id , checker_id, checker_message, severity, file_id);

SELECT anon_1.bug_id AS anon_1_bug_id, anon_1.checker_id AS anon_1_checker_id, anon_1.checker_message AS anon_1_checker_message,
  anon_1.severity AS anon_1_severity, anon_1.file_id AS anon_1_file_id, anon_1.status AS anon_1_status,
  anon_1.message AS anon_1_message, anon_1.author AS anon_1_author, anon_1.date AS anon_1_date, files.filename AS filename 
FROM (
  SELECT anon_2.bug_id AS bug_id, anon_2.checker_id AS checker_id, anon_2.checker_message AS checker_message,
    anon_2.severity AS severity, anon_2.file_id AS file_id, anon_2.status AS status,
    review_statuses.message AS message, review_statuses.author AS author, review_statuses.date AS date 
  FROM (
    SELECT reports.bug_id AS bug_id,
      reports.checker_id AS checker_id,
      reports.checker_message AS checker_message,
      reports.severity AS severity,
      reports.file_id AS file_id,
      max(review_statuses.status) AS status 
    FROM reports
    LEFT OUTER JOIN files ON reports.file_id = files.id
    LEFT OUTER JOIN review_statuses ON review_statuses.bug_hash = reports.bug_id
    GROUP BY reports.bug_id, reports.checker_id, reports.checker_message, reports.severity, reports.file_id
    LIMIT 500 OFFSET 0
  ) AS anon_2
  LEFT OUTER JOIN review_statuses ON review_statuses.bug_hash = anon_2.bug_id
) AS anon_1
LEFT OUTER JOIN files ON anon_1.file_id = files.id
ORDER BY anon_1.severity DESC

@csordasmarton csordasmarton force-pushed the report_filter_performance branch from 8cf54fc to b8a9e85 Compare October 27, 2017 09:10
@gyorb gyorb modified the milestones: release 6.2, 6.1.1 Oct 27, 2017
@csordasmarton csordasmarton force-pushed the report_filter_performance branch from b8a9e85 to d25bfa4 Compare October 27, 2017 13:07
@gyorb gyorb merged commit 64e128e into Ericsson:master Oct 27, 2017
@csordasmarton csordasmarton deleted the report_filter_performance branch October 31, 2017 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLI 💻 Related to the command-line interface, such as the cmd, store, etc. commands discussion 💡 enhancement 🌟
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants