Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement getitem backward #2883

Merged
merged 204 commits into from
Jul 24, 2024
Merged

Implement getitem backward #2883

merged 204 commits into from
Jul 24, 2024

Conversation

seungmanhan
Copy link
Collaborator

@seungmanhan seungmanhan commented Apr 8, 2024

Added getitem operation backward kernel with solver
Added driver test and gtest for getitem backward
Compared to ROCm pytorch, there is a performance improvement

getitem float16
op_name dtype dx_size index_size dy_size dim model dir ROCm pytorch (op time) MIOpen HIP Improvement
getitem float16 [128 128] [128] [128 128] 0 llama2 bwd 321654 53474 6.02
getitem float16 [3234 4] [16] [16 4] 0 ssdlite bwd 249380 60070 4.15
getitem float16 [3234 4] [12] [12 4] 0 ssdlite bwd 211604 57741 3.66
getitem float16 [3234 4] [11] [11 4] 0 ssdlite bwd 254389 54470 4.67
getitem float16 [3234 4] [13] [13 4] 0 ssdlite bwd 240549 58079 4.14
getitem float16 [3234 4] [14] [14 4] 0 ssdlite bwd 220932 59892 3.69
getitem float16 [3234 4] [15] [15 4] 0 ssdlite bwd 243876 58523 4.17
getitem float16 [3234 4] [10] [10 4] 0 ssdlite bwd 234308 60301 3.89
getitem float16 [3234 4] [18] [18 4] 0 ssdlite bwd 214548 59323 3.62
getitem float16 [3234 4] [20] [20 4] 0 ssdlite bwd 221380 61030 3.63
getitem float16 [3234 4] [23] [23 4] 0 ssdlite bwd 243877 59946 4.07
getitem float16 [3234 4] [8] [8 4] 0 ssdlite bwd 210948 59039 3.57
getitem float16 [3234 4] [22] [22 4] 0 ssdlite bwd 223524 59021 3.79
getitem float16 [3234 4] [25] [25 4] 0 ssdlite bwd 246389 59394 4.15
getitem float16 [3234 4] [7] [7 4] 0 ssdlite bwd 206372 57030 3.62
getitem float16 [3234 4] [28] [28 4] 0 ssdlite bwd 226452 61297 3.69
getitem float16 [3234 4] [30] [30 4] 0 ssdlite bwd 206788 60265 3.43
getitem float16 [3234 4] [31] [31 4] 0 ssdlite bwd 223364 62025 3.60
getitem float16 [3234 4] [32] [32 4] 0 ssdlite bwd 215620 60354 3.57
getitem float16 [3234 4] [33] [33 4] 0 ssdlite bwd 203548 55057 3.70
getitem float16 [3234 4] [34] [34 4] 0 ssdlite bwd 223060 61474 3.63
getitem float16 [3234] [16] [16] 0 ssdlite bwd 224244 60408 3.71
getitem float16 [149 128] [1490] [1490 128] 0 llama2_7b bwd 287158 65972 4.35
getitem float16 [150 128] [10] [10 128] 0 llama2_7b bwd 235060 63465 3.70
getitem float16 [174 128] [10] [10 128] 0 llama2_7b bwd 266485 62950 4.23
getitem float16 [205 128] [10] [10 128] 0 llama2_7b bwd 247861 63604 3.90
getitem float16 [232 128] [10] [10 128] 0 llama2_7b bwd 257685 61379 4.20
getitem float16 [3234] [15] [15] 0 ssdlite bwd 242532 58479 4.15
getitem float16 [3234 4] [2] [2 4] 0 ssdlite bwd 219716 60026 3.66
getitem float16 [3234 4] [37] [37 4] 0 ssdlite bwd 267941 62142 4.31
getitem float16 [3234] [10] [10] 0 ssdlite bwd 263285 62337 4.22
getitem float16 [3234 4] [38] [38 4] 0 ssdlite bwd 219188 63012 3.48
getitem float16 [3234 4] [41] [41 4] 0 ssdlite bwd 181827 60665 3.00
getitem float16 [3234 4] [6] [6 4] 0 ssdlite bwd 232612 64968 3.58
getitem float16 [3234 4] [44] [44 4] 0 ssdlite bwd 299382 55455 5.40
getitem float16 [3234] [22] [22] 0 ssdlite bwd 265733 55438 4.79
getitem float16 [3234] [25] [25] 0 ssdlite bwd 193556 60469 3.20
getitem float16 [3234 4] [43] [43 4] 0 ssdlite bwd 222116 59971 3.70
getitem float16 [3234] [21] [21] 0 ssdlite bwd 204452 59811 3.42
getitem float16 [3234] [27] [27] 0 ssdlite bwd 187636 60967 3.08
getitem float16 [3234] [24] [24] 0 ssdlite bwd 213284 56219 3.79
getitem float16 [3234 4] [42] [42 4] 0 ssdlite bwd 209716 61002 3.44
getitem float16 [3234 4] [3] [3 4] 0 ssdlite bwd 184708 58247 3.17
getitem float16 [3234 4] [45] [45 4] 0 ssdlite bwd 220964 61909 3.57
getitem float16 [3234] [40] [40] 0 ssdlite bwd 248037 54475 4.55
getitem float16 [1 4] [8] [8 4] 0 ssdlite bwd 214212 59774 3.58
getitem float16 [1] [13] [13] 0 ssdlite bwd 638684 61516 10.38
getitem float16 [3234] [45] [45] 0 ssdlite bwd 214692 66920 3.21
getitem float16 [1 4] [7] [7 4] 0 ssdlite bwd 219844 60449 3.64
getitem float16 [1] [7] [7] 0 ssdlite bwd 199988 60822 3.29
getitem float16 [3234] [4] [4] 0 ssdlite bwd 220484 58760 3.75
getitem float16 [3234 4] [59] [59 4] 0 ssdlite bwd 185812 61853 3.00
getitem float16 [3234] [5] [5] 0 ssdlite bwd 223844 59720 3.75
getitem float16 [2 4] [15] [15 4] 0 ssdlite bwd 229764 58048 3.96
getitem float16 [3234 4] [62] [62 4] 0 ssdlite bwd 228916 61035 3.75
getitem float16 [3234] [57] [57] 0 ssdlite bwd 175347 61887 2.83
getitem float16 [3234 4] [77] [77 4] 0 ssdlite bwd 185524 63025 2.94
getitem float16 [3234 4] [79] [79 4] 0 ssdlite bwd 185380 62545 2.96
getitem float16 [3234 4] [75] [75 4] 0 ssdlite bwd 166003 64340 2.58
getitem float16 [3234] [59] [59] 0 ssdlite bwd 160531 60767 2.64
getitem float16 [8741 4] [2000] [2000 4] 0 maskrcnn bwd 228196 60553 3.77
getitem float16 [8741] [2000] [2000] 0 maskrcnn bwd 198628 61833 3.21
getitem float16 [3234 4] [89] [89 4] 0 ssdlite bwd 138706 64713 2.14
getitem float16 [3234 4] [84] [84 4] 0 ssdlite bwd 135922 64038 2.12
getitem float16 [8741 4] [8741] [8741 4] 0 maskrcnn bwd 211428 58420 3.62
getitem float16 [8741] [8741] [8741] 0 maskrcnn bwd 202004 53620 3.77
getitem float16 [3234] [58] [58] 0 ssdlite bwd 130242 61140 2.13
getitem float16 [3 4] [20] [20 4] 0 ssdlite bwd 137362 62900 2.18
getitem float16 [3] [20] [20] 0 ssdlite bwd 135986 60251 2.26
getitem float16 [3234] [64] [64] 0 ssdlite bwd 138371 63077 2.19
getitem float16 [3234 4] [83] [83 4] 0 ssdlite bwd 132819 64002 2.08
getitem float16 [3 4] [17] [17 4] 0 ssdlite bwd 133779 62331 2.15
getitem float16 [3] [17] [17] 0 ssdlite bwd 131394 62953 2.09
getitem float16 [3234 4] [86] [86 4] 0 ssdlite bwd 144003 63131 2.28
getitem float16 [3234 4] [88] [88 4] 0 ssdlite bwd 135123 63682 2.12
getitem float16 [3234] [65] [65] 0 ssdlite bwd 129378 62065 2.08
getitem float16 [3234] [73] [73] 0 ssdlite bwd 138162 64962 2.13
getitem float16 [3234] [61] [61] 0 ssdlite bwd 133827 59540 2.25
getitem float16 [3 4] [23] [23 4] 0 ssdlite bwd 131506 62598 2.10
getitem float16 [3] [23] [23] 0 ssdlite bwd 136962 61229 2.24
getitem float16 [3 4] [21] [21 4] 0 ssdlite bwd 139315 58740 2.37
getitem float16 [3] [21] [21] 0 ssdlite bwd 133459 61975 2.15
getitem float16 [2 4] [8] [8 4] 0 ssdlite bwd 136290 58775 2.32
getitem float16 [2] [8] [8] 0 ssdlite bwd 135587 61975 2.19
getitem float16 [3 4] [28] [28 4] 0 ssdlite bwd 135827 58473 2.32
getitem float16 [3] [28] [28] 0 ssdlite bwd 137091 67487 2.03
getitem float16 [2 4] [27] [27 4] 0 ssdlite bwd 144339 60873 2.37
getitem float16 [2] [27] [27] 0 ssdlite bwd 141299 64411 2.19
getitem float16 [3 4] [22] [22 4] 0 ssdlite bwd 134851 60535 2.23
getitem float16 [3] [22] [22] 0 ssdlite bwd 139475 60517 2.30
getitem float16 [3234 4] [78] [78 4] 0 ssdlite bwd 135538 62011 2.19
getitem float16 [2 4] [28] [28 4] 0 ssdlite bwd 136659 61744 2.21
getitem float16 [2] [28] [28] 0 ssdlite bwd 144307 60198 2.40
getitem float16 [3234] [67] [67] 0 ssdlite bwd 136146 62367 2.18
getitem float16 [3234] [74] [74] 0 ssdlite bwd 132786 61584 2.16
getitem float16 [3234] [70] [70] 0 ssdlite bwd 147762 59860 2.47
getitem float16 [3 4] [26] [26 4] 0 ssdlite bwd 159059 61015 2.61
getitem float16 [3] [26] [26] 0 ssdlite bwd 152835 67380 2.27
getitem float16 [3 4] [29] [29 4] 0 ssdlite bwd 157571 65264 2.41
getitem float16 [3] [29] [29] 0 ssdlite bwd 161635 63255 2.56
getitem float16 [3 4] [18] [18 4] 0 ssdlite bwd 151955 60553 2.51
getitem float16 [3] [18] [18] 0 ssdlite bwd 155939 61246 2.55
getitem float16 [3234] [68] [68] 0 ssdlite bwd 161587 62722 2.58
getitem float16 [3234] [66] [66] 0 ssdlite bwd 160131 62224 2.57
getitem float16 [3 4] [19] [19 4] 0 ssdlite bwd 154099 59024 2.61
getitem float16 [3] [19] [19] 0 ssdlite bwd 148547 56250 2.64
getitem float16 [3234 4] [90] [90 4] 0 ssdlite bwd 158995 63060 2.52
getitem float16 [3] [25] [25] 0 ssdlite bwd 150099 60108 2.50
getitem float16 [3234 4] [87] [87 4] 0 ssdlite bwd 159315 63077 2.53
getitem float16 [3 4] [10] [10 4] 0 ssdlite bwd 158275 59895 2.64
getitem float16 [3] [10] [10] 0 ssdlite bwd 148851 58188 2.56
getitem float16 [3 4] [27] [27 4] 0 ssdlite bwd 157251 60784 2.59
getitem float16 [3] [27] [27] 0 ssdlite bwd 158787 64838 2.45
getitem float16 [3 4] [30] [30 4] 0 ssdlite bwd 154851 61975 2.50
getitem float16 [3] [30] [30] 0 ssdlite bwd 159859 54526 2.93
getitem float16 [3 4] [16] [16 4] 0 ssdlite bwd 159331 61122 2.61
getitem float16 [3] [16] [16] 0 ssdlite bwd 151491 64322 2.36
getitem float16 [3 4] [33] [33 4] 0 ssdlite bwd 158995 63539 2.50
getitem float16 [3] [33] [33] 0 ssdlite bwd 163427 58775 2.78
getitem float16 [3234 4] [81] [81 4] 0 ssdlite bwd 155011 63522 2.44
getitem float16 [3 4] [34] [34 4] 0 ssdlite bwd 154659 63415 2.44
getitem float16 [3] [34] [34] 0 ssdlite bwd 161091 62028 2.60
getitem float16 [4300 4] [4261] [4261 4] 0 fasterrcnn bwd 208404 60410 3.45
getitem float16 [4300] [4261] [4261] 0 fasterrcnn bwd 208388 63860 3.26
getitem float16 [4261 4] [1000] [1000 4] 0 fasterrcnn bwd 160339 61815 2.59
getitem float16 [4261] [1000] [1000] 0 fasterrcnn bwd 165416 61317 2.70
getitem float16 [3 4] [31] [31 4] 0 ssdlite bwd 168131 61975 2.71
getitem float16 [3] [31] [31] 0 ssdlite bwd 148131 63255 2.34
getitem float16 [2 4] [29] [29 4] 0 ssdlite bwd 157119 62206 2.53
getitem float16 [2] [29] [29] 0 ssdlite bwd 161795 60873 2.66
getitem float16 [3234] [72] [72] 0 ssdlite bwd 152995 61139 2.50
getitem float16 [3234] [69] [69] 0 ssdlite bwd 152755 62971 2.43
getitem float16 [3234 4] [94] [94 4] 0 ssdlite bwd 157939 65477 2.41
getitem float16 [3 4] [11] [11 4] 0 ssdlite bwd 151907 54561 2.78
getitem float16 [3] [11] [11] 0 ssdlite bwd 159091 60357 2.64
getitem float16 [3 4] [13] [13 4] 0 ssdlite bwd 149555 59486 2.51
getitem float16 [3] [13] [13] 0 ssdlite bwd 153779 57797 2.66
getitem float16 [3 4] [12] [12 4] 0 ssdlite bwd 157283 59504 2.64
getitem float16 [3] [12] [12] 0 ssdlite bwd 152963 60073 2.55
getitem float16 [3 4] [14] [14 4] 0 ssdlite bwd 151123 67291 2.25
getitem float16 [3] [14] [14] 0 ssdlite bwd 156915 63611 2.47
getitem float16 [3234] [71] [71] 0 ssdlite bwd 155427 61851 2.51
getitem float16 [3 4] [24] [24 4] 0 ssdlite bwd 155475 59770 2.60
getitem float16 [3] [24] [24] 0 ssdlite bwd 160387 62028 2.59
getitem float16 [3234 4] [101] [101 4] 0 ssdlite bwd 154627 63415 2.44
getitem float16 [1 4] [4] [4 4] 0 ssdlite bwd 148931 61566 2.42
getitem float16 [1] [4] [4] 0 ssdlite bwd 156211 61477 2.54
getitem float16 [1 4] [6] [6 4] 0 ssdlite bwd 154675 54366 2.85
getitem float16 [1] [6] [6] 0 ssdlite bwd 150579 59433 2.53
getitem float16 [3234 4] [85] [85 4] 0 ssdlite bwd 158851 62490 2.54
getitem float16 [3234 4] [92] [92 4] 0 ssdlite bwd 156035 64873 2.41
getitem float16 [4 4] [4] [4 4] 0 ssdlite bwd 154755 58668 2.64
getitem float16 [4] [4] [4] 0 ssdlite bwd 150051 62330 2.41
getitem float16 [3234] [76] [76] 0 ssdlite bwd 159507 56588 2.82
getitem float16 [3 4] [15] [15 4] 0 ssdlite bwd 153107 59646 2.57
getitem float16 [3] [15] [15] 0 ssdlite bwd 144019 60001 2.40
getitem float16 [3234] [80] [80] 0 ssdlite bwd 146691 61122 2.40
getitem float16 [3 4] [32] [32 4] 0 ssdlite bwd 145331 61353 2.37
getitem float16 [3] [32] [32] 0 ssdlite bwd 142178 57050 2.49
getitem float16 [3234 4] [91] [91 4] 0 ssdlite bwd 137411 62881 2.19
getitem float16 [4300 4] [4262] [4262 4] 0 fasterrcnn bwd 202772 58241 3.48
getitem float16 [4300] [4262] [4262] 0 fasterrcnn bwd 188083 57939 3.25
getitem float16 [4262 4] [1000] [1000 4] 0 fasterrcnn bwd 152211 60819 2.50
getitem float16 [4262] [1000] [1000] 0 fasterrcnn bwd 143955 61548 2.34
getitem float16 [1] [5] [5] 0 maskrcnn_ssdlite bwd 133970 61139 2.19
getitem float16 [4 4] [27] [27 4] 0 ssdlite bwd 139363 62117 2.24
getitem float16 [4] [27] [27] 0 ssdlite bwd 135491 62028 2.18
getitem float16 [1 4] [5] [5 4] 0 ssdlite bwd 130291 58935 2.21
getitem float16 [4 4] [28] [28 4] 0 ssdlite bwd 138797 59895 2.32
getitem float16 [4] [28] [28] 0 ssdlite bwd 137699 60730 2.27
getitem float16 [4300 4] [4194] [4194 4] 0 fasterrcnn bwd 194276 59468 3.27
getitem float16 [4300] [4194] [4194] 0 fasterrcnn bwd 190578 61886 3.08
getitem float16 [4194 4] [1000] [1000 4] 0 fasterrcnn bwd 171235 60410 2.83
getitem float16 [2 4] [30] [30 4] 0 ssdlite bwd 158163 60837 2.60
getitem float16 [2] [30] [30] 0 ssdlite bwd 151331 62153 2.43
getitem float16 [4 4] [33] [33 4] 0 ssdlite bwd 157619 61459 2.56
getitem float16 [4] [33] [33] 0 ssdlite bwd 156979 61157 2.57
getitem float16 [3234] [77] [77] 0 ssdlite bwd 152355 61993 2.46
getitem float16 [3234 4] [93] [93 4] 0 ssdlite bwd 149283 62935 2.37
getitem float16 [3234 4] [105] [105 4] 0 ssdlite bwd 158163 63344 2.50
getitem float16 [3234 4] [113] [113 4] 0 ssdlite bwd 150099 64464 2.33
getitem float16 [4194] [1000] [1000] 0 fasterrcnn bwd 160915 70206 2.29
getitem float16 [3 4] [37] [37 4] 0 ssdlite bwd 145811 59148 2.47
getitem float16 [3] [37] [37] 0 ssdlite bwd 146574 57595 2.54
getitem float16 [4 4] [37] [37 4] 0 ssdlite bwd 146515 64866 2.26
getitem float16 [4] [37] [37] 0 ssdlite bwd 148307 62217 2.38
getitem float16 [2 4] [7] [7 4] 0 ssdlite bwd 134354 64244 2.09
getitem float16 [2] [7] [7] 0 ssdlite bwd 150131 59159 2.54
getitem float16 [4 4] [18] [18 4] 0 ssdlite bwd 140434 55355 2.54
getitem float16 [4] [18] [18] 0 ssdlite bwd 134099 59800 2.24
getitem float16 [3 4] [9] [9 4] 0 ssdlite bwd 135986 59462 2.29
getitem float16 [3] [9] [9] 0 ssdlite bwd 154323 67159 2.30
getitem float16 [3234] [79] [79] 0 ssdlite bwd 140083 63515 2.21
getitem float16 [4 4] [30] [30 4] 0 ssdlite bwd 133794 59604 2.24
getitem float16 [4] [30] [30] 0 ssdlite bwd 149619 61115 2.45
getitem float16 [3234] [75] [75] 0 ssdlite bwd 139619 60600 2.30
getitem float16 [3 4] [35] [35 4] 0 ssdlite bwd 136691 62360 2.19
getitem float16 [4 4] [39] [39 4] 0 ssdlite bwd 216420 68975 3.14
getitem float16 [8] [2008] [2008] 0 maskrcnn bwd 263685 128797 2.05
getitem float16 [5] [14] [14] 0 maskrcnn_ssdlite bwd 196612 61884 3.18
getitem float16 [8 4] [8] [8 4] 0 ssdlite bwd 168851 60035 2.81
getitem float16 [4197 4] [1000] [1000 4] 0 fasterrcnn bwd 166435 64302 2.59
getitem float16 [4197] [1000] [1000] 0 fasterrcnn bwd 171219 59858 2.86
getitem float16 [16] [2016] [2016] 0 maskrcnn bwd 238692 95396 2.50
getitem float16 [7 4] [33] [33 4] 0 ssdlite bwd 155811 59449 2.62
getitem float16
op_name dtype dx_size index_size dy_size dim model dir ROCm pytorch (op time) MIOpen HIP Improvement
getitem float32 [128 128] [128] [128 128] 0 llama2 bwd 259861 61190 4.25
getitem float32 [3234 4] [16] [16 4] 0 ssdlite bwd 239348 61048 3.92
getitem float32 [3234 4] [12] [12 4] 0 ssdlite bwd 249861 59945 4.17
getitem float32 [3234 4] [11] [11 4] 0 ssdlite bwd 233364 59288 3.94
getitem float32 [3234 4] [13] [13 4] 0 ssdlite bwd 230340 58506 3.94
getitem float32 [3234 4] [14] [14 4] 0 ssdlite bwd 253285 60248 4.20
getitem float32 [3234 4] [15] [15 4] 0 ssdlite bwd 227812 58808 3.87
getitem float32 [3234 4] [10] [10 4] 0 ssdlite bwd 219668 60248 3.65
getitem float32 [3234 4] [18] [18 4] 0 ssdlite bwd 236724 63963 3.70
getitem float32 [3234 4] [20] [20 4] 0 ssdlite bwd 234116 55679 4.20
getitem float32 [3234 4] [23] [23 4] 0 ssdlite bwd 197892 60443 3.27
getitem float32 [3234 4] [8] [8 4] 0 ssdlite bwd 228756 54541 4.19
getitem float32 [3234 4] [22] [22 4] 0 ssdlite bwd 233748 60870 3.84
getitem float32 [3234 4] [25] [25 4] 0 ssdlite bwd 200356 62221 3.22
getitem float32 [3234 4] [7] [7 4] 0 ssdlite bwd 213124 58541 3.64
getitem float32 [3234 4] [28] [28 4] 0 ssdlite bwd 214148 60159 3.56
getitem float32 [3234 4] [30] [30 4] 0 ssdlite bwd 224036 62506 3.58
getitem float32 [3234 4] [31] [31 4] 0 ssdlite bwd 211092 59110 3.57
getitem float32 [3234 4] [32] [32 4] 0 ssdlite bwd 213524 61634 3.46
getitem float32 [3234 4] [33] [33 4] 0 ssdlite bwd 223892 60195 3.72
getitem float32 [3234 4] [34] [34 4] 0 ssdlite bwd 210660 59999 3.51
getitem float32 [3234] [16] [16] 0 ssdlite bwd 214116 60426 3.54
getitem float32 [149 128] [1490] [1490 128] 0 llama2_7b bwd 254165 83092 3.06
getitem float32 [150 128] [10] [10 128] 0 llama2_7b bwd 267893 61794 4.34
getitem float32 [174 128] [10] [10 128] 0 llama2_7b bwd 272789 61705 4.42
getitem float32 [205 128] [10] [10 128] 0 llama2_7b bwd 298054 62004 4.81
getitem float32 [232 128] [10] [10 128] 0 llama2_7b bwd 256661 62285 4.12
getitem float32 [248 128] [10] [10 128] 0 llama2_7b bwd 237956 63600 3.74
getitem float32 [3234] [15] [15] 0 ssdlite bwd 247045 57679 4.28
getitem float32 [3234 4] [2] [2 4] 0 ssdlite bwd 263477 63973 4.12
getitem float32 [3234 4] [37] [37 4] 0 ssdlite bwd 238756 62942 3.79
getitem float32 [3234] [10] [10] 0 ssdlite bwd 258229 57803 4.47
getitem float32 [3234 4] [38] [38 4] 0 ssdlite bwd 218180 61021 3.58
getitem float32 [3234 4] [41] [41 4] 0 ssdlite bwd 277029 60487 4.58
getitem float32 [3234 4] [6] [6 4] 0 ssdlite bwd 269109 58425 4.61
getitem float32 [3234 4] [44] [44 4] 0 ssdlite bwd 156131 61536 2.54
getitem float32 [3234] [22] [22] 0 ssdlite bwd 230196 59029 3.90
getitem float32 [3234] [25] [25] 0 ssdlite bwd 210116 60540 3.47
getitem float32 [3234 4] [43] [43 4] 0 ssdlite bwd 208052 58158 3.58
getitem float32 [3234] [21] [21] 0 ssdlite bwd 206996 60007 3.45
getitem float32 [3234] [27] [27] 0 ssdlite bwd 213444 61749 3.46
getitem float32 [3234] [24] [24] 0 ssdlite bwd 199588 59385 3.36
getitem float32 [3234 4] [42] [42 4] 0 ssdlite bwd 199860 55419 3.61
getitem float32 [3234 4] [3] [3 4] 0 ssdlite bwd 212100 58958 3.60
getitem float32 [3234 4] [45] [45 4] 0 ssdlite bwd 196100 56628 3.46
getitem float32 [3234] [40] [40] 0 ssdlite bwd 240628 60752 3.96
getitem float32 [1 4] [8] [8 4] 0 ssdlite bwd 177076 58885 3.01
getitem float32 [1] [13] [13] 0 ssdlite bwd 192084 60449 3.18
getitem float32 [3234] [45] [45] 0 ssdlite bwd 222484 60840 3.66
getitem float32 [1 4] [7] [7 4] 0 ssdlite bwd 189156 62013 3.05
getitem float32 [1] [7] [7] 0 ssdlite bwd 215748 61160 3.53
getitem float32 [3234] [4] [4] 0 ssdlite bwd 196884 59951 3.28
getitem float32 [3234 4] [59] [59 4] 0 ssdlite bwd 229012 61515 3.72
getitem float32 [3234] [5] [5] 0 ssdlite bwd 213780 59239 3.61
getitem float32 [2 4] [15] [15 4] 0 ssdlite bwd 188436 64929 2.90
getitem float32 [3234 4] [62] [62 4] 0 ssdlite bwd 183491 64395 2.85
getitem float32 [3234] [57] [57] 0 ssdlite bwd 182788 60642 3.01
getitem float32 [3234 4] [77] [77 4] 0 ssdlite bwd 184499 65460 2.82
getitem float32 [3234 4] [79] [79 4] 0 ssdlite bwd 163174 63558 2.57
getitem float32 [3234 4] [75] [75 4] 0 ssdlite bwd 156243 65496 2.39
getitem float32 [3234] [59] [59] 0 ssdlite bwd 148819 63007 2.36
getitem float32 [8741 4] [2000] [2000 4] 0 maskrcnn bwd 199124 62598 3.18
getitem float32 [8741] [2000] [2000] 0 maskrcnn bwd 179811 65158 2.76
getitem float32 [3234 4] [89] [89 4] 0 ssdlite bwd 139987 66776 2.10
getitem float32 [3234 4] [84] [84 4] 0 ssdlite bwd 138851 63202 2.20
getitem float32 [8741 4] [8741] [8741 4] 0 maskrcnn bwd 220772 60304 3.66
getitem float32 [8741] [8741] [8741] 0 maskrcnn bwd 183843 60731 3.03
getitem float32 [3234] [58] [58] 0 ssdlite bwd 133907 56109 2.39
getitem float32 [3 4] [20] [20 4] 0 ssdlite bwd 136386 61425 2.22
getitem float32 [3] [20] [20] 0 ssdlite bwd 131311 63095 2.08
getitem float32 [3234] [64] [64] 0 ssdlite bwd 134962 61976 2.18
getitem float32 [3234 4] [83] [83 4] 0 ssdlite bwd 138754 64713 2.14
getitem float32 [3 4] [17] [17 4] 0 ssdlite bwd 134258 63380 2.12
getitem float32 [3] [17] [17] 0 ssdlite bwd 135427 64411 2.10
getitem float32 [3234 4] [86] [86 4] 0 ssdlite bwd 132386 66793 1.98
getitem float32 [3234 4] [88] [88 4] 0 ssdlite bwd 133282 64891 2.05
getitem float32 [3234] [65] [65] 0 ssdlite bwd 137762 63113 2.18
getitem float32 [3234] [73] [73] 0 ssdlite bwd 135731 60037 2.26
getitem float32 [3234] [61] [61] 0 ssdlite bwd 128850 62882 2.05
getitem float32 [3 4] [23] [23 4] 0 ssdlite bwd 139475 64944 2.15
getitem float32 [3] [23] [23] 0 ssdlite bwd 139315 64856 2.15
getitem float32 [3 4] [21] [21 4] 0 ssdlite bwd 135043 64784 2.08
getitem float32 [3] [21] [21] 0 ssdlite bwd 135714 65869 2.06
getitem float32 [2 4] [8] [8 4] 0 ssdlite bwd 134066 61922 2.17
getitem float32 [2] [8] [8] 0 ssdlite bwd 130682 61602 2.12
getitem float32 [3 4] [28] [28 4] 0 ssdlite bwd 144947 63469 2.28
getitem float32 [3] [28] [28] 0 ssdlite bwd 139859 64429 2.17
getitem float32 [2 4] [27] [27 4] 0 ssdlite bwd 137219 70011 1.96
getitem float32 [2] [27] [27] 0 ssdlite bwd 143363 69389 2.07
getitem float32 [3 4] [22] [22 4] 0 ssdlite bwd 138963 60553 2.29
getitem float32 [3] [22] [22] 0 ssdlite bwd 132994 61691 2.16
getitem float32 [3234 4] [78] [78 4] 0 ssdlite bwd 138707 64749 2.14
getitem float32 [2 4] [28] [28 4] 0 ssdlite bwd 143283 68927 2.08
getitem float32 [2] [28] [28] 0 ssdlite bwd 140947 66758 2.11
getitem float32 [3234] [67] [67] 0 ssdlite bwd 136942 60998 2.25
getitem float32 [3234] [74] [74] 0 ssdlite bwd 140643 62331 2.26
getitem float32 [3234] [70] [70] 0 ssdlite bwd 153427 59931 2.56
getitem float32 [3 4] [26] [26 4] 0 ssdlite bwd 156683 63984 2.45
getitem float32 [3] [26] [26] 0 ssdlite bwd 161795 72767 2.22
getitem float32 [3 4] [29] [29 4] 0 ssdlite bwd 155251 61531 2.52
getitem float32 [3] [29] [29] 0 ssdlite bwd 149539 65193 2.29
getitem float32 [3 4] [18] [18 4] 0 ssdlite bwd 157059 62669 2.51
getitem float32 [3] [18] [18] 0 ssdlite bwd 153619 57139 2.69
getitem float32 [3234] [68] [68] 0 ssdlite bwd 152035 65318 2.33
getitem float32 [3234] [66] [66] 0 ssdlite bwd 150563 60997 2.47
getitem float32 [3 4] [19] [19 4] 0 ssdlite bwd 154019 63380 2.43
getitem float32 [3] [19] [19] 0 ssdlite bwd 160147 63949 2.50
getitem float32 [3234 4] [90] [90 4] 0 ssdlite bwd 152963 65335 2.34
getitem float32 [3] [25] [25] 0 ssdlite bwd 158419 58224 2.72
getitem float32 [3234 4] [87] [87 4] 0 ssdlite bwd 151971 63931 2.38
getitem float32 [3 4] [10] [10 4] 0 ssdlite bwd 147987 60162 2.46
getitem float32 [3] [10] [10] 0 ssdlite bwd 157043 57993 2.71
getitem float32 [3 4] [27] [27 4] 0 ssdlite bwd 159027 65228 2.44
getitem float32 [3] [27] [27] 0 ssdlite bwd 151155 61246 2.47
getitem float32 [3 4] [30] [30 4] 0 ssdlite bwd 158163 64891 2.44
getitem float32 [3] [30] [30] 0 ssdlite bwd 158115 64055 2.47
getitem float32 [3 4] [16] [16 4] 0 ssdlite bwd 149187 61993 2.41
getitem float32 [3] [16] [16] 0 ssdlite bwd 158019 64109 2.46
getitem float32 [3 4] [33] [33 4] 0 ssdlite bwd 162515 62242 2.61
getitem float32 [3] [33] [33] 0 ssdlite bwd 152515 64624 2.36
getitem float32 [3234 4] [81] [81 4] 0 ssdlite bwd 157987 65264 2.42
getitem float32 [3 4] [34] [34 4] 0 ssdlite bwd 159235 66651 2.39
getitem float32 [3] [34] [34] 0 ssdlite bwd 151539 65104 2.33
getitem float32 [4300 4] [4261] [4261 4] 0 fasterrcnn bwd 208548 65353 3.19
getitem float32 [4300] [4261] [4261] 0 fasterrcnn bwd 201396 65691 3.07
getitem float32 [4261 4] [1000] [1000 4] 0 fasterrcnn bwd 171427 60748 2.82
getitem float32 [4261] [1000] [1000] 0 fasterrcnn bwd 164723 62242 2.65
getitem float32 [3 4] [31] [31 4] 0 ssdlite bwd 152595 69086 2.21
getitem float32 [3] [31] [31] 0 ssdlite bwd 156531 65726 2.38
getitem float32 [2 4] [29] [29 4] 0 ssdlite bwd 159283 69442 2.29
getitem float32 [2] [29] [29] 0 ssdlite bwd 150067 66828 2.25
getitem float32 [3234] [72] [72] 0 ssdlite bwd 158467 59771 2.65
getitem float32 [3234] [69] [69] 0 ssdlite bwd 161059 61335 2.63
getitem float32 [3234 4] [94] [94 4] 0 ssdlite bwd 146515 64144 2.28
getitem float32 [3 4] [11] [11 4] 0 ssdlite bwd 155843 59166 2.63
getitem float32 [3] [11] [11] 0 ssdlite bwd 147763 60428 2.45
getitem float32 [3 4] [13] [13 4] 0 ssdlite bwd 154563 65334 2.37
getitem float32 [3] [13] [13] 0 ssdlite bwd 154659 59148 2.61
getitem float32 [3 4] [12] [12 4] 0 ssdlite bwd 149299 74989 1.99
getitem float32 [3] [12] [12] 0 ssdlite bwd 156019 59504 2.62
getitem float32 [3 4] [14] [14 4] 0 ssdlite bwd 155619 62259 2.50
getitem float32 [3] [14] [14] 0 ssdlite bwd 146131 61691 2.37
getitem float32 [3234] [71] [71] 0 ssdlite bwd 157123 59788 2.63
getitem float32 [3 4] [24] [24 4] 0 ssdlite bwd 162227 65104 2.49
getitem float32 [3] [24] [24] 0 ssdlite bwd 151939 65264 2.33
getitem float32 [3234 4] [101] [101 4] 0 ssdlite bwd 153763 65637 2.34
getitem float32 [1 4] [4] [4 4] 0 ssdlite bwd 156499 59788 2.62
getitem float32 [1] [4] [4] 0 ssdlite bwd 150691 65228 2.31
getitem float32 [1 4] [6] [6 4] 0 ssdlite bwd 151811 59717 2.54
getitem float32 [1] [6] [6] 0 ssdlite bwd 154915 61050 2.54
getitem float32 [3234 4] [85] [85 4] 0 ssdlite bwd 152451 62704 2.43
getitem float32 [3234 4] [92] [92 4] 0 ssdlite bwd 152867 61762 2.48
getitem float32 [4 4] [4] [4 4] 0 ssdlite bwd 151299 61139 2.47
getitem float32 [4] [4] [4] 0 ssdlite bwd 152755 59593 2.56
getitem float32 [3234] [76] [76] 0 ssdlite bwd 152019 60268 2.52
getitem float32 [3 4] [15] [15 4] 0 ssdlite bwd 141939 63894 2.22
getitem float32 [3] [15] [15] 0 ssdlite bwd 146403 61655 2.37
getitem float32 [3234] [80] [80] 0 ssdlite bwd 148051 60446 2.45
getitem float32 [3 4] [32] [32 4] 0 ssdlite bwd 139203 66419 2.10
getitem float32 [3] [32] [32] 0 ssdlite bwd 136946 67166 2.04
getitem float32 [3234 4] [91] [91 4] 0 ssdlite bwd 142242 66508 2.14
getitem float32 [4300 4] [4262] [4262 4] 0 fasterrcnn bwd 189828 63521 2.99
getitem float32 [4300] [4262] [4262] 0 fasterrcnn bwd 191028 62793 3.04
getitem float32 [4262 4] [1000] [1000 4] 0 fasterrcnn bwd 144659 62153 2.33
getitem float32 [4262] [1000] [1000] 0 fasterrcnn bwd 149171 62313 2.39
getitem float32 [1] [5] [5] 0 maskrcnn_ssdlite bwd 135635 61388 2.21
getitem float32 [4 4] [27] [27 4] 0 ssdlite bwd 135778 65353 2.08
getitem float32 [4] [27] [27] 0 ssdlite bwd 136578 64268 2.13
getitem float32 [1 4] [5] [5 4] 0 ssdlite bwd 136754 59699 2.29
getitem float32 [4 4] [28] [28 4] 0 ssdlite bwd 133634 64890 2.06
getitem float32 [4] [28] [28] 0 ssdlite bwd 136627 65299 2.09
getitem float32 [4300 4] [4194] [4194 4] 0 fasterrcnn bwd 192116 63984 3.00
getitem float32 [4300] [4194] [4194] 0 fasterrcnn bwd 190356 63344 3.01
getitem float32 [4194 4] [1000] [1000 4] 0 fasterrcnn bwd 159907 63450 2.52
getitem float32 [2 4] [30] [30 4] 0 ssdlite bwd 152867 73762 2.07
getitem float32 [2] [30] [30] 0 ssdlite bwd 159491 68517 2.33
getitem float32 [4 4] [33] [33 4] 0 ssdlite bwd 159107 65477 2.43
getitem float32 [4] [33] [33] 0 ssdlite bwd 153363 64997 2.36
getitem float32 [3234] [77] [77] 0 ssdlite bwd 155235 60232 2.58
getitem float32 [3234 4] [93] [93 4] 0 ssdlite bwd 156259 60179 2.60
getitem float32 [3234 4] [105] [105 4] 0 ssdlite bwd 146979 65922 2.23
getitem float32 [3234 4] [113] [113 4] 0 ssdlite bwd 155555 64197 2.42
getitem float32 [4194] [1000] [1000] 0 fasterrcnn bwd 165587 61833 2.68
getitem float32 [3 4] [37] [37 4] 0 ssdlite bwd 145331 71859 2.02
getitem float32 [3] [37] [37] 0 ssdlite bwd 151763 61808 2.46
getitem float32 [4 4] [37] [37 4] 0 ssdlite bwd 148531 63763 2.33
getitem float32 [4] [37] [37] 0 ssdlite bwd 141762 63835 2.22
getitem float32 [2 4] [7] [7 4] 0 ssdlite bwd 139202 61222 2.27
getitem float32 [2] [7] [7] 0 ssdlite bwd 138098 60439 2.28
getitem float32 [4 4] [18] [18 4] 0 ssdlite bwd 133170 62715 2.12
getitem float32 [4] [18] [18] 0 ssdlite bwd 135507 68119 1.99
getitem float32 [3 4] [9] [9 4] 0 ssdlite bwd 142835 62040 2.30
getitem float32 [3] [9] [9] 0 ssdlite bwd 139107 55018 2.53
getitem float32 [3234] [79] [79] 0 ssdlite bwd 134882 59729 2.26
getitem float32 [4 4] [30] [30 4] 0 ssdlite bwd 139651 62466 2.24
getitem float32 [4] [30] [30] 0 ssdlite bwd 135970 64244 2.12
getitem float32 [3234] [75] [75] 0 ssdlite bwd 135858 62840 2.16
getitem float32 [3 4] [35] [35 4] 0 ssdlite bwd 142610 69612 2.05
getitem float32 [4 4] [39] [39 4] 0 ssdlite bwd 190996 57118 3.34
getitem float32 [8] [2008] [2008] 0 maskrcnn bwd 260261 131362 1.98
getitem float32 [5] [14] [14] 0 maskrcnn_ssdlite bwd 169587 60871 2.79
getitem float32 [8 4] [8] [8 4] 0 ssdlite bwd 174323 63111 2.76
getitem float32 [4197 4] [1000] [1000 4] 0 fasterrcnn bwd 178787 62987 2.84
getitem float32 [4197] [1000] [1000] 0 fasterrcnn bwd 169379 63964 2.65
getitem float32 [16] [2016] [2016] 0 maskrcnn bwd 234628 140481 1.67
getitem float32 [7 4] [33] [33 4] 0 ssdlite bwd 160179 64782 2.47
getitem float16
op_name dtype dx_size index_size dy_size dim model dir ROCm pytorch (op time) MIOpen HIP Improvement
getitem bfloat16 [128 128] [128] [128 128] 0 llama2 bwd 253429 65687 3.86
getitem bfloat16 [3234 4] [16] [16 4] 0 ssdlite bwd 242037 61545 3.93
getitem bfloat16 [3234 4] [12] [12 4] 0 ssdlite bwd 240565 61848 3.89
getitem bfloat16 [3234 4] [11] [11 4] 0 ssdlite bwd 231941 58150 3.99
getitem bfloat16 [3234 4] [13] [13 4] 0 ssdlite bwd 253237 61297 4.13
getitem bfloat16 [3234 4] [14] [14 4] 0 ssdlite bwd 226148 59874 3.78
getitem bfloat16 [3234 4] [15] [15 4] 0 ssdlite bwd 226548 58327 3.88
getitem bfloat16 [3234 4] [10] [10 4] 0 ssdlite bwd 248805 58737 4.24
getitem bfloat16 [3234 4] [18] [18 4] 0 ssdlite bwd 242149 61439 3.94
getitem bfloat16 [3234 4] [20] [20 4] 0 ssdlite bwd 213460 59003 3.62
getitem bfloat16 [3234 4] [23] [23 4] 0 ssdlite bwd 229716 55039 4.17
getitem bfloat16 [3234 4] [8] [8 4] 0 ssdlite bwd 225156 61190 3.68
getitem bfloat16 [3234 4] [22] [22 4] 0 ssdlite bwd 209716 62719 3.34
getitem bfloat16 [3234 4] [25] [25 4] 0 ssdlite bwd 219940 59003 3.73
getitem bfloat16 [3234 4] [7] [7 4] 0 ssdlite bwd 221348 59572 3.72
getitem bfloat16 [3234 4] [28] [28 4] 0 ssdlite bwd 217460 61937 3.51
getitem bfloat16 [3234 4] [30] [30 4] 0 ssdlite bwd 197748 55857 3.54
getitem bfloat16 [3234 4] [31] [31 4] 0 ssdlite bwd 222724 61101 3.65
getitem bfloat16 [3234 4] [32] [32 4] 0 ssdlite bwd 221268 61457 3.60
getitem bfloat16 [3234 4] [33] [33 4] 0 ssdlite bwd 209668 59768 3.51
getitem bfloat16 [3234 4] [34] [34 4] 0 ssdlite bwd 214500 64959 3.30
getitem bfloat16 [3234] [16] [16] 0 ssdlite bwd 220367 60425 3.65
getitem bfloat16 [149 128] [1490] [1490 128] 0 llama2_7b bwd 293670 77563 3.79
getitem bfloat16 [150 128] [10] [10 128] 0 llama2_7b bwd 223060 62665 3.56
getitem bfloat16 [174 128] [10] [10 128] 0 llama2_7b bwd 261013 62523 4.17
getitem bfloat16 [205 128] [10] [10 128] 0 llama2_7b bwd 257141 57576 4.47
getitem bfloat16 [232 128] [10] [10 128] 0 llama2_7b bwd 254901 63086 4.04
getitem bfloat16 [248 128] [10] [10 128] 0 llama2_7b bwd 265013 61751 4.29
getitem bfloat16 [3234] [15] [15] 0 ssdlite bwd 263381 59564 4.42
getitem bfloat16 [3234 4] [2] [2 4] 0 ssdlite bwd 219956 60399 3.64
getitem bfloat16 [3234 4] [37] [37 4] 0 ssdlite bwd 242933 61128 3.97
getitem bfloat16 [3234] [10] [10] 0 ssdlite bwd 272741 57821 4.72
getitem bfloat16 [3234 4] [38] [38 4] 0 ssdlite bwd 192020 59545 3.22
getitem bfloat16 [3234 4] [41] [41 4] 0 ssdlite bwd 226452 61607 3.68
getitem bfloat16 [3234 4] [6] [6 4] 0 ssdlite bwd 178259 58283 3.06
getitem bfloat16 [3234 4] [44] [44 4] 0 ssdlite bwd 247812 56842 4.36
getitem bfloat16 [3234] [22] [22] 0 ssdlite bwd 254565 60949 4.18
getitem bfloat16 [3234] [25] [25] 0 ssdlite bwd 188339 58140 3.24
getitem bfloat16 [3234 4] [43] [43 4] 0 ssdlite bwd 223300 62603 3.57
getitem bfloat16 [3234] [21] [21] 0 ssdlite bwd 222628 61553 3.62
getitem bfloat16 [3234] [27] [27] 0 ssdlite bwd 188820 59296 3.18
getitem bfloat16 [3234] [24] [24] 0 ssdlite bwd 209604 60096 3.49
getitem bfloat16 [3234 4] [42] [42 4] 0 ssdlite bwd 212964 62051 3.43
getitem bfloat16 [3234 4] [3] [3 4] 0 ssdlite bwd 181732 58051 3.13
getitem bfloat16 [3234 4] [45] [45 4] 0 ssdlite bwd 206852 62442 3.31
getitem bfloat16 [3234] [40] [40] 0 ssdlite bwd 183395 58120 3.16
getitem bfloat16 [1 4] [8] [8 4] 0 ssdlite bwd 211396 65321 3.24
getitem bfloat16 [1] [13] [13] 0 ssdlite bwd 218180 61498 3.55
getitem bfloat16 [3234] [45] [45] 0 ssdlite bwd 192628 62778 3.07
getitem bfloat16 [1 4] [7] [7 4] 0 ssdlite bwd 223796 58297 3.84
getitem bfloat16 [1] [7] [7] 0 ssdlite bwd 220084 58315 3.77
getitem bfloat16 [3234] [4] [4] 0 ssdlite bwd 221716 60378 3.67
getitem bfloat16 [3234 4] [59] [59 4] 0 ssdlite bwd 194932 60751 3.21
getitem bfloat16 [3234] [5] [5] 0 ssdlite bwd 200388 59293 3.38
getitem bfloat16 [2 4] [15] [15 4] 0 ssdlite bwd 216948 61231 3.54
getitem bfloat16 [3234 4] [62] [62 4] 0 ssdlite bwd 216724 62369 3.47
getitem bfloat16 [3234] [57] [57] 0 ssdlite bwd 185412 61531 3.01
getitem bfloat16 [3234 4] [77] [77 4] 0 ssdlite bwd 178740 65425 2.73
getitem bfloat16 [3234 4] [79] [79 4] 0 ssdlite bwd 152915 66118 2.31
getitem bfloat16 [3234 4] [75] [75 4] 0 ssdlite bwd 151603 64020 2.37
getitem bfloat16 [3234] [59] [59] 0 ssdlite bwd 157059 59842 2.62
getitem bfloat16 [8741 4] [2000] [2000 4] 0 maskrcnn bwd 216740 58811 3.69
getitem bfloat16 [8741] [2000] [2000] 0 maskrcnn bwd 186244 64305 2.90
getitem bfloat16 [3234 4] [89] [89 4] 0 ssdlite bwd 142355 65140 2.19
getitem bfloat16 [3234 4] [84] [84 4] 0 ssdlite bwd 130803 64144 2.04
getitem bfloat16 [8741 4] [8741] [8741 4] 0 maskrcnn bwd 234148 58473 4.00
getitem bfloat16 [8741] [8741] [8741] 0 maskrcnn bwd 191907 59646 3.22
getitem bfloat16 [3234] [58] [58] 0 ssdlite bwd 133955 61389 2.18
getitem bfloat16 [3 4] [20] [20 4] 0 ssdlite bwd 134866 56962 2.37
getitem bfloat16 [3] [20] [20] 0 ssdlite bwd 136547 69478 1.97
getitem bfloat16 [3234] [64] [64] 0 ssdlite bwd 134850 58082 2.32
getitem bfloat16 [3234 4] [83] [83 4] 0 ssdlite bwd 142067 58580 2.43
getitem bfloat16 [3 4] [17] [17 4] 0 ssdlite bwd 136002 65371 2.08
getitem bfloat16 [3] [17] [17] 0 ssdlite bwd 133619 61015 2.19
getitem bfloat16 [3234 4] [86] [86 4] 0 ssdlite bwd 135385 64376 2.10
getitem bfloat16 [3234 4] [88] [88 4] 0 ssdlite bwd 136114 65656 2.07
getitem bfloat16 [3234] [65] [65] 0 ssdlite bwd 134338 60838 2.21
getitem bfloat16 [3234] [73] [73] 0 ssdlite bwd 136867 55646 2.46
getitem bfloat16 [3234] [61] [61] 0 ssdlite bwd 137171 60642 2.26
getitem bfloat16 [3 4] [23] [23 4] 0 ssdlite bwd 133922 65051 2.06
getitem bfloat16 [3] [23] [23] 0 ssdlite bwd 131427 68642 1.91
getitem bfloat16 [3 4] [21] [21 4] 0 ssdlite bwd 137427 59860 2.30
getitem bfloat16 [3] [21] [21] 0 ssdlite bwd 135763 63362 2.14
getitem bfloat16 [2 4] [8] [8 4] 0 ssdlite bwd 128722 63077 2.04
getitem bfloat16 [2] [8] [8] 0 ssdlite bwd 137171 61442 2.23
getitem bfloat16 [3 4] [28] [28 4] 0 ssdlite bwd 137922 60855 2.27
getitem bfloat16 [3] [28] [28] 0 ssdlite bwd 134530 63736 2.11
getitem bfloat16 [2 4] [27] [27 4] 0 ssdlite bwd 142290 69122 2.06
getitem bfloat16 [2] [27] [27] 0 ssdlite bwd 140739 66473 2.12
getitem bfloat16 [3 4] [22] [22 4] 0 ssdlite bwd 132035 62989 2.10
getitem bfloat16 [3] [22] [22] 0 ssdlite bwd 138658 61851 2.24
getitem bfloat16 [3234 4] [78] [78 4] 0 ssdlite bwd 134595 65264 2.06
getitem bfloat16 [2 4] [28] [28 4] 0 ssdlite bwd 137602 71238 1.93
getitem bfloat16 [2] [28] [28] 0 ssdlite bwd 144419 66811 2.16
getitem bfloat16 [3234] [67] [67] 0 ssdlite bwd 136675 56180 2.43
getitem bfloat16 [3234] [74] [74] 0 ssdlite bwd 133058 61068 2.18
getitem bfloat16 [3234] [70] [70] 0 ssdlite bwd 156723 60891 2.57
getitem bfloat16 [3 4] [26] [26 4] 0 ssdlite bwd 159459 68518 2.33
getitem bfloat16 [3] [26] [26] 0 ssdlite bwd 150451 65175 2.31
getitem bfloat16 [3 4] [29] [29 4] 0 ssdlite bwd 155683 65567 2.37
getitem bfloat16 [3] [29] [29] 0 ssdlite bwd 156915 64464 2.43
getitem bfloat16 [3 4] [18] [18 4] 0 ssdlite bwd 150467 64304 2.34
getitem bfloat16 [3] [18] [18] 0 ssdlite bwd 150723 61620 2.45
getitem bfloat16 [3234] [68] [68] 0 ssdlite bwd 154467 62722 2.46
getitem bfloat16 [3234] [66] [66] 0 ssdlite bwd 155315 59059 2.63
getitem bfloat16 [3 4] [19] [19 4] 0 ssdlite bwd 156019 62295 2.50
getitem bfloat16 [3] [19] [19] 0 ssdlite bwd 159667 63237 2.52
getitem bfloat16 [3234 4] [90] [90 4] 0 ssdlite bwd 160435 64589 2.48
getitem bfloat16 [3] [25] [25] 0 ssdlite bwd 149315 58011 2.57
getitem bfloat16 [3234 4] [87] [87 4] 0 ssdlite bwd 157955 62882 2.51
getitem bfloat16 [3 4] [10] [10 4] 0 ssdlite bwd 154723 59468 2.60
getitem bfloat16 [3] [10] [10] 0 ssdlite bwd 151491 59593 2.54
getitem bfloat16 [3 4] [27] [27 4] 0 ssdlite bwd 148339 65778 2.26
getitem bfloat16 [3] [27] [27] 0 ssdlite bwd 159731 63664 2.51
getitem bfloat16 [3 4] [30] [30 4] 0 ssdlite bwd 152787 65140 2.35
getitem bfloat16 [3] [30] [30] 0 ssdlite bwd 150835 64837 2.33
getitem bfloat16 [3 4] [16] [16 4] 0 ssdlite bwd 158339 62882 2.52
getitem bfloat16 [3] [16] [16] 0 ssdlite bwd 157603 63451 2.48
getitem bfloat16 [3 4] [33] [33 4] 0 ssdlite bwd 151475 66651 2.27
getitem bfloat16 [3] [33] [33] 0 ssdlite bwd 161203 64962 2.48
getitem bfloat16 [3234 4] [81] [81 4] 0 ssdlite bwd 152003 64820 2.35
getitem bfloat16 [3 4] [34] [34 4] 0 ssdlite bwd 148995 68002 2.19
getitem bfloat16 [3] [34] [34] 0 ssdlite bwd 162515 62988 2.58
getitem bfloat16 [4300 4] [4261] [4261 4] 0 fasterrcnn bwd 202516 61833 3.28
getitem bfloat16 [4300] [4261] [4261] 0 fasterrcnn bwd 209684 63148 3.32
getitem bfloat16 [4261 4] [1000] [1000 4] 0 fasterrcnn bwd 162931 62597 2.60
getitem bfloat16 [4261] [1000] [1000] 0 fasterrcnn bwd 182643 61957 2.95
getitem bfloat16 [3 4] [31] [31 4] 0 ssdlite bwd 159603 65833 2.42
getitem bfloat16 [3] [31] [31] 0 ssdlite bwd 154259 64197 2.40
getitem bfloat16 [2 4] [29] [29 4] 0 ssdlite bwd 152947 67006 2.28
getitem bfloat16 [2] [29] [29] 0 ssdlite bwd 159267 68553 2.32
getitem bfloat16 [3234] [72] [72] 0 ssdlite bwd 157891 57921 2.73
getitem bfloat16 [3234] [69] [69] 0 ssdlite bwd 149843 62028 2.42
getitem bfloat16 [3234 4] [94] [94 4] 0 ssdlite bwd 157171 63664 2.47
getitem bfloat16 [3 4] [11] [11 4] 0 ssdlite bwd 151475 63486 2.39
getitem bfloat16 [3] [11] [11] 0 ssdlite bwd 158163 56179 2.82
getitem bfloat16 [3 4] [13] [13 4] 0 ssdlite bwd 152067 63557 2.39
getitem bfloat16 [3] [13] [13] 0 ssdlite bwd 148723 60499 2.46
getitem bfloat16 [3 4] [12] [12 4] 0 ssdlite bwd 159027 58775 2.71
getitem bfloat16 [3] [12] [12] 0 ssdlite bwd 152643 60002 2.54
getitem bfloat16 [3 4] [14] [14 4] 0 ssdlite bwd 149699 63219 2.37
getitem bfloat16 [3] [14] [14] 0 ssdlite bwd 155843 61033 2.55
getitem bfloat16 [3234] [71] [71] 0 ssdlite bwd 159891 61744 2.59
getitem bfloat16 [3 4] [24] [24 4] 0 ssdlite bwd 152115 66793 2.28
getitem bfloat16 [3] [24] [24] 0 ssdlite bwd 158819 64766 2.45
getitem bfloat16 [3234 4] [101] [101 4] 0 ssdlite bwd 163683 63220 2.59
getitem bfloat16 [1 4] [4] [4 4] 0 ssdlite bwd 147381 73549 2.00
getitem bfloat16 [1] [4] [4] 0 ssdlite bwd 155971 61602 2.53
getitem bfloat16 [1 4] [6] [6 4] 0 ssdlite bwd 154931 60766 2.55
getitem bfloat16 [1] [6] [6] 0 ssdlite bwd 150259 56659 2.65
getitem bfloat16 [3234 4] [85] [85 4] 0 ssdlite bwd 157619 65602 2.40
getitem bfloat16 [3234 4] [92] [92 4] 0 ssdlite bwd 156451 63166 2.48
getitem bfloat16 [4 4] [4] [4 4] 0 ssdlite bwd 158499 62632 2.53
getitem bfloat16 [4] [4] [4] 0 ssdlite bwd 155651 60588 2.57
getitem bfloat16 [3234] [76] [76] 0 ssdlite bwd 140515 62064 2.26
getitem bfloat16 [3 4] [15] [15 4] 0 ssdlite bwd 146547 62970 2.33
getitem bfloat16 [3] [15] [15] 0 ssdlite bwd 144627 62028 2.33
getitem bfloat16 [3234] [80] [80] 0 ssdlite bwd 139490 61406 2.27
getitem bfloat16 [3 4] [32] [32 4] 0 ssdlite bwd 143187 66135 2.17
getitem bfloat16 [3] [32] [32] 0 ssdlite bwd 139154 75397 1.85
getitem bfloat16 [3234 4] [91] [91 4] 0 ssdlite bwd 130467 64624 2.02
getitem bfloat16 [4300 4] [4262] [4262 4] 0 fasterrcnn bwd 191188 59930 3.19
getitem bfloat16 [4300] [4262] [4262] 0 fasterrcnn bwd 189482 58099 3.26
getitem bfloat16 [4262 4] [1000] [1000 4] 0 fasterrcnn bwd 149603 64677 2.31
getitem bfloat16 [4262] [1000] [1000] 0 fasterrcnn bwd 144979 61282 2.37
getitem bfloat16 [1] [5] [5] 0 maskrcnn_ssdlite bwd 130802 61833 2.12
getitem bfloat16 [4 4] [27] [27 4] 0 ssdlite bwd 136306 61086 2.23
getitem bfloat16 [4] [27] [27] 0 ssdlite bwd 136626 61175 2.23
getitem bfloat16 [1 4] [5] [5 4] 0 ssdlite bwd 130178 60037 2.17
getitem bfloat16 [4 4] [28] [28 4] 0 ssdlite bwd 137586 64624 2.13
getitem bfloat16 [4] [28] [28] 0 ssdlite bwd 138771 64446 2.15
getitem bfloat16 [4300 4] [4194] [4194 4] 0 fasterrcnn bwd 188211 60730 3.10
getitem bfloat16 [4300] [4194] [4194] 0 fasterrcnn bwd 203940 60642 3.36
getitem bfloat16 [4194 4] [1000] [1000 4] 0 fasterrcnn bwd 164147 60712 2.70
getitem bfloat16 [2 4] [30] [30 4] 0 ssdlite bwd 160111 67682 2.37
getitem bfloat16 [2] [30] [30] 0 ssdlite bwd 156755 77442 2.02
getitem bfloat16 [4 4] [33] [33 4] 0 ssdlite bwd 150579 63593 2.37
getitem bfloat16 [4] [33] [33] 0 ssdlite bwd 157987 57797 2.73
getitem bfloat16 [3234] [77] [77] 0 ssdlite bwd 154387 59539 2.59
getitem bfloat16 [3234 4] [93] [93 4] 0 ssdlite bwd 145795 65068 2.24
getitem bfloat16 [3234 4] [105] [105 4] 0 ssdlite bwd 157283 65033 2.42
getitem bfloat16 [3234 4] [113] [113 4] 0 ssdlite bwd 154243 65442 2.36
getitem bfloat16 [4194] [1000] [1000] 0 fasterrcnn bwd 156547 62828 2.49
getitem bfloat16 [3 4] [37] [37 4] 0 ssdlite bwd 143443 69866 2.05
getitem bfloat16 [3] [37] [37] 0 ssdlite bwd 152179 66252 2.30
getitem bfloat16 [4 4] [37] [37 4] 0 ssdlite bwd 139618 63674 2.19
getitem bfloat16 [4] [37] [37] 0 ssdlite bwd 149107 69843 2.13
getitem bfloat16 [2 4] [7] [7 4] 0 ssdlite bwd 139939 61844 2.26
getitem bfloat16 [2] [7] [7] 0 ssdlite bwd 129730 60244 2.15
getitem bfloat16 [4 4] [18] [18 4] 0 ssdlite bwd 137235 62271 2.20
getitem bfloat16 [4] [18] [18] 0 ssdlite bwd 138933 62377 2.23
getitem bfloat16 [3 4] [9] [9 4] 0 ssdlite bwd 137107 60457 2.27
getitem bfloat16 [3] [9] [9] 0 ssdlite bwd 136803 59426 2.30
getitem bfloat16 [3234] [79] [79] 0 ssdlite bwd 142323 61986 2.30
getitem bfloat16 [4 4] [30] [30 4] 0 ssdlite bwd 131635 65079 2.02
getitem bfloat16 [4] [30] [30] 0 ssdlite bwd 135666 64155 2.11
getitem bfloat16 [3234] [75] [75] 0 ssdlite bwd 137907 63124 2.18
getitem bfloat16 [3 4] [35] [35 4] 0 ssdlite bwd 135746 68813 1.97
getitem bfloat16 [4 4] [39] [39 4] 0 ssdlite bwd 182899 69011 2.65
getitem bfloat16 [8] [2008] [2008] 0 maskrcnn bwd 264645 178237 1.48
getitem bfloat16 [5] [14] [14] 0 maskrcnn_ssdlite bwd 158099 61244 2.58
getitem bfloat16 [8 4] [8] [8 4] 0 ssdlite bwd 187412 60266 3.11
getitem bfloat16 [4197 4] [1000] [1000 4] 0 fasterrcnn bwd 161043 56605 2.85
getitem bfloat16 [4197] [1000] [1000] 0 fasterrcnn bwd 176067 62755 2.81
getitem bfloat16 [16] [2016] [2016] 0 maskrcnn bwd 238645 125743 1.90
getitem bfloat16 [7 4] [33] [33 4] 0 ssdlite bwd 164963 63858 2.58
  • Average over all cases
Op Type geomean
getitem_backward float16 2.80
getitem_backward float32 2.68
getitem_backward bfloat16 2.68

@seungmanhan
Copy link
Collaborator Author

The driver is being modified while resolving merge conflicts.

@seungmanhan seungmanhan requested a review from a team as a code owner April 15, 2024 05:56
@seungmanhan
Copy link
Collaborator Author

@junliume Even if I do a clean build and make check, i cannot determine the cause of the failure in Jenkins - HIP Package. Please check the reason of failure.

@seungmanhan
Copy link
Collaborator Author

@junliume Can you confirm what is being tested in 'Jenkins - Fp32 Hip All gfx90a' or send me the script?

@junliume
Copy link
Collaborator

@seungmanhan @apwojcik
The Windows build is failing while the Linux side is passing:

lld-link: error: undefined symbol: enum miopenStatus_t __cdecl miopen::GetitemBackward(struct miopen::Handle &, void *, unsigned __int64, struct miopen::TensorDescriptor const &, void const *, unsigned int, struct miopen::TensorDescriptor const *const *, void const **, struct miopen::TensorDescriptor const &, void *, struct miopen::TensorDescriptor const &, void *, unsigned int, int const *, unsigned int, int const *, unsigned int)

Can we either (1) this feature is provided on Linux only and we protect it to be so; (2) @apwojcik anyway to support this feature on Windows?

@apwojcik
Copy link
Collaborator

@seungmanhan @apwojcik The Windows build is failing while the Linux side is passing:

lld-link: error: undefined symbol: enum miopenStatus_t __cdecl miopen::GetitemBackward(struct miopen::Handle &, void *, unsigned __int64, struct miopen::TensorDescriptor const &, void const *, unsigned int, struct miopen::TensorDescriptor const *const *, void const **, struct miopen::TensorDescriptor const &, void *, struct miopen::TensorDescriptor const &, void *, unsigned int, int const *, unsigned int, int const *, unsigned int)

Can we either (1) this feature is provided on Linux only and we protect it to be so; (2) @apwojcik anyway to support this feature on Windows?

Please export the missing (undefined) symbols with MOPEN_INTERNALS_EXPORT.

@junliume junliume merged commit 57f4b95 into develop Jul 24, 2024
141 checks passed
@junliume junliume deleted the impl_getitem_bwd branch July 24, 2024 17:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants