This repository has been archived by the owner on Mar 21, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 447
Tell gcc this sizeof division is intended ( -Wsizeof-array-div ) #418
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
LGTM, I'll start testing soon. |
alliepiper
added
type: bug: functional
Does not work as intended.
P1: should have
Necessary, but not critical.
labels
Jan 13, 2022
gpuCI: NVIDIA/thrust#1590 |
alliepiper
added
the
testing: internal ci in progress
Currently testing on internal NVIDIA CI (DVS).
label
Jan 17, 2022
The warning is still emitted on gcc 11. I also tried just adding parens around the
I have another workaround incoming. |
The compiler suggests putting parentheses around the division to suppress the warning, but this isn't working on gcc 11. Refactoring this out a bit works though.
mfep
pushed a commit
to ROCm/hipCUB
that referenced
this pull request
Mar 16, 2022
rapids-bot bot
pushed a commit
to rapidsai/cudf
that referenced
this pull request
Apr 1, 2022
This PR updates the version of Thrust from 1.15 to 1.16 ([changelog](https://github.com/NVIDIA/thrust/blob/main/CHANGELOG.md#thrust-1160)). This update is needed to fix compilation with GCC 11, because of some warnings-as-errors present in Thrust 1.15 with GCC 11 (such as this one from Thrust's copy of cub: https://github.com/NVIDIA/cub/pull/418). Notably, Thrust reduced the number of internal header inclusions: > [#1572](https://github.com/NVIDIA/thrust/pull/1572) Removed several unnecessary header includes. Downstream projects may need to update their includes if they were relying on this behavior. This change illuminated many missing includes in libcudf, so I added `#include <thrust/...>` for all thrust features used in each file (with help from a Python script). I included raw benchmarks that I recorded below. <details> <summary>Benchmarks:</summary> ``` Benchmark Time CPU Time Old Time New CPU Old CPU New -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- CopyIfElse/int16_no_nulls/4096/manual_time +0.0581 +0.0307 0 0 0 0 CopyIfElse/uint32_no_nulls/4096/manual_time +0.1308 +0.0463 0 0 0 0 CopyIfElse/uint32_no_nulls/32768/manual_time +0.1043 +0.0485 0 0 0 0 CopyIfElse/float64_no_nulls/4096/manual_time +0.0894 +0.0422 0 0 0 0 StringDateTime/from_days/32768/manual_time +0.0529 +0.0491 93 98 112 118 StringDateTime/to_days/1024/manual_time +0.0596 +0.0493 35 37 54 57 StringDateTime/to_days/32768/manual_time +0.0547 +0.0460 37 39 55 58 StringToDurations/to_durations_ms/1024/manual_time +0.0516 +0.0426 30 31 49 51 StringToDurations/to_durations_ms/32768/manual_time +0.0542 +0.0506 32 34 52 55 StringToDurations/to_durations_us/32768/manual_time +0.0520 +0.0440 32 34 52 55 StringsFromFixedPoint/strings_from_decimal64/16384/manual_time +0.0530 +0.0508 94 99 113 119 StringsToNumeric/strings_to_float32/1024/manual_time +0.0521 +0.0451 31 32 50 52 StringsToNumeric/strings_to_float64/16384/manual_time +0.0517 +0.0437 32 34 51 53 StringsToNumeric/strings_to_float64/65536/manual_time +0.0505 +0.0496 35 36 53 56 StringsToNumeric/strings_to_uint8/4096/manual_time +0.0559 +0.0466 24 25 43 45 StringsToNumeric/strings_to_uint8/65536/manual_time +0.0563 +0.0458 26 27 44 46 StringCopy/gather/4096/32/manual_time +0.0652 +0.0574 0 0 0 0 StringCopy/gather/4096/128/manual_time +0.0706 +0.0615 0 0 0 0 StringCopy/gather/4096/512/manual_time +0.0547 +0.0476 0 0 0 0 StringCopy/gather/32768/32/manual_time +0.0538 +0.0492 0 0 0 0 StringCopy/gather/32768/128/manual_time +0.0540 +0.0477 0 0 0 0 StringCopy/scatter/4096/32/manual_time +0.0571 +0.0526 0 0 0 0 StringCopy/scatter/32768/32/manual_time +0.0541 +0.0509 0 0 0 0 StringFindScalar/find_multi/4096/32/manual_time +0.0525 +0.0460 0 0 0 0 StringFindScalar/find_multi/32768/32/manual_time +0.0538 +0.0489 0 0 0 0 StringFindScalar/contains/4096/32/manual_time +0.0502 +0.0471 0 0 0 0 StringFindScalar/starts_with/4096/32/manual_time +0.0528 +0.0476 0 0 0 0 StringFindScalar/starts_with/4096/2048/manual_time +0.0575 +0.0475 0 0 0 0 StringFindScalar/starts_with/4096/8192/manual_time +0.0606 +0.0515 0 0 0 0 StringFindScalar/starts_with/32768/32/manual_time +0.0690 +0.0592 0 0 0 0 StringFindScalar/starts_with/32768/128/manual_time +0.0589 +0.0499 0 0 0 0 StringFindScalar/starts_with/32768/512/manual_time +0.0567 +0.0521 0 0 0 0 StringFindScalar/starts_with/32768/2048/manual_time +0.0517 +0.0501 0 0 0 0 StringFindScalar/starts_with/262144/32/manual_time +0.0555 +0.0525 0 0 0 0 StringFindScalar/ends_with/4096/2048/manual_time +0.0526 +0.0446 0 0 0 0 StringFindScalar/ends_with/4096/8192/manual_time +0.0568 +0.0485 0 0 0 0 StringFindScalar/ends_with/32768/32/manual_time +0.0654 +0.0567 0 0 0 0 StringFindScalar/ends_with/32768/512/manual_time +0.0546 +0.0502 0 0 0 0 StringFindScalar/ends_with/262144/32/manual_time +0.0523 +0.0517 0 0 0 0 RepeatStrings/scalar_times/256/16/manual_time +0.0555 +0.0501 0 0 0 0 RepeatStrings/scalar_times/1024/16/manual_time +0.0562 +0.0519 0 0 0 0 RepeatStrings/column_times/256/16/manual_time +0.0645 +0.0579 0 0 0 0 RepeatStrings/column_times/256/64/manual_time +0.0506 +0.0472 0 0 0 0 RepeatStrings/column_times/1024/16/manual_time +0.0643 +0.0578 0 0 0 0 RepeatStrings/column_times/4096/16/manual_time +0.0537 +0.0502 0 0 0 0 RepeatStrings/column_times/16384/16/manual_time +0.0565 +0.0514 0 0 0 0 RepeatStrings/compute_output_strings_sizes/256/16/manual_time +0.0626 +0.0490 0 0 0 0 RepeatStrings/compute_output_strings_sizes/256/64/manual_time +0.0539 +0.0434 0 0 0 0 RepeatStrings/compute_output_strings_sizes/256/256/manual_time +0.0694 +0.0525 0 0 0 0 RepeatStrings/compute_output_strings_sizes/1024/16/manual_time +0.0526 +0.0422 0 0 0 0 RepeatStrings/compute_output_strings_sizes/1024/64/manual_time +0.0630 +0.0493 0 0 0 0 RepeatStrings/compute_output_strings_sizes/1024/256/manual_time +0.0533 +0.0460 0 0 0 0 RepeatStrings/precomputed_sizes/256/16/manual_time +0.0674 +0.0602 0 0 0 0 RepeatStrings/precomputed_sizes/1024/16/manual_time +0.0544 +0.0488 0 0 0 0 RepeatStrings/precomputed_sizes/4096/16/manual_time +0.0531 +0.0492 0 0 0 0 RepeatStrings/precomputed_sizes/16384/16/manual_time +0.0522 +0.0470 0 0 0 0 StringReplace/slice/4096/32/manual_time +0.0559 +0.0534 0 0 0 0 StringReplace/slice/32768/32/manual_time +0.0509 +0.0472 0 0 0 0 StringSplit/split_ws/4096/32/manual_time +0.0507 +0.0493 0 0 0 0 StringSubstring/multi_position/4096/32/manual_time +0.0560 +0.0515 0 0 0 0 StringSubstring/delimiter/4096/32/manual_time +0.0532 +0.0504 0 0 0 0 StringSubstring/delimiter/32768/128/manual_time +0.0531 +0.0535 0 0 0 0 StringSubstring/multi_delimiter/4096/32/manual_time +0.0544 +0.0522 0 0 0 0 CsvWrite/string_file_output/23/0/manual_time -0.3111 -0.0110 1421 979 842 833 Shift/shift_ten_percent_nullable_out/32768/manual_time -0.0786 -0.0650 0 0 0 0 Shift/shift_full_nullable_out/1073741824/manual_time +0.0511 +0.0510 11 11 11 11 TypeDispatcher/fp64_bandwidth_host/8/1024/1/manual_time +0.1281 +0.0638 18970 21400 37938 40357 TypeDispatcher/fp64_bandwidth_host/4/2048/1/manual_time +0.0928 +0.0345 11556 12629 30463 31513 TypeDispatcher/fp64_bandwidth_host/2/4096/1/manual_time +0.0768 +0.0270 7421 7991 26234 26943 TypeDispatcher/fp64_bandwidth_host/1/8192/1/manual_time +0.0729 +0.0209 5029 5396 24111 24615 TypeDispatcher/fp64_bandwidth_device/8/1024/1/manual_time +0.1176 +0.0632 16518 18460 35703 37961 TypeDispatcher/fp64_bandwidth_device/4/2048/1/manual_time +0.0787 +0.0457 14424 15559 33546 35079 TypeDispatcher/fp64_bandwidth_device/2/4096/1/manual_time +0.0500 +0.0327 13594 14274 32740 33811 TypeDispatcher/fp64_bandwidth_no/2/1024/1/manual_time +0.0590 +0.0131 5065 5364 23966 24281 TypeDispatcher/fp64_bandwidth_no/8/1024/1/manual_time +0.2305 +0.0699 6912 8506 25803 27607 TypeDispatcher/fp64_bandwidth_no/1/2048/1/manual_time +0.0574 +0.0120 4854 5133 23782 24067 TypeDispatcher/fp64_bandwidth_no/4/2048/1/manual_time +0.1602 +0.0461 6010 6973 24906 26054 TypeDispatcher/fp64_bandwidth_no/2/4096/1/manual_time +0.0949 +0.0330 5583 6113 24469 25275 TypeDispatcher/fp64_bandwidth_no/4/4096/1/manual_time +0.0623 +0.0175 6991 7427 26088 26545 TypeDispatcher/fp64_bandwidth_no/8/4096/1/manual_time +0.0521 +0.0173 8953 9419 28000 28484 TypeDispatcher/fp64_bandwidth_no/1/8192/1/manual_time +0.0607 +0.0257 5225 5542 24107 24727 TypeDispatcher/fp64_bandwidth_no/2/8192/1/manual_time +0.0588 +0.0115 5964 6315 25052 25341 TypeDispatcher/fp64_bandwidth_no/1/16384/1/manual_time +0.0541 +0.0119 5443 5737 24515 24806 TextTokenize/ngrams/2097152/128/manual_time +0.0624 +0.0623 10 10 10 10 MultibyteSplitBenchmark/multibyte_split_simple/1/1/1/32768/manual_time +0.4019 +0.4024 8 12 8 12 MultibyteSplitBenchmark/multibyte_split_simple/2/1/1/32768/manual_time +0.4099 +0.4073 8 12 8 12 MultibyteSplitBenchmark/multibyte_split_simple/1/4/1/32768/manual_time +0.3999 +0.3961 8 12 8 12 MultibyteSplitBenchmark/multibyte_split_simple/2/4/1/32768/manual_time +0.3969 +0.3980 8 12 8 12 MultibyteSplitBenchmark/multibyte_split_simple/1/7/1/32768/manual_time +0.4107 +0.3971 8 12 8 12 MultibyteSplitBenchmark/multibyte_split_simple/2/7/1/32768/manual_time +0.3833 +0.3948 8 12 8 12 MultibyteSplitBenchmark/multibyte_split_simple/1/1/25/32768/manual_time +0.3807 +0.3772 9 12 9 12 MultibyteSplitBenchmark/multibyte_split_simple/2/1/25/32768/manual_time +0.3834 +0.3702 9 12 9 12 MultibyteSplitBenchmark/multibyte_split_simple/1/4/25/32768/manual_time +0.3646 +0.3661 9 12 9 12 MultibyteSplitBenchmark/multibyte_split_simple/2/4/25/32768/manual_time +0.3722 +0.3743 9 12 9 12 MultibyteSplitBenchmark/multibyte_split_simple/1/7/25/32768/manual_time +0.3575 +0.3664 9 12 9 12 MultibyteSplitBenchmark/multibyte_split_simple/2/7/25/32768/manual_time +0.3761 +0.3744 9 12 9 12 MultibyteSplitBenchmark/multibyte_split_simple/1/4/1/1073741824/manual_time -0.1017 -0.1040 1681 1510 1681 1506 MultibyteSplitBenchmark/multibyte_split_simple/2/4/1/1073741824/manual_time -0.1817 -0.1817 4102 3357 4101 3356 MultibyteSplitBenchmark/multibyte_split_simple/0/7/25/1073741824/manual_time -0.0704 -0.0704 345 320 345 320 OVERALL_GEOMEAN +0.0974 +0.0970 0 0 0 0 Groupby/BasicSumScan/100000000/manual_time +0.2947 +0.2947 135 175 135 175 CsvRead/decimal_file_input/35/0/manual_time +0.0508 +0.0511 151 159 151 159 ReductionScan/double_nulls/100000/manual_time +0.0721 +0.0609 22874 24524 40726 43206 OrcWrite/integral_file_output/30/0/32/1/0/manual_time -0.1923 -0.0371 913 738 763 735 OrcWrite/integral_file_output/30/0/1/0/0/manual_time +0.2668 -0.0297 754 955 722 701 OrcWrite/integral_file_output/30/1000/1/0/0/manual_time -0.1090 -0.0510 986 878 725 688 OrcWrite/integral_file_output/30/0/32/0/0/manual_time +0.0594 -0.0575 981 1039 738 696 OrcWrite/integral_buffer_output/30/1000/32/1/1/manual_time +0.0882 +0.0885 85 92 85 92 OrcWrite/integral_buffer_output/30/1000/32/0/1/manual_time -0.0966 -0.0955 98 89 98 89 OrcWrite/floats_file_output/31/0/1/1/0/manual_time +0.0600 -0.0538 737 781 737 697 OrcWrite/floats_file_output/31/0/32/1/0/manual_time +0.0670 +0.0021 1203 1284 715 717 OrcWrite/floats_file_output/31/0/1/0/0/manual_time -0.2406 -0.0605 865 657 698 656 OrcWrite/floats_file_output/31/1000/1/0/0/manual_time -0.2006 -0.0642 1122 897 706 660 OrcWrite/floats_file_output/31/0/32/0/0/manual_time -0.1759 -0.0563 1131 932 708 668 OrcWrite/floats_file_output/31/1000/32/0/0/manual_time -0.1600 -0.0640 1095 919 702 657 OrcWrite/decimal_file_output/35/1000/1/0/0/manual_time +0.1622 -0.0865 1110 1290 588 537 OrcWrite/timestamps_file_output/33/0/1/0/0/manual_time +0.1884 -0.0494 552 657 552 524 OrcWrite/timestamps_file_output/33/1000/1/0/0/manual_time +0.1409 +0.0064 650 742 541 544 OrcWrite/list_file_output/24/0/1/0/0/manual_time -0.0723 -0.0788 713 661 711 655 OrcWrite/list_file_output/24/1000/1/0/0/manual_time +0.0935 -0.0468 696 761 689 657 Concatenate/BM_concatenate_nullable_false/4096/2/manual_time +0.1055 +0.0672 0 0 0 0 Concatenate/BM_concatenate_nullable_false/512/8/manual_time +0.0548 +0.0379 0 0 0 0 Concatenate/BM_concatenate_nullable_true/32768/8/manual_time +0.0501 +0.0415 0 0 0 0 Concatenate/BM_concatenate_nullable_true/64/64/manual_time +0.0570 +0.0400 0 0 0 0 Concatenate/BM_concatenate_nullable_true/512/64/manual_time +0.0894 +0.0606 0 0 0 0 Concatenate/BM_concatenate_tables_nullable_false/4096/2/2/manual_time +0.1086 +0.0771 0 0 0 0 Concatenate/BM_concatenate_tables_nullable_false/512/8/2/manual_time +0.0920 +0.0828 0 0 0 0 Concatenate/BM_concatenate_tables_nullable_false/4096/8/2/manual_time +0.0549 +0.0502 0 0 0 0 Concatenate/BM_concatenate_tables_nullable_false/256/32/2/manual_time +0.1036 +0.1009 1 1 1 1 Concatenate/BM_concatenate_tables_nullable_false/512/32/2/manual_time +0.0827 +0.0813 1 1 1 1 Concatenate/BM_concatenate_tables_nullable_false/4096/32/2/manual_time +0.0788 +0.0768 1 1 1 1 Concatenate/BM_concatenate_tables_nullable_false/256/8/64/manual_time +0.0525 +0.0490 0 0 0 0 ParquetRead/integral_buffer_input/29/1000/1/0/1/manual_time +0.0929 +0.0928 46 50 46 50 ParquetRead/timestamps_file_input/33/0/32/0/0/manual_time -0.0896 -0.0897 127 116 128 116 OrcRead/integral_buffer_input/30/1000/1/0/1/manual_time +0.1087 +0.1087 88 97 88 97 OrcRead/floats_file_input/31/0/1/1/0/manual_time +0.1528 +0.1526 134 155 134 155 OrcRead/floats_buffer_input/31/1000/1/0/1/manual_time +0.1349 +0.1350 75 85 75 85 OrcRead/decimal_buffer_input/35/0/1/0/1/manual_time -0.1137 -0.1137 264 234 264 234 OrcRead/string_file_input/23/0/1/0/0/manual_time -0.0750 -0.0750 162 150 162 150 OrcRead/string_file_input/23/0/32/0/0/manual_time -0.0963 -0.0963 163 147 163 147 OrcRead/string_buffer_input/23/0/32/0/1/manual_time -0.1586 -0.0139 114 96 97 96 OrcRead/list_file_input/24/1000/1/0/0/manual_time +0.0515 +0.0517 176 185 176 185 OrcRead/list_file_input/24/0/32/0/0/manual_time +0.0925 +0.0922 173 189 173 189 OrcRead/list_buffer_input/24/0/1/1/1/manual_time -0.1288 -0.1291 139 121 139 121 BINARYOP<int32_t, TreeType::IMBALANCED_LEFT, true>/binaryop_int32_imbalanced_reuse/100000/2/manual_time +0.0533 +0.0381 0 0 0 0 COMPILED_BINARYOP/NULL_MAX_decimal32_decimal32_decimal32/100000/manual_time +0.0509 +0.0320 13 14 32 33 COMPILED_BINARYOP/NULL_MIN_timestamp_D_timestamp_s_timestamp_s/10000/manual_time +0.0509 +0.0374 11 12 30 31 ParquetWrite/integral_file_output/29/0/1/1/0/manual_time +0.3011 +0.0605 726 945 726 770 ParquetWrite/integral_file_output/29/1000/1/1/0/manual_time +0.0812 +0.0804 311 336 310 335 ParquetWrite/integral_file_output/29/0/32/1/0/manual_time +0.3497 +0.0714 948 1279 734 786 ParquetWrite/integral_file_output/29/1000/32/1/0/manual_time +0.0559 +0.0558 62 65 62 65 ParquetWrite/integral_file_output/29/0/1/0/0/manual_time +0.1829 +0.0679 702 830 700 748 ParquetWrite/integral_file_output/29/1000/1/0/0/manual_time +0.0829 +0.0852 284 307 283 307 ParquetWrite/integral_file_output/29/0/32/0/0/manual_time -0.3273 +0.0451 1063 715 683 714 ParquetWrite/integral_file_output/29/1000/32/0/0/manual_time +0.0835 +0.0834 58 63 58 63 ParquetWrite/integral_buffer_output/29/0/1/1/1/manual_time +0.0608 +0.0609 874 927 874 927 ParquetWrite/floats_file_output/31/0/1/1/0/manual_time +0.1916 +0.0634 694 827 693 737 ParquetWrite/floats_file_output/31/1000/1/1/0/manual_time +0.0560 +0.0553 217 229 217 229 ParquetWrite/floats_file_output/31/0/32/1/0/manual_time +0.0517 +0.0546 1020 1073 721 760 ParquetWrite/floats_file_output/31/1000/32/1/0/manual_time +0.1149 +0.0631 45 50 39 42 ParquetWrite/floats_file_output/31/0/1/0/0/manual_time +0.1165 +0.0471 880 983 664 695 ParquetWrite/floats_file_output/31/1000/1/0/0/manual_time +0.3996 +0.0038 237 331 219 219 ParquetWrite/floats_file_output/31/0/32/0/0/manual_time +0.3109 +0.0673 666 873 666 710 ParquetWrite/floats_file_output/31/1000/32/0/0/manual_time +0.0798 +0.0790 38 41 38 41 ParquetWrite/floats_buffer_output/31/1000/1/1/1/manual_time +0.0710 +0.0709 208 223 208 223 ParquetWrite/floats_buffer_output/31/0/32/1/1/manual_time +0.0677 +0.0673 732 782 732 782 ParquetWrite/floats_buffer_output/31/0/1/0/1/manual_time +0.0663 +0.0659 682 728 682 727 ParquetWrite/floats_buffer_output/31/1000/1/0/1/manual_time +0.0785 +0.0780 188 203 188 203 ParquetWrite/decimal_file_output/35/0/1/1/0/manual_time +0.0655 +0.0636 277 296 277 295 ParquetWrite/decimal_file_output/35/1000/1/1/0/manual_time +0.0657 +0.0634 242 258 242 257 ParquetWrite/decimal_file_output/35/0/32/1/0/manual_time +0.1194 +0.0577 291 325 290 307 ParquetWrite/decimal_file_output/35/1000/32/1/0/manual_time +0.0852 +0.0836 170 185 170 184 ParquetWrite/decimal_file_output/35/0/1/0/0/manual_time +0.3802 +0.0372 346 477 325 337 ParquetWrite/decimal_file_output/35/1000/1/0/0/manual_time +0.8101 +0.1543 374 677 373 431 ParquetWrite/decimal_file_output/35/0/32/0/0/manual_time +1.4742 +0.0541 328 812 327 344 ParquetWrite/decimal_file_output/35/1000/32/0/0/manual_time +0.5398 +0.0463 391 603 390 409 ParquetWrite/decimal_buffer_output/35/0/1/1/1/manual_time +0.0571 +0.0570 301 318 301 318 ParquetWrite/decimal_buffer_output/35/1000/1/1/1/manual_time +0.1955 +0.1953 253 302 253 302 ParquetWrite/decimal_buffer_output/35/0/32/1/1/manual_time +0.0655 +0.0641 306 326 306 325 ParquetWrite/decimal_buffer_output/35/0/1/0/1/manual_time +0.0595 +0.0591 381 404 381 404 ParquetWrite/decimal_buffer_output/35/1000/1/0/1/manual_time +0.0650 +0.0643 515 548 515 548 ParquetWrite/decimal_buffer_output/35/0/32/0/1/manual_time +0.0595 +0.0591 386 409 386 409 ParquetWrite/decimal_buffer_output/35/1000/32/0/1/manual_time +0.0595 +0.0590 517 547 516 547 ParquetWrite/timestamps_file_output/33/0/1/1/0/manual_time +0.0566 +0.0580 724 765 721 762 ParquetWrite/timestamps_file_output/33/1000/1/1/0/manual_time -0.6229 -0.0258 526 198 203 198 ParquetWrite/timestamps_file_output/33/0/32/1/0/manual_time -0.0955 +0.0444 928 840 733 766 ParquetWrite/timestamps_file_output/33/1000/32/1/0/manual_time +0.0794 +0.0725 36 39 36 39 ParquetWrite/timestamps_file_output/33/0/1/0/0/manual_time +0.2140 +0.0788 626 760 626 676 ParquetWrite/timestamps_file_output/33/1000/1/0/0/manual_time +0.0778 +0.0760 174 188 174 187 ParquetWrite/timestamps_file_output/33/0/32/0/0/manual_time +0.4682 +0.0758 636 934 636 684 ParquetWrite/timestamps_file_output/33/1000/32/0/0/manual_time +0.0938 +0.0929 34 38 34 38 ParquetWrite/timestamps_buffer_output/33/0/1/1/1/manual_time +0.0559 +0.0559 837 884 837 884 ParquetWrite/timestamps_buffer_output/33/0/1/0/1/manual_time +0.0612 +0.0612 714 758 714 758 ParquetWrite/timestamps_buffer_output/33/1000/1/0/1/manual_time -0.2022 -0.2021 229 183 229 183 ParquetWrite/timestamps_buffer_output/33/0/32/0/1/manual_time +0.0609 +0.0596 721 765 721 764 ParquetWrite/string_file_output/23/0/1/1/0/manual_time +0.1674 +0.1004 1231 1437 869 956 ParquetWrite/string_file_output/23/1000/1/1/0/manual_time +0.0748 +0.0675 124 133 107 114 ParquetWrite/string_file_output/23/0/32/1/0/manual_time +0.0497 +0.0541 1197 1256 893 942 ParquetWrite/string_file_output/23/1000/32/1/0/manual_time +0.0822 +0.0551 38 41 34 35 ParquetWrite/string_file_output/23/0/1/0/0/manual_time +0.3477 +0.0668 892 1202 828 883 ParquetWrite/string_file_output/23/1000/1/0/0/manual_time +0.1446 +0.1474 98 113 98 113 ParquetWrite/string_file_output/23/1000/32/0/0/manual_time +0.0596 +0.0590 33 35 33 35 ParquetWrite/string_buffer_output/23/1000/1/0/1/manual_time +0.0598 +0.0594 104 110 104 110 ParquetWrite/string_void_output/23/1000/32/0/2/manual_time -0.3901 +0.0015 34 21 21 21 ParquetWrite/list_file_output/24/0/1/0/0/manual_time -0.1313 +0.0831 1033 897 828 897 ParquetWrite/list_file_output/24/1000/1/0/0/manual_time +0.0559 +0.0537 521 550 521 549 ParquetWrite/list_file_output/24/0/32/0/0/manual_time -0.1942 -0.0129 1183 954 888 877 ContiguousSplit/1Gb512ColsValidity/1073741824/512/256/1/iterations:8/manual_time +0.0660 +0.0659 30 32 30 32 AST<int32_t, TreeType::IMBALANCED_LEFT, false, true>/ast_int32_imbalanced_unique_nulls/1000000/1/manual_time +0.0540 +0.0453 0 0 0 0 AST<int32_t, TreeType::IMBALANCED_LEFT, false, true>/ast_int32_imbalanced_unique_nulls/10000000/1/manual_time +0.0657 +0.0642 1 1 1 1 AST<int32_t, TreeType::IMBALANCED_LEFT, false, true>/ast_int32_imbalanced_unique_nulls/100000000/1/manual_time +0.0704 +0.0702 8 9 8 9 AST<int32_t, TreeType::IMBALANCED_LEFT, true, true>/ast_int32_imbalanced_reuse_nulls/1000000/1/manual_time +0.0549 +0.0473 0 0 0 0 AST<int32_t, TreeType::IMBALANCED_LEFT, true, true>/ast_int32_imbalanced_reuse_nulls/10000000/1/manual_time +0.0745 +0.0723 1 1 1 1 AST<int32_t, TreeType::IMBALANCED_LEFT, true, true>/ast_int32_imbalanced_reuse_nulls/100000000/1/manual_time +0.0758 +0.0755 7 8 7 8 AST<double, TreeType::IMBALANCED_LEFT, false, true>/ast_double_imbalanced_unique_nulls/10000000/1/manual_time +0.0534 +0.0522 1 1 1 1 AST<double, TreeType::IMBALANCED_LEFT, false, true>/ast_double_imbalanced_unique_nulls/10000000/10/manual_time +0.0610 +0.0606 3 3 3 3 AST<double, TreeType::IMBALANCED_LEFT, false, true>/ast_double_imbalanced_unique_nulls/100000000/1/manual_time +0.0538 +0.0537 9 10 9 10 AST<double, TreeType::IMBALANCED_LEFT, false, true>/ast_double_imbalanced_unique_nulls/100000000/10/manual_time +0.0579 +0.0579 26 27 26 27 Rank/nulls/1024/manual_time +0.7608 +0.6280 0 0 0 0 Rank/nulls/4096/manual_time +0.2739 +0.2437 0 0 0 0 Rank/nulls/32768/manual_time +0.1599 +0.1469 0 0 0 0 Rank/nulls/262144/manual_time +0.0813 +0.0793 0 0 0 0 Rank/nulls/2097152/manual_time -0.4178 -0.4162 5 3 5 3 Rank/nulls/16777216/manual_time -0.3688 -0.3686 45 28 45 28 Rank/nulls/67108864/manual_time -0.3576 -0.3576 181 117 181 117 Sort<false>/unstable_no_nulls/1024/8/manual_time +0.2655 +0.2554 1 1 1 1 Sort<false>/unstable_no_nulls/4096/8/manual_time +0.3212 +0.3081 0 1 1 1 Sort<false>/unstable_no_nulls/32768/8/manual_time +0.1430 +0.1395 1 1 1 1 Sort<false>/unstable_no_nulls/262144/8/manual_time +0.1080 +0.1064 1 1 1 2 Sort<false>/unstable_no_nulls/2097152/8/manual_time -0.0740 -0.0740 15 14 15 14 Sort<false>/unstable_no_nulls/16777216/8/manual_time -0.0882 -0.0882 215 196 215 196 Sort<false>/unstable_no_nulls/67108864/8/manual_time -0.0848 -0.0848 1170 1071 1170 1071 Sort<true>/stable_no_nulls/1024/8/manual_time +0.2656 +0.2553 1 1 1 1 Sort<true>/stable_no_nulls/4096/8/manual_time +0.3215 +0.3081 0 1 1 1 Sort<true>/stable_no_nulls/32768/8/manual_time +0.1427 +0.1392 1 1 1 1 Sort<true>/stable_no_nulls/262144/8/manual_time +0.1082 +0.1066 1 1 1 2 Sort<true>/stable_no_nulls/2097152/8/manual_time -0.0737 -0.0735 15 14 15 14 Sort<true>/stable_no_nulls/16777216/8/manual_time -0.0889 -0.0887 215 196 215 196 Sort<true>/stable_no_nulls/67108864/8/manual_time -0.0848 -0.0846 1170 1071 1170 1071 Sort<false>/unstable/1024/1/manual_time +0.8698 +0.7017 0 0 0 0 Sort<false>/unstable/4096/1/manual_time +0.2846 +0.2506 0 0 0 0 Sort<false>/unstable/32768/1/manual_time +0.1640 +0.1492 0 0 0 0 Sort<false>/unstable/262144/1/manual_time +0.0818 +0.0794 0 0 0 0 Sort<false>/unstable/2097152/1/manual_time -0.4431 -0.4414 5 3 5 3 Sort<false>/unstable/16777216/1/manual_time -0.4282 -0.4280 38 22 38 22 Sort<false>/unstable/67108864/1/manual_time -0.4168 -0.4168 155 90 155 90 Sort<false>/unstable/1024/8/manual_time +0.2213 +0.2142 1 1 1 1 Sort<false>/unstable/4096/8/manual_time +0.2784 +0.2687 1 1 1 1 Sort<false>/unstable/32768/8/manual_time +0.1115 +0.1094 1 1 1 1 Sort<false>/unstable/262144/8/manual_time +0.1030 +0.1016 2 2 2 2 Sort<true>/stable/1024/1/manual_time +0.8684 +0.7016 0 0 0 0 Sort<true>/stable/4096/1/manual_time +0.2860 +0.2517 0 0 0 0 Sort<true>/stable/32768/1/manual_time +0.1638 +0.1497 0 0 0 0 Sort<true>/stable/262144/1/manual_time +0.0817 +0.0798 0 0 0 0 Sort<true>/stable/2097152/1/manual_time -0.4431 -0.4415 5 3 5 3 Sort<true>/stable/16777216/1/manual_time -0.4279 -0.4277 38 22 38 22 Sort<true>/stable/67108864/1/manual_time -0.4176 -0.4176 155 90 155 90 Sort<true>/stable/1024/8/manual_time +0.2211 +0.2138 1 1 1 1 Sort<true>/stable/4096/8/manual_time +0.2808 +0.2706 1 1 1 1 Sort<true>/stable/32768/8/manual_time +0.1117 +0.1096 1 1 1 1 Sort<true>/stable/262144/8/manual_time +0.1029 +0.1013 2 2 2 2 Sort/strings/262144/manual_time -0.0781 -0.0777 4 4 4 4 Scatter/double_coalesce_x/2048/2/manual_time +0.0614 +0.0472 27988 29705 46846 49057 Scatter/double_coalesce_x/32768/2/manual_time +0.0637 +0.0522 30209 32133 47991 50496 Scatter/double_coalesce_x/131072/2/manual_time +0.0558 +0.0444 37821 39932 54883 57321 Scatter/double_coalesce_x/1024/4/manual_time +0.0811 +0.0663 53699 58053 72617 77434 Scatter/double_coalesce_x/2048/4/manual_time +0.0535 +0.0468 56040 59038 74848 78348 Scatter/double_coalesce_x/4096/4/manual_time +0.0514 +0.0449 56187 59073 74930 78291 Scatter/double_coalesce_x/8192/4/manual_time +0.0516 +0.0452 56747 59674 75140 78533 Scatter/double_coalesce_x/16384/4/manual_time +0.0520 +0.0479 57412 60400 75292 78895 Scatter/double_coalesce_x/32768/4/manual_time +0.0610 +0.0544 58151 61699 75398 79499 Scatter/double_coalesce_x/1024/8/manual_time +0.0526 +0.0486 110089 115882 129032 135301 Scatter/double_coalesce_x/2048/8/manual_time +0.0546 +0.0506 110864 116921 129784 136352 Scatter/double_coalesce_x/4096/8/manual_time +0.0612 +0.0554 110733 117506 129306 136465 Scatter/double_coalesce_x/8192/8/manual_time +0.0635 +0.0579 111614 118703 129727 137233 Scatter/double_coalesce_x/16384/8/manual_time +0.0665 +0.0604 111918 119366 129458 137275 Scatter/double_coalesce_x/32768/8/manual_time +0.0545 +0.0543 114993 121260 131951 139113 Scatter/double_coalesce_x/65536/8/manual_time +0.0619 +0.0560 119167 126540 136092 143717 Scatter/double_coalesce_o/2048/2/manual_time +0.0542 +0.0418 29300 30889 48197 50211 Scatter/double_coalesce_o/32768/2/manual_time +0.0556 +0.0464 32069 33851 49914 52229 Scatter/double_coalesce_o/1024/4/manual_time +0.0684 +0.0569 56480 60346 75468 79761 Scatter/double_coalesce_o/8192/4/manual_time +0.0572 +0.0497 59554 62960 77958 81834 Scatter/double_coalesce_o/16384/4/manual_time +0.0572 +0.0525 59839 63260 77704 81781 Scatter/double_coalesce_o/32768/4/manual_time +0.0564 +0.0514 62493 66015 79779 83883 Scatter/double_coalesce_o/1024/8/manual_time +0.0566 +0.0515 112968 119360 131925 138723 Scatter/double_coalesce_o/2048/8/manual_time +0.0565 +0.0518 113151 119548 132028 138870 Scatter/double_coalesce_o/4096/8/manual_time +0.0594 +0.0545 114566 121374 133078 140333 Scatter/double_coalesce_o/8192/8/manual_time +0.0587 +0.0534 116146 122963 134282 141449 Scatter/double_coalesce_o/16384/8/manual_time +0.0663 +0.0597 116445 124161 134038 142046 Scatter/double_coalesce_o/32768/8/manual_time +0.0555 +0.0566 122258 129043 139016 146891 Scatter/double_coalesce_o/65536/8/manual_time +0.0553 +0.0498 133373 140749 150403 157896 Quantiles/no_nulls/65536/4/1/manual_time +0.1394 +0.1370 1 1 1 1 Quantiles/no_nulls/262144/4/1/manual_time +0.1372 +0.1348 1 1 1 1 Quantiles/no_nulls/1048576/4/1/manual_time -0.0944 -0.0943 6 5 6 5 Quantiles/no_nulls/4194304/4/1/manual_time -0.1068 -0.1070 35 32 35 32 Quantiles/no_nulls/16777216/4/1/manual_time -0.0882 -0.0884 210 191 210 191 Quantiles/no_nulls/67108864/4/1/manual_time -0.0855 -0.0858 1148 1050 1148 1050 Quantiles/no_nulls/65536/8/1/manual_time +0.1312 +0.1290 1 1 1 1 Quantiles/no_nulls/262144/8/1/manual_time +0.1058 +0.1044 1 2 1 2 Quantiles/no_nulls/4194304/8/1/manual_time -0.0982 -0.0984 37 33 37 33 Quantiles/no_nulls/16777216/8/1/manual_time -0.0886 -0.0888 215 196 215 196 Quantiles/no_nulls/67108864/8/1/manual_time -0.0866 -0.0868 1173 1071 1173 1071 Quantiles/no_nulls/65536/4/4/manual_time +0.1413 +0.1385 1 1 1 1 Quantiles/no_nulls/262144/4/4/manual_time +0.1355 +0.1332 1 1 1 1 Quantiles/no_nulls/1048576/4/4/manual_time -0.0944 -0.0943 6 5 6 5 Quantiles/no_nulls/4194304/4/4/manual_time -0.1061 -0.1063 35 32 35 32 Quantiles/no_nulls/16777216/4/4/manual_time -0.0877 -0.0879 210 191 210 191 Quantiles/no_nulls/67108864/4/4/manual_time -0.0863 -0.0865 1149 1050 1149 1049 Quantiles/no_nulls/65536/8/4/manual_time +0.1328…
stanleytsang-amd
added a commit
to ROCm/hipCUB
that referenced
this pull request
Apr 12, 2022
* test_hipcub_device_radix_sort.cpp Correctly test -NaN. * `test_utils::native_half` -NaN to `float` fix * `hipcub::WarpExchange` interface to `::rocprim::warp_exchange` * Fix after review * Default CUDA architecture is 53 to fix __half * Apply 1 suggestion(s) to 1 file(s) * Added NVGPU_TARGETS to gitlab-ci * Update .gitlab-ci.yml file * Changes from [PR346](NVIDIA/cub#346) * Add deprecation warnings. * Update of deprecated statement. * Adding constants from [PR418](NVIDIA/cub#418). * Fix deprecation warnings. * Fix a forgotten deprecation warnings. * Fix deprecation warnings. * Fix deprecation warnings for nvcc. * Replace '__host__ __device__' by 'HIPCUB_HOST_DEVICE' * Added Cuda standard * Bumped referenced CUB and thrust version to 1.16 * Download thrust in test/extra * Added the interface for UniqueByKey * Added test for UniqueByKey * Added benchmark for UniqueByKey * Add UniqueByKey interface * Fix alignment of UniqueByKey parameters * Use 'unsigned int' instead of a one element vector for selected_count_output in UniqueByKey benchmark * Update interface * Update tests, add test for int64_t size * Upde CUB interface * Apply 1 suggestion(s) to 1 file(s) * Add interfaces for subtract * Ignore deprecation warnings from rocPRIM for flags API * Add deprecation warnings for Flags API * Ignore deprecation warnings for Flags API tests * Fix Subtract interfaces * Fix SubtractRightPartial not using the right method * Add benchmark for AdjacentDifference (Subtract) * Add test for AdjacentDifference (Subtract) * Use 'HIPCUB_HOST_DEVICE' macro * Fix a typo * Fix interfaces of Subtract not matching the CUB one * Upadte the tests and benchmarks to the fixed interfaces of Subtract * Fix to use temp_storage_ in subtract call * Fix the tests of Subtract to work with the CUB interfaces * Add the macros to ignore warning in config.hpp and remove it from block_adjacent_difference file and the from the tests * Device adjacent difference CUB backend * New thread operators [skip ci] * Test device adjacent difference [skip ci] * Device adjacent difference rocPRIM backend * Added new headers to the hipcub.hpp-s * Benchmark for device adjacent difference * Added missing thread operators * Updated changelog for CUB 1.16 * Updating changelog for hipCUB 1.16 in next release Co-authored-by: Vince <vince@streamhpc.com> Co-authored-by: Gergely Mészáros <gergely@streamhpc.com> Co-authored-by: Théo Battrel <theo@streamhpc.com> Co-authored-by: Balint Soproni <balint@streamhpc.com> Co-authored-by: Stanley Tsang <stanley.tsang@amd.com>
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
P1: should have
Necessary, but not critical.
testing: gpuCI in progress
Started gpuCI testing.
testing: internal ci in progress
Currently testing on internal NVIDIA CI (DVS).
type: bug: functional
Does not work as intended.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.