{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":629064630,"defaultBranch":"main","name":"llvm-project","ownerLogin":"lukel97","currentUserCanPush":false,"isFork":true,"isEmpty":false,"createdAt":"2023-04-17T14:44:01.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/2488460?v=4","public":true,"private":false,"isOrgOwned":false},"refInfo":{"name":"","listCacheKey":"v0:1726820318.0","currentOid":""},"activityList":{"items":[{"before":null,"after":"d373bcc5708b58728768fb5bb630b683b4d9e695","ref":"refs/heads/zvfhmin-zvfbfmin-custom-lower-memory-ops","pushedAt":"2024-09-20T08:18:38.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Lower memory ops and VP splat for zvfhmin and zvfbfmin\n\nWe can lower f16/bf16 memory ops without promotion through the existing custom lowering.\n\nSome of the zero strided VP loads get combined to a VP splat, so we need to also handle the lowering for that for f16/bf16 w/ zvfhmin/zvfbfmin. This patch copies the lowering from ISD::SPLAT_VECTOR over to lowerScalarSplat which is used by the VP splat lowering.","shortMessageHtmlLink":"[RISCV] Lower memory ops and VP splat for zvfhmin and zvfbfmin"}},{"before":"77d1032516e7057f185c5137071e4a97c3f3eb30","after":"737f56fdf7d8df4f1349085fe7256e27778e4a51","ref":"refs/heads/main","pushedAt":"2024-09-18T10:20:21.000Z","pushType":"push","commitsCount":10000,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Deduplicate zvfhmin and zvfbfmin operation actions. NFC\n\nAfter #108937 fp16 w/o zvfh and bf16 are now in sync and should have\nthe same lowering.","shortMessageHtmlLink":"[RISCV] Deduplicate zvfhmin and zvfbfmin operation actions. NFC"}},{"before":"d091eb11d9d841d366b0bb2e09fa75cf959a110c","after":"bb08e71be1fa928831d1aa1cb58207887940f826","ref":"refs/heads/zvfbfmin/promote","pushedAt":"2024-09-18T08:42:27.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Promote bf16 ops to f32 with zvfbfmin\n\nFor f16 with zvfhmin, we promote most ops and VP ops to f32. This does the same for bf16 with zvfbfmin, so the two fp types should now be in sync.\n\nThere are a few places in the custom lowering where we need to check for a LMUL 8 f16/bf16 vector that can't be promoted and must be split, this extracts that out into isPromotedOpNeedingSplit.\n\nIn a follow up NFC we can deduplicate the code that sets up the promotions.","shortMessageHtmlLink":"[RISCV] Promote bf16 ops to f32 with zvfbfmin"}},{"before":null,"after":"d091eb11d9d841d366b0bb2e09fa75cf959a110c","ref":"refs/heads/zvfbfmin/promote","pushedAt":"2024-09-17T07:33:43.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Promote bf16 ops to f32 with zvfbfmin\n\nFor f16 with zvfhmin, we promote most ops and VP ops to f32. This does the same for bf16 with zvfbfmin, so the two fp types should now be in sync.\n\nThere are a few places in the custom lowering where we need to check for a LMUL 8 f16/bf16 vector that can't be promoted and must be split, this extracts that out into isPromotedOpNeedingSplit.\n\nIn a follow up NFC we can deduplicate the code that sets up the promotions.","shortMessageHtmlLink":"[RISCV] Promote bf16 ops to f32 with zvfbfmin"}},{"before":"7eb4dfb0524eab81f5509944a04ab64e0c09986c","after":"0122f932e7be58ae2d9288a7ea8dec34598ac8fa","ref":"refs/heads/combineOp_VLToVWOp_VL-zvfbfmin","pushedAt":"2024-09-17T02:07:12.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"Adjust test case comment","shortMessageHtmlLink":"Adjust test case comment"}},{"before":"d3b112857eb04ef97cff41edf8f2f5cf670df5ce","after":"7eb4dfb0524eab81f5509944a04ab64e0c09986c","ref":"refs/heads/combineOp_VLToVWOp_VL-zvfbfmin","pushedAt":"2024-09-17T02:01:24.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"Add test case for multiple uses, where not all of them are widenable","shortMessageHtmlLink":"Add test case for multiple uses, where not all of them are widenable"}},{"before":null,"after":"d3b112857eb04ef97cff41edf8f2f5cf670df5ce","ref":"refs/heads/combineOp_VLToVWOp_VL-zvfbfmin","pushedAt":"2024-09-16T07:23:08.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Restrict combineOp_VLToVWOp_VL w/ bf16 to vfwmadd_vl with zvfbfwma\n\nWe currently make sure to check that if widening a f16 vector that we have zvfh. We need to do the same for bf16 vectors, but with the further restriction that we can only combine vfmadd_vl to vfwmadd_vl (to get vfwmaccbf16.v{v,f}).\n\nThis moves the checks into the extension support checks to keep it one place.","shortMessageHtmlLink":"[RISCV] Restrict combineOp_VLToVWOp_VL w/ bf16 to vfwmadd_vl with zvf…"}},{"before":"2f4d238eb3a673025a49af05e7a38bd618f8fa86","after":"0350ec6b7d17b8442565b8f1b44cf5d6202e0fe2","ref":"refs/heads/zvfhmin/rounding-ops","pushedAt":"2024-09-16T05:43:13.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"Update fixed length vector tests\n\nMarking ftrunc as promoted means these are no longer expanded","shortMessageHtmlLink":"Update fixed length vector tests"}},{"before":null,"after":"2f4d238eb3a673025a49af05e7a38bd618f8fa86","ref":"refs/heads/zvfhmin/rounding-ops","pushedAt":"2024-09-15T16:00:43.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Split fp rounding ops with zvfhmin nxv32f16\n\nThis adds zvfhmin test coverage for fceil, ffloor, fnearbyint, frint, fround and froundeven and splits them at nxv32f16 to avoid crashing, similarly to what we do for other nodes that we promote.\n\nThis also sets ftrunc to promote which was previously missing. We already promote the VP version of it, vp_froundtozero.","shortMessageHtmlLink":"[RISCV] Split fp rounding ops with zvfhmin nxv32f16"}},{"before":null,"after":"bee77b3c2a9b73448587293facf9b33dec1b07a9","ref":"refs/heads/zvfhmin-zvfbfmin-interleave-deinterleave","pushedAt":"2024-09-12T15:00:02.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Lower interleave + deinterleave for zvfhmin and zvfbfmin\n\nFortunately f16 and bf16 are always < EEW, so we can always lower via widening or narrowing. This means we don't need to add patterns for vrgather_vv_vl just yet.","shortMessageHtmlLink":"[RISCV] Lower interleave + deinterleave for zvfhmin and zvfbfmin"}},{"before":"05ea1adc01fb0ea82a79cd1d2b000535225cacc8","after":"c0059801ef3f2a55be0e84a4ba5bdbc57e8dd863","ref":"refs/heads/zvfhmin-zvfbfmin-resuage","pushedAt":"2024-09-12T11:39:12.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"Clarify wording in comment","shortMessageHtmlLink":"Clarify wording in comment"}},{"before":null,"after":"05ea1adc01fb0ea82a79cd1d2b000535225cacc8","ref":"refs/heads/zvfhmin-zvfbfmin-resuage","pushedAt":"2024-09-12T11:36:01.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Account for zvfhmin and zvfbfmin promotion in register usage\n\nA half with only zvfhmin or bfloat will end up getting promoted to a f32 for most instructions.\n\nUnless the loop consists only of memory ops and permutation instructions which don't need promoted (is this common?), we'll end up using double the LMUL than what's currently being returned by getRegUsageForType.\n\nSince this is used by the loop vectorizer, it seems better to be conservative and assume that any usage of a zvfhmin half/bfloat will end up being widened to a f32.","shortMessageHtmlLink":"[RISCV] Account for zvfhmin and zvfbfmin promotion in register usage"}},{"before":null,"after":"99a1ed168e47fbc6aefaa867f40bce8f4175d773","ref":"refs/heads/f16/cost-model-promote","pushedAt":"2024-09-12T10:52:48.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Handle zvfhmin promotion to f32 in half arith costs\n\nArithmetic half ops on zvfhmin will be promoted and carried out in f32, so this updates getArithmeticInstrCost to check for this.","shortMessageHtmlLink":"[RISCV] Handle zvfhmin promotion to f32 in half arith costs"}},{"before":null,"after":"932efa82ee330d2c0a1c96f767e30e212e07a813","ref":"refs/heads/bf16-convert-lowering","pushedAt":"2024-09-12T07:24:19.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Lower bf16 {S,U}INT_TO_FP, FP_TO_{S,U}INT and VP variants\n\nThis handles int->fp/fp->int nodes for zvfbfmin, reusing the same parts that f16 uses with zvfhmin.\n\nThere's quite a bit of replication here that can probably be cleaned up at some point.","shortMessageHtmlLink":"[RISCV] Lower bf16 {S,U}INT_TO_FP, FP_TO_{S,U}INT and VP variants"}},{"before":"0e23f24ecc26bd94a0d5523856a00a3d9ac5e08b","after":"b95b1fe985ecf329ec17df9c1f5f6b6268cfd513","ref":"refs/heads/zvfbfmin-fabs-fcopysign-fneg-expand","pushedAt":"2024-09-12T00:45:49.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"Rename test files to put bf16 as suffix","shortMessageHtmlLink":"Rename test files to put bf16 as suffix"}},{"before":null,"after":"0e23f24ecc26bd94a0d5523856a00a3d9ac5e08b","ref":"refs/heads/zvfbfmin-fabs-fcopysign-fneg-expand","pushedAt":"2024-09-11T16:31:26.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Expand bf16 FNEG/FABS/FCOPYSIGN\n\nThe motivation for this is to start promoting bf16 ops to f32 so that we can mark bf16 as a supported type in RISCVTTIImpl::isElementTypeLegalForScalableVector and scalably-vectorize it.\n\nThis starts with expanding the nodes that can't be promoted to f32 due to canonicalizing NaNs, similarly to f16 in #106652.","shortMessageHtmlLink":"[RISCV] Expand bf16 FNEG/FABS/FCOPYSIGN"}},{"before":null,"after":"0a36f6f72bec63039b71c709df5d17d1155fa36e","ref":"refs/heads/bf16-extload-truncstore-crash-fix","pushedAt":"2024-09-11T14:44:59.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Expand bf16 vector truncstores and extloads\n\nPreviously they were legal by default, so the truncstore/extload test cases would get combined and crash during selection.\nThese are set to expand for f16 so do the same for bf16.","shortMessageHtmlLink":"[RISCV] Expand bf16 vector truncstores and extloads"}},{"before":null,"after":"50378679ad48f8c1d677e1174c5a4ae71c63a582","ref":"refs/heads/expandmemcmp-scratch","pushedAt":"2024-09-11T13:57:49.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"disable","shortMessageHtmlLink":"disable"}},{"before":null,"after":"cf21ba47d40edde8cbe3f763c434f5cbdda88096","ref":"refs/heads/vfwmaccbf16-fixed-patterns","pushedAt":"2024-09-11T12:13:56.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Add fixed length vector patterns for vfwmaccbf16.vv\n\nThis adds VL patterns for vfwmaccbf16.vv so that we can handle fixed length vectors.\n\nIt does this by teaching combineOp_VLToVWOp_VL to emit RISCVISD::VFWMADD_VL for bf16. The change in getOrCreateExtendedOp is needed because getNarrowType is based off of the bitwidth so returns f16. We need to explicitly check for bf16.\n\nNote that the .vf patterns don't work yet, since the build_vector pattern gets lowered to a vmv.v.x not a vfmv.v.f which SplatFP doesn't pick up, see #106637.","shortMessageHtmlLink":"[RISCV] Add fixed length vector patterns for vfwmaccbf16.vv"}},{"before":"d621e74f0981356f04f1dca97fae18028207dbdb","after":"176055a468db7013d9aee01d6641e6303a014213","ref":"refs/heads/vector-remat/vmv.s.x-vfmv.s.f","pushedAt":"2024-09-11T01:43:45.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Rematerialize vmv.s.x and vfmv.s.f\n\nContinuing with #107993 and #108007, this handles the last of the main rematerializable vector instructions.\n\n    Program            regalloc.NumSpills                regalloc.NumReloads                regalloc.NumRemats\n                       lhs                rhs      diff  lhs                 rhs      diff  lhs                rhs      diff\n            508.namd_r  6598.00            6598.00  0.0% 15509.00            15509.00  0.0%  2387.00            2387.00  0.0%\n             505.mcf_r   141.00             141.00  0.0%   372.00              372.00  0.0%    36.00              36.00  0.0%\n           641.leela_s   356.00             356.00  0.0%   525.00              525.00  0.0%   117.00             117.00  0.0%\n       631.deepsjeng_s   353.00             353.00  0.0%   682.00              682.00  0.0%   124.00             124.00  0.0%\n       623.xalancbmk_s  1548.00            1548.00  0.0%  2466.00             2466.00  0.0%   620.00             620.00  0.0%\n         620.omnetpp_s   946.00             946.00  0.0%  1485.00             1485.00  0.0%  1178.00            1178.00  0.0%\n             605.mcf_s   141.00             141.00  0.0%   372.00              372.00  0.0%    36.00              36.00  0.0%\n              557.xz_r   289.00             289.00  0.0%   505.00              505.00  0.0%   172.00             172.00  0.0%\n           541.leela_r   356.00             356.00  0.0%   525.00              525.00  0.0%   117.00             117.00  0.0%\n       531.deepsjeng_r   353.00             353.00  0.0%   682.00              682.00  0.0%   124.00             124.00  0.0%\n         520.omnetpp_r   946.00             946.00  0.0%  1485.00             1485.00  0.0%  1178.00            1178.00  0.0%\n       523.xalancbmk_r  1548.00            1548.00  0.0%  2466.00             2466.00  0.0%   620.00             620.00  0.0%\n             619.lbm_s    68.00              68.00  0.0%    70.00               70.00  0.0%     1.00               1.00  0.0%\n             519.lbm_r    73.00              73.00  0.0%    75.00               75.00  0.0%     1.00               1.00  0.0%\n              657.xz_s   289.00             289.00  0.0%   505.00              505.00  0.0%   172.00             172.00  0.0%\n          511.povray_r  1937.00            1936.00 -0.1%  3629.00             3628.00 -0.0%   517.00             518.00  0.2%\n             502.gcc_r 12450.00           12442.00 -0.1% 27328.00            27317.00 -0.0%  9409.00            9409.00  0.0%\n             602.gcc_s 12450.00           12442.00 -0.1% 27328.00            27317.00 -0.0%  9409.00            9409.00  0.0%\n         638.imagick_s  4181.00            4178.00 -0.1% 11342.00            11338.00 -0.0%  3366.00            3368.00  0.1%\n         538.imagick_r  4181.00            4178.00 -0.1% 11342.00            11338.00 -0.0%  3366.00            3368.00  0.1%\n       500.perlbench_r  4178.00            4175.00 -0.1%  9162.00             9159.00 -0.0%  2410.00            2410.00  0.0%\n       600.perlbench_s  4178.00            4175.00 -0.1%  9162.00             9159.00 -0.0%  2410.00            2410.00  0.0%\n            525.x264_r  1886.00            1884.00 -0.1%  4561.00             4559.00 -0.0%   471.00             471.00  0.0%\n            625.x264_s  1886.00            1884.00 -0.1%  4561.00             4559.00 -0.0%   471.00             471.00  0.0%\n          510.parest_r 42740.00           42689.00 -0.1% 82400.00            82252.00 -0.2%  5612.00            5620.00  0.1%\n             644.nab_s   753.00             752.00 -0.1%  1183.00             1182.00 -0.1%   318.00             318.00  0.0%\n             544.nab_r   753.00             752.00 -0.1%  1183.00             1182.00 -0.1%   318.00             318.00  0.0%\n         526.blender_r 13105.00           13084.00 -0.2% 26478.00            26442.00 -0.1% 18991.00           18989.00 -0.0%\nGeomean difference                             -0.0%                              -0.0%                              0.0%\n\nThere's an extra spill in one of the test cases, but it's likely noise from the spill weights and isn't an issue in practice.","shortMessageHtmlLink":"[RISCV] Rematerialize vmv.s.x and vfmv.s.f"}},{"before":"287fba4f444663db7a636b8bbb5a143d6443de08","after":"b19c5dfaaca9e77b9f331cdbf041ee89a1feda9d","ref":"refs/heads/vector-remat/vfmv.v.f","pushedAt":"2024-09-11T01:38:10.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Rematerialize vfmv.v.f\n\nThis is the same principle as vmv.v.x in #107993, but for floats.\n\n    Program            regalloc.NumSpills                regalloc.NumReloads                regalloc.NumRemats\n                       lhs                rhs      diff  lhs                 rhs      diff  lhs                rhs      diff\n             519.lbm_r    73.00              73.00  0.0%    75.00               75.00  0.0%     1.00               1.00  0.0%\n             544.nab_r   753.00             753.00  0.0%  1183.00             1183.00  0.0%   318.00             318.00  0.0%\n             619.lbm_s    68.00              68.00  0.0%    70.00               70.00  0.0%     1.00               1.00  0.0%\n             644.nab_s   753.00             753.00  0.0%  1183.00             1183.00  0.0%   318.00             318.00  0.0%\n            508.namd_r  6598.00            6597.00 -0.0% 15509.00            15503.00 -0.0%  2387.00            2393.00  0.3%\n         526.blender_r 13105.00           13084.00 -0.2% 26478.00            26443.00 -0.1% 18991.00           18996.00  0.0%\n          510.parest_r 42740.00           42665.00 -0.2% 82400.00            82309.00 -0.1%  5612.00            5648.00  0.6%\n          511.povray_r  1937.00            1929.00 -0.4%  3629.00             3620.00 -0.2%   517.00             525.00  1.5%\n         538.imagick_r  4181.00            4150.00 -0.7% 11342.00            11125.00 -1.9%  3366.00            3366.00  0.0%\n         638.imagick_s  4181.00            4150.00 -0.7% 11342.00            11125.00 -1.9%  3366.00            3366.00  0.0%\n    Geomean difference                             -0.2%                              -0.4%                              0.2%","shortMessageHtmlLink":"[RISCV] Rematerialize vfmv.v.f"}},{"before":null,"after":"d621e74f0981356f04f1dca97fae18028207dbdb","ref":"refs/heads/vector-remat/vmv.s.x-vfmv.s.f","pushedAt":"2024-09-10T12:11:51.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Rematerialize vmv.s.x and vfmv.s.f\n\nContinuing with #107993 and #108007, this handles the last of the main rematerializable vector instructions.\n\n    Program            regalloc.NumSpills                regalloc.NumReloads                regalloc.NumRemats\n                       lhs                rhs      diff  lhs                 rhs      diff  lhs                rhs      diff\n            508.namd_r  6598.00            6598.00  0.0% 15509.00            15509.00  0.0%  2387.00            2387.00  0.0%\n             505.mcf_r   141.00             141.00  0.0%   372.00              372.00  0.0%    36.00              36.00  0.0%\n           641.leela_s   356.00             356.00  0.0%   525.00              525.00  0.0%   117.00             117.00  0.0%\n       631.deepsjeng_s   353.00             353.00  0.0%   682.00              682.00  0.0%   124.00             124.00  0.0%\n       623.xalancbmk_s  1548.00            1548.00  0.0%  2466.00             2466.00  0.0%   620.00             620.00  0.0%\n         620.omnetpp_s   946.00             946.00  0.0%  1485.00             1485.00  0.0%  1178.00            1178.00  0.0%\n             605.mcf_s   141.00             141.00  0.0%   372.00              372.00  0.0%    36.00              36.00  0.0%\n              557.xz_r   289.00             289.00  0.0%   505.00              505.00  0.0%   172.00             172.00  0.0%\n           541.leela_r   356.00             356.00  0.0%   525.00              525.00  0.0%   117.00             117.00  0.0%\n       531.deepsjeng_r   353.00             353.00  0.0%   682.00              682.00  0.0%   124.00             124.00  0.0%\n         520.omnetpp_r   946.00             946.00  0.0%  1485.00             1485.00  0.0%  1178.00            1178.00  0.0%\n       523.xalancbmk_r  1548.00            1548.00  0.0%  2466.00             2466.00  0.0%   620.00             620.00  0.0%\n             619.lbm_s    68.00              68.00  0.0%    70.00               70.00  0.0%     1.00               1.00  0.0%\n             519.lbm_r    73.00              73.00  0.0%    75.00               75.00  0.0%     1.00               1.00  0.0%\n              657.xz_s   289.00             289.00  0.0%   505.00              505.00  0.0%   172.00             172.00  0.0%\n          511.povray_r  1937.00            1936.00 -0.1%  3629.00             3628.00 -0.0%   517.00             518.00  0.2%\n             502.gcc_r 12450.00           12442.00 -0.1% 27328.00            27317.00 -0.0%  9409.00            9409.00  0.0%\n             602.gcc_s 12450.00           12442.00 -0.1% 27328.00            27317.00 -0.0%  9409.00            9409.00  0.0%\n         638.imagick_s  4181.00            4178.00 -0.1% 11342.00            11338.00 -0.0%  3366.00            3368.00  0.1%\n         538.imagick_r  4181.00            4178.00 -0.1% 11342.00            11338.00 -0.0%  3366.00            3368.00  0.1%\n       500.perlbench_r  4178.00            4175.00 -0.1%  9162.00             9159.00 -0.0%  2410.00            2410.00  0.0%\n       600.perlbench_s  4178.00            4175.00 -0.1%  9162.00             9159.00 -0.0%  2410.00            2410.00  0.0%\n            525.x264_r  1886.00            1884.00 -0.1%  4561.00             4559.00 -0.0%   471.00             471.00  0.0%\n            625.x264_s  1886.00            1884.00 -0.1%  4561.00             4559.00 -0.0%   471.00             471.00  0.0%\n          510.parest_r 42740.00           42689.00 -0.1% 82400.00            82252.00 -0.2%  5612.00            5620.00  0.1%\n             644.nab_s   753.00             752.00 -0.1%  1183.00             1182.00 -0.1%   318.00             318.00  0.0%\n             544.nab_r   753.00             752.00 -0.1%  1183.00             1182.00 -0.1%   318.00             318.00  0.0%\n         526.blender_r 13105.00           13084.00 -0.2% 26478.00            26442.00 -0.1% 18991.00           18989.00 -0.0%\nGeomean difference                             -0.0%                              -0.0%                              0.0%\n\nThere's an extra spill in one of the test cases, but it's likely noise from the spill weights and isn't an issue in practice.","shortMessageHtmlLink":"[RISCV] Rematerialize vmv.s.x and vfmv.s.f"}},{"before":null,"after":"287fba4f444663db7a636b8bbb5a143d6443de08","ref":"refs/heads/vector-remat/vfmv.v.f","pushedAt":"2024-09-10T11:45:37.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Rematerialize vfmv.v.f\n\nThis is the same principle as vmv.v.x in #107993, but for floats.\n\n    Program            regalloc.NumSpills                regalloc.NumReloads                regalloc.NumRemats\n                       lhs                rhs      diff  lhs                 rhs      diff  lhs                rhs      diff\n             519.lbm_r    73.00              73.00  0.0%    75.00               75.00  0.0%     1.00               1.00  0.0%\n             544.nab_r   753.00             753.00  0.0%  1183.00             1183.00  0.0%   318.00             318.00  0.0%\n             619.lbm_s    68.00              68.00  0.0%    70.00               70.00  0.0%     1.00               1.00  0.0%\n             644.nab_s   753.00             753.00  0.0%  1183.00             1183.00  0.0%   318.00             318.00  0.0%\n            508.namd_r  6598.00            6597.00 -0.0% 15509.00            15503.00 -0.0%  2387.00            2393.00  0.3%\n         526.blender_r 13105.00           13084.00 -0.2% 26478.00            26443.00 -0.1% 18991.00           18996.00  0.0%\n          510.parest_r 42740.00           42665.00 -0.2% 82400.00            82309.00 -0.1%  5612.00            5648.00  0.6%\n          511.povray_r  1937.00            1929.00 -0.4%  3629.00             3620.00 -0.2%   517.00             525.00  1.5%\n         538.imagick_r  4181.00            4150.00 -0.7% 11342.00            11125.00 -1.9%  3366.00            3366.00  0.0%\n         638.imagick_s  4181.00            4150.00 -0.7% 11342.00            11125.00 -1.9%  3366.00            3366.00  0.0%\n    Geomean difference                             -0.2%                              -0.4%                              0.2%","shortMessageHtmlLink":"[RISCV] Rematerialize vfmv.v.f"}},{"before":null,"after":"2d7701d6f11b31f9660f43a889470a75f21f350b","ref":"refs/heads/vector-remat/vmv.v.x","pushedAt":"2024-09-10T09:45:42.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Rematerialize vmv.v.x\n\nEven though vmv.v.x has a non constant scalar operand, because we have split register allocation between vectors and scalars on RISC-V we can rematerialize it.\n\n    Program            regalloc.NumSpills                regalloc.NumReloads                regalloc.NumReMaterialization\n                       lhs                rhs      diff  lhs                 rhs      diff  lhs                           rhs      diff\n              657.xz_s   289.00             292.00  1.0%   505.00              484.00 -4.2%   613.00                        612.00 -0.2%\n              557.xz_r   289.00             292.00  1.0%   505.00              484.00 -4.2%   613.00                        612.00 -0.2%\n             505.mcf_r   141.00             141.00  0.0%   372.00              372.00  0.0%   123.00                        123.00  0.0%\n           641.leela_s   356.00             356.00  0.0%   525.00              525.00  0.0%   801.00                        801.00  0.0%\n            625.x264_s  1886.00            1886.00  0.0%  4561.00             4561.00  0.0%  2108.00                       2108.00  0.0%\n       623.xalancbmk_s  1548.00            1548.00  0.0%  2466.00             2466.00  0.0% 13983.00                      13983.00  0.0%\n         620.omnetpp_s   946.00             946.00  0.0%  1485.00             1485.00  0.0%  8413.00                       8413.00  0.0%\n             605.mcf_s   141.00             141.00  0.0%   372.00              372.00  0.0%   123.00                        123.00  0.0%\n           541.leela_r   356.00             356.00  0.0%   525.00              525.00  0.0%   801.00                        801.00  0.0%\n            525.x264_r  1886.00            1886.00  0.0%  4561.00             4561.00  0.0%  2108.00                       2108.00  0.0%\n          510.parest_r 42740.00           42740.00  0.0% 82400.00            82400.00  0.0% 65165.00                      65165.00  0.0%\n         520.omnetpp_r   946.00             946.00  0.0%  1485.00             1485.00  0.0%  8413.00                       8413.00  0.0%\n            508.namd_r  6598.00            6598.00  0.0% 15509.00            15509.00  0.0%  3164.00                       3164.00  0.0%\n             644.nab_s   753.00             753.00  0.0%  1183.00             1183.00  0.0%  1559.00                       1559.00  0.0%\n             619.lbm_s    68.00              68.00  0.0%    70.00               70.00  0.0%    20.00                         20.00  0.0%\n             544.nab_r   753.00             753.00  0.0%  1183.00             1183.00  0.0%  1559.00                       1559.00  0.0%\n             519.lbm_r    73.00              73.00  0.0%    75.00               75.00  0.0%    18.00                         18.00  0.0%\n          511.povray_r  1937.00            1937.00  0.0%  3629.00             3629.00  0.0%  4914.00                       4914.00  0.0%\n       523.xalancbmk_r  1548.00            1548.00  0.0%  2466.00             2466.00  0.0% 13983.00                      13983.00  0.0%\n             502.gcc_r 12450.00           12446.00 -0.0% 27328.00            27312.00 -0.1% 50527.00                      50533.00  0.0%\n             602.gcc_s 12450.00           12446.00 -0.0% 27328.00            27312.00 -0.1% 50527.00                      50533.00  0.0%\n       500.perlbench_r  4178.00            4175.00 -0.1%  9162.00             9061.00 -1.1% 10223.00                      10392.00  1.7%\n       600.perlbench_s  4178.00            4175.00 -0.1%  9162.00             9061.00 -1.1% 10223.00                      10392.00  1.7%\n         526.blender_r 13105.00           13081.00 -0.2% 26478.00            26438.00 -0.2% 65188.00                      65230.00  0.1%\n         638.imagick_s  4181.00            4157.00 -0.6% 11342.00            11316.00 -0.2% 10884.00                      10938.00  0.5%\n         538.imagick_r  4181.00            4157.00 -0.6% 11342.00            11316.00 -0.2% 10884.00                      10938.00  0.5%\n       531.deepsjeng_r   353.00             345.00 -2.3%   682.00              674.00 -1.2%   530.00                        538.00  1.5%\n       631.deepsjeng_s   353.00             345.00 -2.3%   682.00              674.00 -1.2%   530.00                        538.00  1.5%\n    Geomean difference                             -0.1%                              -0.5%                                         0.3%\n\nThe slight increase in spills in the xz benchmarks are from scalar spills (presumably due to more uses of the scalar operand affecting spill weights), we still manage to remove some vector spills in it too.\n\nInlineSpiller will check to make sure that the scalar operand is live at the point where the rematerialization occurs, so this won't extend any scalar live ranges. However this also means we may not be able to rematerialize in some cases, as shown in @vmv.v.x_needs_extended.\n\nIt might be worthwhile teaching InlineSpiller to extend scalar live ranges in a future patch. I experimented with this locally and it reduced spills on 531.deepsjeng_r by a further 3%.","shortMessageHtmlLink":"[RISCV] Rematerialize vmv.v.x"}},{"before":null,"after":"312b2dfac1dacc5c4b1afb346accf7a1ecb294ad","ref":"refs/heads/vector-peephole-vmerge-to-vmv.v.v-same-mask-passthru-fix","pushedAt":"2024-09-09T09:07:33.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Fix same mask vmerge peephole discarding false operand\n\nThis fixes the issue raised in https://github.com/llvm/llvm-project/pull/106108#discussion_r1749677510\n\nTrue's passthru needs to be equivalent to vmerge's false, but we also allow true's passthru to be undef.\n\nHowever if it's undef then we need to replace it with vmerge's false, otherwise we end up discarding the false operand entirely.\n\nThe changes in fixed-vectors-strided-load-store-asm.ll undo the changes in #106108 where we introduced this miscompile.","shortMessageHtmlLink":"[RISCV] Fix same mask vmerge peephole discarding false operand"}},{"before":null,"after":"76d252b877fe3deeebf6a8111deb9f2fc6250d35","ref":"refs/heads/vector-remat/vmv.v.i","pushedAt":"2024-09-06T09:36:36.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Rematerialize vmv.v.i\n\nThis continues the line of work started in #97520, and gives a 2.5% reduction in the number of spills on SPEC CPU 2017.\n\n    Program            regalloc.NumSpills                regalloc.NumReloads                regalloc.NumReMaterialization\n                       lhs                rhs      diff  lhs                 rhs      diff  lhs                           rhs      diff\n             605.mcf_s   141.00             141.00  0.0%   372.00              372.00  0.0%   123.00                        123.00  0.0%\n             505.mcf_r   141.00             141.00  0.0%   372.00              372.00  0.0%   123.00                        123.00  0.0%\n             519.lbm_r    73.00              73.00  0.0%    75.00               75.00  0.0%    18.00                         18.00  0.0%\n             619.lbm_s    68.00              68.00  0.0%    70.00               70.00  0.0%    20.00                         20.00  0.0%\n       631.deepsjeng_s   354.00             353.00 -0.3%   683.00              682.00 -0.1%   529.00                        530.00  0.2%\n       531.deepsjeng_r   354.00             353.00 -0.3%   683.00              682.00 -0.1%   529.00                        530.00  0.2%\n            625.x264_s  1896.00            1886.00 -0.5%  4583.00             4561.00 -0.5%  2086.00                       2108.00  1.1%\n            525.x264_r  1896.00            1886.00 -0.5%  4583.00             4561.00 -0.5%  2086.00                       2108.00  1.1%\n            508.namd_r  6665.00            6598.00 -1.0% 15649.00            15509.00 -0.9%  3014.00                       3164.00  5.0%\n             644.nab_s   761.00             753.00 -1.1%  1199.00             1183.00 -1.3%  1542.00                       1559.00  1.1%\n             544.nab_r   761.00             753.00 -1.1%  1199.00             1183.00 -1.3%  1542.00                       1559.00  1.1%\n         638.imagick_s  4287.00            4181.00 -2.5% 11624.00            11342.00 -2.4% 10551.00                      10884.00  3.2%\n         538.imagick_r  4287.00            4181.00 -2.5% 11624.00            11342.00 -2.4% 10551.00                      10884.00  3.2%\n             602.gcc_s 12771.00           12450.00 -2.5% 28117.00            27328.00 -2.8% 49757.00                      50526.00  1.5%\n             502.gcc_r 12771.00           12450.00 -2.5% 28117.00            27328.00 -2.8% 49757.00                      50526.00  1.5%\n    Geomean difference                             -2.5%                              -2.6%                                         1.8%\n\nI initially held off submitting this patch because it surprisingly introduced a lot of spills in the test diffs, but after #107290 the vmv.v.is that caused them are now gone.\n\nThe gist is that marking vmv.v.i as spillable decreased its spill weight, which actually resulted in more m8 registers getting evicted and spilled during register allocation.\n\nThe SPEC results show this isn't an issue in practice though, and I plan on posting a separate patch to explain this in more detail.","shortMessageHtmlLink":"[RISCV] Rematerialize vmv.v.i"}},{"before":"f1b20d57977be380524c0c0bc026857f525e09b3","after":"22684edceb1bf4390262e602c2728ecdbc43063d","ref":"refs/heads/vector-peephole-vmerge-to-vmv.v.v-same-mask","pushedAt":"2024-09-06T00:00:14.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"Assert v0defs non null","shortMessageHtmlLink":"Assert v0defs non null"}},{"before":"34c2cf071232f833b8c5e3c038e2231bc3675a54","after":"f1b20d57977be380524c0c0bc026857f525e09b3","ref":"refs/heads/vector-peephole-vmerge-to-vmv.v.v-same-mask","pushedAt":"2024-09-05T23:50:12.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"Remove dead break","shortMessageHtmlLink":"Remove dead break"}},{"before":"b11c15acbb5364edca431ab875ae8e36003b51e2","after":"34c2cf071232f833b8c5e3c038e2231bc3675a54","ref":"refs/heads/vector-peephole-vmerge-to-vmv.v.v-same-mask","pushedAt":"2024-09-05T16:30:15.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Move vmerge same mask peephole to RISCVVectorPeephole\n\nWe currently fold a vmerge.vvm into its true operand if the true operand is a masked pseudo with the same mask.\n\nWe can move this over to RISCVVectorPeephole by instead splitting it up into a smaller peephole which converts it to a vmv.v.v first. The existing foldVMV_V_V peephole will then take care of folding it if needed.\n\nThis is very similar to the existing all-ones mask peephole and we could potentially do it inside of it. I opted to put it in a separate peephole to make it easier to reason about, given that the duplication is small, but I could be persuaded either way.","shortMessageHtmlLink":"[RISCV] Move vmerge same mask peephole to RISCVVectorPeephole"}},{"before":"42fac4be0734125518d6759eafc552732dd4a7f7","after":"1b0da2919e1681bacfbf7cef319502dc4b81c2f4","ref":"refs/heads/vector-peephole-ensure-dominates-update-v0defs","pushedAt":"2024-09-05T15:03:29.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"Use V0Def of node being inserted before","shortMessageHtmlLink":"Use V0Def of node being inserted before"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"Y3Vyc29yOnYyOpK7MjAyNC0wOS0yMFQwODoxODozOC4wMDAwMDBazwAAAAS7tciG","startCursor":"Y3Vyc29yOnYyOpK7MjAyNC0wOS0yMFQwODoxODozOC4wMDAwMDBazwAAAAS7tciG","endCursor":"Y3Vyc29yOnYyOpK7MjAyNC0wOS0wNVQxNTowMzoyOS4wMDAwMDBazwAAAASt_R1s"}},"title":"Activity · lukel97/llvm-project"}