{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":629064630,"defaultBranch":"main","name":"llvm-project","ownerLogin":"lukel97","currentUserCanPush":false,"isFork":true,"isEmpty":false,"createdAt":"2023-04-17T14:44:01.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/2488460?v=4","public":true,"private":false,"isOrgOwned":false},"refInfo":{"name":"","listCacheKey":"v0:1726820318.0","currentOid":""},"activityList":{"items":[{"before":null,"after":"d373bcc5708b58728768fb5bb630b683b4d9e695","ref":"refs/heads/zvfhmin-zvfbfmin-custom-lower-memory-ops","pushedAt":"2024-09-20T08:18:38.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Lower memory ops and VP splat for zvfhmin and zvfbfmin\n\nWe can lower f16/bf16 memory ops without promotion through the existing custom lowering.\n\nSome of the zero strided VP loads get combined to a VP splat, so we need to also handle the lowering for that for f16/bf16 w/ zvfhmin/zvfbfmin. This patch copies the lowering from ISD::SPLAT_VECTOR over to lowerScalarSplat which is used by the VP splat lowering.","shortMessageHtmlLink":"[RISCV] Lower memory ops and VP splat for zvfhmin and zvfbfmin"}},{"before":"77d1032516e7057f185c5137071e4a97c3f3eb30","after":"737f56fdf7d8df4f1349085fe7256e27778e4a51","ref":"refs/heads/main","pushedAt":"2024-09-18T10:20:21.000Z","pushType":"push","commitsCount":10000,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Deduplicate zvfhmin and zvfbfmin operation actions. NFC\n\nAfter #108937 fp16 w/o zvfh and bf16 are now in sync and should have\nthe same lowering.","shortMessageHtmlLink":"[RISCV] Deduplicate zvfhmin and zvfbfmin operation actions. NFC"}},{"before":"d091eb11d9d841d366b0bb2e09fa75cf959a110c","after":"bb08e71be1fa928831d1aa1cb58207887940f826","ref":"refs/heads/zvfbfmin/promote","pushedAt":"2024-09-18T08:42:27.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Promote bf16 ops to f32 with zvfbfmin\n\nFor f16 with zvfhmin, we promote most ops and VP ops to f32. This does the same for bf16 with zvfbfmin, so the two fp types should now be in sync.\n\nThere are a few places in the custom lowering where we need to check for a LMUL 8 f16/bf16 vector that can't be promoted and must be split, this extracts that out into isPromotedOpNeedingSplit.\n\nIn a follow up NFC we can deduplicate the code that sets up the promotions.","shortMessageHtmlLink":"[RISCV] Promote bf16 ops to f32 with zvfbfmin"}},{"before":null,"after":"d091eb11d9d841d366b0bb2e09fa75cf959a110c","ref":"refs/heads/zvfbfmin/promote","pushedAt":"2024-09-17T07:33:43.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Promote bf16 ops to f32 with zvfbfmin\n\nFor f16 with zvfhmin, we promote most ops and VP ops to f32. This does the same for bf16 with zvfbfmin, so the two fp types should now be in sync.\n\nThere are a few places in the custom lowering where we need to check for a LMUL 8 f16/bf16 vector that can't be promoted and must be split, this extracts that out into isPromotedOpNeedingSplit.\n\nIn a follow up NFC we can deduplicate the code that sets up the promotions.","shortMessageHtmlLink":"[RISCV] Promote bf16 ops to f32 with zvfbfmin"}},{"before":"7eb4dfb0524eab81f5509944a04ab64e0c09986c","after":"0122f932e7be58ae2d9288a7ea8dec34598ac8fa","ref":"refs/heads/combineOp_VLToVWOp_VL-zvfbfmin","pushedAt":"2024-09-17T02:07:12.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"Adjust test case comment","shortMessageHtmlLink":"Adjust test case comment"}},{"before":"d3b112857eb04ef97cff41edf8f2f5cf670df5ce","after":"7eb4dfb0524eab81f5509944a04ab64e0c09986c","ref":"refs/heads/combineOp_VLToVWOp_VL-zvfbfmin","pushedAt":"2024-09-17T02:01:24.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"Add test case for multiple uses, where not all of them are widenable","shortMessageHtmlLink":"Add test case for multiple uses, where not all of them are widenable"}},{"before":null,"after":"d3b112857eb04ef97cff41edf8f2f5cf670df5ce","ref":"refs/heads/combineOp_VLToVWOp_VL-zvfbfmin","pushedAt":"2024-09-16T07:23:08.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Restrict combineOp_VLToVWOp_VL w/ bf16 to vfwmadd_vl with zvfbfwma\n\nWe currently make sure to check that if widening a f16 vector that we have zvfh. We need to do the same for bf16 vectors, but with the further restriction that we can only combine vfmadd_vl to vfwmadd_vl (to get vfwmaccbf16.v{v,f}).\n\nThis moves the checks into the extension support checks to keep it one place.","shortMessageHtmlLink":"[RISCV] Restrict combineOp_VLToVWOp_VL w/ bf16 to vfwmadd_vl with zvf…"}},{"before":"2f4d238eb3a673025a49af05e7a38bd618f8fa86","after":"0350ec6b7d17b8442565b8f1b44cf5d6202e0fe2","ref":"refs/heads/zvfhmin/rounding-ops","pushedAt":"2024-09-16T05:43:13.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"Update fixed length vector tests\n\nMarking ftrunc as promoted means these are no longer expanded","shortMessageHtmlLink":"Update fixed length vector tests"}},{"before":null,"after":"2f4d238eb3a673025a49af05e7a38bd618f8fa86","ref":"refs/heads/zvfhmin/rounding-ops","pushedAt":"2024-09-15T16:00:43.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Split fp rounding ops with zvfhmin nxv32f16\n\nThis adds zvfhmin test coverage for fceil, ffloor, fnearbyint, frint, fround and froundeven and splits them at nxv32f16 to avoid crashing, similarly to what we do for other nodes that we promote.\n\nThis also sets ftrunc to promote which was previously missing. We already promote the VP version of it, vp_froundtozero.","shortMessageHtmlLink":"[RISCV] Split fp rounding ops with zvfhmin nxv32f16"}},{"before":null,"after":"bee77b3c2a9b73448587293facf9b33dec1b07a9","ref":"refs/heads/zvfhmin-zvfbfmin-interleave-deinterleave","pushedAt":"2024-09-12T15:00:02.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Lower interleave + deinterleave for zvfhmin and zvfbfmin\n\nFortunately f16 and bf16 are always < EEW, so we can always lower via widening or narrowing. This means we don't need to add patterns for vrgather_vv_vl just yet.","shortMessageHtmlLink":"[RISCV] Lower interleave + deinterleave for zvfhmin and zvfbfmin"}},{"before":"05ea1adc01fb0ea82a79cd1d2b000535225cacc8","after":"c0059801ef3f2a55be0e84a4ba5bdbc57e8dd863","ref":"refs/heads/zvfhmin-zvfbfmin-resuage","pushedAt":"2024-09-12T11:39:12.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"Clarify wording in comment","shortMessageHtmlLink":"Clarify wording in comment"}},{"before":null,"after":"05ea1adc01fb0ea82a79cd1d2b000535225cacc8","ref":"refs/heads/zvfhmin-zvfbfmin-resuage","pushedAt":"2024-09-12T11:36:01.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Account for zvfhmin and zvfbfmin promotion in register usage\n\nA half with only zvfhmin or bfloat will end up getting promoted to a f32 for most instructions.\n\nUnless the loop consists only of memory ops and permutation instructions which don't need promoted (is this common?), we'll end up using double the LMUL than what's currently being returned by getRegUsageForType.\n\nSince this is used by the loop vectorizer, it seems better to be conservative and assume that any usage of a zvfhmin half/bfloat will end up being widened to a f32.","shortMessageHtmlLink":"[RISCV] Account for zvfhmin and zvfbfmin promotion in register usage"}},{"before":null,"after":"99a1ed168e47fbc6aefaa867f40bce8f4175d773","ref":"refs/heads/f16/cost-model-promote","pushedAt":"2024-09-12T10:52:48.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Handle zvfhmin promotion to f32 in half arith costs\n\nArithmetic half ops on zvfhmin will be promoted and carried out in f32, so this updates getArithmeticInstrCost to check for this.","shortMessageHtmlLink":"[RISCV] Handle zvfhmin promotion to f32 in half arith costs"}},{"before":null,"after":"932efa82ee330d2c0a1c96f767e30e212e07a813","ref":"refs/heads/bf16-convert-lowering","pushedAt":"2024-09-12T07:24:19.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Lower bf16 {S,U}INT_TO_FP, FP_TO_{S,U}INT and VP variants\n\nThis handles int->fp/fp->int nodes for zvfbfmin, reusing the same parts that f16 uses with zvfhmin.\n\nThere's quite a bit of replication here that can probably be cleaned up at some point.","shortMessageHtmlLink":"[RISCV] Lower bf16 {S,U}INT_TO_FP, FP_TO_{S,U}INT and VP variants"}},{"before":"0e23f24ecc26bd94a0d5523856a00a3d9ac5e08b","after":"b95b1fe985ecf329ec17df9c1f5f6b6268cfd513","ref":"refs/heads/zvfbfmin-fabs-fcopysign-fneg-expand","pushedAt":"2024-09-12T00:45:49.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"Rename test files to put bf16 as suffix","shortMessageHtmlLink":"Rename test files to put bf16 as suffix"}},{"before":null,"after":"0e23f24ecc26bd94a0d5523856a00a3d9ac5e08b","ref":"refs/heads/zvfbfmin-fabs-fcopysign-fneg-expand","pushedAt":"2024-09-11T16:31:26.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Expand bf16 FNEG/FABS/FCOPYSIGN\n\nThe motivation for this is to start promoting bf16 ops to f32 so that we can mark bf16 as a supported type in RISCVTTIImpl::isElementTypeLegalForScalableVector and scalably-vectorize it.\n\nThis starts with expanding the nodes that can't be promoted to f32 due to canonicalizing NaNs, similarly to f16 in #106652.","shortMessageHtmlLink":"[RISCV] Expand bf16 FNEG/FABS/FCOPYSIGN"}},{"before":null,"after":"0a36f6f72bec63039b71c709df5d17d1155fa36e","ref":"refs/heads/bf16-extload-truncstore-crash-fix","pushedAt":"2024-09-11T14:44:59.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Expand bf16 vector truncstores and extloads\n\nPreviously they were legal by default, so the truncstore/extload test cases would get combined and crash during selection.\nThese are set to expand for f16 so do the same for bf16.","shortMessageHtmlLink":"[RISCV] Expand bf16 vector truncstores and extloads"}},{"before":null,"after":"50378679ad48f8c1d677e1174c5a4ae71c63a582","ref":"refs/heads/expandmemcmp-scratch","pushedAt":"2024-09-11T13:57:49.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"disable","shortMessageHtmlLink":"disable"}},{"before":null,"after":"cf21ba47d40edde8cbe3f763c434f5cbdda88096","ref":"refs/heads/vfwmaccbf16-fixed-patterns","pushedAt":"2024-09-11T12:13:56.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Add fixed length vector patterns for vfwmaccbf16.vv\n\nThis adds VL patterns for vfwmaccbf16.vv so that we can handle fixed length vectors.\n\nIt does this by teaching combineOp_VLToVWOp_VL to emit RISCVISD::VFWMADD_VL for bf16. The change in getOrCreateExtendedOp is needed because getNarrowType is based off of the bitwidth so returns f16. We need to explicitly check for bf16.\n\nNote that the .vf patterns don't work yet, since the build_vector pattern gets lowered to a vmv.v.x not a vfmv.v.f which SplatFP doesn't pick up, see #106637.","shortMessageHtmlLink":"[RISCV] Add fixed length vector patterns for vfwmaccbf16.vv"}},{"before":"d621e74f0981356f04f1dca97fae18028207dbdb","after":"176055a468db7013d9aee01d6641e6303a014213","ref":"refs/heads/vector-remat/vmv.s.x-vfmv.s.f","pushedAt":"2024-09-11T01:43:45.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Rematerialize vmv.s.x and vfmv.s.f\n\nContinuing with #107993 and #108007, this handles the last of the main rematerializable vector instructions.\n\n Program regalloc.NumSpills regalloc.NumReloads regalloc.NumRemats\n lhs rhs diff lhs rhs diff lhs rhs diff\n 508.namd_r 6598.00 6598.00 0.0% 15509.00 15509.00 0.0% 2387.00 2387.00 0.0%\n 505.mcf_r 141.00 141.00 0.0% 372.00 372.00 0.0% 36.00 36.00 0.0%\n 641.leela_s 356.00 356.00 0.0% 525.00 525.00 0.0% 117.00 117.00 0.0%\n 631.deepsjeng_s 353.00 353.00 0.0% 682.00 682.00 0.0% 124.00 124.00 0.0%\n 623.xalancbmk_s 1548.00 1548.00 0.0% 2466.00 2466.00 0.0% 620.00 620.00 0.0%\n 620.omnetpp_s 946.00 946.00 0.0% 1485.00 1485.00 0.0% 1178.00 1178.00 0.0%\n 605.mcf_s 141.00 141.00 0.0% 372.00 372.00 0.0% 36.00 36.00 0.0%\n 557.xz_r 289.00 289.00 0.0% 505.00 505.00 0.0% 172.00 172.00 0.0%\n 541.leela_r 356.00 356.00 0.0% 525.00 525.00 0.0% 117.00 117.00 0.0%\n 531.deepsjeng_r 353.00 353.00 0.0% 682.00 682.00 0.0% 124.00 124.00 0.0%\n 520.omnetpp_r 946.00 946.00 0.0% 1485.00 1485.00 0.0% 1178.00 1178.00 0.0%\n 523.xalancbmk_r 1548.00 1548.00 0.0% 2466.00 2466.00 0.0% 620.00 620.00 0.0%\n 619.lbm_s 68.00 68.00 0.0% 70.00 70.00 0.0% 1.00 1.00 0.0%\n 519.lbm_r 73.00 73.00 0.0% 75.00 75.00 0.0% 1.00 1.00 0.0%\n 657.xz_s 289.00 289.00 0.0% 505.00 505.00 0.0% 172.00 172.00 0.0%\n 511.povray_r 1937.00 1936.00 -0.1% 3629.00 3628.00 -0.0% 517.00 518.00 0.2%\n 502.gcc_r 12450.00 12442.00 -0.1% 27328.00 27317.00 -0.0% 9409.00 9409.00 0.0%\n 602.gcc_s 12450.00 12442.00 -0.1% 27328.00 27317.00 -0.0% 9409.00 9409.00 0.0%\n 638.imagick_s 4181.00 4178.00 -0.1% 11342.00 11338.00 -0.0% 3366.00 3368.00 0.1%\n 538.imagick_r 4181.00 4178.00 -0.1% 11342.00 11338.00 -0.0% 3366.00 3368.00 0.1%\n 500.perlbench_r 4178.00 4175.00 -0.1% 9162.00 9159.00 -0.0% 2410.00 2410.00 0.0%\n 600.perlbench_s 4178.00 4175.00 -0.1% 9162.00 9159.00 -0.0% 2410.00 2410.00 0.0%\n 525.x264_r 1886.00 1884.00 -0.1% 4561.00 4559.00 -0.0% 471.00 471.00 0.0%\n 625.x264_s 1886.00 1884.00 -0.1% 4561.00 4559.00 -0.0% 471.00 471.00 0.0%\n 510.parest_r 42740.00 42689.00 -0.1% 82400.00 82252.00 -0.2% 5612.00 5620.00 0.1%\n 644.nab_s 753.00 752.00 -0.1% 1183.00 1182.00 -0.1% 318.00 318.00 0.0%\n 544.nab_r 753.00 752.00 -0.1% 1183.00 1182.00 -0.1% 318.00 318.00 0.0%\n 526.blender_r 13105.00 13084.00 -0.2% 26478.00 26442.00 -0.1% 18991.00 18989.00 -0.0%\nGeomean difference -0.0% -0.0% 0.0%\n\nThere's an extra spill in one of the test cases, but it's likely noise from the spill weights and isn't an issue in practice.","shortMessageHtmlLink":"[RISCV] Rematerialize vmv.s.x and vfmv.s.f"}},{"before":"287fba4f444663db7a636b8bbb5a143d6443de08","after":"b19c5dfaaca9e77b9f331cdbf041ee89a1feda9d","ref":"refs/heads/vector-remat/vfmv.v.f","pushedAt":"2024-09-11T01:38:10.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Rematerialize vfmv.v.f\n\nThis is the same principle as vmv.v.x in #107993, but for floats.\n\n Program regalloc.NumSpills regalloc.NumReloads regalloc.NumRemats\n lhs rhs diff lhs rhs diff lhs rhs diff\n 519.lbm_r 73.00 73.00 0.0% 75.00 75.00 0.0% 1.00 1.00 0.0%\n 544.nab_r 753.00 753.00 0.0% 1183.00 1183.00 0.0% 318.00 318.00 0.0%\n 619.lbm_s 68.00 68.00 0.0% 70.00 70.00 0.0% 1.00 1.00 0.0%\n 644.nab_s 753.00 753.00 0.0% 1183.00 1183.00 0.0% 318.00 318.00 0.0%\n 508.namd_r 6598.00 6597.00 -0.0% 15509.00 15503.00 -0.0% 2387.00 2393.00 0.3%\n 526.blender_r 13105.00 13084.00 -0.2% 26478.00 26443.00 -0.1% 18991.00 18996.00 0.0%\n 510.parest_r 42740.00 42665.00 -0.2% 82400.00 82309.00 -0.1% 5612.00 5648.00 0.6%\n 511.povray_r 1937.00 1929.00 -0.4% 3629.00 3620.00 -0.2% 517.00 525.00 1.5%\n 538.imagick_r 4181.00 4150.00 -0.7% 11342.00 11125.00 -1.9% 3366.00 3366.00 0.0%\n 638.imagick_s 4181.00 4150.00 -0.7% 11342.00 11125.00 -1.9% 3366.00 3366.00 0.0%\n Geomean difference -0.2% -0.4% 0.2%","shortMessageHtmlLink":"[RISCV] Rematerialize vfmv.v.f"}},{"before":null,"after":"d621e74f0981356f04f1dca97fae18028207dbdb","ref":"refs/heads/vector-remat/vmv.s.x-vfmv.s.f","pushedAt":"2024-09-10T12:11:51.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Rematerialize vmv.s.x and vfmv.s.f\n\nContinuing with #107993 and #108007, this handles the last of the main rematerializable vector instructions.\n\n Program regalloc.NumSpills regalloc.NumReloads regalloc.NumRemats\n lhs rhs diff lhs rhs diff lhs rhs diff\n 508.namd_r 6598.00 6598.00 0.0% 15509.00 15509.00 0.0% 2387.00 2387.00 0.0%\n 505.mcf_r 141.00 141.00 0.0% 372.00 372.00 0.0% 36.00 36.00 0.0%\n 641.leela_s 356.00 356.00 0.0% 525.00 525.00 0.0% 117.00 117.00 0.0%\n 631.deepsjeng_s 353.00 353.00 0.0% 682.00 682.00 0.0% 124.00 124.00 0.0%\n 623.xalancbmk_s 1548.00 1548.00 0.0% 2466.00 2466.00 0.0% 620.00 620.00 0.0%\n 620.omnetpp_s 946.00 946.00 0.0% 1485.00 1485.00 0.0% 1178.00 1178.00 0.0%\n 605.mcf_s 141.00 141.00 0.0% 372.00 372.00 0.0% 36.00 36.00 0.0%\n 557.xz_r 289.00 289.00 0.0% 505.00 505.00 0.0% 172.00 172.00 0.0%\n 541.leela_r 356.00 356.00 0.0% 525.00 525.00 0.0% 117.00 117.00 0.0%\n 531.deepsjeng_r 353.00 353.00 0.0% 682.00 682.00 0.0% 124.00 124.00 0.0%\n 520.omnetpp_r 946.00 946.00 0.0% 1485.00 1485.00 0.0% 1178.00 1178.00 0.0%\n 523.xalancbmk_r 1548.00 1548.00 0.0% 2466.00 2466.00 0.0% 620.00 620.00 0.0%\n 619.lbm_s 68.00 68.00 0.0% 70.00 70.00 0.0% 1.00 1.00 0.0%\n 519.lbm_r 73.00 73.00 0.0% 75.00 75.00 0.0% 1.00 1.00 0.0%\n 657.xz_s 289.00 289.00 0.0% 505.00 505.00 0.0% 172.00 172.00 0.0%\n 511.povray_r 1937.00 1936.00 -0.1% 3629.00 3628.00 -0.0% 517.00 518.00 0.2%\n 502.gcc_r 12450.00 12442.00 -0.1% 27328.00 27317.00 -0.0% 9409.00 9409.00 0.0%\n 602.gcc_s 12450.00 12442.00 -0.1% 27328.00 27317.00 -0.0% 9409.00 9409.00 0.0%\n 638.imagick_s 4181.00 4178.00 -0.1% 11342.00 11338.00 -0.0% 3366.00 3368.00 0.1%\n 538.imagick_r 4181.00 4178.00 -0.1% 11342.00 11338.00 -0.0% 3366.00 3368.00 0.1%\n 500.perlbench_r 4178.00 4175.00 -0.1% 9162.00 9159.00 -0.0% 2410.00 2410.00 0.0%\n 600.perlbench_s 4178.00 4175.00 -0.1% 9162.00 9159.00 -0.0% 2410.00 2410.00 0.0%\n 525.x264_r 1886.00 1884.00 -0.1% 4561.00 4559.00 -0.0% 471.00 471.00 0.0%\n 625.x264_s 1886.00 1884.00 -0.1% 4561.00 4559.00 -0.0% 471.00 471.00 0.0%\n 510.parest_r 42740.00 42689.00 -0.1% 82400.00 82252.00 -0.2% 5612.00 5620.00 0.1%\n 644.nab_s 753.00 752.00 -0.1% 1183.00 1182.00 -0.1% 318.00 318.00 0.0%\n 544.nab_r 753.00 752.00 -0.1% 1183.00 1182.00 -0.1% 318.00 318.00 0.0%\n 526.blender_r 13105.00 13084.00 -0.2% 26478.00 26442.00 -0.1% 18991.00 18989.00 -0.0%\nGeomean difference -0.0% -0.0% 0.0%\n\nThere's an extra spill in one of the test cases, but it's likely noise from the spill weights and isn't an issue in practice.","shortMessageHtmlLink":"[RISCV] Rematerialize vmv.s.x and vfmv.s.f"}},{"before":null,"after":"287fba4f444663db7a636b8bbb5a143d6443de08","ref":"refs/heads/vector-remat/vfmv.v.f","pushedAt":"2024-09-10T11:45:37.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Rematerialize vfmv.v.f\n\nThis is the same principle as vmv.v.x in #107993, but for floats.\n\n Program regalloc.NumSpills regalloc.NumReloads regalloc.NumRemats\n lhs rhs diff lhs rhs diff lhs rhs diff\n 519.lbm_r 73.00 73.00 0.0% 75.00 75.00 0.0% 1.00 1.00 0.0%\n 544.nab_r 753.00 753.00 0.0% 1183.00 1183.00 0.0% 318.00 318.00 0.0%\n 619.lbm_s 68.00 68.00 0.0% 70.00 70.00 0.0% 1.00 1.00 0.0%\n 644.nab_s 753.00 753.00 0.0% 1183.00 1183.00 0.0% 318.00 318.00 0.0%\n 508.namd_r 6598.00 6597.00 -0.0% 15509.00 15503.00 -0.0% 2387.00 2393.00 0.3%\n 526.blender_r 13105.00 13084.00 -0.2% 26478.00 26443.00 -0.1% 18991.00 18996.00 0.0%\n 510.parest_r 42740.00 42665.00 -0.2% 82400.00 82309.00 -0.1% 5612.00 5648.00 0.6%\n 511.povray_r 1937.00 1929.00 -0.4% 3629.00 3620.00 -0.2% 517.00 525.00 1.5%\n 538.imagick_r 4181.00 4150.00 -0.7% 11342.00 11125.00 -1.9% 3366.00 3366.00 0.0%\n 638.imagick_s 4181.00 4150.00 -0.7% 11342.00 11125.00 -1.9% 3366.00 3366.00 0.0%\n Geomean difference -0.2% -0.4% 0.2%","shortMessageHtmlLink":"[RISCV] Rematerialize vfmv.v.f"}},{"before":null,"after":"2d7701d6f11b31f9660f43a889470a75f21f350b","ref":"refs/heads/vector-remat/vmv.v.x","pushedAt":"2024-09-10T09:45:42.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Rematerialize vmv.v.x\n\nEven though vmv.v.x has a non constant scalar operand, because we have split register allocation between vectors and scalars on RISC-V we can rematerialize it.\n\n Program regalloc.NumSpills regalloc.NumReloads regalloc.NumReMaterialization\n lhs rhs diff lhs rhs diff lhs rhs diff\n 657.xz_s 289.00 292.00 1.0% 505.00 484.00 -4.2% 613.00 612.00 -0.2%\n 557.xz_r 289.00 292.00 1.0% 505.00 484.00 -4.2% 613.00 612.00 -0.2%\n 505.mcf_r 141.00 141.00 0.0% 372.00 372.00 0.0% 123.00 123.00 0.0%\n 641.leela_s 356.00 356.00 0.0% 525.00 525.00 0.0% 801.00 801.00 0.0%\n 625.x264_s 1886.00 1886.00 0.0% 4561.00 4561.00 0.0% 2108.00 2108.00 0.0%\n 623.xalancbmk_s 1548.00 1548.00 0.0% 2466.00 2466.00 0.0% 13983.00 13983.00 0.0%\n 620.omnetpp_s 946.00 946.00 0.0% 1485.00 1485.00 0.0% 8413.00 8413.00 0.0%\n 605.mcf_s 141.00 141.00 0.0% 372.00 372.00 0.0% 123.00 123.00 0.0%\n 541.leela_r 356.00 356.00 0.0% 525.00 525.00 0.0% 801.00 801.00 0.0%\n 525.x264_r 1886.00 1886.00 0.0% 4561.00 4561.00 0.0% 2108.00 2108.00 0.0%\n 510.parest_r 42740.00 42740.00 0.0% 82400.00 82400.00 0.0% 65165.00 65165.00 0.0%\n 520.omnetpp_r 946.00 946.00 0.0% 1485.00 1485.00 0.0% 8413.00 8413.00 0.0%\n 508.namd_r 6598.00 6598.00 0.0% 15509.00 15509.00 0.0% 3164.00 3164.00 0.0%\n 644.nab_s 753.00 753.00 0.0% 1183.00 1183.00 0.0% 1559.00 1559.00 0.0%\n 619.lbm_s 68.00 68.00 0.0% 70.00 70.00 0.0% 20.00 20.00 0.0%\n 544.nab_r 753.00 753.00 0.0% 1183.00 1183.00 0.0% 1559.00 1559.00 0.0%\n 519.lbm_r 73.00 73.00 0.0% 75.00 75.00 0.0% 18.00 18.00 0.0%\n 511.povray_r 1937.00 1937.00 0.0% 3629.00 3629.00 0.0% 4914.00 4914.00 0.0%\n 523.xalancbmk_r 1548.00 1548.00 0.0% 2466.00 2466.00 0.0% 13983.00 13983.00 0.0%\n 502.gcc_r 12450.00 12446.00 -0.0% 27328.00 27312.00 -0.1% 50527.00 50533.00 0.0%\n 602.gcc_s 12450.00 12446.00 -0.0% 27328.00 27312.00 -0.1% 50527.00 50533.00 0.0%\n 500.perlbench_r 4178.00 4175.00 -0.1% 9162.00 9061.00 -1.1% 10223.00 10392.00 1.7%\n 600.perlbench_s 4178.00 4175.00 -0.1% 9162.00 9061.00 -1.1% 10223.00 10392.00 1.7%\n 526.blender_r 13105.00 13081.00 -0.2% 26478.00 26438.00 -0.2% 65188.00 65230.00 0.1%\n 638.imagick_s 4181.00 4157.00 -0.6% 11342.00 11316.00 -0.2% 10884.00 10938.00 0.5%\n 538.imagick_r 4181.00 4157.00 -0.6% 11342.00 11316.00 -0.2% 10884.00 10938.00 0.5%\n 531.deepsjeng_r 353.00 345.00 -2.3% 682.00 674.00 -1.2% 530.00 538.00 1.5%\n 631.deepsjeng_s 353.00 345.00 -2.3% 682.00 674.00 -1.2% 530.00 538.00 1.5%\n Geomean difference -0.1% -0.5% 0.3%\n\nThe slight increase in spills in the xz benchmarks are from scalar spills (presumably due to more uses of the scalar operand affecting spill weights), we still manage to remove some vector spills in it too.\n\nInlineSpiller will check to make sure that the scalar operand is live at the point where the rematerialization occurs, so this won't extend any scalar live ranges. However this also means we may not be able to rematerialize in some cases, as shown in @vmv.v.x_needs_extended.\n\nIt might be worthwhile teaching InlineSpiller to extend scalar live ranges in a future patch. I experimented with this locally and it reduced spills on 531.deepsjeng_r by a further 3%.","shortMessageHtmlLink":"[RISCV] Rematerialize vmv.v.x"}},{"before":null,"after":"312b2dfac1dacc5c4b1afb346accf7a1ecb294ad","ref":"refs/heads/vector-peephole-vmerge-to-vmv.v.v-same-mask-passthru-fix","pushedAt":"2024-09-09T09:07:33.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Fix same mask vmerge peephole discarding false operand\n\nThis fixes the issue raised in https://github.com/llvm/llvm-project/pull/106108#discussion_r1749677510\n\nTrue's passthru needs to be equivalent to vmerge's false, but we also allow true's passthru to be undef.\n\nHowever if it's undef then we need to replace it with vmerge's false, otherwise we end up discarding the false operand entirely.\n\nThe changes in fixed-vectors-strided-load-store-asm.ll undo the changes in #106108 where we introduced this miscompile.","shortMessageHtmlLink":"[RISCV] Fix same mask vmerge peephole discarding false operand"}},{"before":null,"after":"76d252b877fe3deeebf6a8111deb9f2fc6250d35","ref":"refs/heads/vector-remat/vmv.v.i","pushedAt":"2024-09-06T09:36:36.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Rematerialize vmv.v.i\n\nThis continues the line of work started in #97520, and gives a 2.5% reduction in the number of spills on SPEC CPU 2017.\n\n Program regalloc.NumSpills regalloc.NumReloads regalloc.NumReMaterialization\n lhs rhs diff lhs rhs diff lhs rhs diff\n 605.mcf_s 141.00 141.00 0.0% 372.00 372.00 0.0% 123.00 123.00 0.0%\n 505.mcf_r 141.00 141.00 0.0% 372.00 372.00 0.0% 123.00 123.00 0.0%\n 519.lbm_r 73.00 73.00 0.0% 75.00 75.00 0.0% 18.00 18.00 0.0%\n 619.lbm_s 68.00 68.00 0.0% 70.00 70.00 0.0% 20.00 20.00 0.0%\n 631.deepsjeng_s 354.00 353.00 -0.3% 683.00 682.00 -0.1% 529.00 530.00 0.2%\n 531.deepsjeng_r 354.00 353.00 -0.3% 683.00 682.00 -0.1% 529.00 530.00 0.2%\n 625.x264_s 1896.00 1886.00 -0.5% 4583.00 4561.00 -0.5% 2086.00 2108.00 1.1%\n 525.x264_r 1896.00 1886.00 -0.5% 4583.00 4561.00 -0.5% 2086.00 2108.00 1.1%\n 508.namd_r 6665.00 6598.00 -1.0% 15649.00 15509.00 -0.9% 3014.00 3164.00 5.0%\n 644.nab_s 761.00 753.00 -1.1% 1199.00 1183.00 -1.3% 1542.00 1559.00 1.1%\n 544.nab_r 761.00 753.00 -1.1% 1199.00 1183.00 -1.3% 1542.00 1559.00 1.1%\n 638.imagick_s 4287.00 4181.00 -2.5% 11624.00 11342.00 -2.4% 10551.00 10884.00 3.2%\n 538.imagick_r 4287.00 4181.00 -2.5% 11624.00 11342.00 -2.4% 10551.00 10884.00 3.2%\n 602.gcc_s 12771.00 12450.00 -2.5% 28117.00 27328.00 -2.8% 49757.00 50526.00 1.5%\n 502.gcc_r 12771.00 12450.00 -2.5% 28117.00 27328.00 -2.8% 49757.00 50526.00 1.5%\n Geomean difference -2.5% -2.6% 1.8%\n\nI initially held off submitting this patch because it surprisingly introduced a lot of spills in the test diffs, but after #107290 the vmv.v.is that caused them are now gone.\n\nThe gist is that marking vmv.v.i as spillable decreased its spill weight, which actually resulted in more m8 registers getting evicted and spilled during register allocation.\n\nThe SPEC results show this isn't an issue in practice though, and I plan on posting a separate patch to explain this in more detail.","shortMessageHtmlLink":"[RISCV] Rematerialize vmv.v.i"}},{"before":"f1b20d57977be380524c0c0bc026857f525e09b3","after":"22684edceb1bf4390262e602c2728ecdbc43063d","ref":"refs/heads/vector-peephole-vmerge-to-vmv.v.v-same-mask","pushedAt":"2024-09-06T00:00:14.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"Assert v0defs non null","shortMessageHtmlLink":"Assert v0defs non null"}},{"before":"34c2cf071232f833b8c5e3c038e2231bc3675a54","after":"f1b20d57977be380524c0c0bc026857f525e09b3","ref":"refs/heads/vector-peephole-vmerge-to-vmv.v.v-same-mask","pushedAt":"2024-09-05T23:50:12.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"Remove dead break","shortMessageHtmlLink":"Remove dead break"}},{"before":"b11c15acbb5364edca431ab875ae8e36003b51e2","after":"34c2cf071232f833b8c5e3c038e2231bc3675a54","ref":"refs/heads/vector-peephole-vmerge-to-vmv.v.v-same-mask","pushedAt":"2024-09-05T16:30:15.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"[RISCV] Move vmerge same mask peephole to RISCVVectorPeephole\n\nWe currently fold a vmerge.vvm into its true operand if the true operand is a masked pseudo with the same mask.\n\nWe can move this over to RISCVVectorPeephole by instead splitting it up into a smaller peephole which converts it to a vmv.v.v first. The existing foldVMV_V_V peephole will then take care of folding it if needed.\n\nThis is very similar to the existing all-ones mask peephole and we could potentially do it inside of it. I opted to put it in a separate peephole to make it easier to reason about, given that the duplication is small, but I could be persuaded either way.","shortMessageHtmlLink":"[RISCV] Move vmerge same mask peephole to RISCVVectorPeephole"}},{"before":"42fac4be0734125518d6759eafc552732dd4a7f7","after":"1b0da2919e1681bacfbf7cef319502dc4b81c2f4","ref":"refs/heads/vector-peephole-ensure-dominates-update-v0defs","pushedAt":"2024-09-05T15:03:29.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"lukel97","name":"Luke Lau","path":"/lukel97","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2488460?s=80&v=4"},"commit":{"message":"Use V0Def of node being inserted before","shortMessageHtmlLink":"Use V0Def of node being inserted before"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"Y3Vyc29yOnYyOpK7MjAyNC0wOS0yMFQwODoxODozOC4wMDAwMDBazwAAAAS7tciG","startCursor":"Y3Vyc29yOnYyOpK7MjAyNC0wOS0yMFQwODoxODozOC4wMDAwMDBazwAAAAS7tciG","endCursor":"Y3Vyc29yOnYyOpK7MjAyNC0wOS0wNVQxNTowMzoyOS4wMDAwMDBazwAAAASt_R1s"}},"title":"Activity · lukel97/llvm-project"}