Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VPlan] Add VPIRBasicBlock, use to model pre-preheader. #93398

Merged
merged 15 commits into from
May 30, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -8607,7 +8607,7 @@ LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(VFRange &Range) {
// loop region contains a header and latch basic blocks.
VPlanPtr Plan = VPlan::createInitialVPlan(
createTripCountSCEV(Legal->getWidestInductionType(), PSE, OrigLoop),
*PSE.getSE());
*PSE.getSE(), OrigLoop->getLoopPreheader());
VPBasicBlock *HeaderVPBB = new VPBasicBlock("vector.body");
VPBasicBlock *LatchVPBB = new VPBasicBlock("vector.latch");
VPBlockUtils::insertBlockAfter(LatchVPBB, HeaderVPBB);
Expand Down Expand Up @@ -8855,7 +8855,7 @@ VPlanPtr LoopVectorizationPlanner::buildVPlan(VFRange &Range) {
// Create new empty VPlan
auto Plan = VPlan::createInitialVPlan(
createTripCountSCEV(Legal->getWidestInductionType(), PSE, OrigLoop),
*PSE.getSE());
*PSE.getSE(), OrigLoop->getLoopPreheader());

// Build hierarchical CFG
VPlanHCFGBuilder HCFGBuilder(OrigLoop, LI, *Plan);
Expand Down
57 changes: 55 additions & 2 deletions llvm/lib/Transforms/Vectorize/VPlan.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -442,6 +442,58 @@ VPBasicBlock::createEmptyBasicBlock(VPTransformState::CFGState &CFG) {
return NewBB;
}

void VPIRWrapperBlock::execute(VPTransformState *State) {
for (VPBlockBase *PredVPBlock : getHierarchicalPredecessors()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-preheader block has no predecessors, can assert, next to asserting it has no successors (currently).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in the initial version patching up the CFG isn't needed. Removed and added assert.

VPBasicBlock *PredVPBB = PredVPBlock->getExitingBasicBlock();
auto &PredVPSuccessors = PredVPBB->getHierarchicalSuccessors();
BasicBlock *PredBB = State->CFG.VPBB2IRBB[PredVPBB];

assert(PredBB && "Predecessor basic-block not found building successor.");
auto *PredBBTerminator = PredBB->getTerminator();
LLVM_DEBUG(dbgs() << "LV: draw edge from" << PredBB->getName() << '\n');

auto *TermBr = dyn_cast<BranchInst>(PredBBTerminator);
if (TermBr) {
// Set each forward successor here when it is created, excluding
// backedges. A backward successor is set when the branch is created.
unsigned idx = PredVPSuccessors.front() == this ? 0 : 1;
assert(!TermBr->getSuccessor(idx) &&
"Trying to reset an existing successor block.");
TermBr->setSuccessor(idx, WrappedBlock);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update DTU?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code is gone for now, as at the moment there are neither predecessors nor successors.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
TermBr->setSuccessor(idx, WrappedBlock);
TermBr->setSuccessor(idx, getWrappedBlock());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gone for now, thanks!

}
}

assert(getHierarchicalSuccessors().size() == 0 &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert(getHierarchicalSuccessors().size() == 0 &&
assert(getHierarchicalSuccessors().empty() &&

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thanks!

"VPIRWrapperBlock cannot have successors");
State->CFG.VPBB2IRBB[this] = getWrappedBlock();
State->CFG.PrevVPBB = this;

auto *Term = cast<BranchInst>(getWrappedBlock()->getTerminator());
State->Builder.SetInsertPoint(Term);

for (VPRecipeBase &Recipe : *this)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can iterate over Recipes instead of *this, as done in VPBasicBlock::execute().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

Recipe.execute(*State);

LLVM_DEBUG(dbgs() << "LV: filled BB:" << *getWrappedBlock());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This last part can be outlined into, say, VPBasicBlock::executeRecipesAtEndOfBB(BasicBlock*), to be reused by VPBasicBlock::execute().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

}

#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)

void VPIRWrapperBlock::print(raw_ostream &O, const Twine &Indent,
VPSlotTracker &SlotTracker) const {
O << Indent << "ir-bb<" << getName() << ">:\n";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set the name to include ir-bb as prefix, and avoid replicating VPBasicBlock::print() by overriding it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped, thanks!


auto RecipeIndent = Indent + " ";
for (const VPRecipeBase &Recipe : *this) {
Recipe.print(O, RecipeIndent, SlotTracker);
O << '\n';
}
assert(getSuccessors().empty() &&
"Wrapper blocks should not have successors");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assert belongs to validation routine rather than printing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Print implementation is gone now

printSuccessors(O, Indent);
}
#endif

void VPBasicBlock::execute(VPTransformState *State) {
bool Replica = State->Instance && !State->Instance->isFirstIteration();
VPBasicBlock *PrevVPBB = State->CFG.PrevVPBB;
Expand Down Expand Up @@ -769,8 +821,9 @@ VPlan::~VPlan() {
delete BackedgeTakenCount;
}

VPlanPtr VPlan::createInitialVPlan(const SCEV *TripCount, ScalarEvolution &SE) {
VPBasicBlock *Preheader = new VPBasicBlock("ph");
VPlanPtr VPlan::createInitialVPlan(const SCEV *TripCount, ScalarEvolution &SE,
BasicBlock *PH) {
VPIRWrapperBlock *Preheader = new VPIRWrapperBlock(PH);
VPBasicBlock *VecPreheader = new VPBasicBlock("vector.ph");
auto Plan = std::make_unique<VPlan>(Preheader, VecPreheader);
Plan->TripCount =
Expand Down
65 changes: 60 additions & 5 deletions llvm/lib/Transforms/Vectorize/VPlan.h
Original file line number Diff line number Diff line change
Expand Up @@ -473,7 +473,11 @@ class VPBlockBase {
/// that are actually instantiated. Values of this enumeration are kept in the
/// SubclassID field of the VPBlockBase objects. They are used for concrete
/// type identification.
using VPBlockTy = enum { VPBasicBlockSC, VPRegionBlockSC };
using VPBlockTy = enum {
VPBasicBlockSC,
VPRegionBlockSC,
VPIRWrapperBlockSC
};

using VPBlocksTy = SmallVectorImpl<VPBlockBase *>;

Expand Down Expand Up @@ -2834,6 +2838,10 @@ class VPBasicBlock : public VPBlockBase {
/// The VPRecipes held in the order of output instructions to generate.
RecipeListTy Recipes;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, Recipes can be made protected, now that a subclass may want to access them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks


protected:
VPBasicBlock(const unsigned char BlockSC, const Twine &Name = "")
: VPBlockBase(BlockSC, Name.str()) {}

public:
VPBasicBlock(const Twine &Name = "", VPRecipeBase *Recipe = nullptr)
: VPBlockBase(VPBasicBlockSC, Name.str()) {
Expand Down Expand Up @@ -2882,7 +2890,8 @@ class VPBasicBlock : public VPBlockBase {

/// Method to support type inquiry through isa, cast, and dyn_cast.
static inline bool classof(const VPBlockBase *V) {
return V->getVPBlockID() == VPBlockBase::VPBasicBlockSC;
return V->getVPBlockID() == VPBlockBase::VPBasicBlockSC ||
V->getVPBlockID() == VPBlockBase::VPIRWrapperBlockSC;
}

void insert(VPRecipeBase *Recipe, iterator InsertPt) {
Expand Down Expand Up @@ -2951,6 +2960,50 @@ class VPBasicBlock : public VPBlockBase {
BasicBlock *createEmptyBasicBlock(VPTransformState::CFGState &CFG);
};

/// A special type of VPBasicBlock that wraps an existing IR basic block.
/// Recipes of the block get added before the terminator of the wrapped IR basic
/// block.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can also serve to wrap ExitBB as well, hopefully simplifying VPBasicBlock::execute()? In which case recipes, if added, would be placed at the beginning rather than the end, e.g., replace (LCSSA) header phi's in ExitBB instead of having VPLiveOut::fixPhi() add incoming to existing Phi's.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will update #92651 soon based on this PR. Left the order as is for now, to avoid updating IR tests. Can adjust order as follow-up

class VPIRWrapperBlock : public VPBasicBlock {
BasicBlock *WrappedBlock;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
BasicBlock *WrappedBlock;
BasicBlock *IRBasicBlock;

or IRBB?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed, thanks


public:
VPIRWrapperBlock(BasicBlock *WrappedBlock)
: VPBasicBlock(VPIRWrapperBlockSC, WrappedBlock->getName()),
WrappedBlock(WrappedBlock) {}

~VPIRWrapperBlock() override {}

static inline bool classof(const VPBlockBase *V) {
return V->getVPBlockID() == VPBlockBase::VPIRWrapperBlockSC;
}

#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
/// Print this VPBsicBlock to \p O, prefixing all lines with \p Indent. \p
/// SlotTracker is used to print unnamed VPValue's using consequtive numbers.
///
/// Note that the numbering is applied to the whole VPlan, so printing
/// individual blocks is consistent with the whole VPlan printing.
void print(raw_ostream &O, const Twine &Indent,
VPSlotTracker &SlotTracker) const override;
using VPBlockBase::print; // Get the print(raw_stream &O) version.
#endif
/// The method which generates the output IR instructions that correspond to
/// this VPBasicBlock, thereby "executing" the VPlan.
void execute(VPTransformState *State) override;

VPIRWrapperBlock *clone() override {
auto *NewBlock = new VPIRWrapperBlock(WrappedBlock);
for (VPRecipeBase &R : *this)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: admittedly VPBasicBlock::clone() also iterates using *this rather than Recipes. Would be good to be consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recipes is private, can move to protected?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks!

NewBlock->appendRecipe(R.clone());
return NewBlock;
}

void dropAllReferences(VPValue *NewValue) override {}
void resetBlock(BasicBlock *N) { WrappedBlock = N; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is resetBlock() used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed, thanks!


BasicBlock *getWrappedBlock() { return WrappedBlock; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, thanks!

};

/// VPRegionBlock represents a collection of VPBasicBlocks and VPRegionBlocks
/// which form a Single-Entry-Single-Exiting subgraph of the output IR CFG.
/// A VPRegionBlock may indicate that its contents are to be replicated several
Expand Down Expand Up @@ -3139,12 +3192,12 @@ class VPlan {
~VPlan();

/// Create initial VPlan skeleton, having an "entry" VPBasicBlock (wrapping
/// original scalar pre-header) which contains SCEV expansions that need to
/// happen before the CFG is modified; a VPBasicBlock for the vector
/// original scalar pre-header \p PH) which contains SCEV expansions that need
/// to happen before the CFG is modified; a VPBasicBlock for the vector
/// pre-header, followed by a region for the vector loop, followed by the
/// middle VPBasicBlock.
static VPlanPtr createInitialVPlan(const SCEV *TripCount,
ScalarEvolution &PSE);
ScalarEvolution &PSE, BasicBlock *PH);

/// Prepare the plan for execution, setting up the required live-in values.
void prepareToExecute(Value *TripCount, Value *VectorTripCount,
Expand Down Expand Up @@ -3321,6 +3374,8 @@ class VPlanPrinter {
/// its successor blocks.
void dumpBasicBlock(const VPBasicBlock *BasicBlock);

void dumpIRWrapperBlock(const VPIRWrapperBlock *WrapperBlock);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is dumpIRWrapperBlock() defined, called, needed or can dumpBasicBlock() be (or is) reused?
Add documentation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed with the latest changes, removed thanks!


/// Print a given \p Region of the Plan.
void dumpRegion(const VPRegionBlock *Region);

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ define void @test_tc_less_than_16(ptr %A, i64 %N) {
; CHECK-NEXT: Live-in vp<[[VTC:%.+]]> = vector-trip-count
; CHECK-NEXT: vp<[[TC:%.+]]> = original trip-count
; CHECK-EMPTY:
; CHECK-NEXT: ph:
; CHECK-NEXT: ir-bb<entry>:
; CHECK-NEXT: EMIT vp<[[TC]]> = EXPAND SCEV (zext i4 (trunc i64 %N to i4) to i64)
; CHECK-NEXT: No successors
; CHECK-EMPTY:
Expand Down Expand Up @@ -45,7 +45,7 @@ define void @test_tc_less_than_16(ptr %A, i64 %N) {
; CHECK-NEXT: Live-in vp<[[VFxUF:%.+]]> = VF * UF
; CHECK-NEXT: vp<[[TC:%.+]]> = original trip-count
; CHECK-EMPTY:
; CHECK-NEXT: ph:
; CHECK-NEXT: ir-bb<entry>:
; CHECK-NEXT: EMIT vp<[[TC]]> = EXPAND SCEV (zext i4 (trunc i64 %N to i4) to i64)
; CHECK-NEXT: No successors
; CHECK-EMPTY:
Expand Down
14 changes: 8 additions & 6 deletions llvm/unittests/Transforms/Vectorize/VPlanTestBase.h
Original file line number Diff line number Diff line change
Expand Up @@ -67,9 +67,10 @@ class VPlanTestBase : public testing::Test {
assert(!verifyFunction(F) && "input function must be valid");
doAnalysis(F);

auto Plan = VPlan::createInitialVPlan(
SE->getBackedgeTakenCount(LI->getLoopFor(LoopHeader)), *SE);
VPlanHCFGBuilder HCFGBuilder(LI->getLoopFor(LoopHeader), LI.get(), *Plan);
Loop *L = LI->getLoopFor(LoopHeader);
auto Plan = VPlan::createInitialVPlan(SE->getBackedgeTakenCount(L), *SE,
L->getLoopPreheader());
VPlanHCFGBuilder HCFGBuilder(L, LI.get(), *Plan);
HCFGBuilder.buildHierarchicalCFG();
return Plan;
}
Expand All @@ -80,9 +81,10 @@ class VPlanTestBase : public testing::Test {
assert(!verifyFunction(F) && "input function must be valid");
doAnalysis(F);

auto Plan = VPlan::createInitialVPlan(
SE->getBackedgeTakenCount(LI->getLoopFor(LoopHeader)), *SE);
VPlanHCFGBuilder HCFGBuilder(LI->getLoopFor(LoopHeader), LI.get(), *Plan);
Loop *L = LI->getLoopFor(LoopHeader);
auto Plan = VPlan::createInitialVPlan(SE->getBackedgeTakenCount(L), *SE,
L->getLoopPreheader());
VPlanHCFGBuilder HCFGBuilder(L, LI.get(), *Plan);
HCFGBuilder.buildPlainCFG();
return Plan;
}
Expand Down
Loading