Skip to content

Commit

Permalink
[Flang][MLIR][OpenMP] Align map clause generation and fix issue with …
Browse files Browse the repository at this point in the history
…non-shared allocations for assumed shape/size descriptor types

This PR aims unify the map argument generation across both the implicit capture (captured in a target region) and the explicit capture (process map), currently the varPtr field of the MapInfo for the same variable will be different depending on how it's captured. This PR tries to align that across the generations of MapInfoOp in the OpenMP lowering.

Currently, I have opted to utilise the rawInput (input memref to a HLFIR DeclareInfoOp) as opposed to the addr field which includes more information. The side affect of this is that we have to deal with BoxTypes less often, which will result in simpler maps in these cases. The negative side affect of this is that we don't have access to the bounds information through the resulting value, however, I believe the bounds information we require in our case is still appropriately stored in the map bounds, and this seems to be the case from testing so far.

The other fix is for cases where we end up with a BoxType argument into a function (certain assumed shape and sizes cases do this) that has no fir.ref wrapping it. As we need the Box to be a reference type to actually utilise the operation to access the base address stored inside and create the correct mappings we currently generate an intermediate allocation in these cases, and then store into it, and utilise this as the map argument, as opposed to the original.

However, as we were not sharing the same intermediate allocation across all of the maps for a variable, this resulted in errors in certain cases when detatching/attatching the data e.g. via enter and exit.  This PR adjusts this for cases

Currently we only maintain tracking of all intermediate allocations for the current function scope, as opposed to module. Primarily as the only case I am aware of that this is required is in cases where we pass certain types of arguments to functions (so I opted to minimize the overhead of the pass for now). It could likely be extended to module scope if required if we find other cases where it's applicable and causing issues.
  • Loading branch information
agozillon committed Jul 8, 2024
1 parent 6bd0a22 commit 96134b2
Show file tree
Hide file tree
Showing 9 changed files with 147 additions and 149 deletions.
6 changes: 3 additions & 3 deletions flang/include/flang/Optimizer/Transforms/Passes.td
Original file line number Diff line number Diff line change
Expand Up @@ -342,11 +342,11 @@ def LoopVersioning : Pass<"loop-versioning", "mlir::func::FuncOp"> {
}

def OMPMapInfoFinalizationPass
: Pass<"omp-map-info-finalization"> {
: Pass<"omp-map-info-finalization", "mlir::func::FuncOp"> {
let summary = "expands OpenMP MapInfo operations containing descriptors";
let description = [{
Expands MapInfo operations containing descriptor types into multiple
MapInfo's for each pointer element in the descriptor that requires
Expands MapInfo operations containing descriptor types into multiple
MapInfo's for each pointer element in the descriptor that requires
explicit individual mapping by the OpenMP runtime.
}];
let dependentDialects = ["mlir::omp::OpenMPDialect"];
Expand Down
44 changes: 16 additions & 28 deletions flang/lib/Lower/OpenMP/ClauseProcessor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -970,25 +970,21 @@ bool ClauseProcessor::processMap(
object.ref(), clauseLocation, asFortran, bounds,
treatIndexAsSection);

auto origSymbol = converter.getSymbolAddress(*object.sym());
mlir::Value symAddr = info.addr;
if (origSymbol && fir::isTypeWithDescriptor(origSymbol.getType()))
symAddr = origSymbol;

// Explicit map captures are captured ByRef by default,
// optimisation passes may alter this to ByCopy or other capture
// types to optimise
mlir::Value baseOp = info.rawInput;
auto location = mlir::NameLoc::get(
mlir::StringAttr::get(firOpBuilder.getContext(), asFortran.str()),
symAddr.getLoc());
baseOp.getLoc());
mlir::omp::MapInfoOp mapOp = createMapInfoOp(
firOpBuilder, location, symAddr,
firOpBuilder, location, baseOp,
/*varPtrPtr=*/mlir::Value{}, asFortran.str(), bounds,
/*members=*/{}, /*membersIndex=*/mlir::DenseIntElementsAttr{},
static_cast<
std::underlying_type_t<llvm::omp::OpenMPOffloadMappingFlags>>(
mapTypeBits),
mlir::omp::VariableCaptureKind::ByRef, symAddr.getType());
mlir::omp::VariableCaptureKind::ByRef, baseOp.getType());

if (object.sym()->owner().IsDerivedType()) {
addChildIndexAndMapToParent(object, parentMemberIndices, mapOp,
Expand All @@ -997,9 +993,9 @@ bool ClauseProcessor::processMap(
result.mapVars.push_back(mapOp);
ptrMapSyms->push_back(object.sym());
if (mapSymTypes)
mapSymTypes->push_back(symAddr.getType());
mapSymTypes->push_back(baseOp.getType());
if (mapSymLocs)
mapSymLocs->push_back(symAddr.getLoc());
mapSymLocs->push_back(baseOp.getLoc());
}
}
});
Expand Down Expand Up @@ -1102,30 +1098,26 @@ bool ClauseProcessor::processUseDeviceAddr(
object.ref(), location, asFortran, bounds,
treatIndexAsSection);

auto origSymbol = converter.getSymbolAddress(*object.sym());
mlir::Value symAddr = info.addr;
if (origSymbol && fir::isTypeWithDescriptor(origSymbol.getType()))
symAddr = origSymbol;

// Explicit map captures are captured ByRef by default,
// optimisation passes may alter this to ByCopy or other capture
// types to optimise
mlir::Value baseOp = info.rawInput;
mlir::omp::MapInfoOp mapOp = createMapInfoOp(
firOpBuilder, location, symAddr,
firOpBuilder, location, baseOp,
/*varPtrPtr=*/mlir::Value{}, asFortran.str(), bounds,
/*members=*/{}, /*membersIndex=*/mlir::DenseIntElementsAttr{},
static_cast<
std::underlying_type_t<llvm::omp::OpenMPOffloadMappingFlags>>(
mapTypeBits),
mlir::omp::VariableCaptureKind::ByRef, symAddr.getType());
mlir::omp::VariableCaptureKind::ByRef, baseOp.getType());

if (object.sym()->owner().IsDerivedType()) {
addChildIndexAndMapToParent(object, parentMemberIndices, mapOp,
semaCtx);
} else {
useDeviceSyms.push_back(object.sym());
useDeviceTypes.push_back(symAddr.getType());
useDeviceLocs.push_back(symAddr.getLoc());
useDeviceTypes.push_back(baseOp.getType());
useDeviceLocs.push_back(baseOp.getLoc());
result.useDeviceAddrVars.push_back(mapOp);
}
}
Expand Down Expand Up @@ -1167,30 +1159,26 @@ bool ClauseProcessor::processUseDevicePtr(
object.ref(), location, asFortran, bounds,
treatIndexAsSection);

auto origSymbol = converter.getSymbolAddress(*object.sym());
mlir::Value symAddr = info.addr;
if (origSymbol && fir::isTypeWithDescriptor(origSymbol.getType()))
symAddr = origSymbol;

// Explicit map captures are captured ByRef by default,
// optimisation passes may alter this to ByCopy or other capture
// types to optimise
mlir::Value baseOp = info.rawInput;
mlir::omp::MapInfoOp mapOp = createMapInfoOp(
firOpBuilder, location, symAddr,
firOpBuilder, location, baseOp,
/*varPtrPtr=*/mlir::Value{}, asFortran.str(), bounds,
/*members=*/{}, /*membersIndex=*/mlir::DenseIntElementsAttr{},
static_cast<
std::underlying_type_t<llvm::omp::OpenMPOffloadMappingFlags>>(
mapTypeBits),
mlir::omp::VariableCaptureKind::ByRef, symAddr.getType());
mlir::omp::VariableCaptureKind::ByRef, baseOp.getType());

if (object.sym()->owner().IsDerivedType()) {
addChildIndexAndMapToParent(object, parentMemberIndices, mapOp,
semaCtx);
} else {
useDeviceSyms.push_back(object.sym());
useDeviceTypes.push_back(symAddr.getType());
useDeviceLocs.push_back(symAddr.getLoc());
useDeviceTypes.push_back(baseOp.getType());
useDeviceLocs.push_back(baseOp.getLoc());
result.useDevicePtrVars.push_back(mapOp);
}
}
Expand Down
10 changes: 3 additions & 7 deletions flang/lib/Lower/OpenMP/ClauseProcessor.h
Original file line number Diff line number Diff line change
Expand Up @@ -215,22 +215,18 @@ bool ClauseProcessor::processMotionClauses(lower::StatementContext &stmtCtx,
object.ref(), clauseLocation, asFortran, bounds,
treatIndexAsSection);

auto origSymbol = converter.getSymbolAddress(*object.sym());
mlir::Value symAddr = info.addr;
if (origSymbol && fir::isTypeWithDescriptor(origSymbol.getType()))
symAddr = origSymbol;

// Explicit map captures are captured ByRef by default,
// optimisation passes may alter this to ByCopy or other capture
// types to optimise
mlir::Value baseOp = info.rawInput;
mlir::omp::MapInfoOp mapOp = createMapInfoOp(
firOpBuilder, clauseLocation, symAddr,
firOpBuilder, clauseLocation, baseOp,
/*varPtrPtr=*/mlir::Value{}, asFortran.str(), bounds,
/*members=*/{}, /*membersIndex=*/mlir::DenseIntElementsAttr{},
static_cast<
std::underlying_type_t<llvm::omp::OpenMPOffloadMappingFlags>>(
mapTypeBits),
mlir::omp::VariableCaptureKind::ByRef, symAddr.getType());
mlir::omp::VariableCaptureKind::ByRef, baseOp.getType());

if (object.sym()->owner().IsDerivedType()) {
addChildIndexAndMapToParent(object, parentMemberIndices, mapOp,
Expand Down
150 changes: 74 additions & 76 deletions flang/lib/Lower/OpenMP/OpenMP.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2024,90 +2024,88 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable,
if (llvm::find(mapSyms, common) != mapSyms.end())
return;

if (llvm::find(mapSyms, &sym) == mapSyms.end()) {
mlir::Value baseOp = converter.getSymbolAddress(sym);
if (!baseOp)
if (const auto *details =
sym.template detailsIf<semantics::HostAssocDetails>()) {
baseOp = converter.getSymbolAddress(details->symbol());
converter.copySymbolBinding(details->symbol(), sym);
}
// If we come across a symbol without a symbol address, we return as we
// cannot process it, this is intended as a catch all early exit for
// symbols that do not have a corresponding extended value. Such as
// subroutines, interfaces and named blocks.
if (!converter.getSymbolAddress(sym))
return;

if (baseOp) {
llvm::SmallVector<mlir::Value> bounds;
std::stringstream name;
fir::ExtendedValue dataExv = converter.getSymbolExtendedValue(sym);
name << sym.name().ToString();

lower::AddrAndBoundsInfo info = getDataOperandBaseAddr(
converter, firOpBuilder, sym, converter.getCurrentLocation());
if (mlir::isa<fir::BaseBoxType>(
fir::unwrapRefType(info.addr.getType())))
bounds = lower::genBoundsOpsFromBox<mlir::omp::MapBoundsOp,
mlir::omp::MapBoundsType>(
firOpBuilder, converter.getCurrentLocation(), dataExv, info);
if (mlir::isa<fir::SequenceType>(
fir::unwrapRefType(info.addr.getType()))) {
bool dataExvIsAssumedSize =
semantics::IsAssumedSizeArray(sym.GetUltimate());
bounds = lower::genBaseBoundsOps<mlir::omp::MapBoundsOp,
mlir::omp::MapBoundsType>(
firOpBuilder, converter.getCurrentLocation(), dataExv,
dataExvIsAssumedSize);
}
if (llvm::find(mapSyms, &sym) != mapSyms.end())
return;

llvm::omp::OpenMPOffloadMappingFlags mapFlag =
llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_IMPLICIT;
mlir::omp::VariableCaptureKind captureKind =
mlir::omp::VariableCaptureKind::ByRef;

mlir::Type eleType = baseOp.getType();
if (auto refType = mlir::dyn_cast<fir::ReferenceType>(baseOp.getType()))
eleType = refType.getElementType();

// If a variable is specified in declare target link and if device
// type is not specified as `nohost`, it needs to be mapped tofrom
mlir::ModuleOp mod = firOpBuilder.getModule();
mlir::Operation *op = mod.lookupSymbol(converter.mangleName(sym));
auto declareTargetOp =
llvm::dyn_cast_if_present<mlir::omp::DeclareTargetInterface>(op);
if (declareTargetOp && declareTargetOp.isDeclareTarget()) {
if (declareTargetOp.getDeclareTargetCaptureClause() ==
mlir::omp::DeclareTargetCaptureClause::link &&
declareTargetOp.getDeclareTargetDeviceType() !=
mlir::omp::DeclareTargetDeviceType::nohost) {
mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO;
mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM;
}
} else if (fir::isa_trivial(eleType) || fir::isa_char(eleType)) {
captureKind = mlir::omp::VariableCaptureKind::ByCopy;
} else if (!fir::isa_builtin_cptr_type(eleType)) {
mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO;
mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM;
}
auto location =
mlir::NameLoc::get(mlir::StringAttr::get(firOpBuilder.getContext(),
sym.name().ToString()),
baseOp.getLoc());
mlir::Value mapOp = createMapInfoOp(
firOpBuilder, location, baseOp, /*varPtrPtr=*/mlir::Value{},
name.str(), bounds, /*members=*/{},
/*membersIndex=*/mlir::DenseIntElementsAttr{},
static_cast<
std::underlying_type_t<llvm::omp::OpenMPOffloadMappingFlags>>(
mapFlag),
captureKind, baseOp.getType());
if (const auto *details =
sym.template detailsIf<semantics::HostAssocDetails>())
converter.copySymbolBinding(details->symbol(), sym);
llvm::SmallVector<mlir::Value> bounds;
std::stringstream name;
fir::ExtendedValue dataExv = converter.getSymbolExtendedValue(sym);
name << sym.name().ToString();

lower::AddrAndBoundsInfo info = getDataOperandBaseAddr(
converter, firOpBuilder, sym, converter.getCurrentLocation());
mlir::Value baseOp = info.rawInput;
if (mlir::isa<fir::BaseBoxType>(fir::unwrapRefType(baseOp.getType())))
bounds = lower::genBoundsOpsFromBox<mlir::omp::MapBoundsOp,
mlir::omp::MapBoundsType>(
firOpBuilder, converter.getCurrentLocation(), dataExv, info);
if (mlir::isa<fir::SequenceType>(fir::unwrapRefType(baseOp.getType()))) {
bool dataExvIsAssumedSize =
semantics::IsAssumedSizeArray(sym.GetUltimate());
bounds = lower::genBaseBoundsOps<mlir::omp::MapBoundsOp,
mlir::omp::MapBoundsType>(
firOpBuilder, converter.getCurrentLocation(), dataExv,
dataExvIsAssumedSize);
}

clauseOps.mapVars.push_back(mapOp);
mapSyms.push_back(&sym);
mapLocs.push_back(baseOp.getLoc());
mapTypes.push_back(baseOp.getType());
llvm::omp::OpenMPOffloadMappingFlags mapFlag =
llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_IMPLICIT;
mlir::omp::VariableCaptureKind captureKind =
mlir::omp::VariableCaptureKind::ByRef;

mlir::Type eleType = baseOp.getType();
if (auto refType = mlir::dyn_cast<fir::ReferenceType>(baseOp.getType()))
eleType = refType.getElementType();

// If a variable is specified in declare target link and if device
// type is not specified as `nohost`, it needs to be mapped tofrom
mlir::ModuleOp mod = firOpBuilder.getModule();
mlir::Operation *op = mod.lookupSymbol(converter.mangleName(sym));
auto declareTargetOp =
llvm::dyn_cast_if_present<mlir::omp::DeclareTargetInterface>(op);
if (declareTargetOp && declareTargetOp.isDeclareTarget()) {
if (declareTargetOp.getDeclareTargetCaptureClause() ==
mlir::omp::DeclareTargetCaptureClause::link &&
declareTargetOp.getDeclareTargetDeviceType() !=
mlir::omp::DeclareTargetDeviceType::nohost) {
mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO;
mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM;
}
} else if (fir::isa_trivial(eleType) || fir::isa_char(eleType)) {
captureKind = mlir::omp::VariableCaptureKind::ByCopy;
} else if (!fir::isa_builtin_cptr_type(eleType)) {
mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO;
mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM;
}
auto location = mlir::NameLoc::get(
mlir::StringAttr::get(firOpBuilder.getContext(), sym.name().ToString()),
baseOp.getLoc());
mlir::Value mapOp = createMapInfoOp(
firOpBuilder, location, baseOp, /*varPtrPtr=*/mlir::Value{}, name.str(),
bounds, /*members=*/{},
/*membersIndex=*/mlir::DenseIntElementsAttr{},
static_cast<
std::underlying_type_t<llvm::omp::OpenMPOffloadMappingFlags>>(
mapFlag),
captureKind, baseOp.getType());

clauseOps.mapVars.push_back(mapOp);
mapSyms.push_back(&sym);
mapLocs.push_back(baseOp.getLoc());
mapTypes.push_back(baseOp.getType());
};
lower::pft::visitAllSymbols(eval, captureImplicitMap);


auto targetOp = firOpBuilder.create<mlir::omp::TargetOp>(loc, clauseOps);
genBodyOfTargetOp(converter, symTable, semaCtx, eval, targetOp, mapSyms,
mapLocs, mapTypes, dsp, loc, queue, item);
Expand Down
44 changes: 31 additions & 13 deletions flang/lib/Optimizer/Transforms/OMPMapInfoFinalization.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,14 @@ class OMPMapInfoFinalizationPass
: public fir::impl::OMPMapInfoFinalizationPassBase<
OMPMapInfoFinalizationPass> {

/// Tracks any intermediate function/subroutine local allocations we
/// generate for the descriptors of box type dummy arguments, so that
/// we can retrieve it for subsequent reuses within the functions
/// scope.
std::map</*descriptor opaque pointer=*/void *,
/*corresponding local alloca=*/fir::AllocaOp>
localBoxAllocas;

void genDescriptorMemberMaps(mlir::omp::MapInfoOp op,
fir::FirOpBuilder &builder,
mlir::Operation *target) {
Expand All @@ -75,14 +83,26 @@ class OMPMapInfoFinalizationPass
// perform an alloca and then store to it and retrieve the data from the new
// alloca.
if (mlir::isa<fir::BaseBoxType>(descriptor.getType())) {
mlir::OpBuilder::InsertPoint insPt = builder.saveInsertionPoint();
mlir::Block *allocaBlock = builder.getAllocaBlock();
assert(allocaBlock && "No alloca block found for this top level op");
builder.setInsertionPointToStart(allocaBlock);
auto alloca = builder.create<fir::AllocaOp>(loc, descriptor.getType());
builder.restoreInsertionPoint(insPt);
builder.create<fir::StoreOp>(loc, descriptor, alloca);
descriptor = alloca;
// If we have already created a local allocation for this BoxType,
// we must be sure to re-use it so that we end up with the same
// allocations being utilised for the same descriptor across all map uses,
// this prevents runtime issues such as not appropriately releasing or
// deleting all mapped data.
auto find = localBoxAllocas.find(descriptor.getAsOpaquePointer());
if (find != localBoxAllocas.end()) {
builder.create<fir::StoreOp>(loc, descriptor, find->second);
descriptor = find->second;
} else {
mlir::OpBuilder::InsertPoint insPt = builder.saveInsertionPoint();
mlir::Block *allocaBlock = builder.getAllocaBlock();
assert(allocaBlock && "No alloca block found for this top level op");
builder.setInsertionPointToStart(allocaBlock);
auto alloca = builder.create<fir::AllocaOp>(loc, descriptor.getType());
builder.restoreInsertionPoint(insPt);
builder.create<fir::StoreOp>(loc, descriptor, alloca);
localBoxAllocas[descriptor.getAsOpaquePointer()] = alloca;
descriptor = alloca;
}
}

mlir::Value baseAddrAddr = builder.create<fir::BoxOffsetOp>(
Expand Down Expand Up @@ -243,14 +263,12 @@ class OMPMapInfoFinalizationPass
// operation (usually function) containing the MapInfoOp because this pass
// will mutate siblings of MapInfoOp.
void runOnOperation() override {
mlir::ModuleOp module =
mlir::dyn_cast_or_null<mlir::ModuleOp>(getOperation());
if (!module)
module = getOperation()->getParentOfType<mlir::ModuleOp>();
mlir::func::FuncOp func = getOperation();
mlir::ModuleOp module = func->getParentOfType<mlir::ModuleOp>();
fir::KindMapping kindMap = fir::getKindMapping(module);
fir::FirOpBuilder builder{module, std::move(kindMap)};

getOperation()->walk([&](mlir::omp::MapInfoOp op) {
func->walk([&](mlir::omp::MapInfoOp op) {
// TODO: Currently only supports a single user for the MapInfoOp, this
// is fine for the moment as the Fortran Frontend will generate a
// new MapInfoOp per Target operation for the moment. However, when/if
Expand Down
Loading

0 comments on commit 96134b2

Please sign in to comment.