Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cast global variables to default address space #135026

Merged
merged 2 commits into from
Jan 31, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion compiler/rustc_codegen_llvm/src/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1225,7 +1225,9 @@ impl<'a, 'll, 'tcx> BuilderMethods<'a, 'tcx> for Builder<'a, 'll, 'tcx> {
impl<'ll> StaticBuilderMethods for Builder<'_, 'll, '_> {
fn get_static(&mut self, def_id: DefId) -> &'ll Value {
// Forward to the `get_static` method of `CodegenCx`
self.cx().get_static(def_id)
let s = self.cx().get_static(def_id);
// Cast to default address space if globals are in a different addrspace
self.cx().const_pointercast(s, self.type_ptr())
}
}

Expand Down
9 changes: 6 additions & 3 deletions compiler/rustc_codegen_llvm/src/common.rs
Original file line number Diff line number Diff line change
Expand Up @@ -221,6 +221,8 @@ impl<'ll, 'tcx> ConstCodegenMethods<'tcx> for CodegenCx<'ll, 'tcx> {
llvm::LLVMSetUnnamedAddress(g, llvm::UnnamedAddr::Global);
}
llvm::set_linkage(g, llvm::Linkage::InternalLinkage);
// Cast to default address space if globals are in a different addrspace
let g = self.const_pointercast(g, self.type_ptr());
(s.to_owned(), g)
})
.1;
Expand Down Expand Up @@ -285,7 +287,7 @@ impl<'ll, 'tcx> ConstCodegenMethods<'tcx> for CodegenCx<'ll, 'tcx> {
let alloc = alloc.inner();
let value = match alloc.mutability {
Mutability::Mut => self.static_addr_of_mut(init, alloc.align, None),
_ => self.static_addr_of(init, alloc.align, None),
_ => self.static_addr_of_impl(init, alloc.align, None),
Comment on lines -288 to +290
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this function call the _impl version? I would think that _impl would be a private helper method or something like that. Unless this means that it's the address of an implementation? The naming here is not doing any favors and there are also no comments on the methods to explain what they are for so I'm really struggling to follow. Can you clarify in the code what the intended difference between static_addr_of and static_addr_of_impl is, and comment why we need the _impl version here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look!
I added comments to the functions, let me know if that makes it clearer or if you think something is unclear or missing.

To the point you mentioned here, static_addr_of_impl always returns a llvm::GlobalVariable. But that may not be a ptr (in LLVM IR; this is implicitly an addrspace(0) ptr, 0 is the default and is omitted), but a addrspace(1) ptr.
When we use a global in Rust (or in the ssa codegen), we need it to be just a ptr (so, addrspace(0)). Therefore, static_addr_of must return a ptr and it does that by inserting an addrspacecast on the result of static_addr_of_impl.

So, depending on what we need either function is called. If we need a llvm::GlobalVariable, we need to call static_addr_of_impl, if a ptr is needed, we call static_addr_of.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That all makes sense, but why in these call sites specifically, is it okay to have addrspace(1) ptr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we want to have the llvm::GlobalValue and edit some of its properties like with llvm::set_value_name below.
We cast it to an (addrspace(0)) ptr with the const_pointercast at the end of this Scalar::Ptr match arm.

};
if !self.sess().fewer_names() && llvm::get_value_name(value).is_empty()
{
Expand All @@ -311,7 +313,7 @@ impl<'ll, 'tcx> ConstCodegenMethods<'tcx> for CodegenCx<'ll, 'tcx> {
.global_alloc(self.tcx.vtable_allocation((ty, dyn_ty.principal())))
.unwrap_memory();
let init = const_alloc_to_llvm(self, alloc, /*static*/ false);
let value = self.static_addr_of(init, alloc.inner().align, None);
let value = self.static_addr_of_impl(init, alloc.inner().align, None);
(value, AddressSpace::DATA)
}
GlobalAlloc::Static(def_id) => {
Expand All @@ -323,7 +325,8 @@ impl<'ll, 'tcx> ConstCodegenMethods<'tcx> for CodegenCx<'ll, 'tcx> {
let llval = unsafe {
llvm::LLVMConstInBoundsGEP2(
self.type_i8(),
self.const_bitcast(base_addr, self.type_ptr_ext(base_addr_space)),
// Cast to the required address space if necessary
self.const_pointercast(base_addr, self.type_ptr_ext(base_addr_space)),
&self.const_usize(offset.bytes()),
1,
)
Expand Down
61 changes: 44 additions & 17 deletions compiler/rustc_codegen_llvm/src/consts.rs
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,14 @@ impl<'ll> CodegenCx<'ll, '_> {
unsafe { llvm::LLVMConstBitCast(val, ty) }
}

pub(crate) fn const_pointercast(&self, val: &'ll Value, ty: &'ll Type) -> &'ll Value {
unsafe { llvm::LLVMConstPointerCast(val, ty) }
}

/// Create a global variable.
///
/// The returned global variable is a pointer in the default address space for globals.
/// Fails if a symbol with the given name already exists.
pub(crate) fn static_addr_of_mut(
&self,
cv: &'ll Value,
Expand All @@ -233,6 +241,34 @@ impl<'ll> CodegenCx<'ll, '_> {
gv
}

/// Create a global constant.
///
/// The returned global variable is a pointer in the default address space for globals.
pub(crate) fn static_addr_of_impl(
&self,
cv: &'ll Value,
align: Align,
kind: Option<&str>,
) -> &'ll Value {
if let Some(&gv) = self.const_globals.borrow().get(&cv) {
unsafe {
// Upgrade the alignment in cases where the same constant is used with different
// alignment requirements
let llalign = align.bytes() as u32;
if llalign > llvm::LLVMGetAlignment(gv) {
llvm::LLVMSetAlignment(gv, llalign);
}
}
return gv;
}
let gv = self.static_addr_of_mut(cv, align, kind);
unsafe {
llvm::LLVMSetGlobalConstant(gv, True);
}
self.const_globals.borrow_mut().insert(cv, gv);
gv
}

#[instrument(level = "debug", skip(self))]
pub(crate) fn get_static(&self, def_id: DefId) -> &'ll Value {
let instance = Instance::mono(self.tcx, def_id);
Expand Down Expand Up @@ -505,24 +541,15 @@ impl<'ll> CodegenCx<'ll, '_> {
}

impl<'ll> StaticCodegenMethods for CodegenCx<'ll, '_> {
/// Get a pointer to a global variable.
///
/// The pointer will always be in the default address space. If global variables default to a
/// different address space, an addrspacecast is inserted.
fn static_addr_of(&self, cv: &'ll Value, align: Align, kind: Option<&str>) -> &'ll Value {
if let Some(&gv) = self.const_globals.borrow().get(&cv) {
unsafe {
// Upgrade the alignment in cases where the same constant is used with different
// alignment requirements
let llalign = align.bytes() as u32;
if llalign > llvm::LLVMGetAlignment(gv) {
llvm::LLVMSetAlignment(gv, llalign);
}
}
return gv;
}
let gv = self.static_addr_of_mut(cv, align, kind);
unsafe {
llvm::LLVMSetGlobalConstant(gv, True);
}
self.const_globals.borrow_mut().insert(cv, gv);
gv
let gv = self.static_addr_of_impl(cv, align, kind);
// static_addr_of_impl returns the bare global variable, which might not be in the default
// address space. Cast to the default address space if necessary.
self.const_pointercast(gv, self.type_ptr())
}

fn codegen_static(&self, def_id: DefId) {
Expand Down
25 changes: 25 additions & 0 deletions compiler/rustc_codegen_llvm/src/debuginfo/metadata.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1500,6 +1500,26 @@ fn build_vtable_type_di_node<'ll, 'tcx>(
.di_node
}

/// Get the global variable for the vtable.
///
/// When using global variables, we may have created an addrspacecast to get a pointer to the
/// default address space if global variables are created in a different address space.
/// For modifying the vtable, we need the real global variable. This function accepts either a
/// global variable (which is simply returned), or an addrspacecast constant expression.
/// If the given value is an addrspacecast, the cast is removed and the global variable behind
/// the cast is returned.
fn find_vtable_behind_cast<'ll>(vtable: &'ll Value) -> &'ll Value {
// The vtable is a global variable, which may be behind an addrspacecast.
unsafe {
if let Some(c) = llvm::LLVMIsAConstantExpr(vtable) {
if llvm::LLVMGetConstOpcode(c) == llvm::Opcode::AddrSpaceCast {
return llvm::LLVMGetOperand(c, 0).unwrap();
}
}
}
vtable
}

pub(crate) fn apply_vcall_visibility_metadata<'ll, 'tcx>(
cx: &CodegenCx<'ll, 'tcx>,
ty: Ty<'tcx>,
Expand All @@ -1520,6 +1540,8 @@ pub(crate) fn apply_vcall_visibility_metadata<'ll, 'tcx>(

let Some(trait_ref) = trait_ref else { return };

// Unwrap potential addrspacecast
let vtable = find_vtable_behind_cast(vtable);
let trait_ref_self = trait_ref.with_self_ty(cx.tcx, ty);
let trait_ref_self = cx.tcx.erase_regions(trait_ref_self);
let trait_def_id = trait_ref_self.def_id();
Expand Down Expand Up @@ -1593,6 +1615,9 @@ pub(crate) fn create_vtable_di_node<'ll, 'tcx>(
return;
}

// Unwrap potential addrspacecast
let vtable = find_vtable_behind_cast(vtable);

// When full debuginfo is enabled, we want to try and prevent vtables from being
// merged. Otherwise debuggers will have a hard time mapping from dyn pointer
// to concrete type.
Expand Down
77 changes: 77 additions & 0 deletions compiler/rustc_codegen_llvm/src/llvm/ffi.rs
Original file line number Diff line number Diff line change
Expand Up @@ -660,6 +660,79 @@ pub enum MemoryEffects {
InaccessibleMemOnly,
}

/// LLVMOpcode
#[derive(Copy, Clone, PartialEq, Eq)]
#[repr(C)]
pub enum Opcode {
Ret = 1,
Br = 2,
Switch = 3,
IndirectBr = 4,
Invoke = 5,
Unreachable = 7,
CallBr = 67,
FNeg = 66,
Add = 8,
FAdd = 9,
Sub = 10,
FSub = 11,
Mul = 12,
FMul = 13,
UDiv = 14,
SDiv = 15,
FDiv = 16,
URem = 17,
SRem = 18,
FRem = 19,
Shl = 20,
LShr = 21,
AShr = 22,
And = 23,
Or = 24,
Xor = 25,
Alloca = 26,
Load = 27,
Store = 28,
GetElementPtr = 29,
Trunc = 30,
ZExt = 31,
SExt = 32,
FPToUI = 33,
FPToSI = 34,
UIToFP = 35,
SIToFP = 36,
FPTrunc = 37,
FPExt = 38,
PtrToInt = 39,
IntToPtr = 40,
BitCast = 41,
AddrSpaceCast = 60,
ICmp = 42,
FCmp = 43,
PHI = 44,
Call = 45,
Select = 46,
UserOp1 = 47,
UserOp2 = 48,
VAArg = 49,
ExtractElement = 50,
InsertElement = 51,
ShuffleVector = 52,
ExtractValue = 53,
InsertValue = 54,
Freeze = 68,
Fence = 55,
AtomicCmpXchg = 56,
AtomicRMW = 57,
Resume = 58,
LandingPad = 59,
CleanupRet = 61,
CatchRet = 62,
CatchPad = 63,
CleanupPad = 64,
CatchSwitch = 65,
}

unsafe extern "C" {
type Opaque;
}
Expand Down Expand Up @@ -975,7 +1048,10 @@ unsafe extern "C" {
pub fn LLVMConstPtrToInt<'a>(ConstantVal: &'a Value, ToType: &'a Type) -> &'a Value;
pub fn LLVMConstIntToPtr<'a>(ConstantVal: &'a Value, ToType: &'a Type) -> &'a Value;
pub fn LLVMConstBitCast<'a>(ConstantVal: &'a Value, ToType: &'a Type) -> &'a Value;
pub fn LLVMConstPointerCast<'a>(ConstantVal: &'a Value, ToType: &'a Type) -> &'a Value;
pub fn LLVMGetAggregateElement(ConstantVal: &Value, Idx: c_uint) -> Option<&Value>;
pub fn LLVMGetConstOpcode(ConstantVal: &Value) -> Opcode;
pub fn LLVMIsAConstantExpr(Val: &Value) -> Option<&Value>;

// Operations on global variables, functions, and aliases (globals)
pub fn LLVMIsDeclaration(Global: &Value) -> Bool;
Expand Down Expand Up @@ -1032,6 +1108,7 @@ unsafe extern "C" {
// Operations on instructions
pub fn LLVMIsAInstruction(Val: &Value) -> Option<&Value>;
pub fn LLVMGetFirstBasicBlock(Fn: &Value) -> &BasicBlock;
pub fn LLVMGetOperand(Val: &Value, Index: c_uint) -> Option<&Value>;

// Operations on call sites
pub fn LLVMSetInstructionCallConv(Instr: &Value, CC: c_uint);
Expand Down
Loading