Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion error in function pointer fmt::Debug implementation - LLVM ERROR: Cannot select: t2: i16 = addrspacecast[1 -> 0] #143

Open
dylanmckay opened this issue Jun 8, 2019 · 17 comments

Comments

@dylanmckay
Copy link
Member

dylanmckay commented Jun 8, 2019

Raised from discussion on #137.

LLVM ERROR: Cannot select: t2: i16 = addrspacecast[1 -> 0] undef:i16
  t1: i16 = undef
In function: _ZN4core3ptr87_$LT$impl$u20$core..fmt..Debug$u20$for$u20$unsafe$u20$fn$LP$A$RP$$u20$.$GT$$u20$Ret$GT$3fmt17h0fdf8ca7140def9bE

The bug is exposed by the std::fmt::Debug implementation for functions and function pointers. Rust emits a pointer dereference with a target pointer type of addrspace(0)* i16, using LLVM routines that will implicitly insert a bitcast (iN -> iM bits) or an address space cast if necessary. In the case of function pointers, Rust should codegen a addrspace(1)* i16 pointer, so that LLVM doesn't have to map (addrspacecast) PROGMEM pointers to RAM pointers an impossible task because there is no memory mapping and the address spaces do not overlap.

Here is the full LLVM IR for libcore from the #137 branch libcore.ll.

@dylanmckay
Copy link
Member Author

Here is the IR of the failing function

; <&T as core::fmt::Debug>::fmt
; Function Attrs: uwtable
define internal zeroext i1 @"_ZN42_$LT$$RF$T$u20$as$u20$core..fmt..Debug$GT$3fmt17hbd9cadb576049f93E"({ i8*, i8* } ({}*) addrspace(1)*** noalias nocapture readonly align 1 dereferenceable(2) %self, %"fmt::Formatter"* align 1 dereferenceable(27) %f) unnamed_addr addrspace(1) #6 {
start:
  %buf.i.i.i.i = alloca [128 x i8], align 1
  %0 = bitcast { i8*, i8* } ({}*) addrspace(1)*** %self to {} addrspace(1)***
  %1 = load {} addrspace(1)**, {} addrspace(1)*** %0, align 1, !nonnull !456
  %2 = load {} addrspace(1)*, {} addrspace(1)** %1, align 1, !alias.scope !5154, !nonnull !456
  %3 = addrspacecast {} addrspace(1)* %2 to {}*
  %4 = ptrtoint {}* %3 to i16
  %5 = getelementptr inbounds %"fmt::Formatter", %"fmt::Formatter"* %f, i16 0, i32 7, i32 0
  %6 = load i8, i8* %5, align 1, !range !2, !noalias !5157
  %7 = getelementptr inbounds %"fmt::Formatter", %"fmt::Formatter"* %f, i16 0, i32 7, i32 1
  %8 = load i16, i16* %7, align 1, !noalias !5157
  %9 = bitcast %"fmt::Formatter"* %f to i32*
  %10 = load i32, i32* %9, align 1, !noalias !5157
  %11 = and i32 %10, 4
  %12 = icmp eq i32 %11, 0
  br i1 %12, label %bb6.i.i, label %bb2.i.i

bb2.i.i:                                          ; preds = %start
  %13 = or i32 %10, 8
  store i32 %13, i32* %9, align 1, !noalias !5157
  %14 = icmp eq i8 %6, 0
  br i1 %14, label %bb3.i.i, label %bb6.i.i

bb3.i.i:                                          ; preds = %bb2.i.i
  store i16 6, i16* %7, align 1, !noalias !5157
  store i8 1, i8* %5, align 1, !noalias !5157
  br label %bb6.i.i

bb6.i.i:                                          ; preds = %bb3.i.i, %bb2.i.i, %start
  %15 = phi i32 [ %10, %start ], [ %13, %bb3.i.i ], [ %13, %bb2.i.i ]
  %16 = or i32 %15, 4
  store i32 %16, i32* %9, align 1, !noalias !5157
  %17 = getelementptr inbounds [128 x i8], [128 x i8]* %buf.i.i.i.i, i16 0, i16 0
  call addrspace(1) void @llvm.lifetime.start.p0i8(i64 128, i8* nonnull %17), !noalias !5160
  %18 = getelementptr inbounds [128 x i8], [128 x i8]* %buf.i.i.i.i, i16 0, i16 128
  br label %bb15.i.i.i.i

bb15.i.i.i.i:                                     ; preds = %bb15.i.i.i.i, %bb6.i.i
  %iter.sroa.4.0.i.i.i.i = phi i8* [ %18, %bb6.i.i ], [ %19, %bb15.i.i.i.i ]
  %x.0.i.i.i.i = phi i16 [ %4, %bb6.i.i ], [ %20, %bb15.i.i.i.i ]
  %curr.0.i.i.i.i = phi i16 [ 128, %bb6.i.i ], [ %26, %bb15.i.i.i.i ]
  %19 = getelementptr inbounds i8, i8* %iter.sroa.4.0.i.i.i.i, i16 -1
  %20 = lshr i16 %x.0.i.i.i.i, 4
  %21 = trunc i16 %x.0.i.i.i.i to i8
  %22 = and i8 %21, 15
  %23 = icmp ult i8 %22, 10
  %24 = or i8 %22, 48
  %25 = add nuw nsw i8 %22, 87
  %_0.0.i15.i.i.i.i = select i1 %23, i8 %24, i8 %25
  store i8 %_0.0.i15.i.i.i.i, i8* %19, align 1, !noalias !5160
  %26 = add nsw i16 %curr.0.i.i.i.i, -1
  %27 = icmp eq i16 %20, 0
  br i1 %27, label %bb43.i.i.i.i, label %bb15.i.i.i.i

bb43.i.i.i.i:                                     ; preds = %bb15.i.i.i.i
  %28 = icmp ugt i16 %26, 128
  br i1 %28, label %bb1.i.i.i.i.i.i.i, label %"_ZN4core3ptr87_$LT$impl$u20$core..fmt..Debug$u20$for$u20$unsafe$u20$fn$LP$A$RP$$u20$.$GT$$u20$Ret$GT$3fmt17ha5a3565824dd93e2E.exit"

bb1.i.i.i.i.i.i.i:                                ; preds = %bb43.i.i.i.i
; call core::slice::slice_index_order_fail
  tail call addrspace(1) void @_ZN4core5slice22slice_index_order_fail17h1a521a1329c8fcf9E(i16 %26, i16 128) #17, !noalias !5160
  unreachable

"_ZN4core3ptr87_$LT$impl$u20$core..fmt..Debug$u20$for$u20$unsafe$u20$fn$LP$A$RP$$u20$.$GT$$u20$Ret$GT$3fmt17ha5a3565824dd93e2E.exit": ; preds = %bb43.i.i.i.i
  %29 = getelementptr inbounds [128 x i8], [128 x i8]* %buf.i.i.i.i, i16 0, i16 %26
  %30 = sub i16 129, %curr.0.i.i.i.i
  %_3.sroa.0.0._3.sroa.0.0..cast.i.i.i.i.i = bitcast i8* %29 to [0 x i8]*
; call core::fmt::Formatter::pad_integral
  %31 = call zeroext addrspace(1) i1 @_ZN4core3fmt9Formatter12pad_integral17h8cbb788acd2ae784E(%"fmt::Formatter"* nonnull align 1 dereferenceable(27) %f, i1 zeroext true, [0 x i8]* noalias nonnull readonly align 1 bitcast (<{ [2 x i8] }>* @anon.1eadd1440c2e72f10a9c27e27f9e4574.191 to [0 x i8]*), i16 2, [0 x i8]* noalias nonnull readonly align 1 %_3.sroa.0.0._3.sroa.0.0..cast.i.i.i.i.i, i16 %30), !noalias !5160
  call addrspace(1) void @llvm.lifetime.end.p0i8(i64 128, i8* nonnull %17), !noalias !5160
  store i8 %6, i8* %5, align 1, !noalias !5157
  store i16 %8, i16* %7, align 1, !noalias !5157
  store i32 %10, i32* %9, align 1, !noalias !5157
  ret i1 %31
}

@dylanmckay
Copy link
Member Author

It seems like LLVM/Rust is choking because when debug printing function pointers.

This is the shoddy IR

  %3 = addrspacecast {} addrspace(1)* %2 to {}*
  %4 = ptrtoint {}* %3 to i16

In order to convert an addrspace(1) pointer to an integer for debug printing, it first attempts to convert it to a pointer in the default address space zero, and then convert that pointer to an integer for printing.

Think of it like this PROGMEM function pointer -> RAM pointer -> integer representation.

On AVR, you cannot convert a PROGMEM function pointer into a RAM pointer (the physical memory spaces are disjoint). Also, it probably makes more sense to print the PROGMEM function pointer when debug printing functions.

We should just go PROGMEM function pointer -> integer representation. The address space cast is unnecessary, and in fact, harmful.

I am not sure if it is possible to implement the addrspacecast instruction in the AVR backend.

We need to modify Rust so that it does not attempt this address space conversion and instead use ptrtoint on the addrspace(1) pointer.

@dylanmckay
Copy link
Member Author

#137 (comment)

The underlying issue is that *const T always has addrspace(0). I think explicit casts (ptr as *const T) should preserve the address space of their input here.

@ecstatic-morse

@dylanmckay dylanmckay changed the title Assertion error in function pointer fmt::Debug implementation Assertion error in function pointer fmt::Debug implementation - LLVM ERROR: Cannot select: t2: i16 = addrspacecast[1 -> 0] Jun 8, 2019
@shepmaster
Copy link
Member

Have we yet figured out which function pointer is being printed?

@dylanmckay
Copy link
Member Author

dylanmckay commented Jun 8, 2019

Not sure, but I've managed to work around it by replacing the implementation of the function pointer Debug::fmt implementations with Ok(()).

As of 20 seconds ago, I successfully compiled the blink program to an ELF under release mode with this branch (current tip 8b7240e).

@shepmaster
Copy link
Member

We do have a history of adding hacks like that (especially to the formatting infrastructure) to get things working again. I'd be cool with adding it, changed to hard-code the string "disabled due to #143" so it's not just completely opaque ;-)

@dylanmckay
Copy link
Member Author

Good point, I've updated the commit and pushed 8b7240e.

@dylanmckay
Copy link
Member Author

The current patch is this

diff --git a/src/libcore/ptr/mod.rs b/src/libcore/ptr/mod.rs
index a04bb11311..caef99ca47 100644
--- a/src/libcore/ptr/mod.rs
+++ b/src/libcore/ptr/mod.rs
@@ -2631,14 +2631,14 @@ macro_rules! fnptr_impls_safety_abi {
         #[stable(feature = "fnptr_impls", since = "1.4.0")]
         impl<Ret, $($Arg),*> fmt::Pointer for $FnTy {
             fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
-                "disabled due to avr-rust/rust#143".fmt(f)
+                fmt::Pointer::fmt(&(*self as usize as *const ()), f)
             }
         }
 
         #[stable(feature = "fnptr_impls", since = "1.4.0")]
         impl<Ret, $($Arg),*> fmt::Debug for $FnTy {
             fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
-                "disabled due to avr-rust/rust#143".fmt(f)
+                fmt::Pointer::fmt(&(*self as usize as *const ()), f)
             }
         }
     }

Preserving @shepmaster's comment from #137.

From an end goal standpoint, will the numeric values of the addresses ever overlap? That is, will we ever print 0x1000/space0 and 0x1000/space1 as the same value? If so, that would rather suck from a debugging persepective.

@shepmaster
Copy link
Member

This occurs outside of Debug formatting. For example, the implementations of async/await executors do function pointer conversions. This could be really annoying.

@shepmaster
Copy link
Member

shepmaster commented Jun 9, 2019

Tiny Rust example:

fn alpha(i: i32) -> i32 { i + 1 }
static FOO: fn(i32) -> i32 = alpha;

The arguments to ConstantExpr::getBitCast:

(lldb) p C->dump()

; Function Attrs: minsize optsize uwtable
define internal i32 @_ZN5blink4main5alpha17he7cdc130812da1f7E(i32) unnamed_addr addrspace(1) #4 !dbg !256 {
  %2 = alloca i32, align 1
  store i32 %0, i32* %2, align 1
  call addrspace(1) void @llvm.dbg.declare(metadata i32* %2, metadata !262, metadata !DIExpression()), !dbg !263
  %3 = load i32, i32* %2, align 1, !dbg !264
  %4 = add i32 %3, 1, !dbg !264
  ret i32 %4, !dbg !265
}

(lldb) p DstTy->dump()
i8*

This fails at

if (SrcPtrTy->getAddressSpace() != DstPtrTy->getAddressSpace())

@shepmaster
Copy link
Member

shepmaster commented Jun 12, 2019

Ok, I pinged some smart compiler devs:

Nikita Popov

From a quick search, we have https://github.com/rust-lang/rust/blob/49d139c64b69ec5289f9f81db885ecfc2c7a8366/src/librustc_codegen_llvm/abi.rs#L365 and https://github.com/rust-lang/rust/blob/49d139c64b69ec5289f9f81db885ecfc2c7a8366/src/librustc_codegen_llvm/type_.rs#L308.

Possibly the code in https://github.com/rust-lang/rust/blob/da9ebc828c982d2ed49396886da85011e1b0a6c0/src/librustc_codegen_ssa/mir/rvalue.rs#L167 is relevant, it deals a lot with casting around function types.

eddyb

the i8* is coming from us lowering miri allocations to LLVM

it wouldn't be that hard to change the addrspace when lowering a pointer to a function, as opposed to a data pointer (cc oli)
https://github.com/rust-lang/rust/blob/c4797fa4f4a696b183b3aa1517ee22c78d0f5d7a/src/librustc_codegen_llvm/common.rs#L322

so there's a cast here, where type_i8p shouldn't always be the same addrspace: https://github.com/rust-lang/rust/blob/c4797fa4f4a696b183b3aa1517ee22c78d0f5d7a/src/librustc_codegen_llvm/common.rs#L331

and then here, either check self.tcx.alloc_map.lock().get(alloc_id) for GlobalAlloc::Function or avoid the cast in scalar_to_backend by not passing in a type to scalar_to_backend (and so move the cast at the end of scalar_to_backend to its other 2 callers) https://github.com/rust-lang/rust/blob/c4797fa4f4a696b183b3aa1517ee22c78d0f5d7a/src/librustc_codegen_llvm/consts.rs#L50

oli

yea, in https://github.com/rust-lang/rust/blob/c4797fa4f4a696b183b3aa1517ee22c78d0f5d7a/src/librustc_codegen_llvm/common.rs#L322 we could just return early instead of falling through to the cast

@shepmaster
Copy link
Member

eddyb walked me through something that seems to at least compile; I've hit other issues with async/await so my program as a whole doesn't link, but might be worth someone else trying this patch out and seeing if the weird cast we added for Debug can be avoided with it:

commit f7d396073a3adb5ab8ddcd203cd31abeb8ee7d21
Author: Jake Goulding <jake.goulding@gmail.com>
Date:   Thu Jun 13 15:42:36 2019 -0400

    Special-case this function pointer thing

diff --git a/src/librustc_codegen_llvm/common.rs b/src/librustc_codegen_llvm/common.rs
index 0b23aac522..5cab9f47db 100644
--- a/src/librustc_codegen_llvm/common.rs
+++ b/src/librustc_codegen_llvm/common.rs
@@ -308,6 +308,7 @@ impl ConstMethods<'tcx> for CodegenCx<'ll, 'tcx> {
                 }
             },
             Scalar::Ptr(ptr) => {
+                let mut do_something_special = false;
                 let alloc_kind = self.tcx.alloc_map.lock().get(ptr.alloc_id);
                 let base_addr = match alloc_kind {
                     Some(GlobalAlloc::Memory(alloc)) => {
@@ -319,6 +320,7 @@ impl ConstMethods<'tcx> for CodegenCx<'ll, 'tcx> {
                         }
                     }
                     Some(GlobalAlloc::Function(fn_instance)) => {
+                        do_something_special = true;
                         self.get_fn(fn_instance)
                     }
                     Some(GlobalAlloc::Static(def_id)) => {
@@ -327,15 +329,19 @@ impl ConstMethods<'tcx> for CodegenCx<'ll, 'tcx> {
                     }
                     None => bug!("missing allocation {:?}", ptr.alloc_id),
                 };
-                let llval = unsafe { llvm::LLVMConstInBoundsGEP(
-                    self.const_bitcast(base_addr, self.type_i8p()),
-                    &self.const_usize(ptr.offset.bytes()),
-                    1,
-                ) };
-                if layout.value != layout::Pointer {
-                    unsafe { llvm::LLVMConstPtrToInt(llval, llty) }
+                if do_something_special {
+                    return base_addr; // move to match?
                 } else {
-                    self.const_bitcast(llval, llty)
+                    let llval = unsafe { llvm::LLVMConstInBoundsGEP(
+                        self.const_bitcast(base_addr, self.type_i8p()),
+                        &self.const_usize(ptr.offset.bytes()),
+                        1,
+                    ) };
+                    if layout.value != layout::Pointer {
+                        unsafe { llvm::LLVMConstPtrToInt(llval, llty) }
+                    } else {
+                        self.const_bitcast(llval, llty)
+                    }
                 }
             }
         }

@shepmaster
Copy link
Member

An updated version of the previous patch

diff --git a/src/librustc_codegen_llvm/common.rs b/src/librustc_codegen_llvm/common.rs
index a5cda5949ee..fdbf222549e 100644
--- a/src/librustc_codegen_llvm/common.rs
+++ b/src/librustc_codegen_llvm/common.rs
@@ -244,6 +244,8 @@ impl ConstMethods<'tcx> for CodegenCx<'ll, 'tcx> {
                 }
             }
             Scalar::Ptr(ptr) => {
+                let mut do_something_special = false;
+
                 let base_addr = match self.tcx.global_alloc(ptr.alloc_id) {
                     GlobalAlloc::Memory(alloc) => {
                         let init = const_alloc_to_llvm(self, alloc);
@@ -256,24 +258,31 @@ impl ConstMethods<'tcx> for CodegenCx<'ll, 'tcx> {
                         }
                         value
                     }
-                    GlobalAlloc::Function(fn_instance) => self.get_fn_addr(fn_instance),
+                    GlobalAlloc::Function(fn_instance) => {
+                        do_something_special = true;
+                        self.get_fn_addr(fn_instance)
+                    },
                     GlobalAlloc::Static(def_id) => {
                         assert!(self.tcx.is_static(def_id));
                         assert!(!self.tcx.is_thread_local_static(def_id));
                         self.get_static(def_id)
                     }
                 };
-                let llval = unsafe {
-                    llvm::LLVMConstInBoundsGEP(
-                        self.const_bitcast(base_addr, self.type_i8p()),
-                        &self.const_usize(ptr.offset.bytes()),
-                        1,
-                    )
-                };
-                if layout.value != Pointer {
-                    unsafe { llvm::LLVMConstPtrToInt(llval, llty) }
+                if do_something_special {
+                    return base_addr; // move to match?
                 } else {
-                    self.const_bitcast(llval, llty)
+                    let llval = unsafe {
+                        llvm::LLVMConstInBoundsGEP(
+                            self.const_bitcast(base_addr, self.type_i8p()),
+                            &self.const_usize(ptr.offset.bytes()),
+                            1,
+                        )
+                    };
+                    if layout.value != Pointer {
+                        unsafe { llvm::LLVMConstPtrToInt(llval, llty) }
+                    } else {
+                        self.const_bitcast(llval, llty)
+                    }
                 }
             }
         }

Applying this does seem to prevent some of the newly-added function pointer casts I see, but I have other failures down the line.

@dylanmckay
Copy link
Member Author

Thanks for re-pining this @shepmaster, I lost track of this. I've been working on address space cast fixes in the same area here: rust-lang#73270 I will try and merge the patch into the PR.

@shepmaster
Copy link
Member

I actually expect that your changes, being more thorough, may have the possibility of making this diff obsolete (fingers crossed)

@dylanmckay
Copy link
Member Author

dylanmckay commented Jun 16, 2020

That makes sense - the "other failures down the line" - were they related to function pointer casts? I'm curious because it would be an interesting data point if that patch fixed all/majority of them.

I'm still wrestling with rustc (in particular, I believe a PlaceRef containing an LLVM Value with the wrong address space is constructed early on for a function pointer (specifically the one in #169) but the LLVM assertion error only hits later on when the bitcast is done during operand codegen.

@shepmaster
Copy link
Member

other failures down the line" - were they related to function pointer casts?

I believe so, but it was a year ago, so my memory is fuzzy. IIRC, I was trying to get some async executors to work on AVR. Those do make use of function pointers to create a manual vtable.

FWIW, if I combine

  • the diff in this issue
  • your address space PR
  • most of the LLVM cherry picks

Then my error when compiling libcore changes:

Assertion failed: (ExtractValueInst::getIndexedType(Agg->getType(), Idxs) == Val->getType() && "Inserted value must match indexed type!"), function init, file /Users/shep/Projects/rust/src/llvm-project/llvm/lib/IR/Instructions.cpp, line 2110.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants