Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARC - passing argument from C++ to Rust issue #86163

Closed
psumbera opened this issue Jun 9, 2021 · 14 comments · Fixed by #91003
Closed

SPARC - passing argument from C++ to Rust issue #86163

psumbera opened this issue Jun 9, 2021 · 14 comments · Fixed by #91003
Labels
A-FFI Area: Foreign function interface (FFI) C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-SPARC Target: SPARC processors P-high High priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@psumbera
Copy link
Contributor

psumbera commented Jun 9, 2021

This was originally reported as Firefox issue: https://bugzilla.mozilla.org/show_bug.cgi?id=1714064

But now it seems to be rather Rust issue.

Unfortunately I wasn't able to reproduce it outside of Firefox yet:

C++ code calls wr_dp_push_stacking_context() with parameter aParams.mFilterDatas.Length() which is zero (0):

https://searchfox.org/mozilla-central/rev/79d73b4aff88dd4a0f06dd3789e1148c49b0de60/gfx/webrender_bindings/WebRenderAPI.cpp#1028

But Rust code gets instead of 0 some nonsense number (e.g. 133081200598864):

https://searchfox.org/mozilla-central/rev/79d73b4aff88dd4a0f06dd3789e1148c49b0de60/gfx/webrender_bindings/src/bindings.rs#2455

I was able to limit the wr_dp_push_stacking_context() to just:

#[no_mangle]
pub extern "C" fn wr_dp_push_stacking_context(
    bounds: LayoutRect,
    filter_count: usize,
    glyph_raster_space: RasterSpace,
    params: &WrStackingContextParams,
)  {
    debug_assert!(unsafe { !is_in_render_thread() });
    println!("XXX start {}", filter_count);
}

And the problem is still there. But when I swap bounds: LayoutRect with filter_count: usize then filter_count is passed correctly!

LayoutRect is quite complicated (to me):
https://searchfox.org/mozilla-central/rev/79d73b4aff88dd4a0f06dd3789e1148c49b0de60/gfx/wr/webrender_api/src/units.rs#86
Important might be that it uses floating numbers!

Following is disassembled wr_dp_push_stacking_context function as defined above:

wr_dp_push_stacking_context:    save      %sp, -0x160, %sp
wr_dp_push_stacking_context+4:  call      +0x8          <wr_dp_push_stacking_context+0xc>
wr_dp_push_stacking_context+8:  sethi     %hi(0x3b05800), %i0
wr_dp_push_stacking_context+0xc:or        %i0, 0x3f8, %i0
wr_dp_push_stacking_context+0x10:       add       %i0, %o7, %i0
wr_dp_push_stacking_context+0x14:       stx       %i0, [%fp + 0x767]
wr_dp_push_stacking_context+0x18:       ldx       [%fp + 0x8af], %i0
wr_dp_push_stacking_context+0x1c:       st        %f1, [%fp + 0x787]
wr_dp_push_stacking_context+0x20:       st        %f3, [%fp + 0x78b]
wr_dp_push_stacking_context+0x24:       st        %f5, [%fp + 0x78f]
wr_dp_push_stacking_context+0x28:       st        %f7, [%fp + 0x793]
wr_dp_push_stacking_context+0x2c:       ld        [%fp + 0x787], %i0
wr_dp_push_stacking_context+0x30:       sllx      %i0, 0x20, %i0
wr_dp_push_stacking_context+0x34:       ld        [%fp + 0x78b], %i1
wr_dp_push_stacking_context+0x38:       or        %i0, %i1, %i0
wr_dp_push_stacking_context+0x3c:       stx       %i0, [%fp + 0x777]
wr_dp_push_stacking_context+0x40:       ld        [%fp + 0x78f], %i0
wr_dp_push_stacking_context+0x44:       sllx      %i0, 0x20, %i0
wr_dp_push_stacking_context+0x48:       ld        [%fp + 0x793], %i1
wr_dp_push_stacking_context+0x4c:       or        %i0, %i1, %i0
wr_dp_push_stacking_context+0x50:       stx       %i0, [%fp + 0x77f]
wr_dp_push_stacking_context+0x54:       stx       %i4, [%fp + 0x797]
wr_dp_push_stacking_context+0x58:       stx       %i5, [%fp + 0x7a7]
wr_dp_push_stacking_context+0x5c:       call      -0xae63cb0    <is_in_render_thread>
wr_dp_push_stacking_context+0x60:       stx       %i5, [%fp + 0x79f]
wr_dp_push_stacking_context+0x64:       ba        +0x8          <wr_dp_push_stacking_context+0x6c>
wr_dp_push_stacking_context+0x68:       st        %o0, [%fp + 0x773]
wr_dp_push_stacking_context+0x6c:       ld        [%fp + 0x773], %i0
wr_dp_push_stacking_context+0x70:       cmp       %i0, 0x0
wr_dp_push_stacking_context+0x74:       be        +0x38         <wr_dp_push_stacking_context+0xac>
wr_dp_push_stacking_context+0x78:       nop
wr_dp_push_stacking_context+0x7c:       ba        +0x8          <wr_dp_push_stacking_context+0x84>
wr_dp_push_stacking_context+0x80:       nop
wr_dp_push_stacking_context+0x84:       ldx       [%fp + 0x767], %i0
wr_dp_push_stacking_context+0x88:       sethi     %hi(0x35c00), %i1
wr_dp_push_stacking_context+0x8c:       add       %i1, 0x3b8, %i1
wr_dp_push_stacking_context+0x90:       ldx       [%i0 + %i1], %o0
wr_dp_push_stacking_context+0x94:       sethi     %hi(0x36000), %i1
wr_dp_push_stacking_context+0x98:       add       %i1, 0x50, %i1
wr_dp_push_stacking_context+0x9c:       ldx       [%i0 + %i1], %o2
wr_dp_push_stacking_context+0xa0:       call      +0x3c7726c    <PLT:_ZN4core9panicking5panic17he6543f3ddbf0355fE>
wr_dp_push_stacking_context+0xa4:       mov       0x33, %o1
wr_dp_push_stacking_context+0xa8:       ta        %icc, 0x5
wr_dp_push_stacking_context+0xac:       ldx       [%fp + 0x767], %i0
wr_dp_push_stacking_context+0xb0:       add       %fp, 0x797, %o0
wr_dp_push_stacking_context+0xb4:       stx       %o0, [%fp + 0x7ef]
wr_dp_push_stacking_context+0xb8:       stx       %o0, [%fp + 0x7f7]
wr_dp_push_stacking_context+0xbc:       sethi     %hi(0x1400), %i1
wr_dp_push_stacking_context+0xc0:       add       %i1, 0x190, %i1
wr_dp_push_stacking_context+0xc4:       call      +0x3c81028    <PLT:_ZN4core3fmt10ArgumentV13new17ha9bc9565de3aadf7E>
wr_dp_push_stacking_context+0xc8:       ldx       [%i0 + %i1], %o1
wr_dp_push_stacking_context+0xcc:       stx       %o0, [%fp + 0x757]
wr_dp_push_stacking_context+0xd0:       ba        +0x8          <wr_dp_push_stacking_context+0xd8>
wr_dp_push_stacking_context+0xd4:       stx       %o1, [%fp + 0x75f]
wr_dp_push_stacking_context+0xd8:       ldx       [%fp + 0x767], %i0
wr_dp_push_stacking_context+0xdc:       ldx       [%fp + 0x75f], %i1
wr_dp_push_stacking_context+0xe0:       ldx       [%fp + 0x757], %i2
wr_dp_push_stacking_context+0xe4:       stx       %i2, [%fp + 0x7df]
wr_dp_push_stacking_context+0xe8:       stx       %i1, [%fp + 0x7e7]
wr_dp_push_stacking_context+0xec:       sethi     %hi(0x36000), %i1
wr_dp_push_stacking_context+0xf0:       add       %i1, 0x58, %i1
wr_dp_push_stacking_context+0xf4:       ldx       [%i0 + %i1], %o1
wr_dp_push_stacking_context+0xf8:       add       %fp, 0x7af, %o0
wr_dp_push_stacking_context+0xfc:       mov       0x2, %o2
wr_dp_push_stacking_context+0x100:      add       %fp, 0x7df, %o3
wr_dp_push_stacking_context+0x104:      call      +0xe2a0c      <core::fmt::Arguments::new_v1::h337fca81b2e584b1>
wr_dp_push_stacking_context+0x108:      mov       0x1, %o4
wr_dp_push_stacking_context+0x10c:      ba        +0x8          <wr_dp_push_stacking_context+0x114>
wr_dp_push_stacking_context+0x110:      nop
wr_dp_push_stacking_context+0x114:      call      +0x3df5ab0    <PLT:_ZN3std2io5stdio6_print17h02792fbd1097b851E>
wr_dp_push_stacking_context+0x118:      add       %fp, 0x7af, %o0
wr_dp_push_stacking_context+0x11c:      ba        +0x8          <wr_dp_push_stacking_context+0x124>
wr_dp_push_stacking_context+0x120:      nop
wr_dp_push_stacking_context+0x124:      ret
wr_dp_push_stacking_context+0x128:      restore
@psumbera psumbera added the C-bug Category: This is a bug. label Jun 9, 2021
@jonas-schievink jonas-schievink added A-FFI Area: Foreign function interface (FFI) O-SPARC Target: SPARC processors labels Jun 9, 2021
@nagisa nagisa added the I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness label Jun 9, 2021
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Jun 9, 2021
@nagisa
Copy link
Member

nagisa commented Jun 9, 2021

Sounds like an ABI definition issue (e.g. something is passed via a value while it should be passed via a pointer). Marking this as unsound, as instead of a pointer, this could be mangling a pointer or a reference. Can we see the disassembly for the calling side as well?

Could you also try to produce a MCVE (a test case that does not rely on external crates?) by inlining the type definitions, etc. Also helpful would be a removal of the panics, e.g. by replacing them with if filter_count != 0 { std::intrinsics::abort() }. Or something along those lines.

@nagisa nagisa added the E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example label Jun 9, 2021
@psumbera
Copy link
Contributor Author

psumbera commented Jun 9, 2021

Following is DisplayListBuilder::PushStackingContext() which I use now for debugging:

Maybe<wr::WrSpatialId> DisplayListBuilder::PushStackingContext(
    const wr::StackingContextParams& aParams, const wr::LayoutRect& aBounds,
    const wr::RasterSpace& aRasterSpace) {
  MOZ_ASSERT(mClipChainLeaf.isNothing(),
             "Non-empty leaf from clip chain given, but not used with SC!");

  wr::LayoutTransform matrix;
  const gfx::Matrix4x4* transform = aParams.mTransformPtr;
  if (transform) {
    matrix = ToLayoutTransform(*transform);
  }
  const wr::LayoutTransform* maybeTransform = transform ? &matrix : nullptr;
  WRDL_LOG("PushStackingContext b=%s t=%s\n", mWrState,
           ToString(aBounds).c_str(),
           transform ? ToString(*transform).c_str() : "none");
  printf("XXX before %ld %ld\n", aParams.mFilters.Length(), aParams.mFilterDatas.Length());
  wr_dp_push_stacking_context(
      aBounds,
      123456789,
      aRasterSpace,
      &aParams);
  return Nothing();
}

And this is its disassembly (_ZN7mozilla2wr18DisplayListBuilder19PushStackingContextERKNS0_21StackingContextParamsERKNS0_4RectIfNS0_11LayoutPixelEEERKNS0_11RasterSpaceE):

0:           save      %sp, -0x180, %sp
+4:          sethi     %hi(0xe374c00), %l7
+8:          add       %l7, 0x3bc, %l7
+0xc:        call      -0x2b0f6dc    <0x10c78ee8>
+0x10:       nop
+0x14:       mov       %l7, %i5
+0x18:       stx       %i0, [%fp + 0x797]
+0x1c:       stx       %i1, [%fp + 0x78f]
+0x20:       stx       %i2, [%fp + 0x787]
+0x24:       stx       %i3, [%fp + 0x77f]
+0x28:       stx       %i4, [%fp + 0x777]
+0x2c:       sethi     %hi(0x0), %g1
+0x30:       xor       %g1, 0x40, %g1
+0x34:       ldx       [%i5 + %g1], %g1
+0x38:       ldx       [%g1], %g2
+0x3c:       stx       %g2, [%fp + 0x7f7]
+0x40:       clr       %g2
+0x44:       ldx       [%fp + 0x78f], %g1
+0x48:       add       %g1, 0x50, %g1
+0x4c:       mov       %g1, %o0
+0x50:       call      +0xe02c       <_ZNK7mozilla5MaybeINS_2wr4RectIfNS1_11LayoutPixelEEEE9isNothingEv>
+0x54:       nop
+0x58:       mov       %o0, %g1
+0x5c:       xor       %g1, 0x1, %g1
+0x60:       and       %g1, 0xff, %g1
+0x64:       mov       %g1, %g2
+0x68:       clr       %g1
+0x6c:       movrne    %g2, 0x1, %g1
+0x70:       and       %g1, 0xff, %g1
+0x74:       cmp       %g1, 0x0
+0x78:       be,pt     %icc, +0x60   <_ZN7mozilla2wr18DisplayListBuilder19PushStackingContextERKNS0_21StackingContextParamsERKNS0_4RectIfNS0_11LayoutPixelEEERKNS0_11RasterSpaceE+0xd8>
+0x7c:       nop
+0x80:       mov       0x3f7, %o2
+0x84:       sethi     %hi(0x128ecc00), %g1
+0x88:       xor       %g1, -0x398, %g1
+0x8c:       add       %i5, %g1, %g1
+0x90:       mov       %g1, %o1
+0x94:       sethi     %hi(0x128ecc00), %g1
+0x98:       xor       %g1, -0xc0, %g1
+0x9c:       add       %i5, %g1, %g1
+0xa0:       mov       %g1, %o0
+0xa4:       call      -0x5408       <MOZ_ReportAssertionFailure>
+0xa8:       nop
+0xac:       sethi     %hi(0x128ecc00), %g1
+0xb0:       xor       %g1, -0x60, %g1
+0xb4:       add       %i5, %g1, %g1
+0xb8:       mov       %g1, %o0
+0xbc:       call      -0x545c       <_ZL22AnnotateMozCrashReasonPKc>
+0xc0:       nop
+0xc4:       clr       %g2
+0xc8:       mov       0x3f7, %g1 _ZN7mozilla2wr18DisplayListBuilder19PushStackingContextERKNS0_21StackingContextParamsERKNS0_4RectIfNS0_11LayoutPixelEEERKNS0_11RasterSpaceE+0xcc:       st        %g1, [%g2]
+0xd0:       call      +0xe440198    <PLT:abort>
+0xd4:       nop
+0xd8:       ldx       [%fp + 0x787], %g1
+0xdc:       ldx       [%g1 + 0x68], %g1
+0xe0:       stx       %g1, [%fp + 0x7a7]
+0xe4:       ldx       [%fp + 0x7a7], %g1
+0xe8:       brz,pt    %g1, +0x3c    <_ZN7mozilla2wr18DisplayListBuilder19PushStackingContextERKNS0_21StackingContextParamsERKNS0_4RectIfNS0_11LayoutPixelEEERKNS0_11RasterSpaceE+0x124>
+0xec:       nop
+0xf0:       add       %fp, 0x72f, %g1
+0xf4:       ldx       [%fp + 0x7a7], %o1
+0xf8:       mov       %g1, %o0
+0xfc:       call      +0x5608       <_ZN7mozilla2wrL17ToLayoutTransformINS_3gfx12UnknownUnitsES3_EENS0_11Transform3DIfNS0_11LayoutPixelES5_EERKNS2_14Matrix4x4TypedIT_T0_fEE>
+0x100:      nop
+0x104:      add       %fp, 0x7b7, %g1
+0x108:      add       %fp, 0x72f, %g2
+0x10c:      mov       0x40, %g3
+0x110:      mov       %g3, %o2
+0x114:      mov       %g2, %o1
+0x118:      mov       %g1, %o0
+0x11c:      call      +0xe440d0c    <PLT:memcpy>
+0x120:      nop
+0x124:      ldx       [%fp + 0x7a7], %g1
+0x128:      brz,pt    %g1, +0x14    <_ZN7mozilla2wr18DisplayListBuilder19PushStackingContextERKNS0_21StackingContextParamsERKNS0_4RectIfNS0_11LayoutPixelEEERKNS0_11RasterSpaceE+0x13c>
+0x12c:      nop
+0x130:      add       %fp, 0x7b7, %g1
+0x134:      ba,pt     %xcc, +0xc    <_ZN7mozilla2wr18DisplayListBuilder19PushStackingContextERKNS0_21StackingContextParamsERKNS0_4RectIfNS0_11LayoutPixelEEERKNS0_11RasterSpaceE+0x140>
+0x138:      nop
+0x13c:      clr       %g1
+0x140:      stx       %g1, [%fp + 0x7af]
+0x144:      ldx       [%fp + 0x787], %g1
+0x148:      add       %g1, 0x40, %g1
+0x14c:      mov       %g1, %o0
+0x150:      call      -0x2ac7a84    <_ZNK13nsTArray_baseI27nsTArrayInfallibleAllocator30nsTArray_RelocateUsingMemutilsE6LengthEv>
+0x154:      nop
+0x158:      mov       %o0, %i4
+0x15c:      ldx       [%fp + 0x787], %g1
+0x160:      add       %g1, 0x48, %g1
+0x164:      mov       %g1, %o0
+0x168:      call      -0x2ac7a9c    <_ZNK13nsTArray_baseI27nsTArrayInfallibleAllocator30nsTArray_RelocateUsingMemutilsE6LengthEv>
+0x16c:      nop
+0x170:      mov       %o0, %g1
+0x174:      mov       %g1, %o2
+0x178:      mov       %i4, %o1
+0x17c:      sethi     %hi(0x128ec800), %g1
+0x180:      xor       %g1, -0x3f8, %g1
+0x184:      add       %i5, %g1, %g1
+0x188:      mov       %g1, %o0
+0x18c:      call      +0xe445dfc    <PLT:printf>
+0x190:      nop
+0x194:      ldx       [%fp + 0x787], %g3
+0x198:      ldx       [%fp + 0x777], %g1
+0x19c:      ld        [%g1], %g2
+0x1a0:      srl       %g2, 0x0, %g2
+0x1a4:      sllx      %g2, 0x20, %g2
+0x1a8:      ld        [%g1 + 0x4], %g1
+0x1ac:      srl       %g1, 0x0, %g1
+0x1b0:      or        %g1, %g2, %g2
+0x1b4:      ldx       [%fp + 0x77f], %g1
+0x1b8:      ld        [%g1], %f11
+0x1bc:      ld        [%g1 + 0x4], %f10
+0x1c0:      ld        [%g1 + 0x8], %f9
+0x1c4:      ld        [%g1 + 0xc], %f8
+0x1c8:      mov       %g3, %o4
+0x1cc:      mov       %g2, %o3
+0x1d0:      sethi     %hi(0x75bcc00), %g1
+0x1d4:      or        %g1, 0x115, %o2
+0x1d8:      fmovs     %f11, %f0
+0x1dc:      fmovs     %f10, %f1
+0x1e0:      fmovs     %f9, %f2
+0x1e4:      fmovs     %f8, %f3
+0x1e8:      call      +0xa86f234    <wr_dp_push_stacking_context>
+0x1ec:      nop
+0x1f0:      mov       %l0, %g1
+0x1f4:      and       %g1, 0xff, %g1
+0x1f8:      sllx      %g1, 0x38, %g1
+0x1fc:      mov       %g1, %o1
+0x200:      ldx       [%fp + 0x797], %o0
+0x204:      call      +0xdec4       <_ZN7mozilla5MaybeINS_2wr11WrSpatialIdEEC1ENS_7NothingE>
+0x208:      nop
+0x20c:      sethi     %hi(0x0), %g1
+0x210:      xor       %g1, 0x40, %g1
+0x214:      ldx       [%i5 + %g1], %g1
+0x218:      ldx       [%fp + 0x7f7], %g2
+0x21c:      ldx       [%g1], %g1
+0x220:      xor       %g2, %g1, %g2
+0x224:      clr       %g1
+0x228:      mov       %g2, %g1
+0x22c:      brz,pt    %g1, +0x10    <_ZN7mozilla2wr18DisplayListBuilder19PushStackingContextERKNS0_21StackingContextParamsERKNS0_4RectIfNS0_11LayoutPixelEEERKNS0_11RasterSpaceE+0x23c>
+0x230:      nop
+0x234:      call      +0xe440074    <PLT:__stack_chk_fail>
+0x238:      nop
+0x23c:      ldx       [%fp + 0x797], %i0
+0x240:      return    %i7 + 0x8
+0x244:      nop

And here is again my wr_dp_push_stacking_context() now without debug_assert:

#[no_mangle]
pub extern "C" fn wr_dp_push_stacking_context(
    bounds: LayoutRect,
    filter_count: usize,
    glyph_raster_space: RasterSpace,
    params: &WrStackingContextParams,
)  {
    println!("XXX start {}", filter_count);
}

And it's disassembly is:

wr_dp_push_stacking_context:    save      %sp, -0x150, %sp
wr_dp_push_stacking_context+4:  call      +0x8          <wr_dp_push_stacking_context+0xc>
wr_dp_push_stacking_context+8:  sethi     %hi(0x3b05800), %i0
wr_dp_push_stacking_context+0xc:or        %i0, 0x3a8, %i0
wr_dp_push_stacking_context+0x10:       add       %i0, %o7, %i0
wr_dp_push_stacking_context+0x14:       stx       %i0, [%fp + 0x75f]
wr_dp_push_stacking_context+0x18:       ldx       [%fp + 0x8af], %i1
wr_dp_push_stacking_context+0x1c:       st        %f1, [%fp + 0x787]
wr_dp_push_stacking_context+0x20:       st        %f3, [%fp + 0x78b]
wr_dp_push_stacking_context+0x24:       st        %f5, [%fp + 0x78f]
wr_dp_push_stacking_context+0x28:       st        %f7, [%fp + 0x793]
wr_dp_push_stacking_context+0x2c:       ld        [%fp + 0x787], %i1
wr_dp_push_stacking_context+0x30:       sllx      %i1, 0x20, %i1
wr_dp_push_stacking_context+0x34:       ld        [%fp + 0x78b], %i2
wr_dp_push_stacking_context+0x38:       or        %i1, %i2, %i1
wr_dp_push_stacking_context+0x3c:       stx       %i1, [%fp + 0x777]
wr_dp_push_stacking_context+0x40:       ld        [%fp + 0x78f], %i1
wr_dp_push_stacking_context+0x44:       sllx      %i1, 0x20, %i1
wr_dp_push_stacking_context+0x48:       ld        [%fp + 0x793], %i2
wr_dp_push_stacking_context+0x4c:       or        %i1, %i2, %i1
wr_dp_push_stacking_context+0x50:       stx       %i1, [%fp + 0x77f]
wr_dp_push_stacking_context+0x54:       stx       %i4, [%fp + 0x797]
wr_dp_push_stacking_context+0x58:       stx       %i5, [%fp + 0x7a7]
wr_dp_push_stacking_context+0x5c:       stx       %i5, [%fp + 0x79f]
wr_dp_push_stacking_context+0x60:       add       %fp, 0x797, %o0
wr_dp_push_stacking_context+0x64:       stx       %o0, [%fp + 0x7ef]
wr_dp_push_stacking_context+0x68:       stx       %o0, [%fp + 0x7f7]
wr_dp_push_stacking_context+0x6c:       sethi     %hi(0x1400), %i1
wr_dp_push_stacking_context+0x70:       add       %i1, 0x190, %i1
wr_dp_push_stacking_context+0x74:       call      +0x3c81078    <PLT:_ZN4core3fmt10ArgumentV13new17ha9bc9565de3aadf7E>
wr_dp_push_stacking_context+0x78:       ldx       [%i0 + %i1], %o1
wr_dp_push_stacking_context+0x7c:       stx       %o0, [%fp + 0x767]
wr_dp_push_stacking_context+0x80:       ba        +0x8          <wr_dp_push_stacking_context+0x88>
wr_dp_push_stacking_context+0x84:       stx       %o1, [%fp + 0x76f]
wr_dp_push_stacking_context+0x88:       ldx       [%fp + 0x75f], %i0
wr_dp_push_stacking_context+0x8c:       ldx       [%fp + 0x76f], %i1
wr_dp_push_stacking_context+0x90:       ldx       [%fp + 0x767], %i2
wr_dp_push_stacking_context+0x94:       stx       %i2, [%fp + 0x7df]
wr_dp_push_stacking_context+0x98:       stx       %i1, [%fp + 0x7e7]
wr_dp_push_stacking_context+0x9c:       sethi     %hi(0x36000), %i1
wr_dp_push_stacking_context+0xa0:       add       %i1, 0x50, %i1
wr_dp_push_stacking_context+0xa4:       ldx       [%i0 + %i1], %o1
wr_dp_push_stacking_context+0xa8:       add       %fp, 0x7af, %o0
wr_dp_push_stacking_context+0xac:       mov       0x2, %o2
wr_dp_push_stacking_context+0xb0:       add       %fp, 0x7df, %o3
wr_dp_push_stacking_context+0xb4:       call      +0xe2a0c      <_ZN4core3fmt9Arguments6new_v117h337fca81b2e584b1E>
wr_dp_push_stacking_context+0xb8:       mov       0x1, %o4
wr_dp_push_stacking_context+0xbc:       ba        +0x8          <wr_dp_push_stacking_context+0xc4>
wr_dp_push_stacking_context+0xc0:       nop
wr_dp_push_stacking_context+0xc4:       call      +0x3df5b00    <PLT:_ZN3std2io5stdio6_print17h02792fbd1097b851E>
wr_dp_push_stacking_context+0xc8:       add       %fp, 0x7af, %o0
wr_dp_push_stacking_context+0xcc:       ba        +0x8          <wr_dp_push_stacking_context+0xd4>
wr_dp_push_stacking_context+0xd0:       nop
wr_dp_push_stacking_context+0xd4:       ret
wr_dp_push_stacking_context+0xd8:       restore

@apiraino apiraino added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jun 10, 2021
@batrla
Copy link

batrla commented Jun 10, 2021

Agreed looks like Rust's SPARCv9 64-bit ABI definition issue. In more detail, it seems to me Rust thinks the floating point registers are 64-bit, while they are still 32-bit in the processor [1]. The mismatch causes Rust function to read garbage when the argument is structure containing floats and it is passed by value.

Here's a small test case, where C calls into Rust:

$ cat franta.rs 
#[derive(Debug)]
#[no_mangle]
#[repr(C)]
pub struct Franta {
        a:f32,
        b:f32,
        c:f32,
        d:f32,
}

#[no_mangle]
pub extern "C" fn funkce(arg: Franta, i: i32)
{
    println!("XXX start {:?} {}", arg, i);
}

$ cat testfranta.c 
#include <stdio.h>
struct Franta {
        float a;
        float b;
        float c;
        float d;
};

void
funkce(struct Franta a, int b);

int main() {
        struct Franta a = { 1, 2.0, 2.1, 2.2};
        funkce(a, 3);
        return 0;
}

rustc  --crate-type dylib franta.rs 
gcc ./testfranta.c ./libfranta.so 

$ file a.out 
a.out:          ELF 64-bit MSB executable SPARCV9 Version 1, UltraSPARC3
Extensions Required, dynamically linked, not stripped

$ ./a.out 
XXX start Franta { a: 2.0, b: 2.2, c: NaN, d: NaN } 2136736064

Expected output:
XXX start Franta { a: 1, b: 2.0, c: 2.1, d: 2.2 } 3

Actual output:
XXX start Franta { a: 2.0, b: 2.2, c: NaN, d: NaN } 2136736064

The disassembler shows:

main+0x74:                      mov       0x3, %o2
main+0x78:                      fmovs     %f11, %f0
main+0x7c:                      fmovs     %f10, %f1
main+0x80:                      fmovs     %f9, %f2
main+0x84:                      fmovs     %f8, %f3
main+0x88:                      call      +0x100278     <PLT=libfranta.so`funkce>
main+0x8c:                      nop

That the caller passes arguments via %f0, %f1, %f2, %f3, %o2,
But callee reads arguments from:

funkce+0x14:                    fmovs     %f7, %f0
funkce+0x18:                    fmovs     %f5, %f2
funkce+0x1c:                    fmovs     %f3, %f4
funkce+0x20:                    fmovs     %f1, %f6
funkce+0x24:                    mov       %i4, %i1

^ caller reads arguments from %f1, %f3, %f5, %f7 that are lower 32-bit parts of double precision (64-bit) floating point registers. So while Rust correctly interprets the structure members as f32, it seems it thinks they are passed in the aggregate (structure) as f64!? At the end Rust reads the second argument of function funkce from %i4 (skips 4x64-bits -> %i0, %i1, %i2, %i3), however, the caller has put it in %i2 (skipping 4x32 bits -> %i0, %i1 since all %i registers are 64-bit wide). The difference in how many input arguments are skipped by Rust and Gcc is another evidence that there's mismatch in size of floating point members of structure passed by value as first argument.

Is the function is_homogenous_aggregate() in librustc_trans/cabi_sparc64.rs right place to look at to check if it correctly sets argument size?

[1] section 5.1.4 in SPARC Architecture Manual,
https://sparc.org/wp-content/uploads/2014/01/SPARCV9.pdf.gz

@nagisa
Copy link
Member

nagisa commented Jun 10, 2021

That function, or rather the file in its entirety is a correct place to look at, yes. Looking at the LLVM IR, this is the signature generated by Rust today:

@funkce([4 x float] %0, i32 signext %i) unnamed_addr

Whereas it really ought to be

@funkce(float, float, float, float, i32) unnamed_addr

or similar. It does seem that is_homogenous_aggregate is involved in at least some way, here.

@JohnTitor JohnTitor removed the E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example label Jun 17, 2021
@psumbera
Copy link
Contributor Author

I guess that the same problem is also on SPARC Linux. @glaubitz can you please test above test case?

@glaubitz
Copy link
Contributor

On Linux, I'm getting:

glaubitz@gcc202:~$ rustc --crate-type dylib franta.rs
warning: attribute should be applied to a function or static
 --> franta.rs:2:1
  |
2 |   #[no_mangle]
  |   ^^^^^^^^^^^^
3 |   #[repr(C)]
4 | / pub struct Franta {
5 | |         a:f32,
6 | |         b:f32,
7 | |         c:f32,
8 | |         d:f32,
9 | | }
  | |_- not a function or static
  |
  = note: `#[warn(unused_attributes)]` on by default
  = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!

warning: 1 warning emitted

glaubitz@gcc202:~$ gcc ./testfranta.c ./libfranta.so 
glaubitz@gcc202:~$ ./a.out 
./a.out: error while loading shared libraries: libfranta.so: cannot open shared object file: No such file or directory
glaubitz@gcc202:~$ LD_LIBRARY_PATH=. ./a.out 
XXX start Franta { a: 2.0, b: 2.2, c: 0.0, d: 0.0 } 0
glaubitz@gcc202:~$

@batrla
Copy link

batrla commented Jun 17, 2021

Thanks for trying on Linux, that's the same issue. Rust function funkce() gets invalid arguments. The difference of arguments being 0 on Linux vs garbage on Solaris might be caused by something that used registers e.g. library initialization.

@apiraino
Copy link
Contributor

Assigning priority as discussed in the Zulip thread of the Prioritization Working Group.

@rustbot label -I-prioritize +P-high

@rustbot rustbot added P-high High priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Jun 17, 2021
@psumbera
Copy link
Contributor Author

That function, or rather the file in its entirety is a correct place to look at, yes. Looking at the LLVM IR, this is the signature generated by Rust today:

@funkce([4 x float] %0, i32 signext %i) unnamed_addr

Whereas it really ought to be

@funkce(float, float, float, float, i32) unnamed_addr

How can this be fixed? Where is relevant code doing this?

or similar. It does seem that is_homogenous_aggregate is involved in at least some way, here.

is_homogenous_aggregate() doesn't seem to do anything bad. At least sizes and types looks good when adding some debug print here:

self.mode = PassMode::Cast(target.into());

seelf.mode = Cast(CastTarget { prefix: [None, None, None, None, None, None, None, None], prefix_chunk_size: Size { raw: 0 }, rest: Uniform { unit: Reg { kind: Float, size: Size { raw: 4 } }, total: Size { raw: 16 } } })

@psumbera
Copy link
Contributor Author

@nagisa any suggestion how to proceed with this issue? Where to look, what to try? Thanks!

@Darkspirit
Copy link

Darkspirit commented Sep 14, 2021

It sounds like Firefox will not work on SPARC anymore if this bug has not been fixed until the release of Firefox 93.
Non-WebRender has been removed in 93.

@psumbera
Copy link
Contributor Author

If I understand it right, we need to modify:

https://github.com/rust-lang/rust/blob/master/compiler/rustc_target/src/abi/call/sparc64.rs#L62

So that for aggregate data argument (structure with floats) are split into registers...

@psumbera
Copy link
Contributor Author

Now Rust (--emit=llvm-ir ) for SPARC64 generates for the test case:

define void @funkce([4 x float] %0, i32 signext %1) unnamed_addr #2 {

where Clang (-S -emit-llvm) for C test case equivalent does:

define dso_local void @funkce(float inreg %0, float inreg %1, float inreg %2, float inreg %3, i32 signext %4) #0 {

Clang SPARC64 ABI generation and usage of inreg for structures with 32bits float numbers seems to be defined here:

https://github.com/llvm/llvm-project/blob/b23d22f7d546a525e6fb0e3f32f835c459ac2746/clang/lib/CodeGen/TargetInfo.cpp#L9506

With my limited knowledge of Rust and its internals I was so far only able to come with below patch for sparc64.rs which generates:

define void @funkce(%Franta inreg %0, i32 signext %1) unnamed_addr #2 {

It works for the test case and it works for the original Firefox bug:
https://bugzilla.mozilla.org/show_bug.cgi?id=1714064
where wr_dp_push_stacking_context() now gets sane arguments and no unsuccessful big memory allocation are seen.

Problem is that now it uses Webrender (which I have never tested on SPARC) and Firefox shows window with nothing just black background. So it may be eventually some other SPARC Webrender issue. Not sure whether it's related but in console I see several occurrence of:

[2021-09-29T15:08:02Z WARN webrender::device::gl] Cropping texture upload Box2D((0, 0), (0, 1)) to None

Any suggestions or comments are welcome!

The patch for Rust 1.53.0 is.

--- rustc-1.53.0-src/compiler/rustc_target/src/abi/call/sparc64.rs
+++ rustc-1.53.0-src/compiler/rustc_target/src/abi/call/sparc64.rs
@@ -1,7 +1,7 @@
 // FIXME: This needs an audit for correctness and completeness.

-use crate::abi::call::{ArgAbi, FnAbi, Reg, RegKind, Uniform};
-use crate::abi::{HasDataLayout, LayoutOf, TyAndLayout, TyAndLayoutMethods};
+use crate::abi::call::{ArgAbi, ArgAttribute, FnAbi, PassMode, Reg, RegKind, Uniform};
+use crate::abi::{self, HasDataLayout, LayoutOf, Size, TyAndLayout, TyAndLayoutMethods};

 fn is_homogeneous_aggregate<'a, Ty, C>(cx: &C, arg: &mut ArgAbi<'a, Ty>) -> Option<Uniform>
 where
@@ -14,6 +14,10 @@
             return None;
         }

+        if unit.kind == RegKind::Float && unit.size == (Size { raw: 4 }) {
+            return None;
+        }
+
         let valid_unit = match unit.kind {
             RegKind::Integer => false,
             RegKind::Float => true,
@@ -65,6 +69,41 @@
         return;
     }

+    match arg.layout.fields {
+        abi::FieldsShape::Primitive |
+        abi::FieldsShape::Array { .. } |
+        abi::FieldsShape::Union(_) => { }
+        abi::FieldsShape::Arbitrary { .. } => {
+
+            let mut has_f32_type = false;
+            let mut has_64b_type = false;
+
+            for i in 0..arg.layout.fields.count() {
+
+                let field = arg.layout.field(cx, i);
+                // We only care about aligned doubles
+                if let abi::Abi::Scalar(scalar) = &field.abi {
+                    if let abi::F32 = scalar.value {
+                        has_f32_type = true;
+                    } else if let abi::F64 = scalar.value {
+                        has_64b_type = true;
+                    } else if let abi::Int(i, _signed) = scalar.value {
+                        has_64b_type = i.size().bits() == 64;
+                    } else if let abi::Pointer = scalar.value {
+                        has_64b_type = true;
+                    }
+                }
+             }
+
+             if has_f32_type && !has_64b_type {
+                 if let PassMode::Direct(ref mut attrs) = arg.mode {
+                     attrs.set(ArgAttribute::InReg);
+                     return;
+                 }
+             }
+        }
+    };
+
     let total = arg.layout.size;
     if total.bits() > 128 {
         arg.make_indirect();

@psumbera
Copy link
Contributor Author

psumbera commented Oct 1, 2021

Ok, the above fix doesn't fix all possibilities. Another occurrence is structure with doubles:
https://bugzilla.mozilla.org/show_bug.cgi?id=1731549
(also two floats doesn't work either...)

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Nov 25, 2021
fix sparc64 ABI for aggregates with floating point members

Fixes rust-lang#86163
@bors bors closed this as completed in a2b7b78 Dec 2, 2021
bjorn3 pushed a commit to bjorn3/rust that referenced this issue Dec 31, 2021
fix sparc64 ABI for aggregates with floating point members

Fixes rust-lang#86163
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-FFI Area: Foreign function interface (FFI) C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-SPARC Target: SPARC processors P-high High priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants