Skip to content

Commit

Permalink
Fix bus errors with memcpy to handle unaligned access on O2/O3
Browse files Browse the repository at this point in the history
This patch is to resolve bus errors in case of the O3 of clang (issue # 5 8 4 4).

When we enable the -O2/-O3 optimization levels of the clang language (from clang 3.5
to latest version that was released on Jun-13-2016), we have got the +3000 BUS Errors
from the coreCLR's unit tests. We can easily monitor SIGBUS signals (e.g.,
"misaligned memory access") with /proc/cpu/alignment facility of kernel-space.
Using "echo 2 > /proc/cpu/alignment" makes Linux kernel fixes the problems
but the performance of the application will be degraded.
* source: http://lxr.free-electrons.com/source/Documentation/arm/mem_alignment

According to ARM information center(infocenter.arm.com), By default,
the ARM compiler expects normal C and C++ pointers to point
to an aligned word in memory. A type qualifier __packed is provided to
enable unaligned pointer access. If you want to define a pointer to a word
that can be at any address (that is, that can be at a non-natural alignment),
you must specify this using the __packed qualifier when defining the pointer:
__packed int *pi; // pointer to unaligned int

However, clang/llvm does not support the __packed qualifier such as
__attribute__((packed)) or __attribute__((packed, aligned(4)))

In -O0 (debugging) the innermost block is emitted as the following assembly,
which works properly:
 ldr r1, [r0, dotnet#24]
 ldr r2, [r0, dotnet#20]

In -O2 (release) however the compiler realizes these fields are adjacent
and generates this assembly:
 ldrdeq  r2, r3, [r0, dotnet#20]
Unfortunately ldrdb instruction always generates an alignment fault (in practice).
It seems that clang uses ldrb instructions although Gcc uses ldr because armv7
supports unalign ldr instruction.

Basically, RISC-based ARM architecture requires aligned access with 4byte reads.
So, let's use memcpy(2) in into a properly aligned buffer instead of
the packing attribute.

Note: If architecture (e.g., Linux/ARM Emulator) does not support unaligned ldr,
this issue will be not generated with -O2/-O3 optimization levels.

* Case study: How does the ARM Compiler support unaligned accesses?
  http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka15414.html

* Case study: Indicating unaligned access to Clang for ARM compatibility
  http://stackoverflow.com/questions/9185811/indicating-unaligned-access-to-clang-for-arm-compatibility

* Case study: Chromium source for UnalignedLoad32() on ARM
  https://github.com/nwjs/chromium.src/blob/nw15/third_party/cld/base/basictypes.h#L302

Signed-off-by: Geunsik Lim <geunsik.lim@samsung.com>
  • Loading branch information
leemgs committed Jul 21, 2016
1 parent 5257068 commit 2a23c68
Showing 1 changed file with 36 additions and 11 deletions.
47 changes: 36 additions & 11 deletions src/jit/compiler.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -736,34 +736,59 @@ inline unsigned genGetU4(const BYTE *addr)

/*****************************************************************************/
// Helpers to pull little-endian values out of a byte stream.

// Get Unaligned values from a potentially unaligned object
inline
unsigned __int8 getU1LittleEndian(const BYTE * ptr)
{ return *(UNALIGNED unsigned __int8 *)ptr; }
{
unsigned __int8 temp;
memcpy(&temp, ptr, sizeof(temp));
return temp;
}

inline
unsigned __int16 getU2LittleEndian(const BYTE * ptr)
{ return *(UNALIGNED unsigned __int16 *)ptr; }
{
unsigned __int16 temp;
memcpy(&temp, ptr, sizeof(temp));
return temp;
}

inline
unsigned __int32 getU4LittleEndian(const BYTE * ptr)
{ return *(UNALIGNED unsigned __int32*)ptr; }

{
unsigned __int32 temp;
memcpy(&temp, ptr, sizeof(temp));
return temp;
}
inline
signed __int8 getI1LittleEndian(const BYTE * ptr)
{ return * (UNALIGNED signed __int8 *)ptr; }

{
signed __int8 temp;
memcpy(&temp, ptr, sizeof(temp));
return temp;
}
inline
signed __int16 getI2LittleEndian(const BYTE * ptr)
{ return * (UNALIGNED signed __int16 *)ptr; }

{
signed __int16 temp;
memcpy(&temp, ptr, sizeof(temp));
return temp;
}
inline
signed __int32 getI4LittleEndian(const BYTE * ptr)
{ return *(UNALIGNED signed __int32*)ptr; }
{
signed __int32 temp;
memcpy(&temp, ptr, sizeof(temp));
return temp;
}

inline
signed __int64 getI8LittleEndian(const BYTE * ptr)
{ return *(UNALIGNED signed __int64*)ptr; }
{
signed __int64 temp;
memcpy(&temp, ptr, sizeof(temp));
return temp;
}

inline
float getR4LittleEndian(const BYTE * ptr)
Expand Down

0 comments on commit 2a23c68

Please sign in to comment.