RyuJIT/x86: encoder uses size_t to accumulate encoding bits, but that's not big enough #7061

seanshpark · 2016-11-29T02:51:45Z

While enabling x86/Linux, I get this error

coreclr/src/jit/emitxarch.cpp:1182:13: error: shift count >= width of type [-Werror,-Wshift-count-overflow]
    regBits <<= 35;
            ^   ~~
1 error generated.

This code is assumed to run in AMD64 and block is surrounded with #ifdef FEATURE_AVX_SUPPORT

My question is,

Can I get any information about shifting 35 bits is about and how to solve this for x86/Linux?
Can I disable FEATURE_AVX_SUPPORT for x86/Linux?

Thanks in advance.

The text was updated successfully, but these errors were encountered:

This commit disables FEATURE_AVX_SUPPORT for x86/Linux to fix #8331.

mikedn · 2016-11-29T07:47:13Z

It looks like that code assumes that size_t is 64 bits, if that's the case then this will likely affect Windows x86 as well. Probably nobody noticed any ill effects until now because enabling SSE/AVX on x86 is work in progress.

janvorli · 2016-11-29T10:33:18Z

cc: @dotnet/jit-contrib

BruceForstall · 2016-11-29T17:19:48Z

@seanshpark There's a pretty pervasive problem in the emitter where it uses size_t to accumulate the encoding, but that's too small for AVX (and maybe more?). Temporarily, for bring-up, you can disable FEATURE_AVX_SUPPORT (and FEATURE_SIMD as well). I'll look into fixing the encoder for x86.

seanshpark · 2016-11-29T22:34:40Z

Temporarily, for bring-up, you can disable FEATURE_AVX_SUPPORT (and FEATURE_SIMD as well)

Thanks @BruceForstall

The encoder was using size_t, a 32-bit type on x86, to accumulate opcode and prefix bits to emit. AVX support uses 3 bytes for prefixes that are higher than the 32-bit type can handle. So, change all code byte related types from size_t to a new code_t, defined as "unsigned __int64" on RyuJIT x86 (there is precedence for this type on the ARM architectures). Fixes #8331

* Disable FEATURE_AVX_SUPPORT for x86/Linux This commit disables FEATURE_AVX_SUPPORT for x86/Linux to fix #8331. * Disable FEATURE_AVX_SUPPORT only for x86/Linux * Disable FEATURE_SIMD for x86/Linux * Simplify nested if in CMakeList.txt

@janvorli

* Disable PrintSEHChain for non-Windows platforms (dotnet#8379) PrintSEHChain uses 'EXCEPTION_REGISTRATION_RECORD' which is not available for non-Windows platforms. This commit disables PrintSEHChain for non-Windows platforms to fix build error in x86/Linux. * Fix x86 encoder to use 64-bit type to accumulate opcode/prefix bits The encoder was using size_t, a 32-bit type on x86, to accumulate opcode and prefix bits to emit. AVX support uses 3 bytes for prefixes that are higher than the 32-bit type can handle. So, change all code byte related types from size_t to a new code_t, defined as "unsigned __int64" on RyuJIT x86 (there is precedence for this type on the ARM architectures). Fixes #8331 * Exclude jithelp.asm for x86/Linux (dotnet#8393) * Add parentheses aroung logical operations (dotnet#8406) This commit fixes logical-op-parentheses compile error for x86/Linux build. * Skip emitting duplicate clauses for CoreRT (dotnet#8400) Fixes dotnet/corert#2262 * Add printing managed assert message to console (dotnet#8399) I have discovered that when GUI assertion dialogs are disabled, the assert message is not shown anywhere and the app just silently exits. This change adds printing the message and stack trace to console in such case. * Remove the BinaryCompatibility class as it is not useful on .NET Core… (dotnet#8396) * Remove the BinaryCompatibility class as it is not useful on .NET Core and creates issues on Debug builds when the TFM on the AppDomain is not recognized. * Update the code for DateTimeFormatInfo to not use BinaryCompatibility * Remove initialization of preferExistingTokens now that we removed its usage * Fix recent x86 SIMD regressions 1. Recent PUTARG_STK work didn't consider SIMD arguments. 2. SSE3_4 work caused underestimation of instruction sizes for SSE4 instructions (e.g., pmulld). * Add EXCEPTION_REGISTRATION_RECORD for x86/Linux (dotnet#8408) * Fix build error in ARM64 code (dotnet#8407) CONTEXT struct for ARM64 does not contain X29 field. * Re-enable UMThkCallFrame and fix compile errors (dotnet#8411) * fix permissive C++ code (MSVC /permissive-) (dotnet#8337) * fix permissive C++ code (MSVC /permissive-) These were found by the C++ compiler group when doing "Real world code" build tests using /permissive-. We are sharing these with you to help you clean up your code before the new version of the compiler comes out. For more information on /permissive- see https://blogs.msdn.microsoft.com/vcblog/2016/11/16/permissive-switch/. ---------------------------- Under /permissive-, skipping the initialization of a variable is not allowed. As an extension the compiler allowed this when there was no destructor for the type. void func(bool b) { if(b) goto END; int value = 0; //error C2362: initialization of 'value' is skipped by 'goto END' int array[10]; //Okay, not initialized. //... value used here END: return; } Fix 1) Limit the scope of value: { int value = 0; //... value used here } END: Fix 2) Initialize/declare value before the 'goto' int value = 0; if(b) goto END; //... value used here END: Fix 3) Don't initialize value in the variable declaration. int value; value = 0 //... value used here END: ------------------- Alternative token representations. The following are reserved as alternative representations for operators: and and_eq bitand bitor compl not not_eq or or_eq xor xor_eq //Can't use reserved names for variables: static int and = 0; // Change name (possibly to 'and_') void func() { _asm { xor edx,edx // xor is reserved, change to uppercase XOR or eax,eax // or is reserved, change to uppercase OR } } * Apply formatting patch. * fixes from code review. I addressed @janvorli requests from the pull request code review. * Update CoreClr, CoreFx to beta-24801-03, beta-24721-02, respectively * [x86/Linux] Adjust the definition of FnStaticBaseHelper for x86 (dotnet#8390) * Resolve duplicated functions (dotnet#8413) Several functions are implemented in both cgenx86.cpp and unixstubs.cpp, which results in linking errors. This commit disables functions in cgenx86.cpp to resolve linking errors. * [x86/Linux] Add Portable PopSEHRecords as NYI (dotnet#8412) * [x86/Linux] Disable Watson-related code for non-Windows platforms (dotnet#8410) * First step to generate nuget package for ARM32/Linux * [x86/Linux] Use portable JIT helpers (dotnet#8392) * Disable test against #8418 to unblock corefx updates * Introduce CORINFO_EH_CLAUSE_SAMETRY flag for CoreRT ABI (dotnet#8422) CORINFO_EH_CLAUSE_SAMEBLOCK flag is returned on mutually protecting EH clauses for CoreRT ABI. It is set on EH clauses that cover same try block as the previous one. The runtime cannot reliably infer this information from native code offsets without full description of duplicate clauses because of different try blocks can have same offsets. Alternative solution to this problem would be inserting extra nops to ensure that different try blocks have different offsets. * [x86/Linux] fix several parentheses compile warnings (dotnet#8428) * RyuJIT/x86: Implement TYP_SIMD12 support There is no native load/store instruction for Vector3/TYP_SIMD12, so we need to break this type down into two loads or two stores, with an additional instruction to put the values together in the xmm target register. AMD64 SIMD support already implements most of this. For RyuJIT/x86, we need to implement stack argument support (both incoming and outgoing), which is different from the AMD64 ABI. In addition, this change implements accurate alignment-sensitive codegen for all SIMD types. For RyuJIT/x86, the stack is only 4 byte aligned (unless we have double alignment), so SIMD locals are not known to be aligned (TYP_SIMD8 could be with double alignment). For AMD64, we were unnecessarily pessimizing alignment information, and were always generating unaligned moves when on AVX2 hardware. Now, all SIMD types are given their preferred alignment in getSIMDTypeAlignment() and alignment determination in isSIMDTypeLocalAligned() takes into account stack alignment (it still needs support for x86 dynamic alignment). X86 still needs to consider dynamic stack alignment for SIMD locals. Fixes #7863 * Change order in .builds and .pkgproj, fix build.sh for not modifying dir.prop * Ensure MSBuild properties get persisted to child MSBuild tasks, fixes a race condition in the build (dotnet#8404) * [x86/Linux] Fix unused function warning (dotnet#8429) * Delete the unused code * [x86/Linux] Revert UMThkCallFrame-related changes (dotnet#8434) * [x86/Linux] Revert UMThkCallFrame-related code * [x86/Linux] Fix dangling 'TheUMEntryPrestub' reference This commit re-enables GenerateUMThunkPrestub and its related code in order to remove TheUMEntryPrestub reference. * [x86/Linux] Re-enable several methods in StubLinkerCPU This commit re-enables the following methods for x86/Linux: - StubLinkerCPU::EmitSetup - StubLinkerCPU::EmitComMethodStubProlog - StubLinkerCPU::EmitComMethodStubEpilog In addtion, EmitComMethodStubEpilog is marked as NYI. * Fix several misspellings of exception and "a exception". (dotnet#8442) * [x86/Linux] Fix dangling DoubleToNumber and NumberToDouble (dotnet#8446) This commit enables portable DoubleToNumber and NumberToDouble for x86/Linux. * Use Portable Floating-point Arithmetic Helpers (dotnet#8447) This commit enables portable floating-point arithmetic helpers for x86/Linux build. * [x86/Linux] Fix indirection of non-volatile null pointer will be deleted (dotnet#8452) Fix compile error for x86/Linux - fix error "indirection of non-volatile null pointer will be deleted, not trap [-Werror,-Wnull-dereference]" - using clang 3.8 * [x86/Linux] Use Portable LMul JIT Helper (dotnet#8449) * [x86/Linux] Fix all paths through this function will call itself (dotnet#8451) Fix compile error for x86/Linux - disable "infinite-recursion" for "recursiveFtn" function - only for clang * [x86/Linux] Mark LeaveCatch as NYI (dotnet#8384) * Disable LeaveCatch for non-Windows platforms * Mark LeaveCatch as NYI * Use #ifndef as before * Fix runtest.sh: delete ni file and lock correctly (dotnet#8081) * [x86/Linux] Fix dangling CLR_ImpersonateLoggedOnUser reference (dotnet#8435) src/vm/securityprincipal.cpp is not included in x86/Linux build, and thus all the reference to the functions in it will be dangling. (i.e. COMPrincipal::CLR_ImpersonateLoggedOnUser). This commit hides COMPrincipal for non-Windows platforms, and marks COMPlusThrowCallbackHelper as NYI. * [x86/Linux] Enclose stub-linking methods with FEATURE_STUBS_AS_IL (dotnet#8432) * Fix dangling StubLinkerCPU::EmitDelegateInvoke in x86/Linux (dotnet#8444) Several methods in StublicLinkerCPU (including EmitDelegateInvoke) are available only when FEATURE_STUBS_AS_IL is defined. This commit encloses their declaration with appropriate macro (FEATURE_STUBS_AS_IL), and fix related build erros. * [x86/Linux] Fix dangling ClrCaptureContext (dotnet#8453) * [x86/Linux] Re-enable FrameHandlerExRecord for x86/Linux (dotnet#8409) * Re-enable FrameHandlerExRecord for x86/Linux * Use _TARGET_X86_ instead of WIN64EXCEPTIONS * JIT: enable inline pinvoke in more cases An inline pinvoke is a pinvoke where the managed/native transition overhead is reduced by inlining parts of the transition bookkeeping around the call site. A normal pinvoke does this bookkeeping in a stub method that interposes between the managed caller and the native callee. Previously the jit would not allow pinvoke calls that came from inlines to be optimized via inline pinvoke. This sometimes caused performance surprises for users who wrap DLL imports with managed methods. See for instance #2373. This change lifts this limitation. Pinvokes from inlined method bodies are now given the same treatment as pinvokes in the root method. The legality check for inline pinvokes has been streamlined slightly to remove a redundant check. Inline pinvokes introduced by inlining are handled by accumulating the unmanaged method count with the value from inlinees, and deferring insertion of the special basic blocks until after inlining, so that if the only inline pinvokes come from inline instances they are still properly processed. Inline pinvokes are still disallowed in try and handler regions (catches, filters, and finallies). X87 liveness tracking was updated to handle the implicit inline frame var references. This was a pre-existing issue that now can show up more frequently. Added a test case that fails with the stock legacy jit (and also with the new enhancements to pinvoke). Now both the original failing case and this case pass. Inline pinvokes are also now suppressed in rarely executed blocks, for instance blocks leading up to throws or similar. The inliner is now also changed to preferentially report inline reasons as forced instead of always when both are applicable. This change adds a new test case that shows the variety of situations that can occur with pinvoke, inlining, and EH. * Incorporate changes from Jan's dotnet#8437, plus review feedback. Still honoring windows exception interop restrictions on all platforms and runtimes. Will revisit when addressing #8459. * Add Linux perf support to Jenkins This change adds perf support for CoreCLR on Ubuntu 14.04 to Jenkins. This is mostly work extending what Smile had already done. The main changes were to build CoreCLR rather then grab it from CI, and work to get the upload portion finished. * Copy CoreFX environment variable code (dotnet#8405) Tweak the core code to match up with what we had done in CoreFX and expose so that we can have a single source of environment truth. This is particularly important for Unix as we use a local copy of the state. * Compare opt against zero involving a shift oper. * Allow remorph of SIMD assignment This fixes an assert exposed by JitStress=1. * x86: Deactivate P/Invoke frames after a native call. Although this does not appear to be strictly necessary, this matches JIT32's behavior. With this change, the stack walker will ignore the P/Invoke frame even while it is still present on its thread's frame list. Fixes VSO 297109. * [x86/Linux] add a stub for THROW_CONTROL_FOR_THREAD_FUNCTION (dotnet#8455) THROW_CONTROL_FOR_THREAD_FUNCTION is defined as ThrowControlForThread for x86/Linux, but unixstubs implements RedirectForThrowControl (which corresponds to x64/Linux). This commit renames RedirectForThrowControl as ThrowControlForThread to fix dangling ThrowControlForThread reference in x86/Linux. * [x86/Linux] Fix no known conversion from 'void ()' to 'void *' (dotnet#8450) Fix compile error for x86/Linux - this will fix "no known conversion from 'void ()' to 'void *'" for "CallRtlUnwindSafe" - for compiler clang 3.8 * [x86/Linux] Enclose ArrayOpStub Exceptions with FEATURE_ARRAYSTUB_AS_IL (dotnet#8445) * Enclose ArrayOpStub Exceptions with FEATURE_ARRAYSTUB_AS_IL * Fix unmatched ifdef * Fix unmatched ifdef * Add UnhandledExceptionHandlerUnix Stub (dotnet#8425) FuncEvalHijack in dbghelpers.S uses UnhandledExceptionHandlerUnix as a personality routine, but UnhandledExceptionHandlerUnix is not avaiable for x86 (UnhandledExceptionHandlerUnix is available only when WIN64EXCEPTIONS which is not defined for x86). This commit adds UnhandledExceptionHandlerUnix to fix dangling reference. * [x86/Linux] Mark several Windows-specific functions in excepx86.cpp as NYI (dotnet#8424) * Mark several Windows-specific functions as NYI * Use FEATURE_PAL instead of PLATFORM_UNIX * Revert the change in threads.h * [x86/Linux] Port gmsasm.asm (dotnet#8456) * Fix calls to curl in prep script Before we were calling curl without the -L configuration. This would cause it not follow redirects and several of the files we needed have now started using redirects. This fixes that issue. * Create Blk node for struct vararg When morphing a reference to a struct parameter in a varargs method, it must be a blk node if it is the destination of an assignment. * [x86/Linux] Fix unknown pragma build error (dotnet#8427) * [x86/Linux] Revise COMPlusThrowCallback (dotnet#8430) GetCallerToken and GetImpersonationToken methods in FrameSecurityDescriptorBaseObject are implemented only for Windows-platform. * [x86/Linux] Fix exception handling routine (dotnet#8433) * [x86/Linux] Fix exception handling routine DispatchManagedException requires WIN64EXCEPTIONS to be defined, but it is not defined for x86/Linux. * Extract ARRAYSTUBS_AS_IL code from STUBS_AS_IL region (dotnet#8443) FEATURE_ARRAYSTUBS_AS_IL code seems to be independent from FEATURE_STUBS_AS_IL, but the related code is enclosed with FEATURE_STUBS_AS_IL. This commit extracts the related code from STUBS_AS_IL region. * [x86/Linux] Fix Dacp structure size mismatch (dotnet#8377) Fix compile error for x86/Linux - add __attribute__((__ms_struct__)) as "MSLAYOUT" for those structures - Fix "Dacp structs cannot be modified due to backwards compatibility" error * fix semicolon * [x86/Linux][SOS] Disable ARM target support for xplat (dotnet#8471) * [x86/Linux][SOS] Fix DataTarget::GetPointerSize for x86 (dotnet#8473) * Address PR feedback. * We should not transform a GT_DYN_BLK with a constant zero size into a GT_BLK as we do not support a GT_BLK of size zero. Fixes VSO 287663 * Fixed typo * Fix use edge iterator for DYN_BLK nodes. Dynamic block nodes (i.e. DYN_BLK and STORE_DYN_BLK) are not standard nodes. As such, the use order of their operands may be reordered in ways that are not visible via the usual mechanisms. The use edge iterator was not taking these mechanisms into account, which caused mismatches between the use order observed by LSRA and the order observed by code generation. This in turn caused SBCG under circumstances in which one operand needed to be copied from e.g. esi to edi before another operand was unspilled into esi. Fixes VSO 297113. * Fix building against liblttng-ust-dev 2.8+ * Fix to issue 8356. * fix comparison * Streamline LSRA resolution Only do resolution when required, and only for variables that may need it. * GcInfoEncoder: Initialize the BitArrays tracking liveness (dotnet#8485) The non-X86 GcInfoEncoder library uses two bit-arrays to keep track of pointer-liveness. The BitArrays are allocated using the arena allocator which doesn't zero-initialize them. This was causing non-deterministic redundant allocation of unused slots. This change fixes the problem. * [x86/Linux] Port PATCH_LABEL macro (dotnet#8483) * [x86/Linux] Port asmhelpers.asm (dotnet#8489) This commit ports asmhelpers.asm to x86/Linux. (CallRtlUnwind is currently marked as NYI) * [x86/Linux] Port StubLinkerCPU::EmitSetup (dotnet#8494) This commit ports StubLinkerCPU::EmitSetup to x86/Linux. * Move JIT_EndCatch from asmhelpers.asm into jithelp.asm (dotnet#8492) * Move JIT_EndCatch from asmhelpers.asm into jithelp.asm The name of JIT_EndCatch suggests that it is a JIT helper, but its implementation is inside asmhelpers.asm (not in jithelp.asm). This commit moves its implementation into jithelp.asm. * Move COMPlusEndCatch declaration * [x86/Linux][SOS] Add definitions for CLR_CMAKE_PLATFORM_ARCH_I386 in CMakeLists.txt file of lldbplugin (dotnet#8499) * Use only lower floats for Vector3 dot and equality For both dot product and comparisons that produce a boolean result, we need to use only the lower 3 floats. The bug was exposed by a case where the result of a call was being used in one of these operations without being stored to a local (which would have caused the upper bits to be cleared). Fix #8220 * [x86/Linux][SOS] Get correct stack pointer from DT_CONTEXT (dotnet#8500) * Remove a use of `gtGetOp` in earlyprop. Instead, use `GenTreeIndir::Addr`, as some indirections are not simple operators. Fixes VSO 289704. * Change ArraySortHelper to use Comparison<T> The Array/List.Sort overloads that take a Comparison<T> have worse performance than the ones that take a IComparer<T>. That's because sorting is implemented around IComparer<T> and a Comparison<T> needs to be wrapped in a comparer object to be used. At the same time, interface calls are slower than delegate calls so the existing implementation doesn't offer the best performance even when the IComparer<T> based overloads are used. By changing the implementation to use Comparison<T> we avoid interface calls in both cases. When IComparer<T> overloads are used a Comparison<T> delegate is created from IComparer<T>.Compare, that's an extra object allocation but sorting is faster and we avoid having two separate sorting implementations. * Remove unused DepthLimitedQuickSort methods These are never used in CoreCLR * Use a left-leaning comma tree when morphing a stelem.ref helper. fgMorphCall may change a call to the stelem.ref helper that is storing a null value into a simple store. This transformation needs to construct a comma tree to hold the argument setup nodes present on the call if any exist. Originally this tree was constructed in right-leaning fashion (i.e. the first comma node was the root of the tree and each successive comma node was the RHS of its parent). Unfortunately, this construction did not automatically propagate the flags of a comma node's children to the comma node, since not all of each comma node's actual children were available at the time it was constructed. Constructing the tree in left-leaning fashion (i.e. the first comma node is the left-most child and the final comma node is the root of the tree) allows the flag propagation to be performed correctly by constrution. Fixes VSO 297215. * Enable POGO build and link for CodegenMirror [tfs-changeset: 1640669] * Refactor Span<T> to ease implementation of JIT intrinsics (dotnet#8497) - Introduce internal ByReference<T> type for byref fields and change Span to use it - Generalize handling of byref-like types in the type loader - Make DangerousGetPinnableReference public while I was on it * [x86/Linux] Fix inconsistent GetCLRFunction definitions (dotnet#8472) * [x86/Linux] Fix inconsistency in GetCLRFunction definitions GetCLRFunction is treated as pfnGetCLRFunction_t which has __stdcall convention, but is implemented without __stdcall. This inconsistency causes segmentaion fault while initializing CoreCLR for x86/Linux. This commit fixes such inconsistency via adding __stdcall to GetCLRFunction implementation. In addition, this commit declares GetCLRFuntion in 'utilcode.h' and and revises .cpp files to include 'utilcode.h' instead of declaring 'GetCLRFunction'. * Remove unnecessary includes * Remove another unnecessay include * Strip some conditional compilation in SPCL (dotnet#8511) Removed: FEATURE_FUSION FEATURE_PATHCOMPAT FEATURE_APPDOMAINMANAGER_INITOPTIONS FEATURE_APTCA FEATURE_CLICKONCE FEATURE_IMPERSONATION FEATURE_MULTIMODULE_ASSEMBLIES Removed some: FEATURE_CAS_POLICY !FEATURE_CORECLR FEATURE_REMOTING * Supporting C# 7 deconstruction of certain types. See https://github.com/dotnet/corefx/issues/13746 * Remove an unused local variable In lowerxarch.cpp, local variable srcUns is defined but not used at Lowering::LowerCast(GenTree* tree). Signed-off-by: Hyung-Kyu Choi <hk0110.choi@samsung.com> * fix parentheses * Update glossary.md * Move native search paths forward (dotnet#8531) Set native search paths in AppDomain.Setup before doing the rest of the setup steps to get ahead of potential P/Invoke calls. * Simplify TimeZoneInfo.Equals(object) (dotnet#8514) Equals(TimeZoneInfo) already handles null. * Avoid allocating in TimeZoneInfo.GetHashCode() (dotnet#8513) Avoid the intermediate ToUpper string allocation. * Disable special put args for LIMIT_CALLER on x86. On x86, `LSRA_LIMIT_CALLER` is too restrictive to allow the use of special put args: this stress mode leaves only three registers allocatable--eax, ecx, and edx--of which the latter two are also used for the first two integral arguments to a call. This can leave us with too few registers to succesfully allocate in situations like the following: t1026 = lclVar ref V52 tmp35 u:3 REG NA <l:$3a1, c:$98d> /--* t1026 ref t1352 = * putarg_reg ref REG NA t342 = lclVar int V14 loc6 u:4 REG NA $50c t343 = const int 1 REG NA $41 /--* t342 int +--* t343 int t344 = * + int REG NA $495 t345 = lclVar int V04 arg4 u:2 REG NA $100 /--* t344 int +--* t345 int t346 = * % int REG NA $496 /--* t346 int t1353 = * putarg_reg int REG NA t1354 = lclVar ref V52 tmp35 (last use) REG NA /--* t1354 ref t1355 = * lea(b+0) byref REG NA Here, the first `putarg_reg` would normally be considered a special put arg, which would remove `ecx` from the set of allocatable registers, leaving only `eax` and `edx`. The allocator will then fail to allocate a register for the def of `t345` if arg4 is not a register candidate: the corresponding ref position will be constrained to { `ecx`, `ebx`, `esi`, `edi` }, which `LSRA_LIMIT_CALLER` will further constrain to `ecx`, which will not be available due to the special put arg. * The fix is to set the GTF_EXCEPT and GTF_GLOB_REF for every GT_DYN_BLK node that we create. We typically don't have any information about the address supplied to a GT_DYN_BLK so we should conservatively allow that it can either be a null pointer or could point into the GC heap. * Correct an assertion in LSRA. `verifyFinalAllocation` asserts that if a non-BB interval RefPosition that is either spilled or is the interval's last use does not have a register, then that ref position must be marked `AllocateIfProfitable`. However, this situation can also arise in at least one other situation: an unused parameter will have at least one ref position that may not be allocated to a register. This change corrects the assertion to check `RefPosition::RequiresRegister` rather than `RefPosition::AllocateIfProfitable`. Fixes VSO 299207. * Remove sscanf and sprintf usage (dotnet#8508) * Remove sscanf * Remove sprintf * Fix unix unwind info Windows uses offset from stack pointer, when unix has to use offset from caninical frame address, * Add script generator and generate test scripts to adapt debuggertests repo for coreclr infrastructure * Remove private TimeZoneInfoComparer (dotnet#8512) Use Comparison<T> instead of IComparer<T> to sort the list of TimeZoneInfos, which moves the comparison code to the sole place where it is used, and now that Array.Sort is implemented in terms of Comparison<T> instead of IComparer<T>, avoids some unnecessary intermediate allocations. * Simplify TimeZoneInfo.AdjustmentRule.Equals (dotnet#8527) * Preallocate the TimeZoneInfo.Utc instance (dotnet#8530) There doesn't appear to be a good reason why the TimeZoneInfo.Utc instance needs to be cleared when TimeZoneInfo.ClearCachedData() is called. Instead, we can pre-allocate and reuse a singleton instance, obviating the need for the lazy-initialization/locking mechanics. * Make TimeZoneInfo.AdjustmentRule fields readonly (dotnet#8528) AdjustmentRule is immutable. Help enforce this by making its fields readonly. * [x86/Linux] Port Several Stubs as NYI (dotnet#8515) This commit adds SinglecastDelegateInvokeStub and VSD-related Stubs as NYI. * Make TimeZoneInfo.TransitionTime fields readonly (dotnet#8529) TransitionTime is immutable. Help enforce this by making its fields readonly. * TimeZoneInfo: Use string.Concat instead of string.Format (dotnet#8540) It's more efficient to concatenate the strings. * Make TimeZoneInfo fields readonly (dotnet#8526) TimeZoneInfo is immutable. Help enforce this by making its fields readonly. * [x86/Linux] Revise asmhelper.S using macro (dotnet#8523) * [x86/Linux] Revise asmhelper.S using macro This commit revises asmhelper.S using macros that inserts CFI directives. * [x86/Linux] Fix PAL unit test paltest_pal_sxs_test1 (dotnet#8522) Fix unit test error for x86/Linux - fix fail of exception_handling/pal_sxs/test1/paltest_pal_sxs_test1 * Fix to issue 8287. * Port ConditionalWeakTable from CoreRT The CoreRT ConditionalWeakTable was modified to support lock-free reads. This ports the implementation back to coreclr. * Fix perf regression with lots of objects in a ConditionalWeakTable The CoreRT implementation of ConditionalWeakTable that was ported back to CoreCLR uses a special scheme to make reads lock-free. When the container needs to grow, it allocates new arrays and duplicates all of the dependency handles from the previous array, rather than just copying them. This avoids issues stemming from a thread getting a dependency handle in an operation on the container, then having that handle destroyed, and then trying to use it; the handle won't be destroyed as long as the container is referenced. However, this also leads to a significant cost in a certain situation. Every time the container grows, it allocates another N dependency handles where N is the current size of the container. So, for example, with an initial size of 8, if 64 objects are added to the container, it'll allocate 8 dependency handles, then another 16, then another 32, and then another 64, resulting in significantly more handles than in the old implementation, which would only allocate 64 handles total. This commit fixes that by changing the scheme slightly. A container still frees its handles in its finalizer. However, rather than duplicating all handles, that responsibility for freeing is transferred from one container to the next. Then to avoid issues where, for example, the second container is released while the first is still in use, a reference is maintained from the first to the second, so that the second can't be finalized while the first is still in use. The commit also fixes a race condition with resurrection and finalization, whereby dependency handles could be used while or after they're being freed by the finalizer. It's addressed by only freeing handles in a second finalization after clearing out state in the first finalization to guarantee no possible usage during the second. * Strip more defines from CoreLib (dotnet#8545) * Strip more defines from CoreLib Removes the rest of FEATURE_CAS_POLICY FEATURE_REMOTING FEATURE_MACL And another significant chunk of !FEATURE_CORECLR * Address feedback * [x86/Linux] Fix getcpuid calling convention (dotnet#8552) Fix getcpuid(), getextcpuid() with STDCALL Fix xmmYmmStateSupport() with STDCALL * Fix incremental build when dummy version.cpp is generated (dotnet#8547) This change fixes a problem with incremental build on Unix. When the version.cpp is generated by the build.sh as a dummy one with no real version stamp in it, it is recreated every time the build.sh is run. That means that build needs to rebuild that file and also re-link all the components that include it. This change tests the file presence and contents before actually regenerating it. * [x86/Linux] Use Portable FastGetDomain (dotnet#8556) FastGetDomain function (with __declspec(naked)) causes segmentation fault. * [x86/Linux] Port ResolveWorkerAsmStub (dotnet#8557) * model.xml * Add TPA/Trusted Platform Assemblies description to the glossary (From https://github.com/dotnet/coreclr/issues/6470#issuecomment-235161459 ) * StringBuilder.AppendJoin (appending lists to StringBuilder) (dotnet#8350) Adding StringBuilder.AppendJoin * Make it easier to iterate through an ArraySegment (dotnet#8559) * Make it easier to iterate through an ArraySegment. See Make it easier to iterate through an ArraySegment * Fix path separator in CrossGen help on Linux * Change CWT use of GetPrimaryAndSecondary to GetPrimary This snuck in as part of my previous ConditionalWeakTable changes. We don't need the secondary here, and it's more expensive to get than just the primary. * Strip some security related attributes (dotnet#8571) Strips SecurityCritical, SecuritySafeCritical, SecurityPermission, EnvironmentPermission, and PermissionSet attributes. Also removes empty defines these left behind. Patterns used: ^.*\[(System\.Security\.)?SecurityCritical\](\s*//.*|\s*)$[\r\n]* ^.*#if FEATURE_CORECLR[\s\r\n]*(#else)?[\s\r\n]*#endif.*[\r\n]* ^.*\[(System\.Security\.Permissions\.)?SecurityPermission(Attribute)?$[^)]*$\](\s*//.*|\s*)$[\r\n]* * Change ConditionalWeakTable.Clear The original CoreRT implementation just dropped the current table and replaced it with a new one. In the process of porting the CoreRT implementation to CoreCLR, I'd changed it to instead remove each item from the table (by setting its hashcode to -1, as does Remove), in order to work around some code in the finalizer that would null out te parent's reference to the container, and that would cause problems with dropping this table... but that code in the finalizer changed before it got merged, and in its current form, the old CoreRT clear implementation was fine. It's also likely better, as it'll let the handles be cleaned up earlier, and it's simple. So reverting back to it. * Fix to issue 8286. * Update CoreClr, CoreFx to beta-24810-01, beta-24810-02, respectively * Adding API ConditionalWeakTable.AddOrUpdate (dotnet#8490) * Added ConditionalWeakTable.AddOrUpdate * Removes final FEATURE_CORECLR defines (dotnet#8555) * Fix misguided lock in CurrentTimeZone (dotnet#8569) CurrentTimeZone locks against a static lock object when modifying a non-static Hashtable. Instead, use the Hashtable instance itself as the lock object. * Fix typos and grammer in coreclr README.md (dotnet#8561) * Fix typeos and grammer in README.md * Fix a small grammar issue and remove a comma * TimeZoneInfo: Avoid cloning privately-created ArgumentRule[] arrays (dotnet#8575) TimeZoneInfo currently always creates a defensive copy of the specified ArgumentRule[] array when created. This makes sense for the public static factory methods. However, there's no need to create a defensive copy of arrays created privately as part of its implementation (e.g. reading the rules from the registry/disk). This change avoids the unnecessary cloning. * Improve ConditionalWeakTable.Remove (dotnet#8580) Clear the key of the deleted entry to allow GC collect objects pointed to by it Fixes #8577 * Use JitHelpers.UnsafeCast in ConditionalWeakTable We know the types and can use UnsafeCast when reading out the objects as TKey and TValue. This improves reading speed by ~30%. * Fix typo in clang-format directive and reformat end of flowgraph.cpp (dotnet#8573) Last 4K or so lines of flowgraph.cpp were not being formatted because the clang-format on directive had a typo. Fix the typo and reformat the latter part of the file. * Local GC: Decouple write barrier operations between the GC and EE (dotnet#8568) * Decouple write barrier operations between the GC and EE * Address code review feedback * Address code review feedback * Repair the standalone GC build * [x86/Linux][SOS] Add CALLBACK (aka stdcall) to function declarations (dotnet#8476) * [x86/Linux][SOS] Don't include utilcode.h in gcdumpx86.cpp and gcdump.cpp (dotnet#8475) * Span<T> api update (dotnet#8583) * Changing method/property order to match CoreFX impl To make diffing the files easier * Added other missing methods to match CoreFX impl Added: - public void CopyTo(Span<T> destination) - public static bool operator ==(Span<T> left, Span<T> right) - public static bool operator !=(Span<T> left, Span<T> right) - public override bool Equals(object obj) - public override int GetHashCode() Also removed 'public void Set(ReadOnlySpan<T> values)' and it's no longer part of the Span<T> public API, see https://github.com/dotnet/apireviews/tree/master/2016/11-04-SpanOfT#spantset * Disable GetGenerationWR2 for GCStress on x86 Disable this test for GCStress on x86 until the cause for its failure can be investigated. * // BLOCKED (do not add now): [EditorBrowsable(EditorBrowsableState.Never)] * Fix longname DAC to enable arm[64] symbol packages (dotnet#8574) * [x86/Linux] implement TheUMEntryPrestub (dotnet#8589) Initial code for x86 TheUMEntryPrestub, UMThunkStub * Rename Contract.Assert to Debug.Assert (dotnet#8600) * Add ability to give a name to a PR run * Ryujit/ARM32 Implement Lowering::LowerCast for ARM Simple integer to interger type conversion is passed. Add comment for LowreCast with C++ comment style. Signed-off-by: Hyung-Kyu Choi <hk0110.choi@samsung.com> * Ryujit/ARM32 Initial Lowering::IsContainableImmed for ARM Initial implementation of IsContainableImmed for ARM. Signed-off-by: Hyung-Kyu Choi <hk0110.choi@samsung.com> * Delete HostProtection attributes (dotnet#8610) * Remove managed environment cache (dotnet#8604) The PAL already caches- no need to do it in managed. Fold the code back into Environment.cs. * Synchronize src\mscorlib\corefx with the CoreRT fork (dotnet#8619) * Enable interop debugging for Windows amd64 and x86. (dotnet#8603) Found sos portable pdb problem on x86. Fixed interop problems between the native sos and SOS.NetCore managed helper assembly. * Fix incorrect compare narrowing in TreeNodeInfoInitCmp TreeNodeInfoInitCmp attempts to eliminate the cast from `cmp(cast<ubyte>(x), icon)` by narrowing the compare to ubyte. This should only happen if the constant fits in a byte so it can be narrowed too, otherwise codegen produces an int sized compare. (or a byte sized compare with a truncated constant if we try to use GTF_RELOP_SMALL). * Fix consume-order checking in codegen. The switch to LIR invalidated the correspondence between a node's sequence number and the order in which it must be consumed with respect to other nodes. This in turn made the consume-order checks during code generation incorrect. This change introduces a new field, `gtUseNum`, that is used during code generation to check the order in which nodes are consumed. Re-enabling these checks revealed a bug in code generation for locked instructions on x86, which were consuming their operands out-of-order; this change also contains a fix for that bug. Fixes #7963. * Remove no-op file security (dotnet#8611) Deletes FileSecurityState and pulls redundant methods. Also removes DriveInfo, which isn't in use in Core. * Fix the ARM32 build. * [x86/Linux] Port jithelp.asm (dotnet#8491) * [x86/Linux] Port jithelp.asm This commit ports jithelp.asm for x86/Linux The following Tailcall helpers are marked as NYI: - JIT_TailCall - JIT_TailCallReturnFromVSD - JIT_TailCallVSDLeave - JIT_TailCallLeave * Revise macro and indentation * [x86/Linux] Fix "Bad opcode" assert in unwindLazyState (dotnet#8609) * [x86/Linux] Fix "Bad opcode" assert in unwindLazyState This commit suppresses "Bad opcode" assert while runing "Hello, World" example. This commit address the following three code patterns discovered while digging the assert failure: - and $0x1, %al - xor $0xff, %al - stack protection code: mov %gs:<off>, <reg> cmp <off>(%esp), <reg> mov <reg>, <off>($esp) jne <disp32> This commit revises LazyMachState::unwindLazyState to handle the first two patterns, and revises compile options not to emit the third pattern. * [x86/Linux] Fix incorrect __fastcall definition (dotnet#8585) In x86/Linux, __fastcall is defined as __stdcall which causes stack corruption issue for the following functions: - HelperMethodFrameRestoreState - LazyMachStateCaptureState This commit removes __fastcall definition as clang recognize __fastcall. * [x86/Linux] Enforce 16-byte stack alignment (dotnet#8587) Clang (and GCC) requires 16-byte stack alignment, but the current implementation of CallDescrInternal and ThePreStub does not provide any guarantee on stack alignment. This commit adds 16-byte stack alignment adjust code inside these functions. * Adding method implementations to fast span/readonlyspan (dotnet#8607) * [ARM32/Linux] Initial bring up of FEATURE_INTERPRETER (dotnet#8594) * Bring up FEATURE_INTERPRETER for ARM32/Linux * Add a disclaimer message for GC preemption workaround * [x86/Linux] Adds Dummy Exception Handler (dotnet#8613) This commit adds a dummy exeption handler to enable x86/Linux build. This commit also reverts 7b92136 to make it easy to enable WIN64EXCEPTIONS in x86/Linux. * Move RUNTIME_FUNCTION__BeginAddress into clrnt.h (dotnet#8632) RUNTIME_FUNCTION__BeginAddress is defined in corcompile.h for x86, but is defined in clrnt.h for all the other architectures. This commit moves RUNTIME_FUNCTION__BeginAddress defines for x86 into clrnt.h to make it consistent. * Update CoreClr, CoreFx to beta-24814-03, beta-24814-02, respectively * Add support for R2R ldvirtftn helpers (dotnet#8608) The codegen for the non-readytorun path (used on CoreCLR) requires the runtime to support general purpose virtual method resolution based on a method handle. CoreRT doesn't have such helpers. * [Linux][GDB-JIT] Add simple C++ mangling of method names in GDBJIT DWARF (dotnet#8638) * Add simple C++ mangling of method names in GDBJIT DWARF Example: Namespace1.Class1.Method -> void Namespace1_Class1::Method() * Do not convert a name during mangling if target buffer is NULL * Fix ref count adjustment in `fgMorphBlockStmt`. LclVar ref counts must be incremented before attempting to remove the morphed statement, since doing so decrements ref counts (and thus requires refcounts to be conservatively correct). Fixes VSO 359734. * Add a regression test. * Correctly sequence fgMorphModToSubMulDiv This method was creating a temp, but the final result was a GT_SUB with a use of the temp as its op1, and it was not setting GTF_REVERSE_OPS. This led to a liveness assert in LSRA. * Fix SIMD Scalar Move Encoding: VEX.L should be 0 For SIMD Scalar Move instructions such as vmovlpd, vmovlps, vmovhps, vmovhps and vmovss on AVX system, JIT should ensure that those instructions are encoded with VEX.L=0, because encoding them with VEX.L=1 may encounter unpredictable behavior across different processor generations. The reason of VEX.L is encoded with 1 is because JIT calls compiler->getSIMDVectorType() which returns EA_32BYTE, and it is been passed into emitter AddVexPrefix() which ends up encoded VEX.L=1, the fix is to pass target type and base type for those instructions to ensure that VEX.L=0 at emitter AddVexPrefix() Fix #8328 * Update CoreFx to beta-24815-01 * Remove API Set dependency (dotnet#8624) * Update CoreClr, CoreFx to beta-24815-03, beta-24815-03, respectively * Switch GCSample to the canonical GCToOSInterface implementation (dotnet#8653) * Make it easier to use StringComparison & StringComparer with GetHashCode (dotnet#8633) * Make it easier to use StringComparison & StringComparer with GetHashCode * model.xml * [Linux][GDB-JIT] Add try/catch blocks to methods in DWARF (dotnet#8650) * Add try/catch blocks to DWARF info Use info about exception handling from IL and map it to native code as try/catch blocks in DWARF. * Improve locals naming convention consistency in gdbjit * Drop pointer to line info from FunctionMember after it is dumped * Update CoreFx to beta-24816-02 * Makes CultureInfo.get_Parent thread safe (dotnet#8656) * Fix buildsystem for linux cross-architecture component build (dotnet#8646) * Fix buildsystem for linux cross-architecture component build * refactoring build.sh, bug fix and typo fix * Update build.sh * Update CoreClr to beta-24816-04 * Dictionary.GetValueOrDefault (dotnet#8641) * Dictionary.GetValueOrDefault * Fixed IDictionary.TryGetValue * remove extensions * // Method similar to TryGetValue that returns the value instead of putting it in an out param. * public TValue GetValueOrDefault(TKey key) => GetValueOrDefault(key, default(TValue)); * Avoid Unsafe.AsRef in Span<T> implementation (dotnet#8657) The JIT is not able to inline the current implementation of Unsafe.AsRef because of it has type mismatch on return. Change the the corelib Span to call Unsafe.As instead since fixing the type mismatch is not easy in the internal corelib version of Unsafe.AsRef. * Packaging support for portable Linux binaries. * Widen basic block flag field to 64 bits Flag field is currently full, and I need at least one more bit to identify the start of a cloned finally. Widen to 64 bits and update a few uses. Also removed an unused copy. * Use ExecutionPolicy ByPass to execute probe-win.ps1 script during the build (dotnet#8673) Port of dotnet/corert#2377 * Remove Read/WriteProcessMemory from PAL. (dotnet#8655) Ifdef more unused code that uses ReadProcessMemory. Move the current memory probing in the transport to PAL_ProbeMemory. Add PAL_ProbeMemory to dac PAL exports. PAL_ProbeMemory may be changed to use write/read on a pipe to validate the memory as soon as we make it perform as well as the current code. Remove ReadProcessMemory tests and add PAL_ProbeMemory pal tests. * Fixing up the arm subsystem versions (dotnet#8676) * Adding arm64 and updating default subsystems * [x86/Linux] Add UMThunkStub (dotnet#8627) Add UMThunkStub method with logic from that of AMD64 * [Linux][GDB-JIT] Fix bugs in gdbjit that break lldb stepping (dotnet#8637) * Fix .text and .thunk symbols overlapping When current method calls itself, a __thunk* symbol might be generated with the same address as the method symbol in .text section. Avoid generating such __thunk* symbol. * Do not create DWARF line table entries with the same address * For each HiddenLine assign a zero line number in DWARF Allow LLDB to to skip HiddenLines when stepping. * Fix __thunk symbols containing garbage Fix a bug when __thunk* symbols of previously compiled methods cause generation of __thunk* symbols for currently compiled method without filling symbol info. * Fix missing check for the end of list of compiled methods * Remove unnecessary check for zero prevLine in gdbjit * Fix DllImport of IdnToAscii & IdnToUnicode (dotnet#8666) Fix an issue found in OneCoreUAP testing. According to MSDN, the official exporting DLL for IdnToAccii and IdnToUnicode is normaliz.dll, not kernel32.dll. While most Windows SKUs export these functions from both normaliz.dll and kernel32.dll, recent tests revealed that some Windows SKUs export them from normaliz.dll only. * Add Encoding.GetBytes(string, offset, count) (dotnet#8651) * Update build-packages.sh to support portableLinux

parjong referenced this issue in parjong/coreclr Nov 29, 2016

Disable FEATURE_AVX_SUPPORT for x86/Linux

e101afd

This commit disables FEATURE_AVX_SUPPORT for x86/Linux to fix #8331.

BruceForstall self-assigned this Nov 29, 2016

BruceForstall changed the title ~~Q) [x86/Linux] shift count >= width of type error~~ RyuJIT/x86: encoder uses size_t to accumulate encoding bits, but that's not big enough Nov 29, 2016

janvorli closed this as completed in dotnet/coreclr#8335 Nov 30, 2016

BruceForstall reopened this Nov 30, 2016

BruceForstall closed this as completed in dotnet/coreclr@b90516f Dec 1, 2016

msftgits transferred this issue from dotnet/coreclr Jan 31, 2020

msftgits added this to the 2.0.0 milestone Jan 31, 2020

ghost locked as resolved and limited conversation to collaborators Dec 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RyuJIT/x86: encoder uses size_t to accumulate encoding bits, but that's not big enough #7061

RyuJIT/x86: encoder uses size_t to accumulate encoding bits, but that's not big enough #7061

seanshpark commented Nov 29, 2016

mikedn commented Nov 29, 2016

janvorli commented Nov 29, 2016

BruceForstall commented Nov 29, 2016

seanshpark commented Nov 29, 2016

RyuJIT/x86: encoder uses size_t to accumulate encoding bits, but that's not big enough #7061

RyuJIT/x86: encoder uses size_t to accumulate encoding bits, but that's not big enough #7061

Comments

seanshpark commented Nov 29, 2016

mikedn commented Nov 29, 2016

janvorli commented Nov 29, 2016

BruceForstall commented Nov 29, 2016

seanshpark commented Nov 29, 2016