Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Expr(:loopinfo) codegen #50663

Merged
merged 14 commits into from
Aug 14, 2023
Merged

Fix Expr(:loopinfo) codegen #50663

merged 14 commits into from
Aug 14, 2023

Conversation

vchuravy
Copy link
Member

Fixes #50660

We used a loop-marker intrinsic because the LoopID used to
be dropped by optimization passes (this seems no longer true).

#50660 is an example of a miscompilation where a loop of length 1
got optimized by simplifycfg to the point where the loop-marker
is now attached to the wrong back-edge.

This PR drops the loop-marker and uses the LoopID metadata node directly.

@vchuravy vchuravy added compiler:codegen Generation of LLVM IR and native code bugfix This change fixes an existing bug backport 1.9 Change should be backported to release-1.9 backport 1.10 Change should be backported to the 1.10 release labels Jul 24, 2023
@vchuravy vchuravy requested a review from pchintalapudi July 24, 2023 21:37
src/codegen.cpp Show resolved Hide resolved
src/llvm-simdloop.cpp Outdated Show resolved Hide resolved
src/llvm-simdloop.cpp Show resolved Hide resolved
@vchuravy
Copy link
Member Author

vchuravy commented Jul 26, 2023

Currently I am miss-attaching the loop id, in some cases.

@vchuravy vchuravy marked this pull request as ready for review August 7, 2023 22:42
@vchuravy vchuravy added needs nanosoldier run This PR should have benchmarks run on it needs pkgeval Tests for all registered packages should be run with this change labels Aug 7, 2023
@vchuravy
Copy link
Member Author

vchuravy commented Aug 9, 2023

@pchintalapudi

Loop passes are also required to preserve ScalarEvolution, which is almost certainly affected by this pass.

I don't think there is a SCEV updater, but there is forgetValue

So you think that in:

    for (chainVector::const_iterator K=chain.begin(); K!=chain.end(); ++K) {
        LLVM_DEBUG(dbgs() << "LSL: marking " << **K << "\n");
        REMARK([&]() {
            return OptimizationRemark(DEBUG_TYPE, "MarkedUnsafeAlgebra", *K)
                   << "marked unsafe algebra on " << ore::NV("Instruction", *K);
        });
        (*K)->setHasAllowReassoc(true);
        (*K)->setHasAllowContract(true);
        ++length;
    }

We should forgetValue(*K)?

@pchintalapudi
Copy link
Member

Yes, and if we change the loop we should also call SE->forgetLoopDispositions(L)

@vchuravy
Copy link
Member Author

@nanosoldier runbenchmarks(!"scalar", vs=":master")

@vchuravy
Copy link
Member Author

@nanosoldier runtests(vs = ":master", configuration = (buildflags=["LLVM_ASSERTIONS=1", "FORCE_ASSERTIONS=1"],), vs_configuration = (buildflags = ["LLVM_ASSERTIONS=1", "FORCE_ASSERTIONS=1"],))

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

@KristofferC KristofferC mentioned this pull request Aug 10, 2023
35 tasks
@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@vchuravy
Copy link
Member Author

Sigh...

julia: /workspace/srcdir/llvm-project/llvm/include/llvm/Support/Casting.h:566: decltype(auto) llvm::cast(const From&) [with To = llvm::MDNode; From = llvm::MDOperand]: Assertion `isa<To>(Val) && "cast<Ty>() argument of incompatible type!"' failed.

[9] signal (6.-6): Aborted
in expression starting at /opt/julia/share/julia/stdlib/v1.11/Dates/test/arithmetic.jl:518
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x7f0a72f1f40e)
__assert_fail at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
_ZN4llvm23addStringMetadataToLoopEPNS_4LoopEPKcj at /opt/julia/bin/../lib/julia/libLLVM-15jl.so (unknown line)
_ZN4llvm8peelLoopEPNS_4LoopEjPNS_8LoopInfoEPNS_15ScalarEvolutionERNS_13DominatorTreeEPNS_15AssumptionCacheEb at /opt/julia/bin/../lib/julia/libLLVM-15jl.so (unknown line)
_ZL15tryToUnrollLoopPN4llvm4LoopERNS_13DominatorTreeEPNS_8LoopInfoERNS_15ScalarEvolutionERKNS_19TargetTransformInfoERNS_15AssumptionCacheERNS_25OptimizationRemarkEmitterEPNS_18BlockFrequencyInfoEPNS_18ProfileSummaryInfoEbibbNS_8OptionalIjEESK_NSJ_IbEESL_SL_SL_SL_SK_ at /opt/julia/bin/../lib/julia/libLLVM-15jl.so (unknown line)
_ZN4llvm18LoopFullUnrollPass3runERNS_4LoopERNS_15AnalysisManagerIS1_JRNS_27LoopStandardAnalysisResultsEEEES5_RNS_10LPMUpdaterE at /opt/julia/bin/../lib/julia/libLLVM-15jl.so (unknown line)
run at /source/usr/include/llvm/IR/PassManagerInternal.h:88
_ZN4llvm11PassManagerINS_4LoopENS_15AnalysisManagerIS1_JRNS_27LoopStandardAnalysisResultsEEEEJS4_RNS_10LPMUpdaterEEE13runSinglePassIS1_St10unique_ptrINS_6detail11PassConceptIS1_S5_JS4_S7_EEESt14default_deleteISD_EEEENS_8OptionalINS_17PreservedAnalysesEEERT_RT0_RS5_S4_S7_RNS_19PassInstrumentationE at /opt/julia/bin/../lib/julia/libLLVM-15jl.so (unknown line)
_ZN4llvm11PassManagerINS_4LoopENS_15AnalysisManagerIS1_JRNS_27LoopStandardAnalysisResultsEEEEJS4_RNS_10LPMUpdaterEEE24runWithoutLoopNestPassesERS1_RS5_S4_S7_ at /opt/julia/bin/../lib/julia/libLLVM-15jl.so (unknown line)
_ZN4llvm11PassManagerINS_4LoopENS_15AnalysisManagerIS1_JRNS_27LoopStandardAnalysisResultsEEEEJS4_RNS_10LPMUpdaterEEE3runERS1_RS5_S4_S7_ at /opt/julia/bin/../lib/julia/libLLVM-15jl.so (unknown line)
run at /source/usr/include/llvm/IR/PassManagerInternal.h:88
_ZN4llvm25FunctionToLoopPassAdaptor3runERNS_8FunctionERNS_15AnalysisManagerIS1_JEEE at /opt/julia/bin/../lib/julia/libLLVM-15jl.so (unknown line)
run at /source/usr/include/llvm/IR/PassManagerInternal.h:88
_ZN4llvm11PassManagerINS_8FunctionENS_15AnalysisManagerIS1_JEEEJEE3runERS1_RS3_ at /opt/julia/bin/../lib/julia/libLLVM-15jl.so (unknown line)
run at /source/usr/include/llvm/IR/PassManagerInternal.h:88
_ZN4llvm27ModuleToFunctionPassAdaptor3runERNS_6ModuleERNS_15AnalysisManagerIS1_JEEE at /opt/julia/bin/../lib/julia/libLLVM-15jl.so (unknown line)
run at /source/usr/include/llvm/IR/PassManagerInternal.h:88
_ZN4llvm11PassManagerINS_6ModuleENS_15AnalysisManagerIS1_JEEEJEE3runERS1_RS3_ at /opt/julia/bin/../lib/julia/libLLVM-15jl.so (unknown line)

@vchuravy
Copy link
Member Author

@nanosoldier runtests(configuration = (buildflags=["LLVM_ASSERTIONS=1", "FORCE_ASSERTIONS=1"],), vs_configuration = (buildflags = ["LLVM_ASSERTIONS=1", "FORCE_ASSERTIONS=1"],))

@vchuravy vchuravy removed the needs nanosoldier run This PR should have benchmarks run on it label Aug 10, 2023
@vchuravy vchuravy requested a review from pchintalapudi August 10, 2023 14:14
@vchuravy
Copy link
Member Author

@nanosoldier runtests(vs = ":master", configuration = (buildflags=["LLVM_ASSERTIONS=1", "FORCE_ASSERTIONS=1"],), vs_configuration = (buildflags = ["LLVM_ASSERTIONS=1", "FORCE_ASSERTIONS=1"],))

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@vchuravy vchuravy removed the needs pkgeval Tests for all registered packages should be run with this change label Aug 13, 2023
@vchuravy vchuravy merged commit 90494c2 into master Aug 14, 2023
@vchuravy vchuravy deleted the vc/loopid branch August 14, 2023 15:59
KristofferC added a commit that referenced this pull request Aug 16, 2023
Backported PRs:
- [x] #50637 <!-- Remove SparseArrays legacy code -->
- [x] #50665 <!-- print `@time` msg into print buffer -->
- [x] #50523 <!-- Avoid generic call in most cases for getproperty -->
- [x] #50635 <!-- `versioninfo()`: include build info and unofficial
warning -->
- [x] #50670 <!-- Make reinterpret specialize fully. -->
- [x] #50666 <!-- include `--pkgimage=no` caches for stdlibs -->
- [x] #50765 
- [x] #50764
- [x] #50768
- [x] #50767
- [x] #50618 <!-- inference: continue const-prop' when concrete-eval
returns non-inlineable -->
- [x] #50689 <!-- Attach `tanpi` docstring to method -->
- [x] #50671 <!-- Fix rdiv of complex lhs by real factorizations -->
- [x] #50598 <!-- only limit types in stack traces in the REPL -->
- [x] #50766 <!-- Don't partition alwaysinline functions -->
- [x] #50771 <!-- re-allow non-string values in ENV `get!` -->
- [x] #50682 <!-- Add fallback if we have make a weird GC decision. -->
- [x] #50781 <!-- fix `bit_map!` with aliasing -->
- [x] #50172 <!-- print feature flags used for matching pkgimage -->
- [x] #50844 <!-- Bump OpenBLAS binaries to use the new GEMM
multithreading threshold -->
- [x] #50826 <!-- Update dependency builds -->
- [x] #50845 <!-- fix #50438, use default pool for at-threads -->
- [x] #50568 <!-- `Array(::AbstractRange)` should return an `Array` -->
- [x] #50655 <!-- fix hashing regression. -->
- [x] #50779 <!-- Minor refactor to image generation -->
- [x] #50791 <!-- Make symbols internal in jl_create_native, and only
externalize them when partitioning -->
- [x] #50724 <!-- Merge opaque closure modules with the rest of the
workqueue -->
- [x] #50738 <!-- Add alignment to constant globals -->
- [x] #50871 <!-- macOS: Don't inspect dead threadtls during exception
handling. -->

Need manual backport:

Contains multiple commits, manual intervention needed:

Non-merged PRs with backport label:
- [ ] #50850 <!-- Remove weird Rational dispatch and add pi functions to
list -->
- [ ] #50823 <!-- Make ranges more robust with unsigned indexes. -->
- [ ] #50809 <!-- Limit type-printing in MethodError -->
- [ ] #50663 <!-- Fix Expr(:loopinfo) codegen -->
- [ ] #50594 <!-- Disallow non-index Integer types in isassigned -->
- [ ] #50385 <!-- Precompile pidlocks: add to NEWS and docs -->
- [ ] #49805 <!-- Limit TimeType subtraction to AbstractDateTime -->
KristofferC added a commit that referenced this pull request Aug 18, 2023
Backported PRs:
- [x] #47782 <!-- Generalize Bool parse method to AbstractString -->
- [x] #48634 <!-- Remove unused "deps" mechanism in internal sorting
keywords [NFC] -->
- [x] #49931 <!-- Lock finalizers' lists at exit -->
- [x] #50064 <!-- Fix numbered prompt with input only with comment -->
- [x] #50474 <!-- docs: Fix a `!!! note` which was miscapitalized -->
- [x] #50516 <!-- Fix visibility of assert on GCC12/13 -->
- [x] #50635 <!-- `versioninfo()`: include build info and unofficial
warning -->
- [x] #49915 <!-- Revert "Remove number / vector (#44358)" -->
- [x] #50781 <!-- fix `bit_map!` with aliasing -->
- [x] #50845 <!-- fix #50438, use default pool for at-threads -->
- [x] #49031 <!-- Update inference.md -->
- [x] #50289 <!-- Initialize prev_nold and nold in gc_reset_page -->
- [x] #50559 <!-- Expand kwcall lowering positional default check to
vararg -->
- [x] #49582 <!-- Update HISTORY.md for `DelimitedFiles` -->
- [x] #50341 <!-- invokelatest docs should say not exported before 1.9
-->
- [x] #50525 <!-- only check that values are finite in `generic_lufact`
when `check=true` -->
- [x] #50444 <!-- Optimize getfield lowering to avoid boxing in some
cases -->
- [x] #50523 <!-- Avoid generic call in most cases for getproperty -->
- [x] #50860 <!-- Add `Base.get_extension` to docs/API -->
- [x] #50164 <!-- codegen: handle dead code with unsafe_store of FCA
pointers -->
- [x] #50568 <!-- `Array(::AbstractRange)` should return an `Array` -->
- [x] #50871 <!-- macOS: Don't inspect dead threadtls during exception
handling. -->

Need manual backport:
- [ ] #48542 <!-- Add docs on task-specific buffering using
multithreading -->
- [ ] #50591 <!-- build: fix various makefile bugs -->


Non-merged PRs with backport label:
- [ ] #50842 <!-- Avoid race conditions with recursive rm -->
- [ ] #50823 <!-- Make ranges more robust with unsigned indexes. -->
- [ ] #50663 <!-- Fix Expr(:loopinfo) codegen -->
- [ ] #49716 <!-- Update varinfo() docstring signature -->
- [ ] #49713 <!-- prevent REPL from erroring in numbered mode in some
situations -->
- [ ] #49573 <!-- Implement jl_cpu_pause on PPC64 -->
- [ ] #48726 <!-- fix macro expansion of property destructuring -->
- [ ] #48642 <!-- Use gc alloc instead of alloc typed in lowering -->
- [ ] #48183 <!-- Don't use pkgimage for package if any includes fall in
tracked path for coverage or alloc tracking -->
- [ ] #48050 <!-- improve `--heap-size-hint` arg handling -->
- [ ] #47615 <!-- Allow threadsafe access to buffer of type inference
profiling trees -->
@IanButterworth
Copy link
Member

IanButterworth commented Aug 19, 2023

FYI there are a lot of conflicts when trying to backport this to 1.9 and 1.10

KristofferC added a commit that referenced this pull request Oct 2, 2023
Backported PRs:
- [x] #48625 <!-- add replace(io, str, patterns...) -->
- [x] #48387 <!-- Improve documentation of sort-related functions -->
- [x] #48363 <!-- Revise sort.md and docstrings in sort.jl -->
- [x] #48977 <!-- Update SparseArrays.jl stdlib for SuiteSparse 7 -->
- [x] #50719 <!-- fix `CyclePadding(::DataType)` -->
- [x] #50694 <!-- inference: permit recursive type traits -->
- [x] #50860 <!-- Add `Base.get_extension` to docs/API -->
- [x] #50594 <!-- Disallow non-index Integer types in isassigned -->
- [x] #50802 <!-- Makes IntrusiveLinkedListSynchronized mutable to avoid
allocation on poptask -->
- [x] #50858 <!-- Add a `threadpool` parameter to `Channel` constructor
-->
- [x] #50874 <!-- Restrict COFF to a single thread when symbol count is
high -->
- [x] #50822 <!-- Add default method for setmodifiers! -->
- [x] #50730 <!-- Fix integer overflow in `isapprox` -->
- [x] #50850 <!-- Remove weird Rational dispatch and add pi functions to
list -->
- [x] #50809 <!-- Limit type-printing in MethodError -->
- [x] #50915 <!-- Add note the `Task` about sticky bit -->
- [x] #50929 <!-- when widening tuple types in tmerge, only widen the
complex parts -->
- [x] #50928 <!-- Bump JuliaSyntax to 0.4.6 -->
- [x] #50959 <!-- Update libssh2 patches -->
- [x] #50823 <!-- Make ranges more robust with unsigned indexes. -->
- [x] #48542 <!-- Add docs on task-specific buffering using
multithreading -->
- [x] #50912 <!-- Separate foreign threads into a :foreign threadpool
-->
- [x] #51010 <!-- Add ORIGIN to SuiteSparse rpath on Linux/FreeBSD -->
- [x] #50753 <!-- cat: remove unused promote_eltype methods that confuse
inference -->
- [x] #51027 <!-- Implement realloc accounting correctly -->
- [x] #51019 <!-- fix a case of potentially use of undefined variable
when handling error in distributed message processing -->
- [x] #51039 <!-- Revert "Optimize findall(f, ::AbstractArray{Bool})
(#42202)" -->
- [x] #51036 <!-- add missing invoke edge for nospecialize targets -->
- [x] #51042 <!-- inference: fix return_type_tfunc modeling of concrete
functions -->
- [x] #51114 <!-- Workaround upstream FreeBSD issue #272992 -->
- [x] #50892 <!-- Add `JL_DLLIMPORT` to `small_typeof` declaration -->
- [x] #51154 <!-- broadcast: use recursion rather than ntuple to map
over a tuple -->
- [x] #51153 <!-- fix debug typo in "add missing invoke edge for
nospecialize targets (#51036)" -->
- [x] #51222 <!-- Check again if the tty is open inside the IO lock -->
- [x] #51236 <!-- Add lock around uv_unref during init -->
- [x] #51243 <!-- GMP: Gracefully handle more overflows. -->
- [x] #51254 <!-- Ryu: make sure adding zeros does not overwrite
trailing dot -->
- [x] #51175 <!-- shorten stale_age for cachefile lock -->
- [x] #51300 <!-- fix method definition error for bad vararg -->
- [x] #51307 <!-- fix force-throw ctrl-C on Windows -->
- [x] #51303 <!-- ensure revising structs is safe -->
- [x] #51393 
- [x] #51403 

Need manual backport:
- [x] #51009 <!-- fix #50562, regression in `in` of tuple of Symbols -->
- [x] #51053 <!-- Bump Statistics stdlib -->
- [x] #51013 <!-- fix #50709, type bound error due to free typevar in
sparam env -->
- [x] #51305 <!-- reduce test time for rounding and floatfuncs -->

Contains multiple commits, manual intervention needed:
- [ ] #50663 <!-- Fix Expr(:loopinfo) codegen -->
- [ ] #51035 <!-- refactor GC scanning code to reflect jl_binding_t are
now first class -->
- [ ] #51092 <!-- inference: fix bad effects for recursion -->
- [x] #51247 <!-- Check if malloc has succeeded before incrementing gc
stats -->
- [x] #51294 <!-- use LibGit2_jll for LibGit2 library -->

Non-merged PRs with backport label:
- [ ] #51132 <!-- Handle `AbstractQ` in concatenation -->
- [x] #51029 <!-- testdefs: make sure that if a test set changes the
active project, they change it back when they're done -->
- [ ] #50919 <!-- Code loading: do the "skipping mtime check for stdlib"
check regardless of the value of `ispath(f)` -->
- [ ] #50824 <!-- Add some aliasing warnings to docstrings for mutating
functions -->
- [x] #50385 <!-- Precompile pidlocks: add to NEWS and docs -->
- [ ] #49805 <!-- Limit TimeType subtraction to AbstractDateTime -->
@KristofferC KristofferC mentioned this pull request Oct 3, 2023
31 tasks
vchuravy added a commit that referenced this pull request Oct 28, 2023
KristofferC added a commit that referenced this pull request Nov 2, 2023
Backported PRs:
- [x] #50932 <!-- types: fix hash values of Vararg -->
- [x] #50975 <!-- Use rr-safe `nopl; rdtsc` sequence -->
- [x] #50989 <!-- fix incorrect results in `expm1(::Union{Float16,
Float32})` -->
- [x] #51284 <!-- Avoid infinite loop when doing SIGTRAP in arm64-apple
-->
- [x] #51332 <!-- Add s4 field to Xoshiro -->
- [x] #51397 <!-- call Pkg precompile hook in latest world -->
- [x] #51405 <!-- Remove fallback that assigns a module to inlined
frames. -->
- [x] #51491 <!-- Throw clearer ArgumentError for strip with two string
args -->
- [x] #51531 <!-- fix `_tryonce_download_from_cache` (busybox.exe
download error) -->
- [x] #51541 <!-- Fix string index error in tab completion code -->
- [x] #51530 <!-- Don't mark nonlocal symbols as hidden -->
- [x] #51557 <!-- Fix last startup & shutdown precompiles -->
- [x] #51512 <!-- avoid limiting Type{Any} to Type -->
- [x] #51595 <!-- reset `maxprobe` on `empty!` -->
- [x] #51582 <!-- Aggressive constprop in LinearAlgebra.wrap -->
- [x] #51592 <!-- correctly track element pointer in heap snapshot -->
- [x] #51326 <!-- complete false & true more generally as vals -->
- [x] #51376 <!-- make `hash(::Xoshiro)` compatible with `==` -->
- [x] #51557 <!-- Fix last startup & shutdown precompiles -->
- [x] #51845 
- [x] #51840 
- [x] #50663 <!-- Fix Expr(:loopinfo) codegen -->
- [x] #51863 <!-- LLVM 15.0.7-9 -->

Contains multiple commits, manual intervention needed:

- [ ] #51035 <!-- refactor GC scanning code to reflect jl_binding_t are
now first class -->
- [ ] #51092 <!-- inference: fix bad effects for recursion -->

Non-merged PRs with backport label:
- [ ] #51479 <!-- prevent code loading from lookin in the versioned
environment when building Julia -->
- [ ] #51414 <!-- improvements on GC scheduler shutdown -->
- [ ] #51366 <!-- Handle infix operators in REPL completion -->
- [ ] #50919 <!-- Code loading: do the "skipping mtime check for stdlib"
check regardless of the value of `ispath(f)` -->
- [ ] #50824 <!-- Add some aliasing warnings to docstrings for mutating
functions in Base -->
- [ ] #49805 <!-- Limit TimeType subtraction to AbstractDateTime -->
@KristofferC KristofferC removed the backport 1.10 Change should be backported to the 1.10 release label Nov 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 1.9 Change should be backported to release-1.9 bugfix This change fixes an existing bug compiler:codegen Generation of LLVM IR and native code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

julia.loopinfo_marker can be attaced to wrong loop.
6 participants