Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can not handle @llvm.maximum.f32(float %38, float %39) #1568

Closed
jgreener64 opened this issue Dec 4, 2023 · 4 comments
Closed

Can not handle @llvm.maximum.f32(float %38, float %39) #1568

jgreener64 opened this issue Dec 4, 2023 · 4 comments

Comments

@jgreener64
Copy link
Contributor

I am on Enzyme main (9d6963c) and CUDA 5.1.1. This worked on CUDA 4.4.1 but now errors:

using Enzyme, CUDA, Atomix, StaticArrays, LinearAlgebra

struct HarmonicAngle{K, D}
    k::K
    θ0::D
end

Base.zero(::HarmonicAngle{K, D}) where {K, D} = HarmonicAngle(zero(K), zero(D))

Base.:+(a1::HarmonicAngle, a2::HarmonicAngle) = HarmonicAngle(a1.k + a2.k, a1.θ0 + a2.θ0)

function f(a::HarmonicAngle, coords_i, coords_j, coords_k)
    vec_ji = coords_i - coords_j
    vec_jk = coords_k - coords_j
    θ = acos(dot(vec_ji, vec_jk) / (norm(vec_ji) * norm(vec_jk)))
    return (a.k / 2) *- a.θ0) ^ 2
end

function kernel!(energy, coords_var, is_var, js_var, ks_var, inters_var)
    coords = CUDA.Const(coords_var)
    is = CUDA.Const(is_var)
    js = CUDA.Const(js_var)
    ks = CUDA.Const(ks_var)
    inters = CUDA.Const(inters_var)

    inter_i = (blockIdx().x - 1) * blockDim().x + threadIdx().x

    @inbounds if inter_i <= length(is)
        i, j, k = is[inter_i], js[inter_i], ks[inter_i]
        pe = f(inters[inter_i], coords[i], coords[j], coords[k])
        Atomix.@atomic :monotonic energy[1] += pe
    end
    return nothing
end

function grad_kernel!(energy, d_energy, coords, d_coords, is, js, ks, inters, d_inters)
    Enzyme.autodiff_deferred(
        Enzyme.Reverse,
        kernel!,
        Const,
        Duplicated(energy, d_energy),
        Duplicated(coords, d_coords),
        Const(is),
        Const(js),
        Const(ks),
        Duplicated(inters, d_inters),
    )
    return nothing
end

pe_vec = CuArray([0.0f0])
d_pe_vec = CuArray([1.0f0])
coords = CuArray([
    SVector(1.0f0, 1.0f0, 1.0f0),
    SVector(2.0f0, 2.1f0, 2.0f0),
    SVector(3.0f0, 3.2f0, 3.3f0),
    SVector(4.0f0, 4.1f0, 4.5f0),
])
d_coords = zero(coords)
is = CuArray([1, 2])
js = CuArray([2, 3])
ks = CuArray([3, 4])
inters = CuArray([
    HarmonicAngle(100.0f0, deg2rad(90.0f0)),
    HarmonicAngle(100.0f0, deg2rad(90.0f0)),
])
d_inters = CuArray([
    HarmonicAngle(0.0f0, 0.0f0),
    HarmonicAngle(0.0f0, 0.0f0),
])

CUDA.@sync @cuda threads=128 kernel!(pe_vec, coords, is, js, ks, inters) # Works

CUDA.@sync @cuda threads=128 grad_kernel!(
    pe_vec, d_pe_vec, coords, d_coords, is, js, ks, inters, d_inters) # Errors
ERROR: LoadError: InvalidIRError: compiling MethodInstance for grad_kernel!(::CuDeviceVector{Float32, 1}, ::CuDeviceVector{Float32, 1}, ::CuDeviceVector{SVector{3, Float32}, 1}, ::CuDeviceVector{SVector{3, Float32}, 1}, ::CuDeviceVector{Int64, 1}, ::CuDeviceVector{Int64, 1}, ::CuDeviceVector{Int64, 1}, ::CuDeviceVector{HarmonicAngle{Float32, Float32}, 1}, ::CuDeviceVector{HarmonicAngle{Float32, Float32}, 1}) resulted in invalid LLVM IR
Reason: unsupported call through a literal pointer (call to )
Stacktrace:
  [1] #abs
    @ ~/.julia/dev/CUDA/src/device/intrinsics/math.jl:225
  [2] maxabs_nested
    @ ~/.julia/packages/StaticArrays/yXGNL/src/linalg.jl:237
  [3] maxabs_nested
    @ ~/.julia/packages/StaticArrays/yXGNL/src/linalg.jl:243
  [4] macro expansion
    @ ~/.julia/packages/StaticArrays/yXGNL/src/linalg.jl:257
  [5] _norm_scaled
    @ ~/.julia/packages/StaticArrays/yXGNL/src/linalg.jl:249
  [6] macro expansion
    @ ~/.julia/packages/StaticArrays/yXGNL/src/linalg.jl:279
  [7] _norm
    @ ~/.julia/packages/StaticArrays/yXGNL/src/linalg.jl:266
  [8] norm
    @ ~/.julia/packages/StaticArrays/yXGNL/src/linalg.jl:265
  [9] f
    @ ~/dms/molly_dev/enzyme_err32.jl:20
 [10] multiple call sites
    @ unknown:0
Reason: unsupported call through a literal pointer (call to )
Stacktrace:
 [1] #max
   @ ~/.julia/dev/CUDA/src/device/intrinsics/math.jl:328
 [2] maxabs_nested
   @ ~/.julia/packages/StaticArrays/yXGNL/src/linalg.jl:243
 [3] macro expansion
   @ ~/.julia/packages/StaticArrays/yXGNL/src/linalg.jl:257
 [4] _norm_scaled
   @ ~/.julia/packages/StaticArrays/yXGNL/src/linalg.jl:249
 [5] macro expansion
   @ ~/.julia/packages/StaticArrays/yXGNL/src/linalg.jl:279
 [6] _norm
   @ ~/.julia/packages/StaticArrays/yXGNL/src/linalg.jl:266
 [7] norm
   @ ~/.julia/packages/StaticArrays/yXGNL/src/linalg.jl:265
 [8] f
   @ ~/dms/molly_dev/enzyme_err32.jl:20
 [9] multiple call sites
   @ unknown:0
Hint: catch this exception as `err` and call `code_typed(err; interactive = true)` to introspect the erronous code with Cthulhu.jl
Stacktrace:
  [1] check_ir(job::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}, args::LLVM.Module)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/validation.jl:147
  [2] macro expansion
    @ ~/.julia/packages/GPUCompiler/U36Ed/src/driver.jl:440 [inlined]
  [3] macro expansion
    @ ~/.julia/packages/TimerOutputs/RsWnF/src/TimerOutput.jl:253 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/GPUCompiler/U36Ed/src/driver.jl:439 [inlined]
  [5] emit_llvm(job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, only_entry::Bool, validate::Bool)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/utils.jl:92
  [6] emit_llvm
    @ ~/.julia/packages/GPUCompiler/U36Ed/src/utils.jl:86 [inlined]
  [7] codegen(output::Symbol, job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, strip::Bool, validate::Bool, only_entry::Bool, parent_job::Nothing)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/driver.jl:129
  [8] compile(target::Symbol, job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, strip::Bool, validate::Bool, only_entry::Bool)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/driver.jl:106
  [9] compile
    @ ~/.julia/packages/GPUCompiler/U36Ed/src/driver.jl:98 [inlined]
 [10] EnzymeAD/Enzyme.jl#1075
    @ ~/.julia/dev/CUDA/src/compiler/compilation.jl:247 [inlined]
 [11] JuliaContext(f::CUDA.var"#1075#1077"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/driver.jl:47
 [12] compile(job::GPUCompiler.CompilerJob)
    @ CUDA ~/.julia/dev/CUDA/src/compiler/compilation.jl:246
 [13] actual_compilation(cache::Dict{Any, CuFunction}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}, compiler::typeof(CUDA.compile), linker::typeof(CUDA.link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/execution.jl:125
 [14] cached_compilation(cache::Dict{Any, CuFunction}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}, compiler::Function, linker::Function)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/execution.jl:103
 [15] macro expansion
    @ ~/.julia/dev/CUDA/src/compiler/execution.jl:382 [inlined]
 [16] macro expansion
    @ ./lock.jl:267 [inlined]
 [17] cufunction(f::typeof(grad_kernel!), tt::Type{Tuple{CuDeviceVector{Float32, 1}, CuDeviceVector{Float32, 1}, CuDeviceVector{SVector{3, Float32}, 1}, CuDeviceVector{SVector{3, Float32}, 1}, CuDeviceVector{Int64, 1}, CuDeviceVector{Int64, 1}, CuDeviceVector{Int64, 1}, CuDeviceVector{HarmonicAngle{Float32, Float32}, 1}, CuDeviceVector{HarmonicAngle{Float32, Float32}, 1}}}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ CUDA ~/.julia/dev/CUDA/src/compiler/execution.jl:377
 [18] cufunction(f::typeof(grad_kernel!), tt::Type{Tuple{CuDeviceVector{Float32, 1}, CuDeviceVector{Float32, 1}, CuDeviceVector{SVector{3, Float32}, 1}, CuDeviceVector{SVector{3, Float32}, 1}, CuDeviceVector{Int64, 1}, CuDeviceVector{Int64, 1}, CuDeviceVector{Int64, 1}, CuDeviceVector{HarmonicAngle{Float32, Float32}, 1}, CuDeviceVector{HarmonicAngle{Float32, Float32}, 1}}})
    @ CUDA ~/.julia/dev/CUDA/src/compiler/execution.jl:374
 [19] macro expansion
    @ ~/.julia/dev/CUDA/src/compiler/execution.jl:104 [inlined]
 [20] top-level scope
    @ ~/.julia/dev/CUDA/src/utilities.jl:35

Attached are the printall_error.txt and the device_code.zip.

@vchuravy
Copy link
Member

vchuravy commented Dec 4, 2023

@0 = private unnamed_addr constant [26221 x i8] c"Enzyme compilation failed.\0ACurrent scope: \0A; Function Attrs: mustprogress willreturn\0Adefine internal fastcc float @preprocess_julia_f_6109([2 x float] addrspace(11)* nocapture nofree noundef nonnull readonly align 4 dereferenceable(8) %0, [1 x [3 x float]] addrspace(11)* nocapture nofree noundef nonnull readonly align 4 dereferenceable(12) %1, [1 x [3 x float]] addrspace(11)* nocapture nofree noundef nonnull readonly align 4 dereferenceable(12) %2, [1 x [3 x float]] addrspace(11)* nocapture nofree noundef nonnull readonly align 4 dereferenceable(12) %3) unnamed_addr EnzymeAD/Enzyme.jl#10 !dbg !413 {\0Atop:\0A  %4 = call {}*** @julia.get_pgcstack() EnzymeAD/Enzyme.jl#11\0A  %5 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %1, i64 0, i64 0, i64 0, !dbg !414\0A  %6 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %2, i64 0, i64 0, i64 0, !dbg !414\0A  %7 = load float, float addrspace(11)* %5, align 4, !dbg !421, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %8 = load float, float addrspace(11)* %6, align 4, !dbg !421, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %9 = fsub float %7, %8, !dbg !421\0A  %10 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %1, i64 0, i64 0, i64 1, !dbg !414\0A  %11 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %2, i64 0, i64 0, i64 1, !dbg !414\0A  %12 = load float, float addrspace(11)* %10, align 4, !dbg !421, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %13 = load float, float addrspace(11)* %11, align 4, !dbg !421, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %14 = fsub float %12, %13, !dbg !421\0A  %15 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %1, i64 0, i64 0, i64 2, !dbg !414\0A  %16 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %2, i64 0, i64 0, i64 2, !dbg !414\0A  %17 = load float, float addrspace(11)* %15, align 4, !dbg !421, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %18 = load float, float addrspace(11)* %16, align 4, !dbg !421, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %19 = fsub float %17, %18, !dbg !421\0A  %20 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %3, i64 0, i64 0, i64 0, !dbg !422\0A  %21 = load float, float addrspace(11)* %20, align 4, !dbg !429, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %22 = fsub float %21, %8, !dbg !429\0A  %23 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %3, i64 0, i64 0, i64 1, !dbg !422\0A  %24 = load float, float addrspace(11)* %23, align 4, !dbg !429, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %25 = fsub float %24, %13, !dbg !429\0A  %26 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %3, i64 0, i64 0, i64 2, !dbg !422\0A  %27 = load float, float addrspace(11)* %26, align 4, !dbg !429, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %28 = fsub float %27, %18, !dbg !429\0A  %29 = fmul float %9, %9, !dbg !430\0A  %30 = fmul float %14, %14, !dbg !430\0A  %31 = fadd float %29, %30, !dbg !437\0A  %32 = fmul float %19, %19, !dbg !430\0A  %33 = fadd float %31, %32, !dbg !437\0A  %34 = call float @__nv_sqrtf(float %33) EnzymeAD/Enzyme.jl#11, !dbg !438\0A  %35 = fcmp ule float %34, 0.000000e+00, !dbg !439\0A  br i1 %35, label %L85, label %L78, !dbg !440\0A\0AL78:                                              ; preds = %top\0A  %36 = call i32 @__nv_finitef(float %34) EnzymeAD/Enzyme.jl#11, !dbg !441\0A  %37 = icmp eq i32 %36, 0, !dbg !442\0A  br i1 %37, label %L85, label %L170, !dbg !440\0A\0AL85:                                              ; preds = %L78, %top\0A  %38 = call float @__nv_fabsf(float %9) EnzymeAD/Enzyme.jl#11, !dbg !445\0A  %39 = call float @__nv_fabsf(float %14) EnzymeAD/Enzyme.jl#11, !dbg !451\0A  %40 = call float @llvm.maximum.f32(float %38, float %39) EnzymeAD/Enzyme.jl#11, !dbg !454\0A  %41 = call float @__nv_fabsf(float %19) EnzymeAD/Enzyme.jl#11, !dbg !451\0A  %42 = call float @llvm.maximum.f32(float %40, float %41) EnzymeAD/Enzyme.jl#11, !dbg !454\0A  %43 = call i32 @__nv_finitef(float %42) EnzymeAD/Enzyme.jl#11, !dbg !455\0A  %.not37 = icmp eq i32 %43, 0, !dbg !457\0A  br i1 %.not37, label %L170, label %L151, !dbg !456\0A\0AL151:                                             ; preds = %L85\0A  %44 = fcmp une float %42, 0.000000e+00, !dbg !460\0A  br i1 %44, label %L155, label %L153, !dbg !462\0A\0AL153:                                             ; preds = %L151\0A  %45 = call float @__nv_fabsf(float noundef 0.000000e+00) EnzymeAD/Enzyme.jl#11, !dbg !463\0A  br label %L170, !dbg !467\0A\0AL155:                                             ; preds = %L151\0A  %46 = fdiv float %9, %42, !dbg !469\0A  %47 = fmul float %46, %46, !dbg !471\0A  %48 = fdiv float %14, %42, !dbg !469\0A  %49 = fmul float %48, %48, !dbg !471\0A  %50 = fadd float %47, %49, !dbg !474\0A  %51 = fdiv float %19, %42, !dbg !469\0A  %52 = fmul float %51, %51, !dbg !471\0A  %53 = fadd float %52, %50, !dbg !474\0A  %54 = call float @__nv_sqrtf(float %53) EnzymeAD/Enzyme.jl#11, !dbg !475\0A  %55 = fmul float %42, %54, !dbg !476\0A  br label %L170, !dbg !467\0A\0AL170:                                             ; preds = %L155, %L153, %L85, %L78\0A  %value_phi5 = phi float [ %34, %L78 ], [ %45, %L153 ], [ %55, %L155 ], [ %42, %L85 ]\0A  %56 = fmul float %22, %22, !dbg !430\0A  %57 = fmul float %25, %25, !dbg !430\0A  %58 = fadd float %56, %57, !dbg !437\0A  %59 = fmul float %28, %28, !dbg !430\0A  %60 = fadd float %58, %59, !dbg !437\0A  %61 = call float @__nv_sqrtf(float %60) EnzymeAD/Enzyme.jl#11, !dbg !438\0A  %62 = fcmp ule float %61, 0.000000e+00, !dbg !439\0A  br i1 %62, label %L185, label %L178, !dbg !440\0A\0AL178:                                             ; preds = %L170\0A  %63 = call i32 @__nv_finitef(float %61) EnzymeAD/Enzyme.jl#11, !dbg !441\0A  %64 = icmp eq i32 %63, 0, !dbg !442\0A  br i1 %64, label %L185, label %L270, !dbg !440\0A\0AL185:                                             ; preds = %L178, %L170\0A  %65 = call float @__nv_fabsf(float %22) EnzymeAD/Enzyme.jl#11, !dbg !445\0A  %66 = call float @__nv_fabsf(float %25) EnzymeAD/Enzyme.jl#11, !dbg !451\0A  %67 = call float @llvm.maximum.f32(float %65, float %66) EnzymeAD/Enzyme.jl#11, !dbg !454\0A  %68 = call float @__nv_fabsf(float %28) EnzymeAD/Enzyme.jl#11, !dbg !451\0A  %69 = call float @llvm.maximum.f32(float %67, float %68) EnzymeAD/Enzyme.jl#11, !dbg !454\0A  %70 = call i32 @__nv_finitef(float %69) EnzymeAD/Enzyme.jl#11, !dbg !455\0A  %.not35 = icmp eq i32 %70, 0, !dbg !457\0A  br i1 %.not35, label %L270, label %L251, !dbg !456\0A\0AL251:                                             ; preds = %L185\0A  %71 = fcmp une float %69, 0.000000e+00, !dbg !460\0A  br i1 %71, label %L255, label %L253, !dbg !462\0A\0AL253:                                             ; preds = %L251\0A  %72 = call float @__nv_fabsf(float noundef 0.000000e+00) EnzymeAD/Enzyme.jl#11, !dbg !463\0A  br label %L270, !dbg !467\0A\0AL255:                                             ; preds = %L251\0A  %73 = fdiv float %22, %69, !dbg !469\0A  %74 = fmul float %73, %73, !dbg !471\0A  %75 = fdiv float %25, %69, !dbg !469\0A  %76 = fmul float %75, %75, !dbg !471\0A  %77 = fadd float %74, %76, !dbg !474\0A  %78 = fdiv float %28, %69, !dbg !469\0A  %79 = fmul float %78, %78, !dbg !471\0A  %80 = fadd float %79, %77, !dbg !474\0A  %81 = call float @__nv_sqrtf(float %80) EnzymeAD/Enzyme.jl#11, !dbg !475\0A  %82 = fmul float %69, %81, !dbg !476\0A  br label %L270, !dbg !467\0A\0AL270:                                             ; preds = %L255, %L253, %L185, %L178\0A  %value_phi6 = phi float [ %61, %L178 ], [ %72, %L253 ], [ %82, %L255 ], [ %69, %L185 ]\0A  %83 = fmul float %9, %22, !dbg !477\0A  %84 = fmul float %14, %25, !dbg !477\0A  %85 = fadd fast float %84, %83, !dbg !483\0A  %86 = fmul float %19, %28, !dbg !477\0A  %87 = fadd fast float %85, %86, !dbg !483\0A  %88 = fmul float %value_phi5, %value_phi6, !dbg !484\0A  %89 = fdiv float %87, %88, !dbg !485\0A  %90 = call float @__nv_acosf(float %89) EnzymeAD/Enzyme.jl#12, !dbg !486\0A  %91 = getelementptr inbounds [2 x float], [2 x float] addrspace(11)* %0, i64 0, i64 0, !dbg !487\0A  %92 = load float, float addrspace(11)* %91, align 4, !dbg !489, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %93 = fmul float %92, 5.000000e-01, !dbg !489\0A  %94 = getelementptr inbounds [2 x float], [2 x float] addrspace(11)* %0, i64 0, i64 1, !dbg !487\0A  %95 = load float, float addrspace(11)* %94, align 4, !dbg !491, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %96 = fsub float %90, %95, !dbg !491\0A  %97 = fmul float %96, %96, !dbg !492\0A  %98 = fmul float %93, %97, !dbg !494\0A  ret float %98, !dbg !488\0A}\0A\0A; Function Attrs: mustprogress willreturn\0Adefine internal fastcc float @preprocess_julia_f_6109([2 x float] addrspace(11)* nocapture nofree noundef nonnull readonly align 4 dereferenceable(8) %0, [1 x [3 x float]] addrspace(11)* nocapture nofree noundef nonnull readonly align 4 dereferenceable(12) %1, [1 x [3 x float]] addrspace(11)* nocapture nofree noundef nonnull readonly align 4 dereferenceable(12) %2, [1 x [3 x float]] addrspace(11)* nocapture nofree noundef nonnull readonly align 4 dereferenceable(12) %3) unnamed_addr EnzymeAD/Enzyme.jl#10 !dbg !413 {\0Atop:\0A  %4 = call {}*** @julia.get_pgcstack() EnzymeAD/Enzyme.jl#11\0A  %5 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %1, i64 0, i64 0, i64 0, !dbg !414\0A  %6 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %2, i64 0, i64 0, i64 0, !dbg !414\0A  %7 = load float, float addrspace(11)* %5, align 4, !dbg !421, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %8 = load float, float addrspace(11)* %6, align 4, !dbg !421, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %9 = fsub float %7, %8, !dbg !421\0A  %10 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %1, i64 0, i64 0, i64 1, !dbg !414\0A  %11 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %2, i64 0, i64 0, i64 1, !dbg !414\0A  %12 = load float, float addrspace(11)* %10, align 4, !dbg !421, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %13 = load float, float addrspace(11)* %11, align 4, !dbg !421, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %14 = fsub float %12, %13, !dbg !421\0A  %15 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %1, i64 0, i64 0, i64 2, !dbg !414\0A  %16 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %2, i64 0, i64 0, i64 2, !dbg !414\0A  %17 = load float, float addrspace(11)* %15, align 4, !dbg !421, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %18 = load float, float addrspace(11)* %16, align 4, !dbg !421, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %19 = fsub float %17, %18, !dbg !421\0A  %20 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %3, i64 0, i64 0, i64 0, !dbg !422\0A  %21 = load float, float addrspace(11)* %20, align 4, !dbg !429, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %22 = fsub float %21, %8, !dbg !429\0A  %23 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %3, i64 0, i64 0, i64 1, !dbg !422\0A  %24 = load float, float addrspace(11)* %23, align 4, !dbg !429, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %25 = fsub float %24, %13, !dbg !429\0A  %26 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %3, i64 0, i64 0, i64 2, !dbg !422\0A  %27 = load float, float addrspace(11)* %26, align 4, !dbg !429, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %28 = fsub float %27, %18, !dbg !429\0A  %29 = fmul float %9, %9, !dbg !430\0A  %30 = fmul float %14, %14, !dbg !430\0A  %31 = fadd float %29, %30, !dbg !437\0A  %32 = fmul float %19, %19, !dbg !430\0A  %33 = fadd float %31, %32, !dbg !437\0A  %34 = call float @__nv_sqrtf(float %33) EnzymeAD/Enzyme.jl#11, !dbg !438\0A  %35 = fcmp ule float %34, 0.000000e+00, !dbg !439\0A  br i1 %35, label %L85, label %L78, !dbg !440\0A\0AL78:                                              ; preds = %top\0A  %36 = call i32 @__nv_finitef(float %34) EnzymeAD/Enzyme.jl#11, !dbg !441\0A  %37 = icmp eq i32 %36, 0, !dbg !442\0A  br i1 %37, label %L85, label %L170, !dbg !440\0A\0AL85:                                              ; preds = %L78, %top\0A  %38 = call float @__nv_fabsf(float %9) EnzymeAD/Enzyme.jl#11, !dbg !445\0A  %39 = call float @__nv_fabsf(float %14) EnzymeAD/Enzyme.jl#11, !dbg !451\0A  %40 = call float @llvm.maximum.f32(float %38, float %39) EnzymeAD/Enzyme.jl#11, !dbg !454\0A  %41 = call float @__nv_fabsf(float %19) EnzymeAD/Enzyme.jl#11, !dbg !451\0A  %42 = call float @llvm.maximum.f32(float %40, float %41) EnzymeAD/Enzyme.jl#11, !dbg !454\0A  %43 = call i32 @__nv_finitef(float %42) EnzymeAD/Enzyme.jl#11, !dbg !455\0A  %.not37 = icmp eq i32 %43, 0, !dbg !457\0A  br i1 %.not37, label %L170, label %L151, !dbg !456\0A\0AL151:                                             ; preds = %L85\0A  %44 = fcmp une float %42, 0.000000e+00, !dbg !460\0A  br i1 %44, label %L155, label %L153, !dbg !462\0A\0AL153:                                             ; preds = %L151\0A  %45 = call float @__nv_fabsf(float noundef 0.000000e+00) EnzymeAD/Enzyme.jl#11, !dbg !463\0A  br label %L170, !dbg !467\0A\0AL155:                                             ; preds = %L151\0A  %46 = fdiv float %9, %42, !dbg !469\0A  %47 = fmul float %46, %46, !dbg !471\0A  %48 = fdiv float %14, %42, !dbg !469\0A  %49 = fmul float %48, %48, !dbg !471\0A  %50 = fadd float %47, %49, !dbg !474\0A  %51 = fdiv float %19, %42, !dbg !469\0A  %52 = fmul float %51, %51, !dbg !471\0A  %53 = fadd float %52, %50, !dbg !474\0A  %54 = call float @__nv_sqrtf(float %53) EnzymeAD/Enzyme.jl#11, !dbg !475\0A  %55 = fmul float %42, %54, !dbg !476\0A  br label %L170, !dbg !467\0A\0AL170:                                             ; preds = %L155, %L153, %L85, %L78\0A  %value_phi5 = phi float [ %34, %L78 ], [ %45, %L153 ], [ %55, %L155 ], [ %42, %L85 ]\0A  %56 = fmul float %22, %22, !dbg !430\0A  %57 = fmul float %25, %25, !dbg !430\0A  %58 = fadd float %56, %57, !dbg !437\0A  %59 = fmul float %28, %28, !dbg !430\0A  %60 = fadd float %58, %59, !dbg !437\0A  %61 = call float @__nv_sqrtf(float %60) EnzymeAD/Enzyme.jl#11, !dbg !438\0A  %62 = fcmp ule float %61, 0.000000e+00, !dbg !439\0A  br i1 %62, label %L185, label %L178, !dbg !440\0A\0AL178:                                             ; preds = %L170\0A  %63 = call i32 @__nv_finitef(float %61) EnzymeAD/Enzyme.jl#11, !dbg !441\0A  %64 = icmp eq i32 %63, 0, !dbg !442\0A  br i1 %64, label %L185, label %L270, !dbg !440\0A\0AL185:                                             ; preds = %L178, %L170\0A  %65 = call float @__nv_fabsf(float %22) EnzymeAD/Enzyme.jl#11, !dbg !445\0A  %66 = call float @__nv_fabsf(float %25) EnzymeAD/Enzyme.jl#11, !dbg !451\0A  %67 = call float @llvm.maximum.f32(float %65, float %66) EnzymeAD/Enzyme.jl#11, !dbg !454\0A  %68 = call float @__nv_fabsf(float %28) EnzymeAD/Enzyme.jl#11, !dbg !451\0A  %69 = call float @llvm.maximum.f32(float %67, float %68) EnzymeAD/Enzyme.jl#11, !dbg !454\0A  %70 = call i32 @__nv_finitef(float %69) EnzymeAD/Enzyme.jl#11, !dbg !455\0A  %.not35 = icmp eq i32 %70, 0, !dbg !457\0A  br i1 %.not35, label %L270, label %L251, !dbg !456\0A\0AL251:                                             ; preds = %L185\0A  %71 = fcmp une float %69, 0.000000e+00, !dbg !460\0A  br i1 %71, label %L255, label %L253, !dbg !462\0A\0AL253:                                             ; preds = %L251\0A  %72 = call float @__nv_fabsf(float noundef 0.000000e+00) EnzymeAD/Enzyme.jl#11, !dbg !463\0A  br label %L270, !dbg !467\0A\0AL255:                                             ; preds = %L251\0A  %73 = fdiv float %22, %69, !dbg !469\0A  %74 = fmul float %73, %73, !dbg !471\0A  %75 = fdiv float %25, %69, !dbg !469\0A  %76 = fmul float %75, %75, !dbg !471\0A  %77 = fadd float %74, %76, !dbg !474\0A  %78 = fdiv float %28, %69, !dbg !469\0A  %79 = fmul float %78, %78, !dbg !471\0A  %80 = fadd float %79, %77, !dbg !474\0A  %81 = call float @__nv_sqrtf(float %80) EnzymeAD/Enzyme.jl#11, !dbg !475\0A  %82 = fmul float %69, %81, !dbg !476\0A  br label %L270, !dbg !467\0A\0AL270:                                             ; preds = %L255, %L253, %L185, %L178\0A  %value_phi6 = phi float [ %61, %L178 ], [ %72, %L253 ], [ %82, %L255 ], [ %69, %L185 ]\0A  %83 = fmul float %9, %22, !dbg !477\0A  %84 = fmul float %14, %25, !dbg !477\0A  %85 = fadd fast float %84, %83, !dbg !483\0A  %86 = fmul float %19, %28, !dbg !477\0A  %87 = fadd fast float %85, %86, !dbg !483\0A  %88 = fmul float %value_phi5, %value_phi6, !dbg !484\0A  %89 = fdiv float %87, %88, !dbg !485\0A  %90 = call float @__nv_acosf(float %89) EnzymeAD/Enzyme.jl#12, !dbg !486\0A  %91 = getelementptr inbounds [2 x float], [2 x float] addrspace(11)* %0, i64 0, i64 0, !dbg !487\0A  %92 = load float, float addrspace(11)* %91, align 4, !dbg !489, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %93 = fmul float %92, 5.000000e-01, !dbg !489\0A  %94 = getelementptr inbounds [2 x float], [2 x float] addrspace(11)* %0, i64 0, i64 1, !dbg !487\0A  %95 = load float, float addrspace(11)* %94, align 4, !dbg !491, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %96 = fsub float %90, %95, !dbg !491\0A  %97 = fmul float %96, %96, !dbg !492\0A  %98 = fmul float %93, %97, !dbg !494\0A  ret float %98, !dbg !488\0A}\0A\0A; Function Attrs: mustprogress willreturn\0Adefine internal fastcc { {} addrspace(10)*, float } @fakeaugmented_julia_f_6109([2 x float] addrspace(11)* nocapture nofree noundef nonnull readonly align 4 dereferenceable(8) %0, [2 x float] addrspace(11)* nocapture nofree align 4 %\22'\22, [1 x [3 x float]] addrspace(11)* nocapture nofree noundef nonnull readonly align 4 dereferenceable(12) %1, [1 x [3 x float]] addrspace(11)* nocapture nofree align 4 %\22'1\22, [1 x [3 x float]] addrspace(11)* nocapture nofree noundef nonnull readonly align 4 dereferenceable(12) %2, [1 x [3 x float]] addrspace(11)* nocapture nofree align 4 %\22'2\22, [1 x [3 x float]] addrspace(11)* nocapture nofree noundef nonnull readonly align 4 dereferenceable(12) %3, [1 x [3 x float]] addrspace(11)* nocapture nofree align 4 %\22'3\22) unnamed_addr EnzymeAD/Enzyme.jl#10 !dbg !495 {\0Atop:\0A  %4 = call {}*** @julia.get_pgcstack() EnzymeAD/Enzyme.jl#11\0A  %5 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %1, i64 0, i64 0, i64 0, !dbg !496\0A  %6 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %2, i64 0, i64 0, i64 0, !dbg !496\0A  %7 = load float, float addrspace(11)* %5, align 4, !dbg !503, !tbaa !34, !invariant.load !13, !alias.scope !504, !noalias !507\0A  %8 = load float, float addrspace(11)* %6, align 4, !dbg !503, !tbaa !34, !invariant.load !13, !alias.scope !509, !noalias !512\0A  %9 = fsub float %7, %8, !dbg !503\0A  %10 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %1, i64 0, i64 0, i64 1, !dbg !496\0A  %11 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %2, i64 0, i64 0, i64 1, !dbg !496\0A  %12 = load float, float addrspace(11)* %10, align 4, !dbg !503, !tbaa !34, !invariant.load !13, !alias.scope !504, !noalias !507\0A  %13 = load float, float addrspace(11)* %11, align 4, !dbg !503, !tbaa !34, !invariant.load !13, !alias.scope !509, !noalias !512\0A  %14 = fsub float %12, %13, !dbg !503\0A  %15 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %1, i64 0, i64 0, i64 2, !dbg !496\0A  %16 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %2, i64 0, i64 0, i64 2, !dbg !496\0A  %17 = load float, float addrspace(11)* %15, align 4, !dbg !503, !tbaa !34, !invariant.load !13, !alias.scope !504, !noalias !507\0A  %18 = load float, float addrspace(11)* %16, align 4, !dbg !503, !tbaa !34, !invariant.load !13, !alias.scope !509, !noalias !512\0A  %19 = fsub float %17, %18, !dbg !503\0A  %20 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %3, i64 0, i64 0, i64 0, !dbg !514\0A  %21 = load float, float addrspace(11)* %20, align 4, !dbg !521, !tbaa !34, !invariant.load !13, !alias.scope !522, !noalias !525\0A  %22 = fsub float %21, %8, !dbg !521\0A  %23 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %3, i64 0, i64 0, i64 1, !dbg !514\0A  %24 = load float, float addrspace(11)* %23, align 4, !dbg !521, !tbaa !34, !invariant.load !13, !alias.scope !522, !noalias !525\0A  %25 = fsub float %24, %13, !dbg !521\0A  %26 = getelementptr inbounds [1 x [3 x float]], [1 x [3 x float]] addrspace(11)* %3, i64 0, i64 0, i64 2, !dbg !514\0A  %27 = load float, float addrspace(11)* %26, align 4, !dbg !521, !tbaa !34, !invariant.load !13, !alias.scope !522, !noalias !525\0A  %28 = fsub float %27, %18, !dbg !521\0A  %29 = fmul float %9, %9, !dbg !527\0A  %30 = fmul float %14, %14, !dbg !527\0A  %31 = fadd float %29, %30, !dbg !534\0A  %32 = fmul float %19, %19, !dbg !527\0A  %33 = fadd float %31, %32, !dbg !534\0A  %34 = call float @__nv_sqrtf(float %33) EnzymeAD/Enzyme.jl#11, !dbg !535\0A  %35 = fcmp ule float %34, 0.000000e+00, !dbg !536\0A  br i1 %35, label %L85, label %L78, !dbg !537\0A\0AL78:                                              ; preds = %top\0A  %36 = call i32 @__nv_finitef(float %34) EnzymeAD/Enzyme.jl#11, !dbg !538\0A  %37 = icmp eq i32 %36, 0, !dbg !539\0A  br i1 %37, label %L85, label %L170, !dbg !537\0A\0AL85:                                              ; preds = %L78, %top\0A  %38 = call float @__nv_fabsf(float %9) EnzymeAD/Enzyme.jl#11, !dbg !542\0A  %39 = call float @__nv_fabsf(float %14) EnzymeAD/Enzyme.jl#11, !dbg !548\0A  %40 = call float @llvm.maximum.f32(float %38, float %39) EnzymeAD/Enzyme.jl#11, !dbg !551\0A  %41 = call float @__nv_fabsf(float %19) EnzymeAD/Enzyme.jl#11, !dbg !548\0A  %42 = call float @llvm.maximum.f32(float %40, float %41) EnzymeAD/Enzyme.jl#11, !dbg !551\0A  call void inttoptr (i64 139987984951216 to void (i8*)*)(i8* getelementptr inbounds ([26073 x i8], [26073 x i8]* @0, i32 0, i32 0)) EnzymeAD/Enzyme.jl#12, !dbg !552\0A  %43 = call i32 @__nv_finitef(float %42) EnzymeAD/Enzyme.jl#11, !dbg !552\0A  %.not37 = icmp eq i32 %43, 0, !dbg !554\0A  br i1 %.not37, label %L170, label %L151, !dbg !553\0A\0AL151:                                             ; preds = %L85\0A  %44 = fcmp une float %42, 0.000000e+00, !dbg !557\0A  br i1 %44, label %L155, label %L153, !dbg !559\0A\0AL153:                                             ; preds = %L151\0A  %45 = call float @__nv_fabsf(float noundef 0.000000e+00) EnzymeAD/Enzyme.jl#11, !dbg !560\0A  br label %L170, !dbg !564\0A\0AL155:                                             ; preds = %L151\0A  %46 = fdiv float %9, %42, !dbg !566\0A  %47 = fmul float %46, %46, !dbg !568\0A  %48 = fdiv float %14, %42, !dbg !566\0A  %49 = fmul float %48, %48, !dbg !568\0A  %50 = fadd float %47, %49, !dbg !571\0A  %51 = fdiv float %19, %42, !dbg !566\0A  %52 = fmul float %51, %51, !dbg !568\0A  %53 = fadd float %52, %50, !dbg !571\0A  %54 = call float @__nv_sqrtf(float %53) EnzymeAD/Enzyme.jl#11, !dbg !572\0A  %55 = fmul float %42, %54, !dbg !573\0A  br label %L170, !dbg !564\0A\0AL170:                                             ; preds = %L155, %L153, %L85, %L78\0A  %value_phi5 = phi float [ %34, %L78 ], [ %45, %L153 ], [ %55, %L155 ], [ %42, %L85 ]\0A  %56 = fmul float %22, %22, !dbg !527\0A  %57 = fmul float %25, %25, !dbg !527\0A  %58 = fadd float %56, %57, !dbg !534\0A  %59 = fmul float %28, %28, !dbg !527\0A  %60 = fadd float %58, %59, !dbg !534\0A  %61 = call float @__nv_sqrtf(float %60) EnzymeAD/Enzyme.jl#11, !dbg !535\0A  %62 = fcmp ule float %61, 0.000000e+00, !dbg !536\0A  br i1 %62, label %L185, label %L178, !dbg !537\0A\0AL178:                                             ; preds = %L170\0A  %63 = call i32 @__nv_finitef(float %61) EnzymeAD/Enzyme.jl#11, !dbg !538\0A  %64 = icmp eq i32 %63, 0, !dbg !539\0A  br i1 %64, label %L185, label %L270, !dbg !537\0A\0AL185:                                             ; preds = %L178, %L170\0A  %65 = call float @__nv_fabsf(float %22) EnzymeAD/Enzyme.jl#11, !dbg !542\0A  %66 = call float @__nv_fabsf(float %25) EnzymeAD/Enzyme.jl#11, !dbg !548\0A  %67 = call float @llvm.maximum.f32(float %65, float %66) EnzymeAD/Enzyme.jl#11, !dbg !551\0A  %68 = call float @__nv_fabsf(float %28) EnzymeAD/Enzyme.jl#11, !dbg !548\0A  %69 = call float @llvm.maximum.f32(float %67, float %68) EnzymeAD/Enzyme.jl#11, !dbg !551\0A  %70 = call i32 @__nv_finitef(float %69) EnzymeAD/Enzyme.jl#11, !dbg !552\0A  %.not35 = icmp eq i32 %70, 0, !dbg !554\0A  br i1 %.not35, label %L270, label %L251, !dbg !553\0A\0AL251:                                             ; preds = %L185\0A  %71 = fcmp une float %69, 0.000000e+00, !dbg !557\0A  br i1 %71, label %L255, label %L253, !dbg !559\0A\0AL253:                                             ; preds = %L251\0A  %72 = call float @__nv_fabsf(float noundef 0.000000e+00) EnzymeAD/Enzyme.jl#11, !dbg !560\0A  br label %L270, !dbg !564\0A\0AL255:                                             ; preds = %L251\0A  %73 = fdiv float %22, %69, !dbg !566\0A  %74 = fmul float %73, %73, !dbg !568\0A  %75 = fdiv float %25, %69, !dbg !566\0A  %76 = fmul float %75, %75, !dbg !568\0A  %77 = fadd float %74, %76, !dbg !571\0A  %78 = fdiv float %28, %69, !dbg !566\0A  %79 = fmul float %78, %78, !dbg !568\0A  %80 = fadd float %79, %77, !dbg !571\0A  %81 = call float @__nv_sqrtf(float %80) EnzymeAD/Enzyme.jl#11, !dbg !572\0A  %82 = fmul float %69, %81, !dbg !573\0A  br label %L270, !dbg !564\0A\0AL270:                                             ; preds = %L255, %L253, %L185, %L178\0A  %value_phi6 = phi float [ %61, %L178 ], [ %72, %L253 ], [ %82, %L255 ], [ %69, %L185 ]\0A  %83 = fmul float %9, %22, !dbg !574\0A  %84 = fmul float %14, %25, !dbg !574\0A  %85 = fadd fast float %84, %83, !dbg !580\0A  %86 = fmul float %19, %28, !dbg !574\0A  %87 = fadd fast float %85, %86, !dbg !580\0A  %88 = fmul float %value_phi5, %value_phi6, !dbg !581\0A  %89 = fdiv float %87, %88, !dbg !582\0A  %90 = call float @__nv_acosf(float %89) EnzymeAD/Enzyme.jl#13, !dbg !583\0A  %91 = getelementptr inbounds [2 x float], [2 x float] addrspace(11)* %0, i64 0, i64 0, !dbg !584\0A  %92 = load float, float addrspace(11)* %91, align 4, !dbg !586, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %93 = fmul float %92, 5.000000e-01, !dbg !586\0A  %94 = getelementptr inbounds [2 x float], [2 x float] addrspace(11)* %0, i64 0, i64 1, !dbg !584\0A  %95 = load float, float addrspace(11)* %94, align 4, !dbg !588, !tbaa !34, !invariant.load !13, !alias.scope !38, !noalias !41\0A  %96 = fsub float %90, %95, !dbg !588\0A  %97 = fmul float %96, %96, !dbg !589\0A  %98 = fmul float %93, %97, !dbg !591\0A  %99 = insertvalue { {} addrspace(10)*, float } undef, float %98, 1, !dbg !585\0A  ret { {} addrspace(10)*, float } %99, !dbg !585\0A\0AallocsForInversion:                               ; No predecessors!\0A}\0A\0Acannot handle (augmented) unknown intrinsic\0A  %40 = call float @llvm.maximum.f32(float %38, float %39) EnzymeAD/Enzyme.jl#11, !dbg !98\0A\0AStacktrace:\0A [1] #max\0A   @ ~/.julia/dev/CUDA/src/device/intrinsics/math.jl:328\0A [2] maxabs_nested\0A   @ ~/.julia/packages/StaticArrays/yXGNL/src/linalg.jl:243\0A [3] macro expansion\0A   @ ~/.julia/packages/StaticArrays/yXGNL/src/linalg.jl:257\0A [4] _norm_scaled\0A   @ ~/.julia/packages/StaticArrays/yXGNL/src/linalg.jl:249\0A [5] macro expansion\0A   @ ~/.julia/packages/StaticArrays/yXGNL/src/linalg.jl:279\0A [6] _norm\0A   @ ~/.julia/packages/StaticArrays/yXGNL/src/linalg.jl:266\0A [7] norm\0A   @ ~/.julia/packages/StaticArrays/yXGNL/src/linalg.jl:265\0A [8] f\0A   @ ~/dms/molly_dev/enzyme_err32.jl:20\0A\00", align 1

E.g. if you look into the grad_kernel.ll Enzyme will tell you whats going wrong, we just fail at the printing.

@vchuravy
Copy link
Member

vchuravy commented Dec 4, 2023

E.g. cannot handle (augmented) unknown intrinsic\0A %40 = call float @llvm.maximum.f32(float %38, float %39) EnzymeAD/Enzyme.jl#11, !dbg !98\

@vchuravy vchuravy changed the title Float32 norm error on GPU Can not handle @llvm.maximum.f32(float %38, float %39) Dec 4, 2023
@vchuravy vchuravy transferred this issue from EnzymeAD/Enzyme.jl Dec 4, 2023
@wsmoses wsmoses closed this as completed Dec 13, 2023
@vchuravy
Copy link
Member

Which PR fixed this?

@wsmoses
Copy link
Member

wsmoses commented Dec 13, 2023

#1566

MilesCranmer pushed a commit to MilesCranmer/Enzyme that referenced this issue Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants