Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crash with Julia 1.8.4 under Windows 11 #95

Closed
mlesnoff opened this issue Jan 3, 2023 · 15 comments
Closed

crash with Julia 1.8.4 under Windows 11 #95

mlesnoff opened this issue Jan 3, 2023 · 15 comments

Comments

@mlesnoff
Copy link

mlesnoff commented Jan 3, 2023

I am using LIBSVM.jl v0.8.0, under Windows 11.

It works fine with Julia 1.8.3:

julia> versioninfo()
Julia Version 1.8.3
Commit 0434deb161 (2022-11-14 20:14 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: 16 × Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
Threads: 8 on 16 virtual cores
Environment:
JULIA_EDITOR = code
JULIA_NUM_THREADS = 8

But when I uses Julia 1.8.4:

julia> versioninfo()
Julia Version 1.8.4
Commit 00177ebc4f (2022-12-23 21:32 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: 16 × Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
Threads: 8 on 16 virtual cores
Environment:
JULIA_EDITOR = code
JULIA_NUM_THREADS = 8`

running any function kills my Julia session, for instance after doing:

using LIBSVM
(X, y) = (randn(100,4), randn(100))

and then the command below;

svmtrain(X', y)

kills the process and closes directly Julia.

Did somebody observe the same problem and know what is happening?

@barucden
Copy link
Member

barucden commented Jan 4, 2023

Do you see any error?

@mlesnoff
Copy link
Author

mlesnoff commented Jan 4, 2023

no. When I use directlty Julia (= without IDE), there is no error printed, just the Julia api closes and disappears.
When I use VsCode, the TERMINAL closes (without printing an error), and there is a VScode alert saying: "The terminal process terminated with exit code: 3221226356".

I observe the same problem with package XGBoost.jl. Both packages are wrappers for external libraries; may be this is linked.

@barucden
Copy link
Member

barucden commented Jan 4, 2023

Okay. The exit code should mean heap corruption and since I have no idea how to debug that... can you try running the following code in a fresh Julia session? It is basically the body of svmtrain. Maybe we will be able to figure out which line causes the crash.

using LIBSVM
(X, y) = (randn(100,4), randn(100))
X = X'
LIBSVM.set_num_threads(1)
degree = 3
_svmtype = 0
_kernel = Int32(Kernel.RadialBasis)
LIBSVM.check_train_input(X, y, Kernel.RadialBasis)
idx, reverse_labels, weights, weight_labels = LIBSVM.indices_and_weights(y, X, nothing)
param = LIBSVM.SVMParameter(
        _svmtype, _kernel, Int32(3), Float64(1.0 / size(X, 1)),
        0.0, 200.0, 0.001, 1.0, Int32(length(weights)),
        pointer(weight_labels), pointer(weights), 0.5, 0.1, Int32(true),
        Int32(false))
ninstances = size(X, 2)
nodes, nodeptrs =  LIBSVM.instances2nodes(X)
problem = LIBSVM.SVMProblem(Int32(ninstances), pointer(idx), pointer(nodeptrs))
LIBSVM.libsvm_set_verbose(true)
@GC.preserve nodes begin
    LIBSVM.libsvm_check_parameter(problem, param)
    ptr_model = LIBSVM.libsvm_train(problem, param)
end
svm = LIBSVM.SVM(unsafe_load(ptr_model), y, X, nothing, reverse_labels, SVC, Kernel.RadialBasis)
LIBSVM.libsvm_free_model(ptr_model)

@mlesnoff
Copy link
Author

mlesnoff commented Jan 4, 2023

Thanks, I did run it:

julia> using LIBSVM

julia> (X, y) = (randn(100,4), randn(100))
([1.1571690985215213 -1.4655808933016536 -0.5977168643349797 0.23670932033058822; 1.045503287624356 1.522046538174527 0.4086712588952419 -0.9324406722062056; … ; 0.015344693271484194 -1.2274585735423154 -1.0148089998062437 0.1819779353324507; 1.164254578419492 -0.4848194088689624 0.8331051401309986 -1.1937463750802504], [-0.27092475097381574, -2.7869592919398616, 0.39722507000723495, -0.6144566882437535, -0.508227597172905, -0.04517546023972688, 0.3874286075603758, 0.4228250301983162, -0.06756199848523661, 0.8778665954489996  …  1.008174872315394, -0.3929995471881523, 0.7032768374006367, 0.38025931107838473, -1.7270008907829422, 0.22935366733471255, 0.17882862687997875, 0.44289178845636096, -1.3425301258045377, 0.4706486988581435])

julia> X = X'
4×100 adjoint(::Matrix{Float64}) with eltype Float64:
  1.15717    1.0455    -1.70691   -0.938373  -1.29473   …   1.65808    0.0553481   1.56421    0.0153447   1.16425
 -1.46558    1.52205    0.189132   1.12947    2.11767      -2.15229    0.746509    1.48085   -1.22746    -0.484819
 -0.597717   0.408671   1.82956    0.707923   1.52135       2.06081   -0.726431    0.278311  -1.01481     0.833105
  0.236709  -0.932441   0.569482  -0.361493   0.128428     -0.243661   0.88279    -0.243693   0.181978   -1.19375

julia> LIBSVM.set_num_threads(1)

julia> degree = 3
3

julia> _svmtype = 0
0

julia> _kernel = Int32(Kernel.RadialBasis)
2

julia> LIBSVM.check_train_input(X, y, Kernel.RadialBasis)

julia> idx, reverse_labels, weights, weight_labels = LIBSVM.indices_and_weights(y, X, nothing)
([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0  …  91.0, 92.0, 93.0, 94.0, 95.0, 96.0, 97.0, 98.0, 99.0, 100.0], [-0.27092475097381574, -2.7869592919398616, 0.39722507000723495, -0.6144566882437535, -0.508227597172905, -0.04517546023972688, 0.3874286075603758, 0.4228250301983162, -0.06756199848523661, 0.8778665954489996  …  1.008174872315394, -0.3929995471881523, 0.7032768374006367, 0.38025931107838473, -1.7270008907829422, 0.22935366733471255, 0.17882862687997875, 0.44289178845636096, -1.3425301258045377, 0.4706486988581435], Float64[], Int32[])

julia> param = LIBSVM.SVMParameter(
               _svmtype, _kernel, Int32(3), Float64(1.0 / size(X, 1)),
               0.0, 200.0, 0.001, 1.0, Int32(length(weights)),
               pointer(weight_labels), pointer(weights), 0.5, 0.1, Int32(true),
               Int32(false))
LIBSVM.SVMParameter(0, 2, 3, 0.25, 0.0, 200.0, 0.001, 1.0, 0, Ptr{Int32} @0x000001bf0562a440, Ptr{Float64} @0x000001bf0562a480, 0.5, 0.1, 1, 0)

julia> ninstances = size(X, 2)
100

julia> nodes, nodeptrs =  LIBSVM.instances2nodes(X)
(LIBSVM.SVMNode[LIBSVM.SVMNode(1, 1.1571690985215213) LIBSVM.SVMNode(1, 1.045503287624356) … LIBSVM.SVMNode(1, 0.015344693271484194) LIBSVM.SVMNode(1, 1.164254578419492); LIBSVM.SVMNode(2, -1.4655808933016536) LIBSVM.SVMNode(2, 1.522046538174527) … LIBSVM.SVMNode(2, -1.2274585735423154) LIBSVM.SVMNode(2, -0.4848194088689624); … ; LIBSVM.SVMNode(4, 0.23670932033058822) LIBSVM.SVMNode(4, -0.9324406722062056) … LIBSVM.SVMNode(4, 0.1819779353324507) LIBSVM.SVMNode(4, -1.1937463750802504); LIBSVM.SVMNode(-1, NaN) LIBSVM.SVMNode(-1, NaN) … LIBSVM.SVMNode(-1, NaN) LIBSVM.SVMNode(-1, NaN)], Ptr{LIBSVM.SVMNode}[Ptr{LIBSVM.SVMNode} @0x000001befa9c9080, Ptr{LIBSVM.SVMNode} @0x000001befa9c90d0, Ptr{LIBSVM.SVMNode} @0x000001befa9c9120, Ptr{LIBSVM.SVMNode} @0x000001befa9c9170, Ptr{LIBSVM.SVMNode} @0x000001befa9c91c0, Ptr{LIBSVM.SVMNode} @0x000001befa9c9210, Ptr{LIBSVM.SVMNode} @0x000001befa9c9260, Ptr{LIBSVM.SVMNode} @0x000001befa9c92b0, Ptr{LIBSVM.SVMNode} @0x000001befa9c9300, Ptr{LIBSVM.SVMNode} @0x000001befa9c9350  …  Ptr{LIBSVM.SVMNode} @0x000001befa9caca0, Ptr{LIBSVM.SVMNode} @0x000001befa9cacf0, Ptr{LIBSVM.SVMNode} @0x000001befa9cad40, Ptr{LIBSVM.SVMNode} @0x000001befa9cad90, Ptr{LIBSVM.SVMNode} @0x000001befa9cade0, Ptr{LIBSVM.SVMNode} @0x000001befa9cae30, Ptr{LIBSVM.SVMNode} @0x000001befa9cae80, Ptr{LIBSVM.SVMNode} @0x000001befa9caed0, Ptr{LIBSVM.SVMNode} @0x000001befa9caf20, Ptr{LIBSVM.SVMNode} @0x000001befa9caf70])

julia> problem = LIBSVM.SVMProblem(Int32(ninstances), pointer(idx), pointer(nodeptrs))
LIBSVM.SVMProblem(100, Ptr{Float64} @0x000001bef986c740, Ptr{Ptr{LIBSVM.SVMNode}} @0x000001bf06487140)

julia> LIBSVM.libsvm_set_verbose(true)

Then this command below:

ptr_model = LIBSVM.libsvm_train(problem, param)

kills Julia (with no printed info). What I don't understand is that LIVSIM.jl works fine with my Julia 1.8.3 (same for XGBoost.jl).

@barucden
Copy link
Member

barucden commented Jan 5, 2023

I guess you could try removing the precompile cache at ~/.julia/compiled/ (although I don't think that precompiled packages are shared between julia versions so it probably won't help).

Apart from that, I don't know how to proceed. If nobody else responds here, I suggest you to create a thread at Discourse.

@mlesnoff
Copy link
Author

mlesnoff commented Jan 5, 2023

Removing the cache did not work.
Actually, I already created a thread at Discourse, but received no answer. This is why I created this issue. May be I will create one in XGBoost.jl
Thanks for your helps above.

@mlesnoff
Copy link
Author

mlesnoff commented Feb 2, 2023

Under Windows11, both packages XGboost.jl or LIBSVM.j do not work from 1.8.4, even with v1.9.0-beta3.
Actually, to use these packages under Windows, I don't see other solution than staying on Julia 1.8.3

@barucden
Copy link
Member

barucden commented Feb 3, 2023

I see in the Discourse thread that the problem was already bisected, and it is mentioned in the XGBoost.jl issue that the problem is being investigated on the Julia's side.

I don't think we can do much on the side of LIBSVM.jl. Let's just keep the issue open here so that other users who potentially observe the same problem know it has been reported. We can close it once it is resolved in Julia.

@mlesnoff
Copy link
Author

I see in the Discourse thread that the problem was already bisected, and it is mentioned in the XGBoost.jl issue that the problem is being investigated on the Julia's side.

FYI
it seems that the issue is not exactly on the Julia side, as explained by @mkitti here

@barucden
Copy link
Member

LIBSVM also seems to use OpenMP so @mkitti is probably right.

Still, as I understand it, crashing svm_train is just a symptom, and the root cause is not within LIBSVM(.jl).

@aminadibi
Copy link

Same issue with Julia 1.9.0

@giordano
Copy link

Fixed in Julia master by JuliaLang/julia#50135

@barucden
Copy link
Member

Thanks @giordano! Can anyone with a Windows machine download a nightly version and verify that the issue is gone so we can close it?

@mkitti
Copy link

mkitti commented Jun 12, 2023

I built his branch. The tests pass.

Test Summary: | Pass  Total   Time
LibSVM        |   56     56  30.3s
     Testing LIBSVM tests passed

julia> versioninfo()
Julia Version 1.10.0-DEV.1468
Commit a523212dd8* (2023-06-11 18:17 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 48 × Intel(R) Xeon(R) Gold 5220R CPU @ 2.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, cascadelake)
  Threads: 1 on 96 virtual cores

@barucden
Copy link
Member

Great! Thank you, @mkitti, for the verification. Let's close this issue then.

For anyone affected: the fix will be part of Julia 1.10 (released in several months). The fix is also scheduled for back-porting to 1.9 so it should be present in the future version 1.9.2 (released in several weeks). In the meantime, consider using the nightly version of Julia, which already contains the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants