Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve performance #38

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

SimonDanisch
Copy link
Member

After benchmarking BSON to figure out if it's a good replacement for JSON, I noticed that its by far the slowest. This PR fixes the performance partly:

@time BSON.bson("test.bson_master", Dict(:a=>x)) # 3.2s
@time open(io-> JSON.print(io, x), "test.json", "w") # 1.6s
@time open(io-> JSON2.write(io, x), "test.json2", "w") # 1.1s
@time BSON.bson("test.bson_this_pr", Dict(:a=>x)) # 0.26s
@time open(io-> CBOR.encode(io, x), "test.cbor", "w") # 0.16

It's still slower than CBOR, which also isn't doing the fastest it could - so I guess there are still some other low hanging fruits in here ;)

@richiejp
Copy link
Contributor

richiejp commented Jun 2, 2019

This appears to give ~10% on my benchmarks, but it would be way higher for more numerical data I guess. Added it to my test branch, see #34

@KristofferC
Copy link
Member

This doesn't seem safe in general? What if x goes out of scope before the unsafe_wrap Array.

@KristofferC
Copy link
Member

KristofferC commented Jun 11, 2019

I tried this on VGG19 from Metalhead and got

julia> @time Metalhead.weights("vgg19.bson")
  0.662478 seconds (1.65 M allocations: 631.681 MiB, 15.95% gc time)
Dict{Symbol,Any} with 38 entries:
Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x1f353f3d -- getindex at .\array.jl:730 [inlined]
getindex at .\subarray.jl:228 [inlined]
_show_nonempty at .\arrayshow.jl:386
in expression starting at no file:0
getindex at .\array.jl:730 [inlined]
getindex at .\subarray.jl:228 [inlined]
_show_nonempty at .\arrayshow.jl:386
#383 at .\arrayshow.jl:404
jl_apply_generic at /home/Administrator/buildbot/worker/package_win64/build/src\gf.c:2219
show_nd at .\arrayshow.jl:294
_show_nonempty at .\arrayshow.jl:403 [inlined]
show at .\arrayshow.jl:421

@SimonDanisch
Copy link
Member Author

Guess it's unsafe indeed :P
I was assuming, it's only used just before sending it to write, but it's a bit hard to tell where things go.
Maybe it'd be better to implement a write that works with an reinterpret array efficiently.

@KristofferC
Copy link
Member

Maybe it'd be better to implement a write that works with an reinterpret array efficiently.

Yeah, probably. #46 is at least a lot better than what we have now.

@SimonDanisch
Copy link
Member Author

Yup, I was thinking to just submit something like #46, but then I got greedy :D

@SimonDanisch
Copy link
Member Author

I'd say just merge #46 and leave this unmerged until we find something safer ;)

@KristofferC
Copy link
Member

Interestingly, just defining

function reinterpret_(::Type{T}, x) where T
  reinterpret(T, x)
end

I can load VGG19 pretty damn fast

julia> @time Metalhead.weights("vgg19.bson")
  0.269944 seconds (1.97 k allocations: 548.196 MiB, 37.98% gc time)
Dict{Symbol,Any} with 38 entries:

but it crashes when testing BSON itself.

@richiejp
Copy link
Contributor

@KristofferC probably because reinterpret returns ReinterpretArray which wraps the original array and lazily casts its contents when accessed.

function reinterpret_(::Type{T}, x) where T
len = Int(length(x) * (sizeof(eltype(x)) / sizeof(T)))
GC.@preserve x begin
return unsafe_wrap(Array, Ptr{T}(pointer(x)), len)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may need to set own = true for unsafe_wrap.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that doesn't work on its own, but if I copy the memory then it works

function reinterpret_(::Type{T}, x::Vector{S}) where {T, S}
  length(x) < 1 && return T[]

  len = Int(length(x) * (sizeof(eltype(x)) / sizeof(T)))

  GC.@preserve x begin
    p = Ptr{UInt8}(pointer(x))
    t = Ptr{UInt8}(Base.Libc.calloc(len, sizeof(T)))
    ccall(:memset, Cvoid, (Ptr{UInt8}, Cint, Csize_t), t, 0, len * sizeof(T))
    ccall(:memcpy, Cvoid, (Ptr{UInt8}, Ptr{UInt8}, Csize_t), t, p, sizeof(x))
    Base.unsafe_wrap(Array, Ptr{T}(t), len; own=true)
  end
end

This brings the time down on my fork to 0.479740 seconds from 0.51184. It would be better to copy the bytes directly from the input stream and only memset the trailing bytes.

I also tried creating a second reference to the original memory and wrapping it, but unsafe_wrap just treats the Ref as a pointer AFAICT.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I add mmap and copy directly from the IOBuffers byte vector, then we get the following:

julia> @timev BSONqs.load("../../../julia/serbench/vgg19.bson"; mmap=true);
  0.250118 seconds (15.08 k allocations: 548.829 MiB, 19.26% gc time)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants