Improve performance of `Dict{K,V}` (~5%) by storing elements in `pairs::Vector{Pair{K,V}}` #44332

petvana · 2022-02-24T16:07:51Z

Updated on March 28, 2022 - when testing on multiple sizes, the difference is not so significant, or zero.

I have noticed that Dict performance can be improved by storing keys and values together in a single vector of pairs. It can provide up to about 5% performance improvement for large dictionaries because it limits the number of random accesses to the memory. This PR is a kind of proof-of-concept. There is no change to the algorithm. Do you think it's worth it?

Although the PR is considered to be non-breaking, it may brake code to those who utilizes internal representation of the Dict.

Master:

mutable struct Dict{K,V} <: AbstractDict{K,V}
    # Metadata: empty => 0x00, removed => 0x7f, full => 0b1[7 most significant hash bits]
    slots::Vector{UInt8}
    keys::Array{K,1}
    vals::Array{V,1}
   ...

PR:

mutable struct Dict{K,V} <: AbstractDict{K,V}
    # Metadata: empty => 0x00, removed => 0x7f, full => 0b1[7 most significant hash bits]
    slots::Vector{UInt8}
    pairs::Vector{Pair{K,V}} # stored pairs (key::K => value::V)
   ...

I've measured the total elapsed time and total allocated memory over large dictionaries with various sizes.

sizes = (1:10) * 1_000_000

Master:

	Type	SET	GET	GET!empty	GET!full	ITERATE	MEM_GB
1	Dict{Int64, Int64}	3.967	2.069	0.692	0.526	0.059	15.408
2	Dict{Int64, Int8}	3.605	2.149	0.639	0.516	0.06	12.418
3	Dict{Any, Int64}	15.64	2.96	2.337	0.728	0.049	23.826
4	Dict{Int64, Any}	5.452	7.522	1.396	1.275	0.896	21.95
5	Dict{Any, Any}	22.452	7.894	6.56	1.367	0.949	30.379
6	Dict{String, Int64}	12.48	4.445	2.301	1.14	0.058	20.308

Total time 112.185 s

PR:

	Type	SET	GET	GET!empty	GET!full	ITERATE	MEM_GB
1	Dict{Int64, Int64}	3.531	1.823	0.618	0.499	0.054	15.409
2	Dict{Int64, Int8}	3.405	1.842	0.613	0.525	0.054	15.402
3	Dict{Any, Int64}	15.866	2.935	2.39	0.612	0.063	23.827
4	Dict{Int64, Any}	5.71	7.009	1.594	1.133	0.938	21.95
5	Dict{Any, Any}	24.558	6.218	6.831	1.261	0.916	30.379
6	Dict{String, Int64}	12.524	3.858	2.008	1.069	0.065	20.308

Total time 110.522 s

Testing code

module TestDict

using Printf
using Random
#using DataStructures
using DataFrames

const sizes = (1:10) * 1_000_000

function test_set(dict, x)
    xn = length(x)
    #sizehint!(dict, xn)
    for i in 1:xn
        dict[x[i]] = i
    end
end
test_get(dict, x) = sum(dict[x[i]] for i = 1:length(x))
test_get!(dict, x) = sum(get!(dict, x[i], i) for i = 1:length(x))
test_iterate(dict) = sum(v for v = values(dict))

function test_set(dict::Dict{K,Int8}, x) where K
    xn = length(x)
    #sizehint!(dict, xn)
    for i in 1:xn
        dict[x[i]] = i>100
    end
end
test_get!(dict::Dict{K,Int8}, x) where K = sum(get!(dict, x[i], i>100) for i = 1:length(x))
test_iterate(dict::Dict{K,Int8}) where K = sum(v for v = values(dict))

for D in [Dict]
    df = DataFrame()
    for (A,B) in [(Int, Int), (Int, Int8), (Any, Int), (Int, Any), (Any, Any), (String, Int)]
        time_set = time_get = time_get!_empty = time_get!_full = time_iterate = 0.0
        mem = @allocated for n in sizes
            Random.seed!(42)
            if A == String
                keys = [randstring() for i = 1:n]
            else
                keys = rand(A == Any ? Int : A, n)
            end
            keys = unique(keys)
            if B == Int8
                correct_sum = sum([1:length(keys)...] .> 100)
            else
                correct_sum = sum([1:length(keys)...])
            end

            #precompile
            dict = D{A, B}()
            test_get!(dict, keys[1:100])
            test_set(dict, keys[1:200])
            test_get(dict, keys[1:200])
            test_iterate(dict)

            dict = D{A, B}()
            time_set += @elapsed test_set(dict, keys)
            time_get += @elapsed getsum = test_get(dict, keys)
            @assert getsum == correct_sum

            dict = D{A, B}()
            time_get!_empty = @elapsed getsum = test_get!(dict, keys)
            @assert getsum == correct_sum
            time_get!_full = @elapsed getsum = test_get!(dict, keys)
            @assert getsum == correct_sum

            test_iterate(dict)
            time_iterate = @elapsed getsum = test_iterate(dict)
            @assert getsum == correct_sum
        end

        new_data = ( 
            Type = D{A,B}, 
            SET = time_set, 
            GET = time_get, 
            GET!empty = time_get!_empty,
            GET!full = time_get!_full,
            ITERATE = time_iterate,
            MEM_GB = mem / 1024.0^3
        )
        println(new_data)
        push!(df, new_data)
    end
    total = sum(sum(x) for x in eachcol(df[:,2:end-1]))
    df[:,2:end] = round.(df[:,2:end]; digits = 3)
    show(stdout, MIME("text/plain"), df)
    println("\n")
    show(stdout, MIME("text/html"), df; eltypes = false, summary = false)
    println("\n")
    @printf "Total time %.3f s\n\n\n" total
end

end

oscardssmith · 2022-02-24T16:21:35Z

This looks great! I think #38145 is probably the direction we want to move in the longer term, but free performance improvements are always great!

KristofferC · 2022-02-24T16:23:44Z

What's the actual benchmark that you ran? Did you measure iterating over values and keys for example?

JeffBezanson · 2022-02-24T21:08:58Z

This is probably a good idea, since we can now store Pairs in-line in all cases (before they might have been heap-allocated, which would surely make performance worse!).

petvana · 2022-02-24T21:50:39Z

The remaining test failure is about precompilation only.

~~However, I've found a very significant performance drop when using Any as the type for storing Int64 as key/value. The produced code is less optimized code.~~
(Solved by explicit types for pairs, results updated.)

Master

typeof(dict) = Dict{Int64, Any}
SET 1.331 s, GET 1.040 s, GET! 1.153 s, ITER. KEYS 0.080 s, ITER. VALS 0.525 s
typeof(dict) = Dict{Any, Int64}
SET 2.706 s, GET 1.306 s, GET! 5.691 s, ITER. KEYS 0.536 s, ITER. VALS 0.080 s
typeof(dict) = Dict{Any, Any}
SET 2.557 s, GET 1.922 s, GET! 7.591 s, ITER. KEYS 0.533 s, ITER. VALS 0.558 s
Total time 27.608 s

PR

typeof(dict) = Dict{Int64, Any}
SET 0.976 s, GET 0.927 s, GET! 1.897 s, ITER. KEYS 0.086 s, ITER. VALS 0.533 s
typeof(dict) = Dict{Any, Int64}
SET 1.837 s, GET 2.029 s, GET! 5.541 s, ITER. KEYS 0.551 s, ITER. VALS 0.089 s
typeof(dict) = Dict{Any, Any}
SET 2.941 s, GET 1.988 s, GET! 6.506 s, ITER. KEYS 0.591 s, ITER. VALS 0.577 s
Total time 27.070 s

JeffBezanson · 2022-02-24T22:13:52Z

Very interesting; we should look into that.

petvana · 2022-02-24T23:11:57Z

The problem seems to be caused by extra allocations because of @nospecialize macro in

julia/base/dict.jl

Lines 386 to 387 in 8306858

    
           function setindex!(h::Dict{K,Any}, v, key::K) where K 
        
               @nospecialize v

.

vtjnash · 2022-02-24T23:15:56Z

Seems like someone accidentally wrote k=>v later in that function, which had the wrong type

petvana · 2022-02-25T00:08:41Z

Seems like someone accidentally wrote k=>v later in that function, which had the wrong type

Thank you, fixed. I was just naive that this can be optimized out automatically.

test/precompile.jl

simeonschaub · 2022-02-25T20:08:30Z

Can you post the updated benchmarks with concrete as well as abstract types?

petvana · 2022-02-25T21:22:17Z

Can you post the updated benchmarks with concrete as well as abstract types?

All results should be updated now.

simeonschaub

This is a great PR! I have a few comments and I think it would be good to have another review from someone more familiar with dict internals than me, but overall I think this is a very nice improvement.

base/dict.jl

simeonschaub · 2022-02-25T22:13:52Z

base/dict.jl

 function getindex(h::Dict{K,V}, key) where V where K
    index = ht_keyindex(h, key)
-    @inbounds return (index < 0) ? throw(KeyError(key)) : h.vals[index]::V
+    @inbounds return (index < 0) ? throw(KeyError(key)) : h.pairs[index].second::V


I wonder why all these type annotations here were added in the first place. They probably don't hurt, but I also don't see why they'd be needed.

base/dict.jl

base/set.jl

Co-authored-by: Simeon Schaub <simeondavidschaub99@gmail.com>

vtjnash · 2022-03-08T17:08:33Z

Are any of the benchmarks here useful? https://github.com/JuliaCollections/DataStructures.jl/blob/master/benchmark/bench_heap.jl

petvana · 2022-03-11T21:18:17Z

I'm closing this in favor of #44513.

petvana · 2022-03-24T19:26:53Z

Reopening to run CI. I'll update benchmarks once I have some time.

petvana · 2022-03-28T15:44:56Z

I've updated the evaluation. The speed-up is still about 5% for concrete types, but none for abstract types. Furthermore, Julia currently doesn't support packing of Pair or Struct, as C does. Therefore, Dict{Int64, Int8} uses more memory. So, not sure if the PR is worth it at this stage.

oscardssmith · 2022-11-23T13:49:16Z

one thing that might be worth trying is storing 8 keys followed by 8 values. this would fix the alignment issues as least

petvana · 2022-11-23T14:04:23Z

I've updated the PR against master, since I found it beneficial for small Sets and Dicts as it removes one allocation (keys+vals => pairs).

julia> @btime Set(x) setup=(x=rand()); # PR
  62.848 ns (3 allocations: 336 bytes)

julia> @btime Set(x) setup=(x=rand()); # master
  87.330 ns (4 allocations: 400 bytes)



julia> @btime Dict(x => x) setup=(x=rand()); # PR
  64.738 ns (3 allocations: 480 bytes)

julia> @btime Dict(x => x) setup=(x=rand()); # master
  90.961 ns (4 allocations: 544 bytes)

julia> @btime Base.ImmutableDict(x => x) setup=(x=rand()); # only as a ground truth
  10.300 ns (2 allocations: 64 bytes)

petvana · 2022-11-24T01:08:51Z

@nanosoldier runtests(ALL, vs = ":master")

nanosoldier · 2022-11-24T18:39:16Z

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

KristofferC · 2022-11-27T12:27:37Z

(you can leave out the vs argument if running against master)

nanosoldier · 2022-11-27T22:18:45Z

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

petvana · 2022-11-28T13:48:29Z

@nanosoldier runtests(["ZChop", "MusicManipulations", "Orthography", "SerialDependence", "TextAnalysis", "Fairness", "ClimateSatellite", "SparseArrayKit", "Hecke", "GeoLearning", "WeakValueDicts", "Discreet", "SUNRepresentations", "EqualitySampler", "Tries", "EMIRT", "Hygese", "DeepDish", "PopGenCore", "LicenseGrabber", "ScrapeSEC", "BangBang", "EmbeddingsAnalysis", "DetectionTheory", "ChemometricsData", "MetidaNCA", "AlgebraicRelations", "EBayes", "HDF5Utils", "Mex", "TMLE", "CategoricalDistributions", "MLJFlux", "EPOCHInput", "LITS", "FastaLoader", "ClimateTasks", "FatDatasets", "MLJLinearModels", "SparseMatrixDicts", "TrueSkillThroughTime", "AsterReader", "MLJTestInterface", "Pidfile", "OneRule", "ClassicalCiphers", "StatsBase", "EvoTrees", "TextClassification", "GigaSOM", "MLJ", "Graph500", "Config", "MosimoBase", "FunctionalBallDropping", "DelayDiffEq", "CorrectMatch", "CachedFunctions", "StringDistances", "Bobby", "PreprocessMD", "HDF5Logger", "ConformalPrediction", "MLJEnsembles", "SDFResults", "DevIL", "CitableParserBuilder", "GeoStatsBase", "MLLabelUtils", "PoreMatMod", "LanguageFinder", "CitableCorpusAnalysis", "MLJTuning", "MLJModels", "MLJAbstractGPsGlue", "MLJClusteringInterface", "Nonconvex", "AIBECS", "LightPropagation", "MajoranaReps", "FreqTables", "TopologyPreprocessing", "PolaronMobility", "CSVReader", "Evolutionary", "SortedIteratorProducts", "PkgDependency", "GridUtilities", "QXTns", "MLJTestIntegration", "BinomialSynapses", "LazyGroupBy"])

nanosoldier · 2022-11-28T22:03:51Z

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

petvana · 2022-11-29T17:43:10Z

one thing that might be worth trying is storing 8 keys followed by 8 values. this would fix the alignment issues as least

@oscardssmith Nice idea, but I'm not aware how to implement that in pure Julia. Closest way I can imagine is using NTuple but it is immutable. Using other types would lead to unnecessary allocations.

julia> v = Pair{NTuple{8,Int64}, NTuple{8,Int8}}[]
Pair{NTuple{8, Int64}, NTuple{8, Int8}}[]

julia> resize!(v, 4)
4-element Vector{Pair{NTuple{8, Int64}, NTuple{8, Int8}}}:
 (0, 0, 0, 0, 0, 0, 0, 0) => (0, 0, 0, 0, 0, 0, 0, 0)
 (0, 0, 0, 0, 0, 0, 0, 0) => (0, 0, 0, 0, 0, 0, 0, 0)
 (0, 0, 0, 0, 0, 0, 0, 0) => (0, 0, 0, 0, 0, 0, 0, 0)
 (0, 0, 0, 0, 0, 0, 0, 0) => (0, 0, 0, 0, 0, 0, 0, 0)

julia> isbitstype(Pair{NTuple{8,Int64}, NTuple{8,Int8}})
true

oscardssmith · 2022-11-30T04:11:53Z

good point. @JeffBezanson this is another great example of why we should have a simple buffer type.

fingolfin · 2024-02-13T14:58:48Z

This PR has been approved but was never merged; it has various conflicts now. Also apparently the performance benefit in the current version is minimal to non-existent.

Thus I think it is OK to close this. Feel free to re-open should I be mistaken, or just submit a new PR with a pointer to this one (I believe this will increase its chance of being "seen" by reviewers).

Improve performance of 'Dict'

ae2aad5

This comment was marked as outdated.

Sign in to view

petvana added 2 commits February 24, 2022 20:04

Fix Set

0acaa9a

Fix most of the tests

8306858

Use explicite types for pairs

9a1a660

Mark broken test of precompilation

82e7a9d

KristofferC reviewed Feb 25, 2022

View reviewed changes

test/precompile.jl Outdated Show resolved Hide resolved

petvana added 2 commits February 25, 2022 09:39

Merge branch 'master' into pv-dict

263dc86

Reenable broken test

96d80a5

petvana mentioned this pull request Feb 25, 2022

DataType not contained in roots of Method #44344

Closed

Use simplified OldDict for testing precompilation

f37cc2d

petvana commented Feb 25, 2022

View reviewed changes

test/precompile.jl Outdated Show resolved Hide resolved

Fix precompile test

25d9bec

petvana changed the title ~~WIP: Improve performance of 'Dict' by about 5%~~ Improve performance of Dict{K,V} (~5%) by storing elements in pairs::Vector{Pair{K,V}} Feb 25, 2022

petvana marked this pull request as ready for review February 25, 2022 17:42

petvana added 2 commits February 25, 2022 20:27

Fix white space

8d39aef

Merge branch 'master' into pv-dict

51a6c15

simeonschaub reviewed Feb 25, 2022

View reviewed changes

petvana and others added 2 commits February 25, 2022 23:55

Apply suggestions from code review

08d7d88

Co-authored-by: Simeon Schaub <simeondavidschaub99@gmail.com>

Apply suggestions from code review

afd7975

Co-authored-by: Simeon Schaub <simeondavidschaub99@gmail.com>

petvana mentioned this pull request Mar 8, 2022

Swiss tables design for Dict #44513

Merged

petvana closed this Mar 11, 2022

Merge branch 'master' into pv-dict

cbe1cc7

petvana reopened this Mar 24, 2022

petvana marked this pull request as draft March 24, 2022 19:27

petvana added 2 commits March 24, 2022 20:30

Remove empty lines in test

326798e

Merge branch 'master' into pv-dict

2aee365

Merge branch 'master' into pv-dict

33b1133

Backward compatibility for keys and vals

f9b27dc

Merge branch 'master' into pv-dict

406034b

Improve backward compatibility

d093e7d

This comment was marked as outdated.

Sign in to view

Improve backward compatibility II

58414f8

petvana mentioned this pull request Mar 22, 2023

Change Dict field names to _keys and _vals #49095

Open

oscardssmith closed this Feb 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance of `Dict{K,V}` (~5%) by storing elements in `pairs::Vector{Pair{K,V}}` #44332

Improve performance of `Dict{K,V}` (~5%) by storing elements in `pairs::Vector{Pair{K,V}}` #44332

petvana commented Feb 24, 2022 •

edited

Loading

oscardssmith commented Feb 24, 2022

KristofferC commented Feb 24, 2022 •

edited

Loading

This comment was marked as outdated.

JeffBezanson commented Feb 24, 2022

petvana commented Feb 24, 2022 •

edited

Loading

JeffBezanson commented Feb 24, 2022

petvana commented Feb 24, 2022

vtjnash commented Feb 24, 2022

petvana commented Feb 25, 2022

simeonschaub commented Feb 25, 2022

petvana commented Feb 25, 2022

simeonschaub left a comment

simeonschaub Feb 25, 2022

vtjnash commented Mar 8, 2022

petvana commented Mar 11, 2022

petvana commented Mar 24, 2022

petvana commented Mar 28, 2022

oscardssmith commented Nov 23, 2022

petvana commented Nov 23, 2022

petvana commented Nov 24, 2022

nanosoldier commented Nov 24, 2022

This comment was marked as outdated.

KristofferC commented Nov 27, 2022

nanosoldier commented Nov 27, 2022

petvana commented Nov 28, 2022

nanosoldier commented Nov 28, 2022

petvana commented Nov 29, 2022

oscardssmith commented Nov 30, 2022

fingolfin commented Feb 13, 2024

Improve performance of Dict{K,V} (~5%) by storing elements in pairs::Vector{Pair{K,V}} #44332

Improve performance of Dict{K,V} (~5%) by storing elements in pairs::Vector{Pair{K,V}} #44332

Conversation

petvana commented Feb 24, 2022 • edited Loading

oscardssmith commented Feb 24, 2022

KristofferC commented Feb 24, 2022 • edited Loading

This comment was marked as outdated.

JeffBezanson commented Feb 24, 2022

petvana commented Feb 24, 2022 • edited Loading

JeffBezanson commented Feb 24, 2022

petvana commented Feb 24, 2022

vtjnash commented Feb 24, 2022

petvana commented Feb 25, 2022

simeonschaub commented Feb 25, 2022

petvana commented Feb 25, 2022

simeonschaub left a comment

Choose a reason for hiding this comment

simeonschaub Feb 25, 2022

Choose a reason for hiding this comment

vtjnash commented Mar 8, 2022

petvana commented Mar 11, 2022

petvana commented Mar 24, 2022

petvana commented Mar 28, 2022

oscardssmith commented Nov 23, 2022

petvana commented Nov 23, 2022

petvana commented Nov 24, 2022

nanosoldier commented Nov 24, 2022

This comment was marked as outdated.

KristofferC commented Nov 27, 2022

nanosoldier commented Nov 27, 2022

petvana commented Nov 28, 2022

nanosoldier commented Nov 28, 2022

petvana commented Nov 29, 2022

oscardssmith commented Nov 30, 2022

fingolfin commented Feb 13, 2024

Improve performance of `Dict{K,V}` (~5%) by storing elements in `pairs::Vector{Pair{K,V}}` #44332

Improve performance of `Dict{K,V}` (~5%) by storing elements in `pairs::Vector{Pair{K,V}}` #44332

petvana commented Feb 24, 2022 •

edited

Loading

KristofferC commented Feb 24, 2022 •

edited

Loading

petvana commented Feb 24, 2022 •

edited

Loading