Skip to content

nalimilan/FreqTables.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FreqTables

Build Status Coverage Status FreqTables

This package allows computing one- or multi-way frequency tables (a.k.a. contingency or pivot tables) from any type of vector or array. It includes support for CategoricalArray and DataFrames, as well as for weighted counts.

Tables are represented as NamedArray objects.

julia> using FreqTables
julia> x = repeat(["a", "b", "c", "d"], outer=[100]);
julia> y = repeat(["A", "B", "C", "D"], inner=[10], outer=[10]);
julia> freqtable(x)
julia>  freqtable(x)
4-element Named Array{Int64,1}
Dim1  │
──────┼────
a     │ 100
b     │ 100
c     │ 100
d     │ 100

julia> freqtable(x, y)
4×4 Named Array{Int64,2}
Dim1 ╲ Dim2 │  A   B   C   D
────────────┼───────────────
a           │ 30  20  30  20
b           │ 30  20  30  20
c           │ 20  30  20  30
d           │ 20  30  20  30

julia> freqtable(x, y, subset=1:20)
4×2 Named Array{Int64,2}
Dim1 ╲ Dim2 │ A  B
────────────┼─────
a           │ 3  2
b           │ 3  2
c           │ 2  3
d           │ 2  3

julia> freqtable(x, y, subset=1:20, weights=repeat([1, .5], outer=[10]))
4×2 Named Array{Float64,2}
Dim1 ╲ Dim2 │   A    B
────────────┼─────────
a           │ 3.0  2.0
b           │ 1.5  1.0
c           │ 2.0  3.0
d           │ 1.0  1.5

For convenience, when working with a data frame, one can also pass the DataFrame object and columns as symbols:

julia> using DataFrames, CSV

julia> iris = CSV.read(joinpath(Pkg.dir("DataFrames"), "test/data/iris.csv"));

julia> iris[:LongSepal] = iris[:SepalLength] .> 5.0;

julia> freqtable(iris, :Species, :LongSepal)
3×2 Named Array{Int64,2}
Species ╲ LongSepal │ false   true
────────────────────┼─────────────
setosa              │    28     22
versicolor          │     3     47
virginica           │     1     49

julia> freqtable(iris, :Species, :LongSepal, subset=iris[:PetalLength] .< 4.0)
julia> freqtable(iris, :Species, :LongSepal, subset=iris[:PetalLength] .< 4.0)
2×2 Named Array{Int64,2}
Species ╲ LongSepal │ false   true
────────────────────┼─────────────
setosa              │    28     22
versicolor          │     3      8