Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Printing vectors of NamedTuples as a markdown table #24627

Closed
bramtayl opened this issue Nov 16, 2017 · 11 comments
Closed

Printing vectors of NamedTuples as a markdown table #24627

bramtayl opened this issue Nov 16, 2017 · 11 comments
Labels
collections Data structures holding multiple items, e.g. sets display and printing Aesthetics and correctness of printed representations of objects.

Comments

@bramtayl
Copy link
Contributor

bramtayl commented Nov 16, 2017

A vector of NamedTuples has the potential to become a "lightweight table". See #12131 (comment). To that end, printing a vector of NamedTuples with the same fields as a markdown table is a good start. This is a convenient way to view the contents of a vector of NamedTuples regardless. In Keys.jl I had the beginning of the code for this. I was waiting for a windows binary to come out with NamedTuples in order to be able to port the code, but I got bored. So anyway here's the code I had below. I'm somewhat indifferent about the actual implementation details.

export KeyedTable
const KeyedTable = AbstractVector{T} where T <: KeyedTuple{K, V} where V <: Tuple where K <: Tuple

show_row(io, atuple) = begin
    print(io, "| ")
    join(io, atuple, " | ")
    println(io, " |")
end

show_row(io, atuple, widths, n) =
    show_row(io, map(rpad, atuple, widths)[1:n])

function Base.summary(t::KeyedTable)
    "$(length(t)) x $(length(type_keys(eltype(t)))) keyed table"
end

struct Repeated
    text::String
    number::Int
end

function Base.show(io::IO, r::Repeated)
    text = r.text
    for i in 1:r.number
        print(io, text)
    end
end

function Base.showarray(io::IO, t::KeyedTable, ::Bool)
    println(io, summary(t))
    if length(t) == 0
        nothing
    else
        row_number, column_number = displaysize(io)
        limit = get(io, :limit, false)
        names = string.(type_keys(eltype(t)))
        # subset rows for long arrays
        subset =
            if limit
                t[1:min(row_number - 3, length(t))]
            else
                t
            end
        rows = map(subset) do row
            string.(row.values)
        end
        # find maximum widths for rows
        row_widths = mapreduce(
            row -> map(strwidth, row),
            (x, y) -> map(max, x, y),
            rows
        )
        # and then also for names
        widths = map(max, map(strwidth, names), row_widths)
        # figure out how many columns we can safely print
        if limit
            n = findfirst(x -> x > column_number - 2, cumsum([(widths .+ 3)...])) - 1
            if n == -1
                n = length(widths)
            end
        else
            n = length(widths)
        end

        if n > 0
            show_row(io, names, widths, n)
            show_row(io, (Repeated("-", i) for i in [widths...][1:n]))
            for row in rows
                show_row(io, row, widths, n)
            end
            nothing
        else
            nothing
        end
    end
end
@ararslan ararslan added collections Data structures holding multiple items, e.g. sets display and printing Aesthetics and correctness of printed representations of objects. labels Nov 16, 2017
@JeffBezanson
Copy link
Member

I think this is a good idea. The usual printing of arrays of named tuples is pretty annoying, repeating the names on every line. This also allows implementing things like #12131. What's your concern, @ararslan ?

One issue is that we don't want too many different ways of printing tables. We should perhaps move DataFrames' printing code into Base as a showtable function.

@ararslan
Copy link
Member

My concern is that it's too specific; arrays of named tuples could potentially be used for things other than table-like structures. Plus, what do we do if a given element in the array has different names than the others? That would mess up the table formatting. And it would be weirdly inconsistent to print things nicely only in the case of an array where each element has the same set of names.

@bramtayl
Copy link
Contributor Author

I think you should be able to tell whether or not the names are consistent from the type signature alone.

@ararslan
Copy link
Member

Yes, but that doesn't change or address anything I said above.

@JeffBezanson
Copy link
Member

weirdly inconsistent to print things nicely only in the case of an array where each element has the same set of names

I don't see it as inconsistent, since having the same names is what makes the nicer printing make sense. Similar to how we will print the type of each element if the array is heterogeneous, but not if all elements have the same type.

@ararslan
Copy link
Member

Similar to how we will print the type of each element if the array is heterogeneous, but not if all elements have the same type.

?

julia> Any[1, 1.0, "1"]
3-element Array{Any,1}:
 1
 1.0
  "1"

@JeffBezanson
Copy link
Member

Well, that's how it should work; see #24651. It doesn't apply to all types since many types like strings don't ever have a type as part of their representation.

@rfourquet
Copy link
Member

I'm not sure yet, but I think I prefer a dedicated showtable function or KeyedTable struct which is not an alias to NamedTuple. My 2 small concerns is how the formatting will be degraded when one line doesn't fit on one screen line, and the big difference between the printing of two NamedTuple arrays depending on whether all the elements have the same names. Also FWIW, with the PR linked by Jeff, it's already easy to have not the field names repeated if they all have the same.

@bramtayl
Copy link
Contributor Author

How about a table struct wrapper for a vector named tuples which prints as a markdown table?

@bramtayl
Copy link
Contributor Author

An updated version now that the windows 0.7 binaries are out:

const Table{T} = AbstractVector{NamedTuple{T, T1}} where {T, T1 <: Tuple}

Base.names(t::Table{T}) where T = T

function Base.summary(t::Table)
    "$(length(t)) x $(length(names(t))) Table"
end

function Base.showarray(io::IO, t::Table, ::Bool)
    println(io, summary(t))
    if length(t) != 0
        display_length, display_width = displaysize(io)
        limit = get(io, :limit, false)
        table_names = string.(names(t))
        number_of_columns = length(table_names)
        rows = map(
            if limit
                t[1:min(display_length - 3, length(t))]
            else
                t
            end
        ) do row
            string.(Tuple(row))
        end
        unshift!(rows, table_names)
        if limit
            n = findfirst(
                x -> x > display_width - 2, 
                cumsum([(mapreduce(
                    row -> map(textwidth, row),
                    (x, y) -> map(max, x, y),
                    rows
                ) .+ 3)...])
            ) - 1
            if n != -1
                number_of_columns = n
            end
        end
        if number_of_columns > 0
            Base.show(io, Markdown.MD([Markdown.Table(
                map(rows) do row
                    [row...][1:number_of_columns]
                end, 
                repeat([:r], inner = number_of_columns)
            )]))
        end
    end
end

@bramtayl
Copy link
Contributor Author

Still think this might be nice, but I guess a little unexpected

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
collections Data structures holding multiple items, e.g. sets display and printing Aesthetics and correctness of printed representations of objects.
Projects
None yet
Development

No branches or pull requests

4 participants