Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend "whos" to print the size (in bytes) of Julia objects #11461

Closed
wants to merge 3 commits into from
Closed

Extend "whos" to print the size (in bytes) of Julia objects #11461

wants to merge 3 commits into from

Conversation

dancasimiro
Copy link
Contributor

This pull request changes the whos function to print out the size of Julia objects (#3393). I added a totalsizeof function to measure the size of Julia objects (#7603). Finally, it adds an eachindex for Tuple types so that totalsizeof can treat Tuples like other iterable objects.


sz = 0
try
sz = sizeof(x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, which types of objects does this fail on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It fails on the Symbol type. An Error is raised that says "value does not have a canonical binary representation."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defaulting to 0 in such cases is also a bit odd, but OTOH since symbols are reused perhaps that's as good as any other choice.

But if Symbols are the main way in which this fails, it might be better to say

if !isa(x, Symbol)
    try
        sz = sizeof(x)
...

because exceptions are a slow way of doing control flow. (More relevant for programmatic usage of totalsize than for usages of whos(), of course.) For reference, the builtin definition of sizeof might be helpful.

@timholy
Copy link
Member

timholy commented May 28, 2015

It would be great to add some tests of the totalsize function, and to add documentation for it. Once those are addressed, this looks merge-worthy to me.

@dancasimiro
Copy link
Contributor Author

@timholy, what do you think about the eachindex implementation for Tuple types? I wasn't sure if it was omitted intentionally.

@timholy
Copy link
Member

timholy commented May 28, 2015

I think it's a good addition. Thanks for doing it.

@dancasimiro
Copy link
Contributor Author

I think that I addressed everything that @timholy pointed out. The CI failures seem unrelated to my change.

This function is equivalent to the "eachindex" function that is already
defined for the AbstractArray type. This allows code that can generically
iterate over Tuple indices in addition to AbstractArray indices.
This function measures the total number of bytes used by Julia objects.

Fixes #7603

I don't think that this function should get exported, so it doesn't make sense to me that it should be documented in the manual.
Fixes #3393: whos() should print the memory usage of the objects

As of this change, "whos" prints the total size of an object in bytes. A new
column was added to the output.

julia> whos()
Base                                  16 bytes Module
Compat                                16 bytes Module
Core                                  16 bytes Module
Main                                  16 bytes Module
WAV                                   16 bytes Module
ans                                    8 bytes Int64
wav1                               33066 KB    Tuple{Array{Int16,2},UInt32,UInt16,Dict{Symbol,Any}}
wav2                                 129 MB    Tuple{Array{Float64,2},UInt32,UInt16,Dict{Symbol,Any}}

It isn't expected, but totalsizeof might throw an exception. Prepare whos so that the user interaction doesn't break.
@tkelman
Copy link
Contributor

tkelman commented Jun 27, 2015

appveyor failure was #11890

@ScottPJones
Copy link
Contributor

👍

@ViralBShah
Copy link
Member

Bump - can we merge this?

@jakebolewski
Copy link
Member

I think this needs a bit of work before getting merged (or some follow up work if we merge this). Instead of using a Set and taking the pointer of the underlying object it should just use an ObjectIdDict (which does the same thing). It also should also not have to iterate through the entire array if the array type is isbits.

I am hesitant to merge this as the result will often be wrong if users do not carefully implement the sizeof operator for custom types. It also does not pass the "Big Data" REPL test. Printing whos with this PR will be painfully slow if the workspace is filled with a number of data structures that take significant amounts of memory.

@StefanKarpinski
Copy link
Member

I wonder if a different approach might be to allow a GC pass to attribute all allocated bytes to some gc root object. You could even generalize that to take a set of arbitrary "roots" and follow them, attributing data to them as it goes. It's unclear what to do when accounting for multiply reachable memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants