Why is `length` defined on numbers? #11769

samuela · 2015-06-19T21:30:24Z

I discovered from a typo today that the length method is defined on numbers:

julia> length(5)
1

julia> length(pi)
1

It's defined here: https://github.com/JuliaLang/julia/blob/master/base/number.jl#L12. Why is length defined on numbers? This seems counterintuitive. Numbers aren't iterable in any way after all.

The text was updated successfully, but these errors were encountered:

joehuchette · 2015-06-19T21:35:08Z

They are in Julia:

julia> for a in 1
       @show a
       end
a = 1

samuela · 2015-06-19T21:36:45Z

AHH, BUT WHYYY?

nalimilan · 2015-06-19T21:37:04Z

It's considered practical by some people. The question is rather whether this has drawbacks or not. Else there's no reason to remove that.

FWIW, please use the mailing list for questions in the future, and keep GitHub for bug reports.

Keno · 2015-06-19T21:37:33Z

FWIW, I consider this a mistake, it has bitten me way too many times.

carnaval · 2015-06-19T21:38:11Z

I also find it a bad idea, but github may not be the place to revive the debate

samuela · 2015-06-19T21:38:59Z

This just reeks of MATLAB to me.

jakebolewski · 2015-06-19T21:44:40Z

@samuela the supported iteration / indexing interface is to basically treat scalars as zero dimensional arrays.

Keno · 2015-06-19T21:47:35Z

It is way too easy to miss a colon in a for loop and only do one iteration - I remember vividly how I spent a week debugging why my locality sensitive hash wasn't local - I had for i = x instead of for i = 1:x for the number of rounds boom.

nalimilan · 2015-06-19T21:49:17Z

Yeah, I'm not a great fan of this either. :-/

JeffBezanson · 2015-06-19T21:49:39Z

Yes I really think we should experiment with removing iteration of numbers. I think the array indexing redesign might make this possible.

johnmyleswhite · 2015-06-19T21:55:15Z

It is way too easy to miss a colon in a for loop and only do one iteration

This is the bug I most often hit these days.

mbauman · 2015-06-19T22:22:50Z

Here was my last 15-minute attempt at making numbers non-iterable: #10331 (comment)

carlobaldassi · 2015-06-19T22:33:56Z

Just to balance the opinions here, I'll mention I use this feature a lot and find it quite handy.

joehuchette · 2015-06-19T22:35:26Z

@carlobaldassi what does a typical use look like?

carlobaldassi · 2015-06-19T22:49:20Z

@joehuchette : for example: I have functions which explore ranges of parameters for collecting data. Most of the time I need to change one parameter or two, and keep the rest at some value. So I just have function signatures like f(A, B, C) and in the code there are loops which look like for a in A, b in B, c in C. Then I can call either f(1.0, 2.0, 0.1:0.1:1.0) or f(1:0.5:10, 2.0, 1.0) etc.
This is just the first example which comes to mind and may not be the best; in general I have lots of places in the code I've written which use the iterable interface with the assumption that I can pass scalars to them.

samuela · 2015-06-19T23:04:02Z

Why not f([1], [2], 0.1:0.1:1.0) or f(1:0.5:10, [2.0], [1.0]) then?

ScottPJones · 2015-06-19T23:29:26Z

@carlobaldassi What does that do to the code generated? Doesn't that cause problems of type instability, if the argument could either be a scalar or a range? Instead of [1], which I would think would also have type instability problems, would 1.0:1.0 work? A little more typing, but if it makes the code run faster... (I guess you'd need to look at @code_native to see for sure). Of course, I may be all wet on this!

carlobaldassi · 2015-06-19T23:34:38Z

@samuela; Here's a line I just entered in the REPL minutes ago:

julia> span(α=0.75,S=0.991:0.001:0.999, x=5.935451778194593, q0=0.653270041192449,q1=0.9806204541529127,δq=0.0007726138979231578,q̂0=0.140152147115123,q̂1=0.5386512290281629,δq̂=0.030249748497618967,Ŝ=0.48377607373470927)

I suppose you can see how it might become annoying to add [] around each entry which might be iterated over. Of course, this is not "nice" code which would get into julia base or a public package — it's a rather dirty and hands-on data exploration task. The code is also kind of fluid and follows the results and ideas which come up as the task proceeds, so what's iterated may change (often).

Anyway, as I said, it's probably not the best example, just one possible way to show how my life is a little easier because numbers are iterable.

@ScottPJones : this is not a hot loop, it's just the interface. Each iteration takes at least tens of seconds, at worst hours (and I can assure you I've been optimizing that quite aggressively). As I said, it's basically an API for me interacting with the REPL, and you may see that I already type plenty.

ScottPJones · 2015-06-19T23:45:15Z

OK, thanks for the explanation... with all this math stuff, I feel it's way over my head sometimes! 😀

timholy · 2015-06-20T00:35:24Z

The biggest advantage of our current system is that it allows you to use a single algorithm for vectors or scalars without having to allocate an array to pass a scalar. There are many instances where simply wrapping that scalar as [s] would cause an unacceptable performance hit (we're not talking 20% here, it's many-fold slower execution). For what it's worth, here's a trivial demo:

julia> function sumelts(A)
           s = 0.0
           for a in A
               s += a
           end
           s
       end
sumelts (generic function with 1 method)

julia> function timeit1(n)
           S = 0.0
           for i = 1:n
               S += sumelts(i)
           end
           S
       end
timeit1 (generic function with 1 method)

julia> function timeit2(n)
           S = 0.0
           for i = 1:n
               S += sumelts([i])
           end
           S
       end
timeit2 (generic function with 1 method)

# Warmup suppressed

julia> @time timeit1(10^6)
  12.690 milliseconds (6 allocations: 176 bytes)
5.000005e11

julia> @time timeit2(10^6)
  94.928 milliseconds (1999 k allocations: 93742 KB, 21.42% gc time)
5.000005e11

While on balance I favor this feature, at the same time I am aware that it can cause problems. If folks decide it's simply too bug-prone, I will learn to live with its removal and write two implementations of algorithms where I am exploiting this feature.

Keno · 2015-06-20T00:38:15Z

I know it's not idea, but what about an iterator that for scalars makes them iterable and for everything else just passes through. That way in the limited cases where this is useful, it is still possible (albeit a little ugly), while avoiding the trap in the common case.

timholy · 2015-06-20T00:40:38Z

Like this?

julia> immutable Container{T}
           val::T
       end

julia> Base.start(c::Container) = 1
start (generic function with 47 methods)

julia> Base.next(c::Container, index::Int) = (c.val, 2)
next (generic function with 60 methods)

julia> Base.done(c::Container, index::Int) = index != 1
done (generic function with 49 methods)

julia> function timeit3(n)
           S = 0.0
           for i = 1:n
               S += sumelts(Container(i))
           end
           S
       end
timeit3 (generic function with 1 method)

julia> @time timeit3(10^6)
  75.680 milliseconds (3000 k allocations: 46875 KB, 13.54% gc time)
5.000005e11

Sniff.

johnmyleswhite · 2015-06-20T00:44:48Z

What about a range? Is that going to cause numeric problems?

function timeit4(n)
    S = 0.0
    for i in 1:n
        S += sumelts(i:i)
    end
    S
end

Keno · 2015-06-20T00:47:49Z

immutable Container{T}
   val::T
end
Base.start{T<:Number}(c::Container{T}) = false
Base.next{T<:Number}(c::Container{T}, index::Bool) = (c.val, true)
Base.done{T<:Number}(c::Container{T}, index::Bool) = index
Base.start(c::Container) = start(v.cal)
Base.next(c::Container,i) = next(v.cal,i)
Base.done(c::Container,i) = done(v.cal,i)

julia> @time timeit1(10^6)
   5.942 milliseconds (6 allocations: 176 bytes)
5.000005e11
julia> @time timeit3(10^6)
   4.753 milliseconds (6 allocations: 176 bytes)
5.000005e11

It is possible that we have a codegen regression within the past week, there have been a number of changes to codegen, but on my week old version of julia the generated code is identical.

timholy · 2015-06-20T00:53:50Z

Much better---amazing what a difference a small tweak makes. @johnmyleswhite, your version is also faster than what I posted, though worse than @Keno's if you use Float64 rather than Int.

Assuming this solution generalizes to less-trivial functions, I have no substantive objections to changing this.

mbauman · 2015-06-20T00:56:57Z

This is effectively what I did to simplify setindex! with broadcasting scalars (except instead of only iterating once they repeat forever): multidimensional.jl:304-307.

JeffBezanson · 2015-06-20T02:00:32Z

I tried @timholy 's code and I don't see any allocation. Maybe a slow definition was polluting the method cache somehow?

Another good option is to use a 1-tuple (S += sumelts((i,))). It's just as fast and doesn't require a new type.

Keno · 2015-06-20T02:02:02Z

@JeffBezanson well, there's the concern on not wanting to type those three extra characters at the REPL every time.

JeffBezanson · 2015-06-20T02:18:30Z

Libraries could also use

containerize(x::Number) = (x,)
containerize(y) = y

instead of a new type.

nalimilan · 2015-06-20T09:30:49Z

FWIW, one reason not to treat scalars as collections is the inconsistency with indexing dropping or not (trailing) dimensions. For example, if x is a matrix, x[:,i] can be either a vector or a matrix depending on whether i is a scalar, which goes against the idea of treating and 1 the same [1] in other places.

JeffBezanson · 2015-06-20T18:16:34Z

Numbers are actually 0-dimensional, so they have a single element without being vectors.

matthieugomez · 2015-07-15T22:13:46Z

I also lost a couple of hours due to

for i in length(x)

It'd be great to catch this in some way (which I now do via syntax highlighting, at least).

malmaud · 2015-07-15T22:19:41Z

While I still support making scalars non-iterable, the rise of the eachindex(X) paradigm instead of 1:length(X) has made it less pressing.

rofinn · 2015-08-18T20:19:50Z

+1 for making scalars non-iterable. I think this would help solve several other consistency issues. For example findin(array, x) works fine when x is scalar, but you need to do findin(array, [x]) for everything else.

KristofferC · 2015-08-18T21:37:45Z

If something is messed up with my code, it feels like it is 90% of the times because I have written for i in length(x). Guess I just gotta get used to eachindex.

ScottPJones · 2015-08-18T23:21:18Z

Maybe make scalars non-iterable first thing in 0.5? If eachindex is a better replacement, why keep something causing so much grief?

samuela · 2015-08-19T04:07:42Z

@ScottPJones +1

nalimilan closed this as completed Jun 19, 2015

yuyichao mentioned this issue Jul 15, 2015

Typo #12170

Closed

yuyichao mentioned this issue Aug 16, 2015

Deprecate colon(::AbstractFloat, ::AbstractFloat) #12624

Closed

jakebolewski added this to the 0.5 milestone Aug 19, 2015

roelstappers mentioned this issue Aug 25, 2020

collect on a Adjoint{T,Array{T,1}} should return a Adjoint{T,Array{T,1}} JuliaLang/LinearAlgebra.jl#762

Closed

Why is length defined on numbers? #11769

Why is length defined on numbers? #11769

Comments

samuela commented Jun 19, 2015

joehuchette commented Jun 19, 2015

samuela commented Jun 19, 2015

nalimilan commented Jun 19, 2015

Keno commented Jun 19, 2015

carnaval commented Jun 19, 2015

samuela commented Jun 19, 2015

jakebolewski commented Jun 19, 2015

Keno commented Jun 19, 2015

nalimilan commented Jun 19, 2015

JeffBezanson commented Jun 19, 2015

johnmyleswhite commented Jun 19, 2015

mbauman commented Jun 19, 2015

carlobaldassi commented Jun 19, 2015

joehuchette commented Jun 19, 2015

carlobaldassi commented Jun 19, 2015

samuela commented Jun 19, 2015

ScottPJones commented Jun 19, 2015

carlobaldassi commented Jun 19, 2015

ScottPJones commented Jun 19, 2015

timholy commented Jun 20, 2015

Keno commented Jun 20, 2015

timholy commented Jun 20, 2015

johnmyleswhite commented Jun 20, 2015

Keno commented Jun 20, 2015

timholy commented Jun 20, 2015

mbauman commented Jun 20, 2015

JeffBezanson commented Jun 20, 2015

Keno commented Jun 20, 2015

JeffBezanson commented Jun 20, 2015

nalimilan commented Jun 20, 2015

JeffBezanson commented Jun 20, 2015

matthieugomez commented Jul 15, 2015

malmaud commented Jul 15, 2015

rofinn commented Aug 18, 2015

KristofferC commented Aug 18, 2015

ScottPJones commented Aug 18, 2015

samuela commented Aug 19, 2015

Why is `length` defined on numbers? #11769

Why is `length` defined on numbers? #11769