Improving indexing operations of ranges of bit integers #26608

haampie · 2018-03-24T18:52:44Z

Currently, computations in getindex(::UnitRange, ::Integer) can overflow during the bounds check. If so, it throws an InexactError; not a BoundsError. Compare:

> (-100:100)[400] # ok
ERROR: BoundsError: attempt to access 201-element UnitRange{Int64} at index [400]
> (Int8(-100):Int8(100))[400] # :(
ERROR: InexactError: trunc(Int8, 299)
> (-100:100)[typemax(UInt)] # :(
ERROR: InexactError: check_top_bit(UInt64, 18446744073709551514)

I think I have fixed this for all bit integer types (both the range type and index type) with this hack:

using Base: BitInteger

function getindex(r::UnitRange{Tr}, i::Integer) where {Tr<:BitInteger}
    @_inline_meta
    # if i ≥ 1 we can safely do i - 1 without ending up at typemax(i)
    # if r.start ≤ r.stop we could compute r.stop - r.start, but this number may not be representable in signed integers, so switch to unsigned integers with a hack.
    @boundscheck i ≥ 1 && r.start ≤ r.stop && i - 1 ≤ unsigned(r.stop) - unsigned(r.start) || throw_boundserror(r, i)

    return (r.start + (i - 1)) % Tr
end

But I'm not 100% sure this is the way to go and if I'm covering everything. At least it fixes the problems listed above:

> (Int8(-100):Int8(100))[400]
ERROR: BoundsError: attempt to access 201-element UnitRange{Int8} at index [400]
> (-100:100)[typemax(UInt)]
ERROR: BoundsError: attempt to access 201-element UnitRange{Int64} at index [18446744073709551615]

The text was updated successfully, but these errors were encountered:

haampie · 2018-03-25T14:54:42Z

Another issue not solved by the above:

> (false : true)[3]
ERROR: InexactError()
Stacktrace:
 [1] convert(::Type{Bool}, ::Int64) at ./bool.jl:7
 [2] getindex(::UnitRange{Bool}, ::Int64) at ./range.jl:476

mbauman · 2018-03-25T20:41:10Z

The other thing to keep in mind here is just how performance-sensitive bounds checks are. That really constrains our ability to do fancy things. I'd love for this to be better, though! The abstract checkbounds works in all your error cases, but I'm certain that we're using a custom bounds checking implementation as an optimization. It might be worth doing an updated check on the difference in performance there.

haampie · 2018-03-26T08:31:39Z

Thanks for the reply! A few observations:

My implementation suggested above is indeed slower than the current implementation
The checkbounds method is nearly twice as fast as the current implementation, but unfortunately fails when length(r) overflows
```
> checkbounds(typemin(Int) : typemax(Int), 10)
ERROR: OverflowError: 9223372036854775807 - -9223372036854775808 overflowed for type Int64
```
Now length(r) will always overflow for typemin(Int) : typemax(Int), since the length is typemax(UInt)+1. I think this issue was the rationale for the current implementation rather than performance.
The current implementation has overhead even when @inbounds is used; the bottleneck seems convert(T, ...) which is called outside of @boundscheck.

Considering all this, I think it's best to keep the current implementation, but remove the convert call:

function new_getindex(v::UnitRange{T}, i::Integer) where T
    @_inline_meta
    value = v.start + i - 1 # Overflow is well-defined behaviour; if this overflows then value < start regardless
    @boundscheck ((i > 0) & (value ≥ v.start) & (value ≤ v.stop)) || throw_boundserror(v, i)
    value % T # This improves performance
end

It works for all edge cases above. I've benchmarked it with r = Int8(-128):Int8(127):

function sum_impl_1(r::UnitRange{T}) where {T}
    sum = zero(T)
    for idx = linearindices(r) # idx of type Int64
        sum += r[idx]
    end
    sum
end

function sum_impl_2(r::UnitRange{T}) where {T}
    sum = zero(T)
    for idx = linearindices(r) # idx of type Int64
        sum += new_getindex(r, idx)
    end
    sum
end

function sum_impl_3(r::UnitRange{T}) where {T}
    sum = zero(T)
    val = r.start
    while true
        sum += val
        val += T(1)

        if val == r.stop
            break
        end
    end
    sum
end

Which gives:

With bounds checking: 141.935 ns vs 74.925 ns vs 1.451 ns respectively;
Without bounds checking: 142.058 ns vs 1.452 ns vs 1.452 ns respectively.

Interpretation: new implementation is faster with bounds checking, and without bounds checking the compiler is able to optimize the whole for-loop out. I'll just make this a PR and see whether it passes tests.

mbauman added the arrays [a, r, r, a, y, s] label Mar 25, 2018

haampie mentioned this issue Mar 26, 2018

Fix conversion in getindex #26615

Merged

KristofferC closed this as completed in #26615 Jun 26, 2018

ethomag mentioned this issue May 3, 2020

range(0, length=Uint(0)) throws InexactError #35711

Closed

wildart mentioned this issue Nov 22, 2020

counts(x::IntegerArray) doesn't play nice with arrays of random integers JuliaStats/StatsBase.jl#617

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improving indexing operations of ranges of bit integers #26608

Improving indexing operations of ranges of bit integers #26608

haampie commented Mar 24, 2018 •

edited

Loading

haampie commented Mar 25, 2018 •

edited

Loading

mbauman commented Mar 25, 2018

haampie commented Mar 26, 2018 •

edited

Loading

Improving indexing operations of ranges of bit integers #26608

Improving indexing operations of ranges of bit integers #26608

Comments

haampie commented Mar 24, 2018 • edited Loading

haampie commented Mar 25, 2018 • edited Loading

mbauman commented Mar 25, 2018

haampie commented Mar 26, 2018 • edited Loading

haampie commented Mar 24, 2018 •

edited

Loading

haampie commented Mar 25, 2018 •

edited

Loading

haampie commented Mar 26, 2018 •

edited

Loading