Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibly undocumented behavior of indexing assignment to arrays #40018

Open
piyushsrv opened this issue Mar 14, 2021 · 6 comments
Open

Possibly undocumented behavior of indexing assignment to arrays #40018

piyushsrv opened this issue Mar 14, 2021 · 6 comments
Labels
arrays [a, r, r, a, y, s]

Comments

@piyushsrv
Copy link

piyushsrv commented Mar 14, 2021

This is based on a post on the forum.

The documentation for indexing assignment says:

If any index I_k selects more than one location, then the right hand side X must be an array with the same shape as the result of indexing A[I_1, I_2, ..., I_n] or a vector with the same number of elements. The value in location I_1[i_1], I_2[i_2], ..., I_n[i_n] of A is overwritten with the value X[I_1, I_2, ..., I_n] , converting if necessary.

I am not sure if this is a bug, but the behavior in 1.5.4 (and also 1.5.3) seems to slightly different:

A = zeros(10, 10); #10 x 10 Array{T, 2}
                                                                   
b = ones(100); #100 element Array{T, 1}
                                                                   
s = zeros(1, 100); #1 x 100 Array{T, 2}
                                                                   
t = zeros(100, 1); # 100 x 1 Array{T, 2}

Now as documented,

 A[:, :] = b # works as documented

fills A with ones. The behavior with s and t, however, is a bit puzzling. Neither of them are vectors. However,

 A[:, :] = s

fills A with zeros, even though this behavior does not seem to be documented in the quote from the “Indexing assignment” section reproduced above, as s neither has the same dimensions as A, nor is it a “vector”. Indeed,

 A[:, :] = t

raises precisely this error:

ERROR: DimensionMismatch("tried to assign 100×1 array to 10×10 destination")

A comment on the forum by Tamas Papp also suggests that this might be undocumented behavior. More precisely, I am not sure if it is a documented feature that the assignment with s above should work, while the one with t should fail. (The quote from the documentation seems to suggest both should fail with an error).

julia> versioninfo()
Julia Version 1.5.4
Commit 69fcb5745b (2021-03-11 19:13 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake)
@JeffBezanson JeffBezanson added the arrays [a, r, r, a, y, s] label Mar 15, 2021
@piyushsrv
Copy link
Author

piyushsrv commented Dec 23, 2021

The same (possibly undocumented) indexing behavior as in the original report persists in julia 1.8.4. To summarize, it seems that some indexing operations treat 1 x 100 Matrix{T} as a Vector{T}, but do not treat 100 x 1 Matrix{T} as a Vector{T}. According to the part of the documentation quoted in the original report above, neither of two kinds of Matrix{T} instances should be treated as a Vector{T}.

julia> versioninfo()
Julia Version 1.8.4
Commit 00177ebc4fc (2022-12-23 21:32 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 4 × Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
  Threads: 1 on 4 virtual cores

@piyushsrv
Copy link
Author

I should note that the wording of the documentation has changed slightly to the following:

If any index I_k is itself an array, then the right hand side X must also be an array with the same shape as the result of indexing A[I_1, I_2, ..., I_n] or a vector with the same number of elements. The value in location I_1[i_1], I_2[i_2], ..., I_n[i_n] of A is overwritten with the value X[I_1, I_2, ..., I_n], converting if necessary. The element-wise assignment operator .= may be used to broadcast X across the selected locations:

However, this does not affect the problem above. julia.1.8.4 still treats a 100 x 1 array as a Vector while not treating a 1 x 100 array as a vector, as described in the original report for julia 1.5.4 above.

@StefanKarpinski
Copy link
Member

Requesting the attention of @mbauman.

@piyushsrv
Copy link
Author

piyushsrv commented Jan 10, 2023

On the forum post where I first posted this as a question, User Tamas_Papp linked to the code of the function setindex_shape_check, which seems to check when A[I_1, I_2, ..., I_n] = X (where X <: AbstractArray) is to be allowed.

Based on my reading of the code, it appears to me that the documentation quoted above is not a correct description of the behavior of setindex_shape_check. What the function setindex_shape_check is doing is better described by the following text:

Let J_1, J_2, ..., J_m be the lengths of the m (possibly different from n) axes of X. Then

A[I_1, I_2, ..., I_n] = X

is allowed if, and only if, after inserting an arbitrary number of 1's in the two tuples

L1 = (length(I_1), length(I_2), ..., length(I_n)),

and

L2 = (length(J_1), length(J_2), ..., length(J_m)),

they can be converted into two tuples L1' and L2' of lengths n' >= n and m' >= m respectively for which the following are true:

  1. L1'[n'] = L1[n] and L2'[m'] = L2[m] (i.e., no 1's are inserted at the end).
  2. If $n' &lt; m'$ then L1'[i] = L2'[i] for $1 \leq i \leq n' - 1$, and L1'[n'] is equal to the product $\prod_{n' \leq j \leq m'}$ L2'[j].
  3. If $m' \le n'$ then L1'[i] = L2'[i] for $1 \leq i \leq m' - 1$, and L2'[m'] is equal to the product $\prod_{m' \leq j \leq n'}$ L1'[j].

The behavior of setindex_shape_check described above explains the code snippets I posted. However, it is clear that its behavior is indeed different from what is described currently in the julia language documentation.

I can try to work on a different version of the text that describes the behavior of setindex_shape_check.

The behavior of setindex_shape_check is also different from the documented behavior of broadcast (i.e., broadcast is more conservative in handling axes of dimension 1 than the current implementation of setindex_shape_check). It is also therefore different from what other comparable packages (e.g. numpy) document their behavior to be: for example, numpy documentation says:

You may use slicing to set values in the array, but (unlike lists) you can never grow the array. The size of the value to be set in x[obj] = value must be (broadcastable to) the same shape as x[obj].

@mbauman
Copy link
Member

mbauman commented Jan 10, 2023

I like the documented behavior significantly better than the implemented one. The behavior of setindex_shape_check is quirky and old (like, it hasn't been meaningfully touched since v0.3). Its behaviors align with Julia of that vintage, namely:

I simply missed updating this check as we changed those behaviors.

So that's the how and the why... the more interesting question is the what are we gonna do about it. I don't really want to change the documentation to be more complicated here — it's already more complicated than I'd like. And I'd wager there is code in the wild relying upon, e.g., a horizontal "row vector" assignment that only works because we skip those unitary dimensions.

@tpapp
Copy link
Contributor

tpapp commented Jan 11, 2023

code in the wild relying upon, e.g., a horizontal "row vector" assignment that only works because we skip those unitary dimensions

Would this code error if the implementation is made consistent with the docs, or silently do the wrong thing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s]
Projects
None yet
Development

No branches or pull requests

5 participants