You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The default dtype of array is platform-dependant. ( #9464 )
When running tests in a continuous integration context, that are ran on multiple platforms (Windows, macOS, Linux), the fact that the default dtypes of arrays can vary must be taken into account.
The issue appears for tests relying on numpy's arrays representations. Indeed, the default dtype of the array is not displayed in the array representation. This means that an expected output representation is now dependant on the platform. Writing OS-specific tests is now unavoidable.
What I would like is being able to write platform independent repeatable outputs that can be used for automated testing.
Example
Actual
On my machine, the default dtype for integer arrays is int64. Here are some examples of array creations and their representations:
When creating an array with no dtype kwarg, the default dtype is used. The array representation solely is not enough to know the actual dtype.
When creating an array with a dtype kwarg matching the default integer dtype of the platform, the resulting array representation is the same, and dtype is also implicit.
The last case is the most explicit: the user provides the expected dtype, and the representation reflects that. This only works for non-default dtypes.
The dtype is always printed out, and the default dtype does not influence the representation. So, since the default dtype depends on the platform, and the representation depends on the dtype, the chain is broken and the representation does not depend anymore on the platform. Writing platform independant tests relying on representation is now easier.
I first looked into https://numpy.org/doc/stable/reference/generated/numpy.set_printoptions.html
I experimented with kwarg legacy='1.13' and legacy='1.21, without success. Also, even if I were successful, I would have dislike relying on a kwarg named legacy, strongly implying it should not be used anymore in new code.
Proposed solution
Adding a new dtype printing option
importnumpynp.set_printoptions(dtype="default") # current behaviournp.set_printoptions(dtype="always") # always print dtypenp.set_printoptions(dtype="never") # never print dtype
Technical Analysis
The function _array_repr_implementation implements the array representation logic. We can see the logic where it adds the suffix, and there is no way to force print the dtype, or force not printing it.
FWIW, I changed things so in NumPy 2.0 the default on windows is 64bit also. It still is 32bit on 32bit platforms, though, so it doesn't remove the platform issue fully. Just hopefully the worst caveat.
I don't have an opinion on always printing it. But since we hide it, having an option in the printoptions for it seems very reasonable to me. (Not sure I think there is much reason to always hide it.)
Proposed new feature or change:
Motive
The default dtype of array is platform-dependant. ( #9464 )
When running tests in a continuous integration context, that are ran on multiple platforms (Windows, macOS, Linux), the fact that the default dtypes of arrays can vary must be taken into account.
The issue appears for tests relying on numpy's arrays representations. Indeed, the default dtype of the array is not displayed in the array representation. This means that an expected output representation is now dependant on the platform. Writing OS-specific tests is now unavoidable.
What I would like is being able to write platform independent repeatable outputs that can be used for automated testing.
Example
Actual
On my machine, the default dtype for integer arrays is
int64
. Here are some examples of array creations and their representations:We can see that:
Desired
The dtype is always printed out, and the default dtype does not influence the representation. So, since the default dtype depends on the platform, and the representation depends on the dtype, the chain is broken and the representation does not depend anymore on the platform. Writing platform independant tests relying on representation is now easier.
from
platform <- default dtype <- repr
=>platform <- repr
to
platform <- default dtype </- repr
=>platform </- repr
Existing solutions I looked for
np.set_printoptions
I first looked into https://numpy.org/doc/stable/reference/generated/numpy.set_printoptions.html
I experimented with kwarg
legacy='1.13'
andlegacy='1.21
, without success. Also, even if I were successful, I would have dislike relying on a kwarg namedlegacy
, strongly implying it should not be used anymore in new code.Proposed solution
Adding a new
dtype
printing optionTechnical Analysis
The function
_array_repr_implementation
implements the array representation logic. We can see the logic where it adds the suffix, and there is no way to force print the dtype, or force not printing it.Allow to override this param could be helpful:
Role of the proposed new
skipdtype
three-valued kwarg:None
: current behaviour, platform-dependantFalse
: always print the, dtype=...
suffixTrue
: never print the, dtype=...
suffixAdditional links
nbytes
representation in DataArrays and Datasetrepr
pydata/xarray#8702The text was updated successfully, but these errors were encountered: