Provide wrapper for nan_array to give lazy object correct dtype #2530

djkirkham · 2017-05-09T08:36:00Z

As I suggested yesterday, a wrapper class around an array to make dask think it has a particular dtype. This will make fixing the issue of getting the correct dtype for cube operations (#2528) significantly easier.

Currently the tests fail.

djkirkham · 2017-05-09T08:36:15Z

@pp-mo @dkillick @bjlittle

djkirkham · 2017-05-09T09:59:33Z

This causes concatenating cubes to give incorrect results in some cases. Here's some code that deomonstrates what goes wrong at the dask level:

import dask.array as da
import numpy as np

class _DtypeWrapper(object):
    def __init__(self, array, dtype):
        self.dtype = dtype
        self.array = array

    @property
    def shape(self):
        return self.array.shape

    @property
    def ndim(self):
        return self.array.ndim

    def __getitem__(self, item):
        return self.array.__getitem__(item)


a = np.array([1,2])
b = np.array([3.,np.nan])
b_wrapped = _DtypeWrapper(b, dtype=int)
a_dask = da.from_array(a, 1)
b_dask = da.from_array(b_wrapped, 1)
c = da.concatenate([a_dask, b_dask])

print c.compute()

yields:

[                   1                    2                    3
 -9223372036854775808]

This is because the second array is being interpreted as integer (compare with b.astype(int)).

Provide wrapper for nan_array to give lazy object correct dtype

b81578f

QuLogic added the Status: Work in Progress label May 9, 2017

djkirkham closed this May 9, 2017

QuLogic removed the Status: Work in Progress label May 9, 2017

djkirkham deleted the nan_proxy branch October 26, 2017 13:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide wrapper for nan_array to give lazy object correct dtype #2530

Provide wrapper for nan_array to give lazy object correct dtype #2530

djkirkham commented May 9, 2017

djkirkham commented May 9, 2017

djkirkham commented May 9, 2017

Provide wrapper for nan_array to give lazy object correct dtype #2530

Provide wrapper for nan_array to give lazy object correct dtype #2530

Conversation

djkirkham commented May 9, 2017

djkirkham commented May 9, 2017

djkirkham commented May 9, 2017