read_dates returns `dtype('O')` instead of `dtype('<M8[ns]')` #49

sebhahn · 2020-04-27T08:42:57Z

netCDF4.num2date used here seems to return a numpy.dtype('O'), which laters leads to troubles because it is expected to get a datetime object, like numpy.dtype('<M8[ns]'). I couldn't track down if this is related to a certain version of netCDF4 or numpy (or even cftime)

Using return self.dates.astype('datetime64[ns]') here seems to be a workaround, but I'm not sure if this is the best solution give the fact that is unknown which package/version is causing the problem.

The text was updated successfully, but these errors were encountered:

sebhahn · 2020-04-30T10:31:46Z

Ok the problem is related to the fact that the default behavior of netCDF4.num2date has changed. I already added the type conversion mentioned above in version 0.2.1, but it seems like some tests in other packages (like ascat) are failing because the returned data type of netCDF4.num2date has changed from cftime.datetime instead of datetime.datetime (see this issue Unidata/netcdf4-python#994 and this related Unidata/cftime#136). Ultimately this leads to some rounding issues in the millisecond region of time stamps.

@cpaulik: Do you have an opinion on this issue? Should we change to the previous default behavior forcing to return datetime.datetime or should we stick to the new convention but update the tests accordingly.

I think the type conversion to numpy.datetime64 should stay in any case.

Reverting it to the "old behavior" would look like:

    def read_dates(self, loc_id):
        """
        Read time stamps and convert them.
        """
        self.dates = netCDF4.num2date(
            self.read_time(loc_id),
            units=self.dataset.variables[self.time_var].units,
            calendar='standard', only_use_cftime_datetimes=False,
            only_use_python_datetimes=True)

        return self.dates.astype('datetime64[ns]')

cpaulik · 2020-04-30T12:04:30Z

Is there any benefit to the cftime object? If not then I don't really care. Whatever is easier.

sebhahn · 2020-04-30T12:15:41Z

As discussed in these issues Unidata/netcdf4-python#981 and Unidata/cftime#134 there were some problems with rounding and negative times, which can be surprising sometimes.

Probably a good summary is this comment:

rkouznetsov commented on Nov 23, 2019
Looks like a mess... Certainly, returning different objects depending on subtle differences in the input is confusing. Having two time-handling libraries, one precise, but limited in time, another imprecise, but more flexible and universal, i would prefer to explicitly force use of one or another depending on my application, may be even with different functoins. In the current implementation one can force cftime, but, unfortunately, one can not force using datetime. That seems to be a missing feature.

Automatic choice and/or fallbacks is a way to an unexpected behaviour, so if datetime is forced, an exception should be generated.

Makes sense?

sebhahn · 2020-04-30T15:06:46Z

Reverted it to old behavior with a7ee3a9 for now

sebhahn closed this as completed Apr 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

read_dates returns `dtype('O')` instead of `dtype('<M8[ns]')` #49

read_dates returns `dtype('O')` instead of `dtype('<M8[ns]')` #49

sebhahn commented Apr 27, 2020

sebhahn commented Apr 30, 2020 •

edited

Loading

cpaulik commented Apr 30, 2020

sebhahn commented Apr 30, 2020

sebhahn commented Apr 30, 2020

read_dates returns dtype('O') instead of dtype('<M8[ns]') #49

read_dates returns dtype('O') instead of dtype('<M8[ns]') #49

Comments

sebhahn commented Apr 27, 2020

sebhahn commented Apr 30, 2020 • edited Loading

cpaulik commented Apr 30, 2020

sebhahn commented Apr 30, 2020

sebhahn commented Apr 30, 2020

read_dates returns `dtype('O')` instead of `dtype('<M8[ns]')` #49

read_dates returns `dtype('O')` instead of `dtype('<M8[ns]')` #49

sebhahn commented Apr 30, 2020 •

edited

Loading