Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_dates returns dtype('O') instead of dtype('<M8[ns]') #49

Closed
sebhahn opened this issue Apr 27, 2020 · 4 comments
Closed

read_dates returns dtype('O') instead of dtype('<M8[ns]') #49

sebhahn opened this issue Apr 27, 2020 · 4 comments

Comments

@sebhahn
Copy link
Member

sebhahn commented Apr 27, 2020

netCDF4.num2date used here seems to return a numpy.dtype('O'), which laters leads to troubles because it is expected to get a datetime object, like numpy.dtype('<M8[ns]'). I couldn't track down if this is related to a certain version of netCDF4 or numpy (or even cftime)

Using return self.dates.astype('datetime64[ns]') here seems to be a workaround, but I'm not sure if this is the best solution give the fact that is unknown which package/version is causing the problem.

@sebhahn
Copy link
Member Author

sebhahn commented Apr 30, 2020

Ok the problem is related to the fact that the default behavior of netCDF4.num2date has changed. I already added the type conversion mentioned above in version 0.2.1, but it seems like some tests in other packages (like ascat) are failing because the returned data type of netCDF4.num2date has changed from cftime.datetime instead of datetime.datetime (see this issue Unidata/netcdf4-python#994 and this related Unidata/cftime#136). Ultimately this leads to some rounding issues in the millisecond region of time stamps.

@cpaulik: Do you have an opinion on this issue? Should we change to the previous default behavior forcing to return datetime.datetime or should we stick to the new convention but update the tests accordingly.

I think the type conversion to numpy.datetime64 should stay in any case.

Reverting it to the "old behavior" would look like:

    def read_dates(self, loc_id):
        """
        Read time stamps and convert them.
        """
        self.dates = netCDF4.num2date(
            self.read_time(loc_id),
            units=self.dataset.variables[self.time_var].units,
            calendar='standard', only_use_cftime_datetimes=False,
            only_use_python_datetimes=True)

        return self.dates.astype('datetime64[ns]')

@cpaulik
Copy link
Collaborator

cpaulik commented Apr 30, 2020

Is there any benefit to the cftime object? If not then I don't really care. Whatever is easier.

@sebhahn
Copy link
Member Author

sebhahn commented Apr 30, 2020

As discussed in these issues Unidata/netcdf4-python#981 and Unidata/cftime#134 there were some problems with rounding and negative times, which can be surprising sometimes.

Probably a good summary is this comment:

rkouznetsov commented on Nov 23, 2019
Looks like a mess... Certainly, returning different objects depending on subtle differences in the input is confusing. Having two time-handling libraries, one precise, but limited in time, another imprecise, but more flexible and universal, i would prefer to explicitly force use of one or another depending on my application, may be even with different functoins. In the current implementation one can force cftime, but, unfortunately, one can not force using datetime. That seems to be a missing feature.

Automatic choice and/or fallbacks is a way to an unexpected behaviour, so if datetime is forced, an exception should be generated.

Makes sense?

@sebhahn
Copy link
Member Author

sebhahn commented Apr 30, 2020

Reverted it to old behavior with a7ee3a9 for now

@sebhahn sebhahn closed this as completed Apr 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants