Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when using engine='scipy' reading CM2.6 ocean output #1704

Closed
jbusecke opened this issue Nov 9, 2017 · 7 comments
Closed

Error when using engine='scipy' reading CM2.6 ocean output #1704

jbusecke opened this issue Nov 9, 2017 · 7 comments

Comments

@jbusecke
Copy link
Contributor

jbusecke commented Nov 9, 2017

Code Sample, a copy-pastable example if possible

path = '/work/Julius.Busecke/CM2.6_staged/CM2.6_A_V03_1PctTo2X/annual_averages'
ds_ocean = xr.open_mfdataset(os.path.join(path,'ocean.*.ann.nc'), chunks={'time':1}, 
                             decode_times=False, engine='scipy')
ds_ocean

gives

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-7-278556ff881c> in <module>()
      1 path = '/work/Julius.Busecke/CM2.6_staged/CM2.6_A_V03_1PctTo2X/annual_averages'
----> 2 ds_ocean = xr.open_mfdataset(os.path.join(path,'ocean.*.ann.nc'), chunks={'time':1}, decode_times=False, engine='scipy')
      3 ds_ocean

~/code/miniconda/envs/standard/lib/python3.6/site-packages/xarray/backends/api.py in open_mfdataset(paths, chunks, concat_dim, compat, preprocess, engine, lock, **kwargs)
    503         lock = _default_lock(paths[0], engine)
    504     datasets = [open_dataset(p, engine=engine, chunks=chunks or {}, lock=lock,
--> 505                              **kwargs) for p in paths]
    506     file_objs = [ds._file_obj for ds in datasets]
    507 

~/code/miniconda/envs/standard/lib/python3.6/site-packages/xarray/backends/api.py in <listcomp>(.0)
    503         lock = _default_lock(paths[0], engine)
    504     datasets = [open_dataset(p, engine=engine, chunks=chunks or {}, lock=lock,
--> 505                              **kwargs) for p in paths]
    506     file_objs = [ds._file_obj for ds in datasets]
    507 

~/code/miniconda/envs/standard/lib/python3.6/site-packages/xarray/backends/api.py in open_dataset(filename_or_obj, group, decode_cf, mask_and_scale, decode_times, autoclose, concat_characters, decode_coords, engine, chunks, lock, cache, drop_variables)
    283         elif engine == 'scipy':
    284             store = backends.ScipyDataStore(filename_or_obj,
--> 285                                             autoclose=autoclose)
    286         elif engine == 'pydap':
    287             store = backends.PydapDataStore(filename_or_obj)

~/code/miniconda/envs/standard/lib/python3.6/site-packages/xarray/backends/scipy_.py in __init__(self, filename_or_obj, mode, format, group, writer, mmap, autoclose)
    133                                    filename=filename_or_obj,
    134                                    mode=mode, mmap=mmap, version=version)
--> 135         self.ds = opener()
    136         self._autoclose = autoclose
    137         self._isopen = True

~/code/miniconda/envs/standard/lib/python3.6/site-packages/xarray/backends/scipy_.py in _open_scipy_netcdf(filename, mode, mmap, version)
     81     try:
     82         return scipy.io.netcdf_file(filename, mode=mode, mmap=mmap,
---> 83                                     version=version)
     84     except TypeError as e:  # netcdf3 message is obscure in this case
     85         errmsg = e.args[0]

~/code/miniconda/envs/standard/lib/python3.6/site-packages/scipy/io/netcdf.py in __init__(self, filename, mode, mmap, version, maskandscale)
    264 
    265         if mode in 'ra':
--> 266             self._read()
    267 
    268     def __setattr__(self, attr, value):

~/code/miniconda/envs/standard/lib/python3.6/site-packages/scipy/io/netcdf.py in _read(self)
    591         self._read_dim_array()
    592         self._read_gatt_array()
--> 593         self._read_var_array()
    594 
    595     def _read_numrecs(self):

~/code/miniconda/envs/standard/lib/python3.6/site-packages/scipy/io/netcdf.py in _read_var_array(self)
    696             # Build rec array.
    697             if self.use_mmap:
--> 698                 rec_array = self._mm_buf[begin:begin+self._recs*self._recsize].view(dtype=dtypes)
    699                 rec_array.shape = (self._recs,)
    700             else:

ValueError: new type not compatible with array.

xarray version: '0.9.6'

Problem description

I am trying to lazily read in a large number of high resolution ocean model output files. If I omit the engine='scipy' it works but takes forever.
Is there a known reason why this would fail with the 'scipy' option?

I found #1313, and checked my conda environment:

$ conda list hdf
# packages in environment at /home/Julius.Busecke/code/miniconda/envs/standard:
#
hdf4                      4.2.12                        0    conda-forge
hdf5                      1.8.18                        1    conda-forge
$ conda list netcdf
# packages in environment at /home/Julius.Busecke/code/miniconda/envs/standard:
#
h5netcdf                  0.4.2                      py_0    conda-forge
libnetcdf                 4.4.1.1                       6    conda-forge
netcdf4                   1.3.0                    py36_0    conda-forge

I can also import netCDF4 and also load a single file using netCDF, so I am unsure if this is the same error as in #1313

I keep getting this error with some of the files for this particular model but not with others.

Any help would be greatly appreciated.

@shoyer
Copy link
Member

shoyer commented Nov 9, 2017

Can you share an example file?

Note that there are some performance improvements for open_mfdataset in the next version of xarray, which you could test out with the release candidate if you like.

@jbusecke
Copy link
Contributor Author

Ok I just tried to read the files with the new RC. Same error.

here is the output of xr.show_versions():

INSTALLED VERSIONS
------------------
commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.32-642.15.1.el6.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US
LOCALE: en_US.ISO8859-1

xarray: 0.10.0rc1
pandas: 0.21.0
numpy: 1.13.3
scipy: 0.19.1
netCDF4: 1.3.1
h5netcdf: None
Nio: None
bottleneck: None
cyordereddict: None
dask: 0.15.4
matplotlib: 2.1.0
cartopy: 0.15.1
seaborn: 0.8.1
setuptools: 36.6.0
pip: 9.0.1
conda: None
pytest: 3.2.3
IPython: 6.2.1
sphinx: None

and here is an ncdump -h of one of the files:

$ ncdump -h ocean.0198.ann.nc
netcdf ocean.0198.ann {
dimensions:
	xt_ocean = 3600 ;
	yt_ocean = 2700 ;
	time = UNLIMITED ; // (1 currently)
	nv = 2 ;
	xu_ocean = 3600 ;
	yu_ocean = 2700 ;
	st_ocean = 50 ;
	st_edges_ocean = 51 ;
	sw_edges_ocean = 51 ;
	sw_ocean = 50 ;
variables:
	double xt_ocean(xt_ocean) ;
		xt_ocean:long_name = "tcell longitude" ;
		xt_ocean:units = "degrees_E" ;
		xt_ocean:cartesian_axis = "X" ;
	double yt_ocean(yt_ocean) ;
		yt_ocean:long_name = "tcell latitude" ;
		yt_ocean:units = "degrees_N" ;
		yt_ocean:cartesian_axis = "Y" ;
	double time(time) ;
		time:long_name = "time" ;
		time:units = "days since 0001-01-01 00:00:00" ;
		time:cartesian_axis = "T" ;
		time:calendar_type = "JULIAN" ;
		time:calendar = "JULIAN" ;
		time:bounds = "time_bounds" ;
	double nv(nv) ;
		nv:long_name = "vertex number" ;
		nv:units = "none" ;
		nv:cartesian_axis = "N" ;
	double xu_ocean(xu_ocean) ;
		xu_ocean:long_name = "ucell longitude" ;
		xu_ocean:units = "degrees_E" ;
		xu_ocean:cartesian_axis = "X" ;
	double yu_ocean(yu_ocean) ;
		yu_ocean:long_name = "ucell latitude" ;
		yu_ocean:units = "degrees_N" ;
		yu_ocean:cartesian_axis = "Y" ;
	double st_ocean(st_ocean) ;
		st_ocean:long_name = "tcell zstar depth" ;
		st_ocean:units = "meters" ;
		st_ocean:cartesian_axis = "Z" ;
		st_ocean:positive = "down" ;
		st_ocean:edges = "st_edges_ocean" ;
	double st_edges_ocean(st_edges_ocean) ;
		st_edges_ocean:long_name = "tcell zstar depth edges" ;
		st_edges_ocean:units = "meters" ;
		st_edges_ocean:cartesian_axis = "Z" ;
		st_edges_ocean:positive = "down" ;
	double sw_edges_ocean(sw_edges_ocean) ;
		sw_edges_ocean:long_name = "ucell zstar depth edges" ;
		sw_edges_ocean:units = "meters" ;
		sw_edges_ocean:cartesian_axis = "Z" ;
		sw_edges_ocean:positive = "down" ;
	float geolon_t(yt_ocean, xt_ocean) ;
		geolon_t:long_name = "tracer longitude" ;
		geolon_t:units = "degrees_E" ;
		geolon_t:valid_range = -281.f, 361.f ;
		geolon_t:missing_value = 1.e+20f ;
		geolon_t:_FillValue = 1.e+20f ;
		geolon_t:cell_methods = "time: point" ;
		geolon_t:coordinates = "geolon_t geolat_t" ;
	float geolat_t(yt_ocean, xt_ocean) ;
		geolat_t:long_name = "tracer latitude" ;
		geolat_t:units = "degrees_N" ;
		geolat_t:valid_range = -91.f, 91.f ;
		geolat_t:missing_value = 1.e+20f ;
		geolat_t:_FillValue = 1.e+20f ;
		geolat_t:cell_methods = "time: point" ;
		geolat_t:coordinates = "geolon_t geolat_t" ;
	float geolon_c(yu_ocean, xu_ocean) ;
		geolon_c:long_name = "uv longitude" ;
		geolon_c:units = "degrees_E" ;
		geolon_c:valid_range = -281.f, 361.f ;
		geolon_c:missing_value = 1.e+20f ;
		geolon_c:_FillValue = 1.e+20f ;
		geolon_c:cell_methods = "time: point" ;
		geolon_c:coordinates = "geolon_c geolat_c" ;
	float geolat_c(yu_ocean, xu_ocean) ;
		geolat_c:long_name = "uv latitude" ;
		geolat_c:units = "degrees_N" ;
		geolat_c:valid_range = -91.f, 91.f ;
		geolat_c:missing_value = 1.e+20f ;
		geolat_c:_FillValue = 1.e+20f ;
		geolat_c:cell_methods = "time: point" ;
		geolat_c:coordinates = "geolon_c geolat_c" ;
	float temp(time, st_ocean, yt_ocean, xt_ocean) ;
		temp:long_name = "Potential temperature" ;
		temp:units = "degrees C" ;
		temp:valid_range = -10.f, 500.f ;
		temp:missing_value = -1.e+20f ;
		temp:_FillValue = -1.e+20f ;
		temp:cell_methods = "time: mean" ;
		temp:time_avg_info = "average_T1,average_T2,average_DT" ;
		temp:coordinates = "geolon_t geolat_t" ;
		temp:standard_name = "sea_water_potential_temperature" ;
	double time_bounds(time, nv) ;
		time_bounds:long_name = "time axis boundaries" ;
		time_bounds:units = "days" ;
		time_bounds:missing_value = 1.e+20 ;
		time_bounds:_FillValue = 1.e+20 ;
	float salt(time, st_ocean, yt_ocean, xt_ocean) ;
		salt:long_name = "Practical Salinity" ;
		salt:units = "psu" ;
		salt:valid_range = -10.f, 100.f ;
		salt:missing_value = -1.e+20f ;
		salt:_FillValue = -1.e+20f ;
		salt:cell_methods = "time: mean" ;
		salt:time_avg_info = "average_T1,average_T2,average_DT" ;
		salt:coordinates = "geolon_t geolat_t" ;
		salt:standard_name = "sea_water_salinity" ;
	float u(time, st_ocean, yu_ocean, xu_ocean) ;
		u:long_name = "i-current" ;
		u:units = "m/sec" ;
		u:valid_range = -10.f, 10.f ;
		u:missing_value = -1.e+20f ;
		u:_FillValue = -1.e+20f ;
		u:cell_methods = "time: mean" ;
		u:time_avg_info = "average_T1,average_T2,average_DT" ;
		u:coordinates = "geolon_c geolat_c" ;
		u:standard_name = "sea_water_x_velocity" ;
	float v(time, st_ocean, yu_ocean, xu_ocean) ;
		v:long_name = "j-current" ;
		v:units = "m/sec" ;
		v:valid_range = -10.f, 10.f ;
		v:missing_value = -1.e+20f ;
		v:_FillValue = -1.e+20f ;
		v:cell_methods = "time: mean" ;
		v:time_avg_info = "average_T1,average_T2,average_DT" ;
		v:coordinates = "geolon_c geolat_c" ;
		v:standard_name = "sea_water_y_velocity" ;
	float pot_rho_0(time, st_ocean, yt_ocean, xt_ocean) ;
		pot_rho_0:long_name = "potential density referenced to 0 dbar" ;
		pot_rho_0:units = "kg/m^3" ;
		pot_rho_0:valid_range = -10.f, 100000.f ;
		pot_rho_0:missing_value = -1.e+20f ;
		pot_rho_0:_FillValue = -1.e+20f ;
		pot_rho_0:cell_methods = "time: mean" ;
		pot_rho_0:time_avg_info = "average_T1,average_T2,average_DT" ;
		pot_rho_0:coordinates = "geolon_t geolat_t" ;
		pot_rho_0:standard_name = "sea_water_potential_density" ;
	float ty_trans(time, st_ocean, yu_ocean, xt_ocean) ;
		ty_trans:long_name = "T-cell j-mass transport" ;
		ty_trans:units = "Sv (10^9 kg/s)" ;
		ty_trans:valid_range = -1.e+20f, 1.e+20f ;
		ty_trans:missing_value = -1.e+20f ;
		ty_trans:_FillValue = -1.e+20f ;
		ty_trans:cell_methods = "time: mean" ;
		ty_trans:time_avg_info = "average_T1,average_T2,average_DT" ;
		ty_trans:coordinates = "geolon_t geolat_c" ;
		ty_trans:standard_name = "ocean_y_mass_transport" ;
	float eta_t(time, yt_ocean, xt_ocean) ;
		eta_t:long_name = "surface height on T cells [Boussinesq (volume conserving) model]" ;
		eta_t:units = "meter" ;
		eta_t:valid_range = -1000.f, 1000.f ;
		eta_t:missing_value = -1.e+20f ;
		eta_t:_FillValue = -1.e+20f ;
		eta_t:cell_methods = "time: mean" ;
		eta_t:time_avg_info = "average_T1,average_T2,average_DT" ;
		eta_t:coordinates = "geolon_t geolat_t" ;
	float eta_u(time, yu_ocean, xu_ocean) ;
		eta_u:long_name = "surface height on U cells" ;
		eta_u:units = "meter" ;
		eta_u:valid_range = -1000.f, 1000.f ;
		eta_u:missing_value = -1.e+20f ;
		eta_u:_FillValue = -1.e+20f ;
		eta_u:cell_methods = "time: mean" ;
		eta_u:time_avg_info = "average_T1,average_T2,average_DT" ;
		eta_u:coordinates = "geolon_c geolat_c" ;
	float frazil_2d(time, yt_ocean, xt_ocean) ;
		frazil_2d:long_name = "ocn frazil heat flux over time step" ;
		frazil_2d:units = "W/m^2" ;
		frazil_2d:valid_range = -1.e+10f, 1.e+10f ;
		frazil_2d:missing_value = -1.e+20f ;
		frazil_2d:_FillValue = -1.e+20f ;
		frazil_2d:cell_methods = "time: mean" ;
		frazil_2d:time_avg_info = "average_T1,average_T2,average_DT" ;
		frazil_2d:coordinates = "geolon_t geolat_t" ;
	float hblt(time, yt_ocean, xt_ocean) ;
		hblt:long_name = "T-cell boundary layer depth from KPP" ;
		hblt:units = "m" ;
		hblt:valid_range = -100000.f, 1000000.f ;
		hblt:missing_value = -1.e+20f ;
		hblt:_FillValue = -1.e+20f ;
		hblt:cell_methods = "time: mean" ;
		hblt:time_avg_info = "average_T1,average_T2,average_DT" ;
		hblt:coordinates = "geolon_t geolat_t" ;
		hblt:standard_name = "ocean_mixed_layer_thickness_defined_by_mixing_scheme" ;
	float mld(time, yt_ocean, xt_ocean) ;
		mld:long_name = "mixed layer depth determined by density criteria" ;
		mld:units = "m" ;
		mld:valid_range = 0.f, 1000000.f ;
		mld:missing_value = -1.e+20f ;
		mld:_FillValue = -1.e+20f ;
		mld:cell_methods = "time: mean" ;
		mld:time_avg_info = "average_T1,average_T2,average_DT" ;
		mld:coordinates = "geolon_t geolat_t" ;
		mld:standard_name = "ocean_mixed_layer_thickness_defined_by_sigma_t" ;
	float mld_dtheta(time, yt_ocean, xt_ocean) ;
		mld_dtheta:long_name = "mixed layer depth determined by temperature criteria" ;
		mld_dtheta:units = "m" ;
		mld_dtheta:valid_range = 0.f, 1000000.f ;
		mld_dtheta:missing_value = -1.e+20f ;
		mld_dtheta:_FillValue = -1.e+20f ;
		mld_dtheta:cell_methods = "time: mean" ;
		mld_dtheta:time_avg_info = "average_T1,average_T2,average_DT" ;
		mld_dtheta:coordinates = "geolon_t geolat_t" ;
	float net_sfc_heating(time, yt_ocean, xt_ocean) ;
		net_sfc_heating:long_name = "surface ocean heat flux coming through coupler and mass transfer" ;
		net_sfc_heating:units = "Watts/m^2" ;
		net_sfc_heating:valid_range = -10000.f, 10000.f ;
		net_sfc_heating:missing_value = -1.e+20f ;
		net_sfc_heating:_FillValue = -1.e+20f ;
		net_sfc_heating:cell_methods = "time: mean" ;
		net_sfc_heating:time_avg_info = "average_T1,average_T2,average_DT" ;
		net_sfc_heating:coordinates = "geolon_t geolat_t" ;
	float pme_river(time, yt_ocean, xt_ocean) ;
		pme_river:long_name = "mass flux of precip-evap+river via sbc (liquid, frozen, evaporation)" ;
		pme_river:units = "(kg/m^3)*(m/sec)" ;
		pme_river:valid_range = -1000000.f, 1000000.f ;
		pme_river:missing_value = -1.e+20f ;
		pme_river:_FillValue = -1.e+20f ;
		pme_river:cell_methods = "time: mean" ;
		pme_river:time_avg_info = "average_T1,average_T2,average_DT" ;
		pme_river:coordinates = "geolon_t geolat_t" ;
		pme_river:standard_name = "water_flux_into_sea_water" ;
	float river(time, yt_ocean, xt_ocean) ;
		river:long_name = "mass flux of river (runoff + calving) entering ocean" ;
		river:units = "(kg/m^3)*(m/sec)" ;
		river:valid_range = -1000000.f, 1000000.f ;
		river:missing_value = -1.e+20f ;
		river:_FillValue = -1.e+20f ;
		river:cell_methods = "time: mean" ;
		river:time_avg_info = "average_T1,average_T2,average_DT" ;
		river:coordinates = "geolon_t geolat_t" ;
	float salt_int_rhodz(time, yt_ocean, xt_ocean) ;
		salt_int_rhodz:long_name = "vertical sum of Practical Salinity * rho_dzt" ;
		salt_int_rhodz:units = "psu*(kg/m^3)*m" ;
		salt_int_rhodz:valid_range = -1.e+20f, 1.e+20f ;
		salt_int_rhodz:missing_value = -1.e+20f ;
		salt_int_rhodz:_FillValue = -1.e+20f ;
		salt_int_rhodz:cell_methods = "time: mean" ;
		salt_int_rhodz:time_avg_info = "average_T1,average_T2,average_DT" ;
		salt_int_rhodz:coordinates = "geolon_t geolat_t" ;
	float sea_level(time, yt_ocean, xt_ocean) ;
		sea_level:long_name = "effective sea level (eta_t + patm/(rho0*g)) on T cells" ;
		sea_level:units = "meter" ;
		sea_level:valid_range = -1000.f, 1000.f ;
		sea_level:missing_value = -1.e+20f ;
		sea_level:_FillValue = -1.e+20f ;
		sea_level:cell_methods = "time: mean" ;
		sea_level:time_avg_info = "average_T1,average_T2,average_DT" ;
		sea_level:coordinates = "geolon_t geolat_t" ;
		sea_level:standard_name = "sea_surface_height_above_geoid" ;
	float sea_levelsq(time, yt_ocean, xt_ocean) ;
		sea_levelsq:long_name = "square of effective sea level (eta_t + patm/(rho0*g)) on T cells" ;
		sea_levelsq:units = "m^2" ;
		sea_levelsq:valid_range = -1000.f, 1000.f ;
		sea_levelsq:missing_value = -1.e+20f ;
		sea_levelsq:_FillValue = -1.e+20f ;
		sea_levelsq:cell_methods = "time: mean" ;
		sea_levelsq:time_avg_info = "average_T1,average_T2,average_DT" ;
		sea_levelsq:coordinates = "geolon_t geolat_t" ;
		sea_levelsq:standard_name = "square_of_sea_surface_height_above_geoid" ;
	float sfc_hflux_coupler(time, yt_ocean, xt_ocean) ;
		sfc_hflux_coupler:long_name = "surface heat flux coming through coupler" ;
		sfc_hflux_coupler:units = "Watts/m^2" ;
		sfc_hflux_coupler:valid_range = -10000.f, 10000.f ;
		sfc_hflux_coupler:missing_value = -1.e+20f ;
		sfc_hflux_coupler:_FillValue = -1.e+20f ;
		sfc_hflux_coupler:cell_methods = "time: mean" ;
		sfc_hflux_coupler:time_avg_info = "average_T1,average_T2,average_DT" ;
		sfc_hflux_coupler:coordinates = "geolon_t geolat_t" ;
	double sw_ocean(sw_ocean) ;
		sw_ocean:long_name = "ucell zstar depth" ;
		sw_ocean:units = "meters" ;
		sw_ocean:cartesian_axis = "Z" ;
		sw_ocean:positive = "down" ;
		sw_ocean:edges = "sw_edges_ocean" ;
	float tau_x(time, yu_ocean, xu_ocean) ;
		tau_x:long_name = "i-directed wind stress forcing u-velocity" ;
		tau_x:units = "N/m^2" ;
		tau_x:valid_range = -10.f, 10.f ;
		tau_x:missing_value = -1.e+20f ;
		tau_x:_FillValue = -1.e+20f ;
		tau_x:cell_methods = "time: mean" ;
		tau_x:time_avg_info = "average_T1,average_T2,average_DT" ;
		tau_x:coordinates = "geolon_c geolat_c" ;
		tau_x:standard_name = "surface_downward_x_stress" ;
	float tau_y(time, yu_ocean, xu_ocean) ;
		tau_y:long_name = "j-directed wind stress forcing v-velocity" ;
		tau_y:units = "N/m^2" ;
		tau_y:valid_range = -10.f, 10.f ;
		tau_y:missing_value = -1.e+20f ;
		tau_y:_FillValue = -1.e+20f ;
		tau_y:cell_methods = "time: mean" ;
		tau_y:time_avg_info = "average_T1,average_T2,average_DT" ;
		tau_y:coordinates = "geolon_c geolat_c" ;
		tau_y:standard_name = "surface_downward_y_stress" ;
	float temp_int_rhodz(time, yt_ocean, xt_ocean) ;
		temp_int_rhodz:long_name = "vertical sum of Potential temperature * rho_dzt" ;
		temp_int_rhodz:units = "deg_C*(kg/m^3)*m" ;
		temp_int_rhodz:valid_range = -1.e+20f, 1.e+20f ;
		temp_int_rhodz:missing_value = -1.e+20f ;
		temp_int_rhodz:_FillValue = -1.e+20f ;
		temp_int_rhodz:cell_methods = "time: mean" ;
		temp_int_rhodz:time_avg_info = "average_T1,average_T2,average_DT" ;
		temp_int_rhodz:coordinates = "geolon_t geolat_t" ;
	float wt(time, sw_ocean, yt_ocean, xt_ocean) ;
		wt:long_name = "dia-surface velocity T-points" ;
		wt:units = "m/sec" ;
		wt:valid_range = -100000.f, 100000.f ;
		wt:missing_value = -1.e+20f ;
		wt:_FillValue = -1.e+20f ;
		wt:cell_methods = "time: mean" ;
		wt:time_avg_info = "average_T1,average_T2,average_DT" ;
		wt:coordinates = "geolon_t geolat_t" ;

// global attributes:
		:filename = "01980101.ocean.nc" ;
		:title = "CM2.6_miniBling" ;
		:grid_type = "mosaic" ;
		:grid_tile = "1" ;
		:history = "Tue Feb 25 16:32:17 2014: ncks --64bit --hdr_pad 15000 -A ocean.0198.ann2d.nc ocean.0198.ann.nc\n",
			"Tue Feb 25 16:23:35 2014: ncks --64bit --hdr_pad 15000 -A frazil_2d.nc ocean.0198.ann2d.nc\n",
			"Tue Feb 25 16:23:32 2014: ncra -O -v nv,time_bounds,geolat_c,geolat_t,geolon_c,geolon_t,st_edges_ocean,sw_edges_ocean,frazil_2d 01980101.ocean.nc frazil_2d.nc" ;
		:nco_openmp_thread_number = 1 ;
		:NCO = "4.1.0" ;
}

I am unsure if I am allowed to share them publicly, is there another way to diagnose what is going on?

Thanks a lot!

@shoyer
Copy link
Member

shoyer commented Nov 10, 2017

Can you figure out which variable name it's erroring on? Try dropping into a debugger (e.g., %debug% in IPython).

It is not immediately obvious to me what the issue is with this file, but I suspect this is probably a SciPy issue given that scipy.io.netcdf_file() cannot even open it. Access to the file would help because then I can use a debugger to see exactly what went wrong.

@jbusecke
Copy link
Contributor Author

I believe I solved the issue: It turns out the reason for the slow performance were inconsistencies in between different files (added data_vars and coords). Specifying drop_variables yields the expected performance.

I am not sure if that is a feasible option, but would it be possible to implement a check for such errors that displays a warning?

The error regarding enige='scipy' remains but I believe this is due to changing netcdf formats between files.

@jhamman
Copy link
Member

jhamman commented Nov 21, 2017

@jbusecke -

... would it be possible to implement a check for such errors that displays a warning?

In your case, what would that have looked like? Many users have actually requested that xarray provide a more streamlined reader function that does less checking. Perhaps it would have been useful for you to have xarray raise an error when it encountered files that weren't consistent (either in the variable, dimensions, or coordinates)?

@jbusecke
Copy link
Contributor Author

jbusecke commented Nov 21, 2017 via email

@dcherian
Copy link
Contributor

Closing in favour of #1823

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants