Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem while reading a self-made netcdf file with io.return_xarray_mfdataset #32

Open
simon3122 opened this issue Jun 3, 2016 · 6 comments

Comments

@simon3122
Copy link

simon3122 commented Jun 3, 2016

Hello,

My problem is as follows: I want to read a self-made netcdf file with io.return_xarray_mfdataset.
The netcdf header gives:

group: netcdf4 {
dimensions:
y = 3454 ;
x = 5422 ;
t = 1 ;
variables:
float nav_lat(y, x) ;
nav_lat:axis = "Y" ;
nav_lat:standard_name = "latitude" ;
nav_lat:long_name = "Latitude" ;
nav_lat:units = "degrees_north" ;
nav_lat:nav_model = "grid_T" ;
float nav_lon(y, x) ;
nav_lon:axis = "X" ;
nav_lon:standard_name = "longitude" ;
nav_lon:long_name = "Longitude" ;
nav_lon:units = "degrees_east" ;
nav_lon:nav_model = "grid_T" ;
double time_centered(t) ;
time_centered:standard_name = "time" ;
time_centered:long_name = "Time axis" ;
time_centered:title = "Time" ;
time_centered:time_origin = "1958-01-01 00:00:00" ;
time_centered:bounds = "time_centered_bounds" ;
time_centered:units = "seconds since 1958-01-01" ;
time_centered:calendar = "gregorian" ;
double t(t) ;
t:axis = "T" ;
t:standard_name = "time" ;
t:long_name = "Time axis" ;
t:title = "Time" ;
t:time_origin = "1958-01-01 00:00:00" ;
t:bounds = "time_counter_bounds" ;
t:units = "seconds since 1958-01-01" ;
t:calendar = "gregorian" ;
float sossheig(t, y, x) ;
sossheig:_FillValue = 0.f ;
sossheig:long_name = "sea surface height" ;
sossheig:units = "m" ;
sossheig:online_operation = "average" ;
sossheig:interval_operation = "40s" ;
sossheig:interval_write = "5d" ;
sossheig:coordinates = "nav_lon nav_lat time_centered" ;
} // group netcdf4

My code is:
from oocgcm.core import io
chunks = (1727, 2711)
xr_chunks_tmean = {'y': chunks[-2], 'x': chunks[-1], 't':1}
vmean_xrt =io.return_xarray_mfdataset(filemean, chunks=xr_chunks_tmean)[vdict[vkey]['vname']][:]
I get the error output
ValueError: some chunks keys are not dimensions on this object: ['y', 'x', 't']

@simon3122
Copy link
Author

This problem was when reading a Netcdf4 file.
But is not anymore when reading a Netcdf3 file, written with these options.

ds.to_netcdf(filenam,
'w',
format='NETCDF3_64BIT'
engine='scipy',
encoding={ vkey2:{'dtype':'float32'}})

@lesommer
Copy link
Owner

lesommer commented Jun 6, 2016

Hi Simon,
could you please look in more detail and let me know what is the vkey/vname that returns the exception ?
thanks.

@simon3122
Copy link
Author

simon3122 commented Jun 7, 2016

I am working on NEMO data with the following variables:
vkey = 'sea level'
vkey2 = vdict[vkey]['vname']
The dictionary is defined as follows:
vdict['sea level']={'vname': 'sossheig', ...

If it can provide information, here are my options to output in Netcdf4 (which raises a problem at the following reading stage)
ds.to_netcdf(filenam,'w',format='NETCDF4', engine='netcdf4', encoding={ vkey2:{'_FillValue':0,'dtype':'float32'}})

@lesommer
Copy link
Owner

lesommer commented Jun 7, 2016

I suspect this related to the options used in reading netcdf files in core.io.
NB : these options have changed in the trunk (see)

  1. could you try to open your newly created dataset directly from xarray methods ?
  2. try without specifying chunk size for 't' dimension {'y': chunks[-2], 'x': chunks[-1]}

@simon3122
Copy link
Author

First, I noticed that I can not create the xarray.dataset: the error comes immediately from the open_mfdataset function

Here are my results:

  1. Replacing
    io.return_xarray_mfdataset(filemean,chunks=xr_chunks_tmean)
    with
    xr.open_mfdataset(filemean,chunks=xr_chunks_tmean,engine='netcdf4',lock=False)

    does not change the result

  2. Leaving out the 't' dimension provides this error:
    ValueError: some chunks keys are not dimensions on this object: ['y', 'x']

I can add another result : when specifying chunk=None in the xarray function, the dataset again can not be read properly:
print xr.open_mfdataset(filemean,chunks=None,engine='netcdf4',lock=False)
gives

<xarray.Dataset>
Dimensions:  ()
Coordinates:
    *empty*
Data variables:
    *empty*

@simon3122
Copy link
Author

simon3122 commented Jun 9, 2016

The blocking version was when writing the file
ds.to_netcdf(filenam,'w','NETCDF4', 'netcdf4'...}})
which gives the following header

group: netcdf4 {
  dimensions:
    y = 3454 ;
    x = 5422 ;

In fact, it works when writing the file
ds.to_netcdf(filenam,'w',format='NETCDF4', engine='netcdf4'...}})
which gives the following header

dimensions:
    y = 3454 ;
    x = 5422 ;

The first version seems to create an inner netcdf4 group which might not be a standard Netcdf4 file.
I understand that Xarray reading function may not read properly the first version (with an inner group) if the option 'group' is not filled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants