Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some problems with the netcdf-cf mapper #412

Closed
5 tasks done
mortenwh opened this issue Jun 28, 2019 · 3 comments
Closed
5 tasks done

Some problems with the netcdf-cf mapper #412

mortenwh opened this issue Jun 28, 2019 · 3 comments

Comments

@mortenwh
Copy link
Contributor

mortenwh commented Jun 28, 2019

Demo file: https://thredds.met.no/thredds/fileServer/aromearcticarchive/2019/06/15/arome_arctic_vtk_20190615T15Z.nc (download with, e.g., wget)

  • The projection based on a subdataset is slightly different from the one based on the projection variable (projection_lambert) provided in the netcdf file. It is probably best to use the projection variable if it is present.

  • During work with the above file, I discovered that the selection of bands based on the netcdf_dim kwarg is wrong. The time selection works, but it is not possible to filter on the pressure dimension (ref above file).

  • Nansat fails to open a downloaded S1A file (wget http://nbstds.met.no/thredds/fileServer/NBS/S1A/2019/06/15/IW/S1A_IW_GRDM_1SDV_20190615T151935_20190615T152004_027692_03202F_6E58.nc). Also, the scatterometer mapper is selected by Nansat, although it should be the netcdf-cf mapper. We may need to make a specific S1 netcdf-cf mapper.

  • If I reproject the arome dataset to an S1 dataset (http://nbstds.met.no/thredds/dodsC/NBS/S1A/2019/06/15/IW/S1A_IW_GRDM_1SDV_20190615T151935_20190615T152004_027692_03202F_6E58.nc) available from opendap), the bands cannot be read. Changing the projection as suggested above (point 1) makes it readable, but all the data is NaN. It works fine if I use the opendap version of the arome dataset (https://thredds.met.no/thredds/dodsC/aromearcticarchive/2019/06/15/arome_arctic_vtk_20190615T15Z.nc).

  • Also check S2 netcdf datasets

@mortenwh mortenwh changed the title Mapper for netcdf-cf does not correctly choose bands from netcdf_dim kwarg Some problems with the netcdf-cf mapper Jun 28, 2019
mortenwh added a commit that referenced this issue Jul 2, 2019
mortenwh added a commit that referenced this issue Jul 2, 2019
…and added tests. It seems to work correctly now.
mortenwh added a commit that referenced this issue Jul 2, 2019
mortenwh added a commit that referenced this issue Jul 2, 2019
mortenwh added a commit that referenced this issue Jul 2, 2019
…rs package. It should only import those that has 'mapper' in the module name. Changed code and added a test.
mortenwh added a commit that referenced this issue Jul 2, 2019
…ded lists ALLOWED_SPATIAL_DIMENSIONS_X and ALLOWED_SPATIAL_DIMENSIONS_Y that will be checked when selecting band numbers from the netcdf_dim dict.
mortenwh added a commit that referenced this issue Jul 2, 2019
…py) for generic handling of both netcdf and opendap datasets (see #391). Did not write tests but tried the code on the dataset linked in the issue description. Nansat can now read s1 netcdf files from the Norwegian ground segment but reading some of the bands is *very* slow. This is most likely an issue with the way the files are created. Have e-mailed met.no about it.
mortenwh added a commit that referenced this issue Jul 2, 2019
mortenwh added a commit that referenced this issue Jul 2, 2019
@mortenwh
Copy link
Contributor Author

mortenwh commented Jul 3, 2019

Regarding reading the netcdf's from satellittdata.no:

OPeNDAP (http://nbstds.met.no/thredds/dodsC/NBS/S1A/2019/06/15/IW/S1A_IW_GRDM_1SDV_20190615T151935_20190615T152004_027692_03202F_6E58.nc):

unavailable at the moment of writing the comment - will edit later...

Downloaded NetCDF:

In [1]: import gdal                                                                                                              

In [2]: import time                                                                                                              

In [3]: gds = gdal.Open('NETCDF:"S1A_IW_GRDM_1SDV_20190615T151935_20190615T152004_027692_03202F_6E58.nc":Amplitude_VV')          

In [4]: start, amp0, end = time.time(), gds.ReadAsArray(), time.time()                                                           

In [5]: end-start # seconds                                                                                                      
Out[5]: 598.0722901821136

In [6]: (end-start)/60 # minutes                                                                                                 
Out[6]: 9.967871503035228

In [7]: from netCDF4 import Dataset                                                                                              

In [8]: ds = Dataset('S1A_IW_GRDM_1SDV_20190615T151935_20190615T152004_027692_03202F_6E58.nc')                                   

In [9]: start, amp, end = time.time(), ds.variables['Amplitude_VV'][:].data, time.time()                                         

In [10]: end-start # seconds                                                                                                     
Out[10]: 0.4655623435974121

10 minutes with gdal vs 0.5 seconds with netCDF4!

Then, transform the netcdf file:

gdal_translate -of netcdf -co "WRITE_BOTTOMUP=NO" -sds S1A_IW_GRDM_1SDV_20190615T151935_20190615T152004_027692_03202F_6E58.nc out.nc
nccopy -7 -c x/3,y/3 out_04.nc out_04_chunked.nc

This shortens the gdal reading time (but now there is only one band in out*.nc):

In [7]: gds = gdal.Open('out_04_chunked.nc')                                                                                     

In [8]: start, amp, end = time.time(), gds.ReadAsArray(), time.time()                                                            

In [9]: end-start # seconds                                                                                                      
Out[9]: 3.566657304763794

In [10]:                                                                                                                         

In [10]: gds = gdal.Open('out_04.nc')                                                                                            

In [11]: start, amp, end = time.time(), gds.ReadAsArray(), time.time()                                                           

In [12]: end-start # seconds                                                                                                     
Out[12]: 2.888197183609009

In [13]: 

ln [14]: ds = Dataset('out_04.nc')                                                                                               

In [15]: start, amp, end = time.time(), ds.variables['Amplitude_VV'][:].data, time.time()                                        

In [16]: end-start # seconds                                                                                                     
Out[16]: 1.4296467304229736

netCDF4 takes more time to read the new file compared to the original but is still much faster than gdal...

@mortenwh
Copy link
Contributor Author

mortenwh commented Jul 3, 2019

Reprojection of the Arome dataset now works with the new updates using the downloaded netCDF files:

In [1]: from nansat.nansat import Nansat                                                                                         

In [2]: import numpy as np                                                                                                       

In [3]: a0fn='arome_arctic_vtk_20190615T15Z.nc'                                                                                  

In [4]: fn = 'S1A_IW_GRDM_1SDV_20190615T151935_20190615T152004_027692_03202F_6E58.nc'                                            

In [5]: a0n = Nansat(a0fn, bands=['x_wind', 'y_wind'], netcdf_dim={'time': np.datetime64('2019-06-15T15:00')})                   

In [6]: n = Nansat(fn)                                                                                                           
=>Sentinel-1 GRD data<=
/home/vagrant/Miniconda3-4.6.14-Linux-x86_64/envs/py3openwind/lib/python3.6/site-packages/nansat/mappers/mapper_netcdf_cf.py:365: UserWarning: GDAL cannot determine the dataset projection - using Nansat spatial reference WKT, assuming a regular longitude/latitude grid
  warnings.warn('GDAL cannot determine the dataset projection - using Nansat ' \
/home/vagrant/Miniconda3-4.6.14-Linux-x86_64/envs/py3openwind/lib/python3.6/site-packages/nansat/mappers/mapper_netcdf_cf.py:304: UserWarning: KeyError: 'NETCDF_DIM_time' - Mapping key not found. Continuing without time metadata for band lat
  band_metadata['NETCDF_VARNAME']))
/home/vagrant/Miniconda3-4.6.14-Linux-x86_64/envs/py3openwind/lib/python3.6/site-packages/nansat/mappers/mapper_netcdf_cf.py:304: UserWarning: KeyError: 'NETCDF_DIM_time' - Mapping key not found. Continuing without time metadata for band lon
  band_metadata['NETCDF_VARNAME']))
/home/vagrant/Miniconda3-4.6.14-Linux-x86_64/envs/py3openwind/lib/python3.6/site-packages/nansat/mappers/mapper_netcdf_cf.py:304: UserWarning: KeyError: 'NETCDF_DIM_time' - Mapping key not found. Continuing without time metadata for band swathList
  band_metadata['NETCDF_VARNAME']))

In [7]: a0n.reproject(n)                                                                                                         

In [8]: a0n['x_wind_10m']                                                                                                        
Out[8]: 
array([[ 9.09290028,  9.09290028,  9.09290028, ...,  9.08801746,
         9.08801746,  9.08801746],
       [ 9.09290028,  9.09290028,  9.09290028, ...,  9.08801746,
         9.08801746,  9.08801746],
       [ 9.09290028,  9.09290028,  9.09290028, ...,  9.08801746,
         9.08801746,  9.08801746],
       ..., 
       [ 5.96741199,  5.96741199,  5.96741199, ...,  8.51916981,
         8.51916981,  8.51916981],
       [ 5.96741199,  5.96741199,  5.96741199, ...,  8.51916981,
         8.51916981,  8.51916981],
       [ 5.96741199,  5.96741199,  5.96741199, ...,  8.51916981,
         8.51916981,  8.51916981]], dtype=float32)

In [9]: n['Amplitude_VV']
Out[9]: 
array([[54, 49, 50, ...,  0,  0,  0],
       [67, 60, 58, ...,  0,  0,  0],
       [75, 68, 67, ...,  0,  0,  0],
       ..., 
       [ 0,  0,  0, ...,  0,  0,  0],
       [ 0,  0,  0, ...,  0,  0,  0],
       [ 0,  0,  0, ...,  0,  0,  0]], dtype=uint16)

mortenwh added a commit that referenced this issue Jul 3, 2019
…o. If a dimension was not added as a variable in the netcdf file, the function _get_band_from_subfile failed. Made test to reproduce the issue and fixed the bug. GDAL complains with GDAL_ERROR 1: b'No 1D variable is indexed by dimension dimension_rgb' but it still works. Also removed a warning issued in a try/except clause.
@mortenwh
Copy link
Contributor Author

mortenwh commented Jul 3, 2019

S2 file: wget http://nbstds.met.no/thredds/fileServer/NBS/S2A/2019/07/01/S2A_MSIL1C_20190701T160911_N0207_R140_T29XNK_20190701T180203.nc

mortenwh added a commit that referenced this issue Jul 3, 2019
…his depends on an external resource and will fails if something is changed there. Added a test for NSR just to increase the coverage.
mortenwh added a commit that referenced this issue Jul 3, 2019
mortenwh added a commit that referenced this issue Jul 3, 2019
…and added tests. It seems to work correctly now.
mortenwh added a commit that referenced this issue Jul 3, 2019
mortenwh added a commit that referenced this issue Jul 3, 2019
mortenwh added a commit that referenced this issue Jul 3, 2019
…rs package. It should only import those that has 'mapper' in the module name. Changed code and added a test.
mortenwh added a commit that referenced this issue Jul 3, 2019
…ded lists ALLOWED_SPATIAL_DIMENSIONS_X and ALLOWED_SPATIAL_DIMENSIONS_Y that will be checked when selecting band numbers from the netcdf_dim dict.
mortenwh added a commit that referenced this issue Jul 3, 2019
…py) for generic handling of both netcdf and opendap datasets (see #391). Did not write tests but tried the code on the dataset linked in the issue description. Nansat can now read s1 netcdf files from the Norwegian ground segment but reading some of the bands is *very* slow. This is most likely an issue with the way the files are created. Have e-mailed met.no about it.
mortenwh added a commit that referenced this issue Jul 3, 2019
mortenwh added a commit that referenced this issue Jul 3, 2019
mortenwh added a commit that referenced this issue Jul 3, 2019
…o. If a dimension was not added as a variable in the netcdf file, the function _get_band_from_subfile failed. Made test to reproduce the issue and fixed the bug. GDAL complains with GDAL_ERROR 1: b'No 1D variable is indexed by dimension dimension_rgb' but it still works. Also removed a warning issued in a try/except clause.
mortenwh added a commit that referenced this issue Jul 3, 2019
…his depends on an external resource and will fails if something is changed there. Added a test for NSR just to increase the coverage.
@mortenwh mortenwh closed this as completed Jul 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant