Skip to content

Commit

Permalink
Add bitgroom support to NCZarr
Browse files Browse the repository at this point in the history
re: PR Unidata#2088

Primary changes:
* Add NCZarr-specific quantize function to the dispatch table.
* Copy quantize code from libhdf5
* Add quantize invocation to zvar.c
* Add support for _QuantizeBitgroomNumberOfSignificantDigits to ncgen.
* Copy quantize test from nc_test4 to nczarr_tests. Remove some parts that are not relevant to NCZarr.

Other Changes:
* Break zsync.c into zsync.c (writing) and zload.c (reading).
* Clean up the fill value handling (many changes)
* Disable atexit() under Windows
* Move ncjson to libdispatch
* Add documentation of differences between netcdf-4 and NCZarr, especially WRT fill value.
* Some mingw fixes
* Remove some cruft
* Cleanup the handling of scalars
  • Loading branch information
DennisHeimbigner committed Nov 3, 2021
1 parent 0e205f9 commit c5ddd15
Show file tree
Hide file tree
Showing 37 changed files with 4,441 additions and 3,010 deletions.
7 changes: 6 additions & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1715,10 +1715,15 @@ CHECK_FUNCTION_EXISTS(atexit HAVE_ATEXIT)

# Control invoking nc_finalize at exit
OPTION(ENABLE_ATEXIT_FINALIZE "Invoke nc_finalize at exit." ON)
IF(ENABLE_ATEXIT_FINALIZE)
IF(NOT HAVE_ATEXIT)
IF(ENABLE_ATEXIT_FINALIZE AND NOT HAVE_ATEXIT)
SET(ENABLE_ATEXIT_FINALIZE OFF CACHE BOOL "Enable ATEXIT" FORCE)
MESSAGE(WARNING "ENABLE_ATEXIT_FINALIZE set but atexit() function not defined")
ELSE()
IF(MSVC)
SET(ENABLE_ATEXIT_FINALIZE OFF CACHE BOOL "Enable ATEXIT" FORCE)
MESSAGE(WARNING "ENABLE_ATEXIT_FINALIZE not supported under Windows")
ENDIF()
ENDIF()
ENDIF()

Expand Down
2 changes: 2 additions & 0 deletions RELEASE_NOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,12 @@ This file contains a high-level description of this package's evolution. Release

## 4.8.2 - TBD

* [Enhancement] Add bitgroom support to NCZarr. See [Github #2???](https://github.com/Unidata/netcdf-c/pull/2???)
* [Enhancement] Support byte-range reading of netcdf-3 files stored in private buckets in S3. See [Github #2134](https://github.com/Unidata/netcdf-c/pull/2134)
* [Enhancement] Support Amazon S3 access for NCZarr. Also support use of the existing Amazon SDK credentials system. See [Github #2114](https://github.com/Unidata/netcdf-c/pull/2114)
* [Bug Fix] Fix string allocation error in H5FDhttp.c. See [Github #2127](https://github.com/Unidata/netcdf-c/pull/2127).
* [Bug Fix] Apply patches for ezxml and for selected oss-fuzz detected errors. See [Github #2125](https://github.com/Unidata/netcdf-c/pull/2125).
* [Enhancement] Support Amazon S3 access for NCZarr. Also support use of the existing Amazon SDK credentials system. See [Github #2114](https://github.com/Unidata/netcdf-c/pull/2114)
* [Bug Fix] Ensure that internal Fortran APIs are always defined. See [Github #2098](https://github.com/Unidata/netcdf-c/pull/2098).
* [Enhancement] Support filters for NCZarr. See [Github #2101](https://github.com/Unidata/netcdf-c/pull/2101)
* [Bug Fix] Make PR 2075 long file name be idempotent. See [Github #2094](https://github.com/Unidata/netcdf-c/pull/2094).
Expand Down
23 changes: 22 additions & 1 deletion docs/nczarr.md
Original file line number Diff line number Diff line change
Expand Up @@ -293,6 +293,28 @@ Examples of currently unsupported types are as follows:

Again, this list should diminish over time.

# NCZarr versus netCDF-4. {#nczarr_netcdf4}

If ncgen is used to create both a netCDF-4 file and an NCZarr store using
the same .cdl file, then some differences may be observed.

## _FillValue
The Zarr format stores the fill value as part of the .zarray metadata,
while netcdf-4 stores this in the _FillValue attribute. The .zattr for that
array may also contain the _FillValue attribute as well, so in NCZarr, the
fill value may occur in two places.

The rule is that if nc_def_var_fill was called or the .cdl file defines the _FillValue attribute,
then that attribute will appear in the .zattr metadata, otherwise not.
However, if the fill_value key is defined, then it is used in place of the _FillValue attribute.

If a Zarr store is read that was created by some other Zarr implementation, then
the the fill_value key may be set but there will probably not be any _FillValue attribute.
As above, then this value will be used.

The net result is that NCZarr stores will carry the fill value and use it in subsequent
reads and writes.

# Notes on Debugging NCZarr Access {#nczarr_debug}

The NCZarr support has a trace facility.
Expand Down Expand Up @@ -320,7 +342,6 @@ aws_secret_access_key=YYYY...
```
See Appendix E for additional information.


## Addressing Style

The notion of "addressing style" may need some expansion.
Expand Down
10 changes: 6 additions & 4 deletions libdispatch/dpathmgr.c
Original file line number Diff line number Diff line change
Expand Up @@ -678,7 +678,7 @@ parsepath(const char* inpath, struct Path* path)
&& (tmp1[0] == '/')
&& strchr(windrive,tmp1[1]) != NULL
&& (tmp1[2] == '/' || tmp1[2] == '\0')) {
/* Assume this is a mingw path */
/* Assume this is a msys path */
path->drive = tmp1[1];
/* Remainder */
if(tmp1[2] == '\0')
Expand Down Expand Up @@ -869,11 +869,13 @@ static int
getlocalpathkind(void)
{
int kind = NCPD_UNKNOWN;
#ifdef __CYGWIN__
#if defined __CYGWIN__
kind = NCPD_CYGWIN;
#elif __MSYS__
#elif defined __MINGW32__
kind = NCPD_WIN; /* Do not understand the relationship of MSYS to MINGW */
#elif defined __MSYS__
kind = NCPD_MSYS;
#elif _MSC_VER /* not _WIN32 */
#elif defined _MSC_VER /* not _WIN32 */
kind = NCPD_WIN;
#else
kind = NCPD_NIX;
Expand Down
1 change: 1 addition & 0 deletions libnczarr/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ zodom.c
zopen.c
zprov.c
zsync.c
zload.c
ztype.c
zutil.c
zvar.c
Expand Down
1 change: 1 addition & 0 deletions libnczarr/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ zodom.c \
zopen.c \
zprov.c \
zsync.c \
zload.c \
ztype.c \
zutil.c \
zvar.c \
Expand Down
1 change: 1 addition & 0 deletions libnczarr/zarr.h
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ EXTERNL int NCZ_create_fill_chunk(size64_t chunksize, size_t typesize, const voi
EXTERNL int NCZ_s3clear(NCS3INFO* s3map);
EXTERNL int NCZ_ischunkname(const char* name,char dimsep);
EXTERNL char* NCZ_chunkpath(struct ChunkKey key);
EXTERNL int ncz_rebuild_fill_chunk(NC_VAR_INFO_T* var);

/* zwalk.c */
EXTERNL int NCZ_read_chunk(int ncid, int varid, size64_t* zindices, void* chunkdata);
Expand Down
6 changes: 4 additions & 2 deletions libnczarr/zattr.c
Original file line number Diff line number Diff line change
Expand Up @@ -997,9 +997,10 @@ ncz_del_attr(NC_FILE_INFO_T* file, NC_OBJ* container, const char* name)
}
#endif

/* If we do not have a _FillValue, then go ahead and create it */
#if 0
/* If we do not have a _FillValue attribute, then go ahead and create it */
int
ncz_create_fillvalue(NC_VAR_INFO_T* var)
ncz_create_fillvalue_att(NC_VAR_INFO_T* var)
{
int stat = NC_NOERR;
int i;
Expand All @@ -1025,6 +1026,7 @@ ncz_create_fillvalue(NC_VAR_INFO_T* var)
done:
return THROW(stat);
}
#endif

/* Create an attribute; This is an abbreviated form
of ncz_put_att above */
Expand Down
4 changes: 2 additions & 2 deletions libnczarr/zdispatch.c
Original file line number Diff line number Diff line change
Expand Up @@ -104,8 +104,8 @@ static const NC_Dispatch NCZ_dispatcher = {
NC4_get_var_chunk_cache,
NCZ_inq_var_filter_ids,
NCZ_inq_var_filter_info,
NC_NOTNC4_def_var_quantize,
NC_NOTNC4_inq_var_quantize,
NCZ_def_var_quantize,
NCZ_inq_var_quantize,
};

const NC_Dispatch* NCZ_dispatch_table = NULL; /* moved here from ddispatch.c */
Expand Down
5 changes: 2 additions & 3 deletions libnczarr/zdispatch.h
Original file line number Diff line number Diff line change
Expand Up @@ -173,9 +173,8 @@ EXTERNL int NCZ_def_var_filter(int ncid, int varid, unsigned int filterid, size_
EXTERNL int NCZ_inq_var_filter_ids(int ncid, int varid, size_t* nfiltersp, unsigned int *filterids);
EXTERNL int NCZ_inq_var_filter_info(int ncid, int varid, unsigned int filterid, size_t* nparamsp, unsigned int *params);

EXTERNL int NCZ_def_var_filterx(int ncid, int varid, const char* text);
EXTERNL int NCZ_inq_var_filterx_ids(int ncid, int varid, char** textp);
EXTERNL int NCZ_inq_var_filterx_info(int ncid, int varid, const char* id, char** textp);
EXTERNL int NCZ_def_var_quantize(int ncid, int varid, int quantize_mode, int nsd);
EXTERNL int NCZ_inq_var_quantize(int ncid, int varid, int *quantize_modep, int *nsdp);

#if defined(__cplusplus)
}
Expand Down
6 changes: 4 additions & 2 deletions libnczarr/zfile.c
Original file line number Diff line number Diff line change
Expand Up @@ -115,11 +115,13 @@ NCZ_enddef(int ncid)
var = (NC_VAR_INFO_T *)ncindexith(g->vars, j);
assert(var);
/* set the fill value and _FillValue attribute */
if((stat = ncz_get_fill_value(h5,var,NULL))) goto done; /* ensure var->fill_value is set */
if((stat = ncz_ensure_fill_value(var))) goto done; /* ensure var->fill_value is set */
assert(var->fill_value != NULL);
var->written_to = NC_TRUE; /* mark it written */
/* rebuild the fill chunk */
/* ensure cache is correct */
if((stat = NCZ_adjust_var_cache(var))) goto done;
/* rebuild the fill chunk */
if((stat = ncz_rebuild_fill_chunk(var))) goto done;
/* Build the filter working parameters for any filters */
if((stat = NCZ_filter_setup(var))) goto done;
}
Expand Down
21 changes: 5 additions & 16 deletions libnczarr/zinternal.c
Original file line number Diff line number Diff line change
Expand Up @@ -636,19 +636,19 @@ ncz_find_grp_var_att(int ncid, int varid, const char *name, int attnum,
* @internal What fill value should be used for a variable?
* Side effects: set as default if necessary and build _FillValue attribute.
*
* @param h5 Pointer to file info struct.
* @param var Pointer to variable info struct.
* @param fillp Pointer that gets pointer to fill value.
* @param fillp Pointer that gets pointer to fill value; do not free
*
* @returns NC_NOERR No error.
* @returns NC_ENOMEM Out of memory.
* @author Ed Hartnett, Dennis Heimbigner
*/
int
ncz_get_fill_value(NC_FILE_INFO_T *h5, NC_VAR_INFO_T *var, void **fillp)
ncz_ensure_fill_value(NC_VAR_INFO_T *var)
{
size_t size;
int retval = NC_NOERR;
NC_FILE_INFO_T* h5 = NULL;

#if 0 /*LOOK*/
/* Find out how much space we need for this type's fill value. */
Expand All @@ -659,7 +659,8 @@ ncz_get_fill_value(NC_FILE_INFO_T *h5, NC_VAR_INFO_T *var, void **fillp)
else
#endif
{
if ((retval = nc4_get_typelen_mem(h5, var->type_info->hdr.id, &size))) goto done;
h5 = var->container->nc4_info;
if ((retval = nc4_get_typelen_mem(h5, var->type_info->hdr.id, &size))) goto done;
}
assert(size);

Expand Down Expand Up @@ -695,7 +696,6 @@ ncz_get_fill_value(NC_FILE_INFO_T *h5, NC_VAR_INFO_T *var, void **fillp)
fv_vlen->len = in_vlen->len;
if (!(fv_vlen->p = malloc(basetypesize * in_vlen->len)))
{
free(*fillp);
*fillp = NULL;
return NC_ENOMEM;
}
Expand All @@ -712,17 +712,6 @@ ncz_get_fill_value(NC_FILE_INFO_T *h5, NC_VAR_INFO_T *var, void **fillp)
}
}
#endif /*0*/
/* Create _FillValue Attribute */
if((retval = ncz_create_fillvalue(var))) goto done;
if(fillp) {
void* fill = NULL;
/* Allocate the return space. */
if((fill = calloc(1, size))==NULL)
{retval = NC_ENOMEM; goto done;}
memcpy(fill, var->fill_value, size);
*fillp = fill;
fill = NULL;
}

done:
return retval;
Expand Down
5 changes: 3 additions & 2 deletions libnczarr/zinternal.h
Original file line number Diff line number Diff line change
Expand Up @@ -225,7 +225,7 @@ int NCZ_initialize(void);
int NCZ_finalize(void);
int NCZ_initialize_internal(void);
int NCZ_finalize_internal(void);
int ncz_get_fill_value(NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, void **fillp);
int ncz_ensure_fill_value(NC_VAR_INFO_T* var);
int ncz_find_grp_var_att(int ncid, int varid, const char *name, int attnum,
int use_name, char *norm_name, NC_FILE_INFO_T** file,
NC_GRP_INFO_T** grp, NC_VAR_INFO_T** var,
Expand All @@ -245,12 +245,13 @@ int ncz_close_ncz_file(NC_FILE_INFO_T* file, int abort);

/* zattr.c */
int ncz_getattlist(NC_GRP_INFO_T *grp, int varid, NC_VAR_INFO_T **varp, NCindex **attlist);
int ncz_create_fillvalue(NC_VAR_INFO_T* var);
int ncz_create_fillvalue_att(NC_VAR_INFO_T* var);
int ncz_makeattr(NC_OBJ*, NCindex* attlist, const char* name, nc_type typid, size_t len, void* values, NC_ATT_INFO_T**);

/* zvar.c */
int ncz_gettype(NC_FILE_INFO_T*, NC_GRP_INFO_T*, int xtype, NC_TYPE_INFO_T** typep);
int ncz_find_default_chunksizes2(NC_GRP_INFO_T *grp, NC_VAR_INFO_T *var);
int NCZ_ensure_quantizer(int ncid, NC_VAR_INFO_T* var);

/* Undefined */
/* Find var, doing lazy var metadata read if needed. */
Expand Down
Loading

0 comments on commit c5ddd15

Please sign in to comment.