Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand the NC_INMEMORY capabilities #879

Merged
merged 19 commits into from
May 7, 2018
Merged

Expand the NC_INMEMORY capabilities #879

merged 19 commits into from
May 7, 2018

Conversation

DennisHeimbigner
Copy link
Collaborator

re: esupport MQO-415619
and #708

Expand the NC_INMEMORY capabilities to support writing and accessing
the final modified memory.

Three new functions have been added:
nc_open_memio, nc_create_mem, and nc_close_memio.

The following new capabilities were added.

  1. nc_open_memio() allows the NC_WRITE mode flag
    so a chunk of memory can be passed in and be modified
  2. nc_create_mem() allows the NC_INMEMORY flag to be set
    to cause the created file to be kept in memory.
  3. nc_close_mem() allows the final in-memory contents to be
    retrieved at the time the file is closed.
  4. A special flag, NC_MEMIO_LOCK, is provided to ensure that
    the provided memory will not be freed or reallocated.

Note the following.

  1. If nc_open_memio() is called with NC_WRITE, and NC_MEMIO_LOCK is not set,
    then the netcdf-c library will take control of the incoming memory.
    This means that the original memory block should not be freed
    but the block returned by nc_close_mem() must be freed.
  2. If nc_open_memio() is called with NC_WRITE, and NC_MEMIO_LOCK is set,
    then modifications to the original memory may fail if the space available
    is insufficient.

Documentation is provided in the file docs/inmemory.md.
A test case is provided: nc_test/tst_inmemory.c driven by
nc_test/run_inmemory.sh

WARNING: changes were made to the dispatch table for
the close entry. From int (*close)(int) to int (close)(int,void).

and #708

Expand the NC_INMEMORY capabilities to support writing and accessing
the final modified memory.

Three new functions have been added:
nc_open_memio, nc_create_mem, and nc_close_memio.

The following new capabilities were added.
1. nc_open_memio() allows the NC_WRITE mode flag
   so a chunk of memory can be passed in and be modified
2. nc_create_mem() allows the NC_INMEMORY flag to be set
   to cause the created file to be kept in memory.
3. nc_close_mem() allows the final in-memory contents to be
   retrieved at the time the file is closed.
4. A special flag, NC_MEMIO_LOCK, is provided to ensure that
   the provided memory will not be freed or reallocated.

Note the following.
1. If nc_open_memio() is called with NC_WRITE, and NC_MEMIO_LOCK is not set,
   then the netcdf-c library will take control of the incoming memory.
   This means that the original memory block should not be freed
   but the block returned by nc_close_mem() must be freed.
2. If nc_open_memio() is called with NC_WRITE, and NC_MEMIO_LOCK is set,
   then modifications to the original memory may fail if the space available
   is insufficient.

Documentation is provided in the file docs/inmemory.md.
A test case is provided: nc_test/tst_inmemory.c driven by
nc_test/run_inmemory.sh

WARNING: changes were made to the dispatch table for
the close entry. From int (*close)(int) to int (*close)(int,void*).
@WardF
Copy link
Member

WardF commented Feb 26, 2018

This pull request introduces 1 alert when merging ccc70d6 into fc6ab98 - view on lgtm.com

new alerts:

  • 1 for Comparison result is always the same

Comment posted by lgtm.com

@edhartnett
Copy link
Contributor

edhartnett commented Feb 26, 2018

In order to complete #856 some of this code (all HDF5 calls) must be relocated to libhdf5 out of libsrc4. libsrc4 will not have access to HDF5 header files. (See the discussion in #856 for more details.)

What would be helpful would be to start functions with nc4_ for code that is not hdf5 dependent (i.e. will remain in libsrc4), and with hdf5_ for functions that use HDF5 data structs and functions. That will make it easier to separate without changing function names. This is the convention I adopted when separating the code in my prototype. Happy to use some other convention if that is preferred. (I used the same convention for file names, for example hdf5file.c and nc4file.c, to handle HDF5 file functions and internal metadata file functions, respectively.)

Perhaps the inmemory code deserves it's own dispatch directory?

Can you have inmemory without HDF5? Or is HDF5 always going to be required when building inmemory capabilities?

Also these changes fail address sanitizer, so some memory leaks are present.

@DennisHeimbigner
Copy link
Collaborator Author

In memory works for netcdf classic, cdf5, and netcdf enhanced.
Its implementation is intimately associated with those format dispatchers
so a separate dispatch is not possible without IMO enormous code duplication.
Also, your proposed rename will violate the goal of having non-static
functions begin with nc or NC.
Can you send me the sanitizer output?

@DennisHeimbigner
Copy link
Collaborator Author

Ward- I cannot fix the alert because that lgtm page is failing to load for me.
I will fix as soon as I can access it, or you can send me the text of the failure.

@WardF
Copy link
Member

WardF commented Feb 26, 2018

I will send that shortly, thanks Dennis!

@edhartnett
Copy link
Contributor

WRT inmemory I have not studied it, but I noticed some HDF5 calls. Presumably they are used when the user wants to turn an in-memory file into a real HDF5 file?

Does this happen in nc_close? Will it call the nc_close in the HDF5 layer? If so, then that would make it easy to call HDF5 code there, and non-HDF5 code in the libsrc4 layer. HDF5_close() gets called by the dispatch table, and when it has done it's work, calls NC_close() in libsrc4. So the HDF5 part has to be separated out into functions that are only called in HDF5_close().

WRT function names: Sorry, I didn't realize that nc_ was meant for all non-static functions. I thought you meant just the ones in the libsrc4 layer. I have already violated that rule with the hdf4 code. For example the HDF4 nc_open is called HDF4_OPEN.

The convention I was using was NC_ and nc_ in the libsrc4 directory, HDF5_ and hdf5_ int the HDF5 dispatch code, HDF4_ and hdf4_ for HDF4, etc. I was following the pnetcdf model, which uses ncp_.

But I can just add an NC_ and nc_ everywhere in front. So it would be NC_HDF5_open() for the HDF5 nc_open, and NC_HDF4_open for the HDF4 nc_open. Sound good?

I will send some sanitizer output via email.

@DennisHeimbigner
Copy link
Collaborator Author

The inmemory code for netcdf4 uses two parts of hdf5 library.

  1. "core" driver which allows the use of memory for storing the file contents
  2. the image operations (see https://support.hdfgroup.org/HDF5/doc1.8/Advanced/FileImageOperations/HDF5FileImageOperations.pdf) to set and get the
    initial and final contents of the inmemory file.

The netcdf-3 and cdf5 code use a modified ncio subclass: memio.c

As far as I know, HDF4 does not support inmemory operation, so if that flag
is set for hdf5 open/create, then it should be an error.

@WardF
Copy link
Member

WardF commented Feb 26, 2018

This pull request introduces 1 alert when merging 25045b7 into fc6ab98 - view on lgtm.com

new alerts:

  • 1 for Comparison result is always the same

Comment posted by lgtm.com

@WardF
Copy link
Member

WardF commented Feb 26, 2018

lgtm is something Ryan turned on across the board for Unidata projects; I will figure out how to get you access to the results, but it is a minor thing that can be potentially ignored. I will share with you shortly in person.

2. fix lgtm alert
4. fix problem in examples/C/hdf5plugins
   where a file overwrite problem occurs.
@DennisHeimbigner
Copy link
Collaborator Author

I am putting this on hold for a while. It appears that I had made some changes
in another branch that I intended to merge with this one before final merge.
In any case, I apparently (1) forgot to do the merge and (2) deleted the other branch.
So I will have to recreate those changes.

There is an error in handling the nc_create_memio function
as noted in the above esupport thread.
This attempts to fix it.

Also do a master merge
@DennisHeimbigner
Copy link
Collaborator Author

Ok, this appears to be ready to go again. So it should be merged.

@WardF
Copy link
Member

WardF commented May 4, 2018

This will be the next one merged!

@WardF WardF merged commit 0123000 into master May 7, 2018
@WardF WardF deleted the inmemory.dmh branch May 7, 2018 19:17
@mlandiech
Copy link

Hello,

I have some issue with InMemory buffer returned in case of write mode.
First, if I create a new file with InMemory and write mode, when I close it, the returned buffer is not valid. When I try to open buffer with InMemory open function it does not work because version is missing on first bytes.
Second, if I open a file with InMemory and write mode, then I modify its content , when I close it, the returned buffer does not contain modifications. When I try to open buffer with InMemory open function, values do not match

Do you have an idea about what can happen?
I think that returned buffer is the initial one.

Regards,

@DennisHeimbigner
Copy link
Collaborator Author

Do you have a simple program to show this bug?

@mlandiech
Copy link

Yes I modified your test file. You can find modified file in attached archive.
I added 2 sections prefix with :
// [ADD mlandiech]

One try to reopen a newly created buffer.
Second try to reopen and check a modified buffer.

Regards

tst_inmemory.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants