Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NetCDF] Simplify NetCDF build script #5319

Merged

Conversation

Alexander-Barth
Copy link
Contributor

@Alexander-Barth Alexander-Barth commented Aug 16, 2022

  • require LibCURL_jll 7.73.0
  • drop support for julia 1.3
  • single build script
  • enable Zstandard

See also issue #4511

* drop support for julia 1.3
* single build script
* enable Zstandard
@mkitti
Copy link
Contributor

mkitti commented Aug 16, 2022

I'm just curious how the Zstandard integration works here. I would like to test how that interacts with https://github.com/JuliaIO/HDF5.jl/tree/master/filters/H5Zzstd

@visr
Copy link
Contributor

visr commented Aug 16, 2022

@mkitti it hasn't quite been hooked up yet in the netcdf wrapper packages, though see JuliaGeo/NetCDF.jl#160 (comment).

Both approaches would use the same Zstd_jll. What kind of interaction are you thinking about? I'd be very strange for someone to use both on the same file, since they'd probably do similar things, one just going through the libnetcdf interface.

@Alexander-Barth
Copy link
Contributor Author

After looking into a how the these filters work, there is certainly more work needed before zstandard (and NetCDF plugins for HDF5 in general) become usable with NetCDF_jll.
So far, in they do not compile on Windows (Unidata/netcdf-c#2468) and on Mac it seems that some plugins have the .so extension while they should have the .dylib. Currently it is a mix of both on aarch64-apple-darwin:

$ ls .libs/lib__nc*dylib
.libs/lib__nch5bzip2.0.dylib    .libs/lib__nch5fletcher32.0.dylib  .libs/lib__nch5noop.0.dylib   .libs/lib__nch5shuffle.0.dylib  .libs/lib__nch5zstd.0.dylib
.libs/lib__nch5bzip2.dylib      .libs/lib__nch5fletcher32.dylib    .libs/lib__nch5noop1.0.dylib  .libs/lib__nch5shuffle.dylib    .libs/lib__nch5zstd.dylib
.libs/lib__nch5deflate.0.dylib  .libs/lib__nch5misc.0.dylib        .libs/lib__nch5noop1.dylib    .libs/lib__nch5unknown.0.dylib  .libs/lib__nczmisc.0.dylib
.libs/lib__nch5deflate.dylib    .libs/lib__nch5misc.dylib          .libs/lib__nch5noop.dylib     .libs/lib__nch5unknown.dylib    .libs/lib__nczmisc.dylib
$ ls .libs/lib__nc*so
.libs/lib__nczhdf5filters.so  .libs/lib__nczstdfilters.so

which leads to this error if lib__nczhdf5filters is declared as a LibraryProduct

[ Info: Could not locate lib__nczhdf5filters inside ["/mnt/data1/abarth/src/Yggdrasil/N/NetCDF/build/aarch64-apple-darwin/McixRJ5o/aarch64-apple-darwin20-libgfortran5-cxx11/destdir/lib64", "/mnt/data1/abarth/src/Yggdrasil/N/NetCDF/build/aarch64-apple-darwin/McixRJ5o/aarch64-apple-darwin20-libgfortran5-cxx11/destdir/lib"]
┌ Error: Built NetCDF but lib__nczhdf5filters still unsatisfied:
└ @ BinaryBuilder ~/.julia/packages/BinaryBuilder/wohhx/src/AutoBuild.jl:906

Once this is resolved, there is the question where to install the plugins so that HDF5 can find them. Here is a relevant discussion Unidata/netcdf-c#2294 and there will probably some changed in the upcoming version 4.9.1.
The approach of using the environment variable HDF5_PLUGIN_PATH (or dropping file in the HDF5 installation directory) does not seem to be well suited to the way we install libraries in julia.

I think it would be good to track the integration of HDF5 plugins in a separate issue.

@mkitti
Copy link
Contributor

mkitti commented Aug 18, 2022

Just to elaborate on the follwoing Unidata/netcdf comment from Unidata from the Julia perspective:

It turns out that we are not building and installing the zstandard filter code itself. This is a small C program that we must compile and install on the users system, or else zstandard cannot work for them.

The HDF5 plugin they refer to is located here:
https://github.com/aparamon/HDF5Plugin-Zstandard
This is small piece of C code that glues HDF5 and ZStandard together.

For Julia, rather than compiling this small piece of C code, we just translated it to Julia here:
https://github.com/JuliaIO/HDF5.jl/blob/master/filters/H5Zzstd/src/H5Zzstd.jl

Other translated filter plugins can be found here:
https://github.com/JuliaIO/HDF5.jl/blob/master/filters

Technically, we could package up the C plugin in BinaryBuilder, but so far I have not seen a strong incentive to do so.

@Alexander-Barth
Copy link
Contributor Author

Do you register these julia plugins with H5Pset_filter when HDF5.jl is loaded?

@Alexander-Barth
Copy link
Contributor Author

I removed the Zstd_jll dependency because the support for Zstandard and plugins is not jet functional.

@visr
Copy link
Contributor

visr commented Aug 18, 2022

Sounds good! So this is good to go, right? I see julia 1.8 got tagged, would be great to have a fully functional netcdf binary by the time it is announced :)

@mkitti
Copy link
Contributor

mkitti commented Aug 18, 2022

Do you register these julia plugins with H5Pset_filter when HDF5.jl is loaded?

The filters are registered via H5Zregister:
https://github.com/JuliaIO/HDF5.jl/blob/e3252afec2dadecc5a6ba644924dae2922c4e5b0/src/filters/filters.jl#L233

via their __init__:
https://github.com/JuliaIO/HDF5.jl/blob/master/filters/H5Zzstd/src/H5Zzstd.jl#L122

If you want to use Zstd compression with HDF5.jl, then you just have to do:

using HDF5
using H5Zzstd

This is documented here:
https://juliaio.github.io/HDF5.jl/stable/filters/

H5Pset_filter is used to add a filter to a dataset's filter pipeline. Via HDF5.jl this is usually done by passing a vector of Filters to the filters keyword of create_dataset.

If we're talking about filter ID 32015, then using H5Zzstd should work for you.

@mkitti
Copy link
Contributor

mkitti commented Aug 18, 2022

One can confirm the availability of the filter via H5Zfilter_avail

julia> using HDF5

julia> HDF5.API.h5z_filter_avail(32015)
false

julia> using H5Zzstd

julia> H5Z_FILTER_ZSTD
32015

julia> HDF5.API.h5z_filter_avail(32015)
true

julia> HDF5.API.h5z_filter_avail(H5Z_FILTER_ZSTD)
true

julia> HDF5.Filters.isavailable(ZstdFilter)
true

julia> HDF5.Filters.isavailable(H5Z_FILTER_ZSTD)
true

@Alexander-Barth
Copy link
Contributor Author

Thank you for this detailed information! So there are good chances that this mechanism already works for NetCDF too.
As @visr said, it would be good to have this release so that we have again a working NetCDF_jll for all major OSes.

@staticfloat
Copy link
Member

So to be clear, the expectation here is that this will build a single version of NetCDF that can be loaded in Julia 1.6, 1.7 and 1.8. I see the dependency on LibCURL is specifically set to the version in 1.6, I think there's a good chance this will work, so let's roll with it!

@staticfloat staticfloat merged commit c3edfae into JuliaPackaging:master Aug 18, 2022
@maleadt
Copy link
Contributor

maleadt commented Aug 19, 2022

Since yesterday, quite some packages have started to fail on Julia 1.9 segfaulting in libnetcdf during initialization, so I guess that may be related to this PR. See https://s3.amazonaws.com/julialang-reports/nanosoldier/pkgeval/by_hash/f7144f7/report.html, e.g., https://s3.amazonaws.com/julialang-reports/nanosoldier/pkgeval/by_hash/f7144f7/ArgoData.primary.log (for one of the shorter failed tests). This happens during precompilation, when NCDatasets.init_certificate_authority executes. It's possible this is related to the PkgEval sandbox, but regardless it shouldn't segfault like that.

@mkitti
Copy link
Contributor

mkitti commented Aug 19, 2022

Julia 1.8 uses libcurl 7.84.0, which is the same version as Julia master (1.9) after JuliaLang/julia#45967. ArgoData seems to pass the tests fine on my Julia 1.8 running on Windows.

Test Summary: | Pass  Total     Time
ArgoData.jl   |    6      6  2m23.7s

@mkitti
Copy link
Contributor

mkitti commented Aug 19, 2022

The segmentation faults seem to be occurring on the call to strdup:
https://github.com/Unidata/netcdf-c/blob/717e022e80cc696ed2855b6e38d58b26ebedb8ba/libdispatch/drc.c#L168-L182

@mkitti
Copy link
Contributor

mkitti commented Aug 19, 2022

ClimateModels fails after nc_open
https://s3.amazonaws.com/julialang-reports/nanosoldier/pkgeval/by_hash/f7144f7/ClimateModels.primary.log

@Alexander-Barth
Copy link
Contributor Author

Alexander-Barth commented Aug 20, 2022

I think all this is related to Unidata/netcdf-c#2337. We are currently calling a private API to define the path of the certificate authority for HTTPS because the (new) public API produces a segfault. Now it seems that the private API segfaults too. I guess I will need to disable HTTPS in NCDatasets until this upstream issue is resolved (at least under Julia 1.9).

@maleadt
Copy link
Contributor

maleadt commented Aug 20, 2022

I guess I will need to disable HTTPS in NCDatasets until this upstream issue is resolved (at least under Julia 1.9).

That would be great. Alternatively, if this is related to PkgEval's sandbox environment I could tweak that, as long as we get rid of the 30+ segfaulting packages on PkgEval :-)

@Alexander-Barth
Copy link
Contributor Author

OK, I have released NCDatasets 0.12.7 which does not call NCDatasets.init_certificate_authority on julia 1.9. Surprisingly NCDatasets 0.12.6 works on Julia 1.9.0-DEV.1160 (nightly) in CI:

https://github.com/Alexander-Barth/NCDatasets.jl/runs/7931319355?check_suite_focus=true

and 1.9.0-DEV.1162 on my machine. I am wondering how I can reproduce this error locally. Does PkgEval also use nightly builds?

@maleadt
Copy link
Contributor

maleadt commented Aug 22, 2022

Does PkgEval also use nightly builds?

The way to test this is explained in the PkgEval README: https://github.com/JuliaCI/PkgEval.jl#why-does-my-package-fail

Interestingly however, it does not fail when using julia=nightly (which uses pre-built binaries from Julia CI), whereas it does when forcing PkgEval.jl to build Julia from scratch:

julia> using PkgEval

julia> config = Configuration(julia="master", buildflags=["JULIA_CPU_TARGET=native", "JULIA_PRECOMPILE=0"])

julia> PkgEval.sandboxed_julia(config)

# in the sandbox here

pkg> add NCDatasets@0.12.6

julia> using NCDatasets

# segfault

That's not great, because PkgEval source builds are supposed to be identical to the ones performed by CI... @staticfloat, any thoughts? The build script is here: https://github.com/JuliaCI/PkgEval.jl/blob/893b93e5723782653dc87aaec2e00b1b08b6cce5/src/julia.jl#L107-L186

@maleadt
Copy link
Contributor

maleadt commented Aug 22, 2022

Interestingly, if I switch to the rootfs used by Julia CI, the issue disappears... Let's do that for PkgEval then.

@Alexander-Barth
Copy link
Contributor Author

Alexander-Barth commented Aug 23, 2022

This issue Unidata/netcdf-c#2486 might be relevant where a memory problem during initialization is observed (independently of julia) in ncrc_setrchome.

@Alexander-Barth
Copy link
Contributor Author

For the record: the C option -std=c99 (used before) results that the function strdup is undefined (as it is not part of C99) and any operation on strings in NetCDF like using the home path leads in a crash.

@giordano
Copy link
Member

strdup is defined conditionally:

sandbox:${WORKSPACE} # grep -B3 -r '\*strdup' /opt/${target}/${target}/sys-root/usr/include/
/opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/include/string.h-#if defined __USE_SVID || defined __USE_BSD || defined __USE_XOPEN_EXTENDED \
/opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/include/string.h-    || defined __USE_XOPEN2K8
/opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/include/string.h-/* Duplicate S, returning an identical malloc'd string.  */
/opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/include/string.h:extern char *strdup (__const char *__s)

The gnu99 standard defines some of those macros, thus enabling strdup to be defined, we had to deal with this yesterday in

cc zipflow.c -shared -fPIC -std=gnu99 -o ${libdir}/libzipflow.${dlext} -I${includedir} -L${libdir} -lz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants