Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests hang on command line, throw an exception under Visual Studio on Windows, with S3 support enabled #2739

Open
WardF opened this issue Aug 23, 2023 · 12 comments
Assignees
Milestone

Comments

@WardF
Copy link
Member

WardF commented Aug 23, 2023

netcdf-c specific issue mirroring what's been observed over at conda-forge/libnetcdf-feedstock#182

Under Windows, using the dependencies provided by Conda, I see a hang under Visual Studio when running the netCDF-C tests. Specifically, the very first test, ncdump/tst_create_files hangs. When running under Visual Studio, an exception is thrown.

See the following instructions to duplicate this issue

Settting up the environment.

  1. Install Miniconda from https://docs.conda.io/en/latest/miniconda.html
  2. From the Windows menu, open Anaconda Prompt (Miniconda3)
  3. Add conda-forge channel: $conda config --add channels conda-forge
  4. Set up environment (thanks to @dopplershift): $ conda create -n bad python=3.11 netcdf4 && conda install -n bad -c conda-forge/label/broken libnetcdf=4.9.2=nompi_h624ddae_109
  5. Switch to new environment: $ conda activate bad

Compiling netCDF-C

Once you have this environment configured, you can change to the top-level netCDF-C directory. From that point, you can build netCDF-C as you normally would under windows. For my workflow it is as follows (assuming top-level netCDF-C)

  1. mkdir build
  2. cd build
  3. cmake .. -DENABLE_NCZARR_S3=TRUE
    4 cmake --build . --config Debug -j 4

At this point, you've compiled and can observe the hang on the command line as follows:

  1. cmake --build . --config Debug --target RUN_TESTS

Debugging in Visual Studio

You can launch Visual Studio for debugging, but you will need to make a couple of additional changes. Once you've opened Visual Studio, brows the the project file ALL_BUILD in the build/ directory.

  1. Scroll down to tst_create_files
  2. Right-click, go to Properties
  3. Under Debugging, find the Environment entry. Modify it as follows:
    PATH=[output of 'echo %PATH%' from the Anaconda command window]

image

At this point, you can right-click on tst_create_files in Visual Studio, navigate to Debug, and select Step into new instance.

@WardF WardF added platform/windows area/nczarr nczarr related topics. labels Aug 23, 2023
@WardF WardF added this to the 4.9.3 milestone Aug 23, 2023
@WardF WardF changed the title Tests Hang on command line, thrown Exception under Visual Studio, on Windows with S3 support. Tests hang on command line, throw an exception under Visual Studio on Windows, with S3 support enabled Aug 23, 2023
@DennisHeimbigner
Copy link
Collaborator

I have never successfully installed aws-sdk-cpp on my windows machine. Without that, I am not sure I can fix this problem.

@WardF
Copy link
Member Author

WardF commented Aug 23, 2023

@DennisHeimbigner If you follow the steps I laid out, using conda, you will have a full environment to replicate this issue. It's how I've managed to recreate it on Windows :).

@dopplershift
Copy link
Member

@DennisHeimbigner Well right now it would seem that it's entirely possible that NcZarr S3 support is completely busted on Windows to the point of locking up the library. Fixing this problem is not optional, so top priority is figuring out how to get our Windows building robust. conda-forge seems to have figured out how to build aws-sdk-cpp on Windows so using those builds, or at least the recipe, would seem to be a good place to start.

@WardF
Copy link
Member Author

WardF commented Aug 23, 2023

@DennisHeimbigner The first step I think is that I need to turn on the tests in our CI; can you remind me where we're at in terms of how to enable them w/ credentials? Once I know what I need, I'll talk to the right people to get the credentials securely stored for our tests to have access to the S3 buckets they'll need.

@DennisHeimbigner
Copy link
Collaborator

Ryan -
I have a second system for accessing S3. It is base on the HDF5 ros3, and obviates the need for aws-sdk-cpp.
It is in PR #2686, which is merged into master.
It appears to work under windows.
I have never had any luck with aws-sdk-cpp under windows. I am trying again with the newest main branch of aws-sdk-cpp.

@dopplershift
Copy link
Member

dopplershift commented Aug 23, 2023

@DennisHeimbigner If the alternate system strikes the best engineering trade-offs (runtime performance, dependencies vs. code we have to understand and maintain), that's great, provided @WardF is on-board.

I do need to point out though that conda-forge is able to REGULARLY build the released version of aws-sdk-cpp and I've not heard of that library creating lock-up problems anywhere else. So from my perspective, we have some internal challenges that we should be resolving rather than considering aws-sdk-cpp somehow fundamentally flawed.

@WardF
Copy link
Member Author

WardF commented Aug 23, 2023

I'm currently fixing some issues with how cmake was finding libaws and telling it to link (it wasn't). There are additional steps I'll need to take for Windows based builds as well. I'd rather avoid internalizing more functionality and associated technical debt where ever possible. I'd like to prioritizing getting the AWS SDK working as the primary solution, after which we can focus on a fallback solution. As it stands, cmake was never linking against the SDK libraries, even if it found them.

@DennisHeimbigner
Copy link
Collaborator

Ward- a possible experiment.

  1. Find the function ncs3sdk.cpp#NC_s3sdkinitialize
  2. just before the line: Aws::InitAPI(ncs3options); insert this line:
    ncs3options.loggingOptions.logLevel = Aws::Utils::Logging::LogLevel::Debug;

Perhaps this will produce some useful debug output.

@bnlawrence
Copy link

Hi folks. Has there been any progress on this? We've got a lot of things depending on it, and if there's a substantial issue here it'd be good to know! If, as I expect, you've dealt with this, but not yet updated the ticket, that'd be good to know too!

@WardF
Copy link
Member Author

WardF commented Sep 25, 2023

Work continues on this; it is proving problematic. Everything works under Linux but are hanging (when using the s3 cpp sdk) on MacOS and Windows. It's been the primary focus of my work for the last month. We have an internal S3 API that I'm hoping will work as a stop-gap on Windows, but there are other issues I've had to address first. @bnlawrence

@WardF
Copy link
Member Author

WardF commented Sep 26, 2023

Thanks for your patience; this has been interesting, in the way 'undergraduate Computer Science student' was often interesting. There's no clear answer as to why things are behaving the way they are, so there's been a lot of exploration, and a lot of time invested in writing test programs to answer simple questions.

@bnlawrence
Copy link

bnlawrence commented Sep 27, 2023

Thanks for your patience; this has been interesting, in the way 'undergraduate Computer Science student' was often interesting. There's no clear answer as to why things are behaving the way they are, so there's been a lot of exploration, and a lot of time invested in writing test programs to answer simple questions.

Ok, good luck! No one wants to live in "interesting times" (copyright Confucious).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants