-
Notifications
You must be signed in to change notification settings - Fork 859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
./configure does not look into hwloc's "lib" directory #471
Comments
Odd - it looks like the configure code should first be checking the with_hwloc/lib and then with_hwloc/lib64, so it should have worked. However, I do note that a couple of variables aren't initialized in that code, and so maybe that's the problem? See if this patch helps: diff --git a/opal/mca/hwloc/external/configure.m4 b/opal/mca/hwloc/external/configure.m4
index c31dc2c..eb89c4f 100644
--- a/opal/mca/hwloc/external/configure.m4
+++ b/opal/mca/hwloc/external/configure.m4
@@ -109,6 +109,8 @@ AC_DEFUN([MCA_opal_hwloc_external_CONFIG],[
AS_IF([test "$with_hwloc" = "no"], [opal_hwloc_external_want=no])
# If we still want external support, try it
+ opal_hwloc_libdir=
+ opal_hwloc_dir=
AS_IF([test "$opal_hwloc_external_want" = "yes"],
[OPAL_CHECK_WITHDIR([hwloc-libdir], [$with_hwloc_libdir],
[libhwloc.*]) Note that you will be required to re-run autogen.pl before configure so this can take effect. |
Can you put the full configure output and config.log in a gist? |
Ignore my above comment - my bad. The 1.8 series definitely has a bug in it. Specifically, if you don't explicitly set the libdir, then we leave that variable empty. OMPI_CHECK_PACKAGE will then only look at the default LD_LIBRARY_PATH location - it never looks at any with_hwloc/lib or with_hwloc/lib64 options. Looks like it applies to master as well. |
Jeff: consider the following patch (this is against 1.8, but should apply to master: diff --git a/opal/mca/hwloc/external/configure.m4 b/opal/mca/hwloc/external/configure.m4
index c096100..c137d33 100644
--- a/opal/mca/hwloc/external/configure.m4
+++ b/opal/mca/hwloc/external/configure.m4
@@ -108,6 +108,8 @@ AC_DEFUN([MCA_opal_hwloc_external_CONFIG],[
# If we still want external support, try it
AS_IF([test "$opal_hwloc_external_want" = "yes"],
+ opal_hwloc_dir=
+ opal_hwloc_libdir=
[OMPI_CHECK_WITHDIR([hwloc-libdir], [$with_hwloc_libdir],
[libhwloc.*])
@@ -117,7 +119,8 @@ AC_DEFUN([MCA_opal_hwloc_external_CONFIG],[
AC_MSG_RESULT([($opal_hwloc_dir)])],
[AC_MSG_RESULT([(default search paths)])])
AS_IF([test ! -z "$with_hwloc_libdir" -a "$with_hwloc_libdir" != "yes"],
- [opal_hwloc_libdir="$with_hwloc_libdir"])
+ [opal_hwloc_libdir="$with_hwloc_libdir"],
+ [opal_hwloc_libdir="$opal_hwloc_dir"])
opal_hwloc_external_CPPFLAGS_save=$CPPFLAGS
opal_hwloc_external_CFLAGS_save=$CFLAGS |
👍 |
Wait, no -- that patch is wrong. Let me investigate... |
@eschnett I'd still like to see your OMPI configure output and config.log -- I'm not sure why OMPI would pick $hwloc_dir/lib64 over $hwloc_dir/lib (especially if $hwloc_dir/lib64 does not exist). EDIT Oops -- I meant "...especially if $hwloc_dir/lib64 does not exist..." Are you sure that OMPI is not checking $hwloc_dir/lib64 after it checks $hwloc_dir/lib? It should be checking for both. And if it fails with $hwloc_dir/lib, that area in config.log should provide some illumination as to why it failed there. |
Are you sure that patch is wrong? Here is what I see when I trace the code. Let's assume that we have one hwloc version in a standard installation location, but we want to use the one we hand-built under our home directory. So we configure In the current code, this means that OPAL_CHECK_PACKAGE (in the master) is called with the following arguments: OPAL_CHECK_PACKAGE([opal_hwloc_external],
[hwloc.h],
[hwloc],
[hwloc_topology_init],
[],
[<foo>],
[],
[opal_hwloc_external_support=yes],
[opal_hwloc_external_support=no]) If I then look at OPAL_CHECK_PACKAGE, I find that the code will correctly find the hwloc.h header in the My proposed change just ensures that if you specify |
@rhc54 Heh -- I was very confused by your comment, but then I realized that your use of Anyhoo... to reply to your question... When you pass a non-empty value to the libdir argument, it's expected to be the exact directory where the library lives -- not a prefix. For example, it's expected to be Your patch passes When you pass an empty value to the libdir argument, the macro will try the base dir argument suffixed with both lib and lib64. I.e., it'll first try This is why @eschnett is seeing |
I didn't realize Github did that - sorry. I'll try to remember they do in the future. I hear what you say, but that isn't what I see in the code. This is what is in OPAL_CHECK_PACKAGE: AS_IF([test "$opal_check_package_lib_happy" = "no"],
[AS_IF([test "$opal_check_package_libdir" != ""],
[$1_LDFLAGS="$$1_LDFLAGS -L$opal_check_package_libdir/lib"
LDFLAGS="$LDFLAGS -L$opal_check_package_libdir/lib"
AC_VERBOSE([looking for library in lib])
AC_SEARCH_LIBS([$3], [$2],
[opal_check_package_lib_happy="yes"],
[opal_check_package_lib_happy="no"], [$4])
AS_IF([test "$opal_check_package_lib_happy" = "no"],
[ # no go on the as is.. see what happens later...
LDFLAGS="$opal_check_package_$1_save_LDFLAGS"
$1_LDFLAGS="$opal_check_package_$1_orig_LDFLAGS"
unset opal_Lib])])]) Note that it only checks the lib subdirectory if the provided libdir is not empty. So an empty libdir doesn't go down that path. Is this a bug in the check_package code? |
I find that things are working fine if I use gcc instead of clang. With gcc, I was not able to reproduce the problem at all. I am still using clang to build hwloc -- I only switch to gcc for OpenMPI. At the same time, even when I manage to configure hwloc for OpenMPI correctly (using both
I found other places on the internet that describe this OpenMPI/clang incompatibility. I assume that this is some kind of misunderstanding regarding the meaning of Given that OpenMPI has trouble building code with clang, I assume that this may also trigger configuration errors. I have nevertheless created a gist with the output |
Sorry for the delay in replying; I traveled / presented at a workshop this week, which always makes me behind in actual work... @rhc54 No, I think opal_check_package (ompi_check_package in v1.8) is ok. The hwloc/external/configure.m4 will pass an empty libdir value if --with-hwloc-libdir was not specified (and in this case, it wasn't). Hence, opal(ompi)_check_package will check both @eschnett Thanks for the gist. According to that config.log, it looks like you configured it with:
and that the hwloc external component was able to successfully configure. I.e., it found the libhwloc library in
and it ultimately concludes that it can build the hwloc external component (which is the chunk of code that builds against an external hwloc, not the embedded hwloc):
Indeed, it looks like this config.log is from a successful run of configure. Can you send the config.log from a failed configure? As for building issues with clang, other than the annoying warnings about |
FWIW: @rhc54 and I just talked through the opal(ompi)_check_package.m4 code on the phone, and we convinced ourselves that it appears to be correct. So I think that's a red herring in this issue -- let's see what the config.log from a failed configure shows, and go from there. Thanks! |
I have created a new gist with the contents of This is OS X:
and clang installed via MacPorts (that's not Apple's system clang):
|
I have now identified a similar issue when building OpenMPI on Linux with gcc. This is a self-built gcc:
Make screen output and OpenMPI configure log file are at https://gist.github.com/eschnett/0cc565ce3768af4e9119. Configuring fails when configuring libevent; libevent's configure output is at https://gist.github.com/eschnett/75316b734bdee0a8b08e. |
Regarding the Linux/gcc issue: If I configure OpenMPI like this (taken from
then it configures and builds fine. The difference is that I explicitly pass |
This is great information; thank you. I may be swamped and unable to look at this until Monday, but I'll definitely get to it. Sent from my phone. No type good. On Mar 19, 2015, at 8:57 AM, Erik Schnetter <notifications@github.mirror.nvdadr.commailto:notifications@github.com> wrote: Regarding the Linux/gcc issue: If I configure OpenMPI like this (taken from config.log): $ /project/eschnett/shelob/src/cc/FunHPC.cxx/external/openmpi-1.8.4/configure --prefix=/project/eschnett/shelob/src/cc/FunHPC.cxx/openmpi-1.8.4 --with-hwloc=/project/eschnett/shelob/src/cc/FunHPC.cxx/hwloc-1.10.1 CC=/project/eschnett/shelob/gcc-4.9.2/bin/gcc CXX=/project/eschnett/shelob/gcc-4.9.2/bin/g++ CFLAGS=-m128bit-long-double -march=native -Wall -g -O3 CXXFLAGS=-m128bit-long-double -march=native -Wall -g -O3 LDFLAGS=-L/project/eschnett/shelob/src/cc/FunHPC.cxx/hwloc-1.10.1/lib -Wl,-rpath,/project/eschnett/shelob/src/cc/FunHPC.cxx/hwloc-1.10.1/lib then it configures and builds fine. The difference is that I explicitly pass LDFLAGS. — |
@jsquyres did you ever resolve this? Is it something that needs to be done before releasing 1.8.5? |
(I know, I'm dreadfully late in following up on this issue... :-( ) I finally spent some quality time with this issue today. I believe that part of what has been confusing me is that there are (at least?) two distinct issues being reported on this issue:
First issue: duplicate __sputc symbolI was finally able to duplicate the first issue. I'm able to replicate at the v1.8 branch head with a ports-installed clang 3.7 on the latest stable Yosemite (10.10.3) with the XCode clang (i.e., no need for a ports-installed clang). The key is the CFLAGS: $ ./configure --prefix=/Users/jsquyres/bogus 'CFLAGS=-march=native -Wall -g -O3' (note, also, the lack of a need for an external hwloc to replicate the issue) What's sketchy is that It turns out that even Now that I'm able to replicate, I'm digging deeper... Second issue: libevent configure fails with external hwlocI finally asked myself the Right question today: "why on earth does libevent's configure care anything about hwloc?" Specifically: libevent doesn't depend on hwloc, so libevent's configure should be orthogonal to however we're configuring/using hwloc. But. When we have an external hwloc, we add
You can see that the test program fails to run because it can't find libhwloc.so.5. Notice that even though the test program doesn't invoke any hwloc functionality, it was compiled with Notice that this will only happen when hwloc is installed in a path that is not searched by the run-time linker (e.g., it's not in your $LD_LIBRARY_PATH). I have to think about this a little to develop a correct fix. In the meantime, @escnett's LDFLAGS-on-the-configure-command-line workaround is good one. Adding the hwloc library path to LD_LIBRARY_PATH would likely also work. |
Regarding |
Regarding Alternatively, you could add the respective I circumvent this issue now by setting |
Regarding the duplicate __sputc issue: well, this was tangled. The root cause of the issue is that AC_C_INLINE somehow determines that the compiler does not support the #define inline ...which causes all kinds of Badness, the most obvious of which is that it screws up the actual inlining of Terrible. However, OMPI doesn't need to use the AC_C_INLINE test any more before we require a C99 compiler (which is guaranteed to support the As some added bonuses:
|
Fixed the first issue on master in 1029dea (I accidentally referred to the wrong bug number in the commit message -- curses!). |
…igure == Short version Do not export special variables into the environment (e.g., LIBS, LDFLAGS, etc.) when invoking subdir configure scripts. This prevents problems described in #471. == More detail Exporing special env variables before invoking a subdir configure script causes problems in some cases. E.g., in #471, when the user configures with `--with-hwloc=/path/to/hwloc`, and that directory is *not* in a default linker search location will cause the libevent subdir configuration to fail. This happens because: 1. We'll pass LIBS="-L/path/to/hwloc/lib -lhwloc" to the libevent configure script 1. Meaning: configure-generated executables will link successfully 1. But unless LD_LIBRARY_PATH (or some other tell-the-linker-where-to-find-things mechanism) includes /path/to/hwloc/lib, the executable can't run. Specifically, the libevent "hey, does the compiler generate proper executables?" check will fail, and configure will abort (because OMPI needs libevent). I checked the history: exporting these vars dates all the way back to LAM/MPI. I can't think of a reason why we need to export these variables -- AC_CONFIG_SUBDIRs doesn't do it; subdir configure scripts should be orthogonal from the upper-layer configure script (and its variables). So let's remove these export statements and see if anything breaks.
…igure == Short version Do not export special variables into the environment (e.g., LIBS, LDFLAGS, etc.) when invoking subdir configure scripts. This prevents problems described in open-mpi/ompi#471. == More detail Exporing special env variables before invoking a subdir configure script causes problems in some cases. E.g., in open-mpi/ompi#471, when the user configures with `--with-hwloc=/path/to/hwloc`, and that directory is *not* in a default linker search location will cause the libevent subdir configuration to fail. This happens because: 1. We'll pass LIBS="-L/path/to/hwloc/lib -lhwloc" to the libevent configure script 1. Meaning: configure-generated executables will link successfully 1. But unless LD_LIBRARY_PATH (or some other tell-the-linker-where-to-find-things mechanism) includes /path/to/hwloc/lib, the executable can't run. Specifically, the libevent "hey, does the compiler generate proper executables?" check will fail, and configure will abort (because OMPI needs libevent). I checked the history: exporting these vars dates all the way back to LAM/MPI. I can't think of a reason why we need to export these variables -- AC_CONFIG_SUBDIRs doesn't do it; subdir configure scripts should be orthogonal from the upper-layer configure script (and its variables). So let's remove these export statements and see if anything breaks. (cherry picked/adapted from commit open-mpi/ompi@9ac9be1)
This issue is now fixed on master, and the 2nd (of 2) PRs is opened to bring the fix over to v1.8. Yippie! So I'm closing this issue. |
…igure == Short version Do not export special variables into the environment (e.g., LIBS, LDFLAGS, etc.) when invoking subdir configure scripts. This prevents problems described in open-mpi#471. == More detail Exporing special env variables before invoking a subdir configure script causes problems in some cases. E.g., in open-mpi#471, when the user configures with `--with-hwloc=/path/to/hwloc`, and that directory is *not* in a default linker search location will cause the libevent subdir configuration to fail. This happens because: 1. We'll pass LIBS="-L/path/to/hwloc/lib -lhwloc" to the libevent configure script 1. Meaning: configure-generated executables will link successfully 1. But unless LD_LIBRARY_PATH (or some other tell-the-linker-where-to-find-things mechanism) includes /path/to/hwloc/lib, the executable can't run. Specifically, the libevent "hey, does the compiler generate proper executables?" check will fail, and configure will abort (because OMPI needs libevent). I checked the history: exporting these vars dates all the way back to LAM/MPI. I can't think of a reason why we need to export these variables -- AC_CONFIG_SUBDIRs doesn't do it; subdir configure scripts should be orthogonal from the upper-layer configure script (and its variables). So let's remove these export statements and see if anything breaks.
…igure == Short version Do not export special variables into the environment (e.g., LIBS, LDFLAGS, etc.) when invoking subdir configure scripts. This prevents problems described in open-mpi#471. == More detail Exporing special env variables before invoking a subdir configure script causes problems in some cases. E.g., in open-mpi#471, when the user configures with `--with-hwloc=/path/to/hwloc`, and that directory is *not* in a default linker search location will cause the libevent subdir configuration to fail. This happens because: 1. We'll pass LIBS="-L/path/to/hwloc/lib -lhwloc" to the libevent configure script 1. Meaning: configure-generated executables will link successfully 1. But unless LD_LIBRARY_PATH (or some other tell-the-linker-where-to-find-things mechanism) includes /path/to/hwloc/lib, the executable can't run. Specifically, the libevent "hey, does the compiler generate proper executables?" check will fail, and configure will abort (because OMPI needs libevent). I checked the history: exporting these vars dates all the way back to LAM/MPI. I can't think of a reason why we need to export these variables -- AC_CONFIG_SUBDIRs doesn't do it; subdir configure scripts should be orthogonal from the upper-layer configure script (and its variables). So let's remove these export statements and see if anything breaks.
…pdate .gitignore: add man page in CUDA extension
I configured OpenMPI 1.8.4 with a self-built hwloc library using the command
This was on OS X; I also tried Linux.
The configure stage aborted with the error (taken from
config.log
):A few lines further up I see
This indicates that OpenMPI does not look into hwloc's
lib
directory, but only into itslib64
directory. Unfortunately, there is nolib64
directory, and hwloc's libraries are installed into thelib
directory. (This is how hwloc installed itself; I did not modify the install procedure.)As a workaround, I can specify
--with-hwloc-libdir
to make things work.The text was updated successfully, but these errors were encountered: