Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

./configure does not look into hwloc's "lib" directory #471

Closed
eschnett opened this issue Mar 12, 2015 · 24 comments
Closed

./configure does not look into hwloc's "lib" directory #471

eschnett opened this issue Mar 12, 2015 · 24 comments
Assignees
Labels

Comments

@eschnett
Copy link

I configured OpenMPI 1.8.4 with a self-built hwloc library using the command

/Users/eschnett/src/cc/FunHPC.cxx/external/openmpi-1.8.4/configure --prefix=/Users/eschnett/src/cc/FunHPC.cxx/openmpi-1.8.4 --disable-shared --with-hwloc=/Users/eschnett/src/cc/FunHPC.cxx/hwloc-1.10.1 CC=clang CXX=clang++ CFLAGS=-march=native -Wall -g -O3 CXXFLAGS=-march=native -Wall -g -O3

This was on OS X; I also tried Linux.

The configure stage aborted with the error (taken from config.log):

configure:76897: result: no
configure:77822: checking if MCA component hwloc:external can compile
configure:77824: result: no
configure:78090: WARNING: Did not find a suitable static opal hwloc component
configure:78092: error: Cannot continue

A few lines further up I see

configure:76888: clang -o conftest -DNDEBUG -march=native -Wall -g -O3 -finline-functions -fno-strict-aliasing    -I/Users/eschnett/src/cc/FunHPC.cxx/hwloc-1.10.1/include  -Wl,-flat_namespace  -L/Users/eschnett/src/cc/FunHPC.cxx/hwloc-1.10.1/lib64 conftest.c -lhwloc    >&5

This indicates that OpenMPI does not look into hwloc's lib directory, but only into its lib64 directory. Unfortunately, there is no lib64 directory, and hwloc's libraries are installed into the lib directory. (This is how hwloc installed itself; I did not modify the install procedure.)

As a workaround, I can specify --with-hwloc-libdir to make things work.

@rhc54
Copy link
Contributor

rhc54 commented Mar 12, 2015

Odd - it looks like the configure code should first be checking the with_hwloc/lib and then with_hwloc/lib64, so it should have worked. However, I do note that a couple of variables aren't initialized in that code, and so maybe that's the problem? See if this patch helps:

diff --git a/opal/mca/hwloc/external/configure.m4 b/opal/mca/hwloc/external/configure.m4
index c31dc2c..eb89c4f 100644
--- a/opal/mca/hwloc/external/configure.m4
+++ b/opal/mca/hwloc/external/configure.m4
@@ -109,6 +109,8 @@ AC_DEFUN([MCA_opal_hwloc_external_CONFIG],[
     AS_IF([test "$with_hwloc" = "no"], [opal_hwloc_external_want=no])

     # If we still want external support, try it
+    opal_hwloc_libdir=
+    opal_hwloc_dir=
     AS_IF([test "$opal_hwloc_external_want" = "yes"],
           [OPAL_CHECK_WITHDIR([hwloc-libdir], [$with_hwloc_libdir], 
                               [libhwloc.*])

Note that you will be required to re-run autogen.pl before configure so this can take effect.

@jsquyres
Copy link
Member

Can you put the full configure output and config.log in a gist?

@rhc54
Copy link
Contributor

rhc54 commented Mar 12, 2015

Ignore my above comment - my bad. The 1.8 series definitely has a bug in it. Specifically, if you don't explicitly set the libdir, then we leave that variable empty. OMPI_CHECK_PACKAGE will then only look at the default LD_LIBRARY_PATH location - it never looks at any with_hwloc/lib or with_hwloc/lib64 options.

Looks like it applies to master as well.

@rhc54
Copy link
Contributor

rhc54 commented Mar 13, 2015

Jeff: consider the following patch (this is against 1.8, but should apply to master:

diff --git a/opal/mca/hwloc/external/configure.m4 b/opal/mca/hwloc/external/configure.m4
index c096100..c137d33 100644
--- a/opal/mca/hwloc/external/configure.m4
+++ b/opal/mca/hwloc/external/configure.m4
@@ -108,6 +108,8 @@ AC_DEFUN([MCA_opal_hwloc_external_CONFIG],[

     # If we still want external support, try it
     AS_IF([test "$opal_hwloc_external_want" = "yes"],
+          opal_hwloc_dir=
+          opal_hwloc_libdir=
           [OMPI_CHECK_WITHDIR([hwloc-libdir], [$with_hwloc_libdir], 
                               [libhwloc.*])

@@ -117,7 +119,8 @@ AC_DEFUN([MCA_opal_hwloc_external_CONFIG],[
                   AC_MSG_RESULT([($opal_hwloc_dir)])],
                  [AC_MSG_RESULT([(default search paths)])])
            AS_IF([test ! -z "$with_hwloc_libdir" -a "$with_hwloc_libdir" != "yes"],
-                 [opal_hwloc_libdir="$with_hwloc_libdir"])
+                 [opal_hwloc_libdir="$with_hwloc_libdir"],
+                 [opal_hwloc_libdir="$opal_hwloc_dir"])

            opal_hwloc_external_CPPFLAGS_save=$CPPFLAGS
            opal_hwloc_external_CFLAGS_save=$CFLAGS

@jsquyres
Copy link
Member

👍

@jsquyres
Copy link
Member

Wait, no -- that patch is wrong. Let me investigate...

@jsquyres
Copy link
Member

@eschnett I'd still like to see your OMPI configure output and config.log -- I'm not sure why OMPI would pick $hwloc_dir/lib64 over $hwloc_dir/lib (especially if $hwloc_dir/lib64 does not exist).

EDIT Oops -- I meant "...especially if $hwloc_dir/lib64 does not exist..."

Are you sure that OMPI is not checking $hwloc_dir/lib64 after it checks $hwloc_dir/lib? It should be checking for both. And if it fails with $hwloc_dir/lib, that area in config.log should provide some illumination as to why it failed there.

@rhc54
Copy link
Contributor

rhc54 commented Mar 13, 2015

Are you sure that patch is wrong? Here is what I see when I trace the code. Let's assume that we have one hwloc version in a standard installation location, but we want to use the one we hand-built under our home directory. So we configure --with-hwloc=<foo>.

In the current code, this means that OPAL_CHECK_PACKAGE (in the master) is called with the following arguments:

           OPAL_CHECK_PACKAGE([opal_hwloc_external],
                              [hwloc.h],
                              [hwloc],
                              [hwloc_topology_init],
                              [],
                              [<foo>],
                              [],
                              [opal_hwloc_external_support=yes],
                              [opal_hwloc_external_support=no])

If I then look at OPAL_CHECK_PACKAGE, I find that the code will correctly find the hwloc.h header in the <foo>/include directory. However, because the libdir argument is empty, it will look for the hwloc library only in the standard install locations - it never looks at the <foo>/lib[64] locations. Worse, since we have a default install in the standard location, it will pick that library up and use it - even though the header it uses will be the one in <foo>.

My proposed change just ensures that if you specify --with-hwloc=<foo>, then we check for the library in the lib and lib64 under that directory.

@jsquyres
Copy link
Member

@rhc54 Heh -- I was very confused by your comment, but then I realized that your use of <foo> was unescaped, so Github turned it into the HTML tag <foo>, which renders as "empty" in HTML email and the Github web interface. So I edited your comment in the web interface and put all cases of <foo> in single quotes (i.e., verbatim) mode, and now <foo> shows up in all the proper places... and your comment makes much more sense. :-)

Anyhoo... to reply to your question...

When you pass a non-empty value to the libdir argument, it's expected to be the exact directory where the library lives -- not a prefix. For example, it's expected to be <foo>/lib (or <foo>/lib64), not <foo>.

Your patch passes <foo>, which will try to link against -lhwloc by adding -L<foo>, which will fail.

When you pass an empty value to the libdir argument, the macro will try the base dir argument suffixed with both lib and lib64. I.e., it'll first try -L<foo>/lib -lhwloc, and if that doesn't work, it'll try -L<foo>/lib64 -lhwloc.

This is why @eschnett is seeing -L<foo>/lib64 in his config.log -- because it tried that second. What I'd like to see is the reason why the -L<foo>/lib test failed.

@rhc54
Copy link
Contributor

rhc54 commented Mar 14, 2015

I didn't realize Github did that - sorry. I'll try to remember they do in the future.

I hear what you say, but that isn't what I see in the code. This is what is in OPAL_CHECK_PACKAGE:

           AS_IF([test "$opal_check_package_lib_happy" = "no"],
               [AS_IF([test "$opal_check_package_libdir" != ""],
                    [$1_LDFLAGS="$$1_LDFLAGS -L$opal_check_package_libdir/lib"
                     LDFLAGS="$LDFLAGS -L$opal_check_package_libdir/lib"
                     AC_VERBOSE([looking for library in lib])
                     AC_SEARCH_LIBS([$3], [$2],
                               [opal_check_package_lib_happy="yes"],
                               [opal_check_package_lib_happy="no"], [$4])
                     AS_IF([test "$opal_check_package_lib_happy" = "no"],
                         [ # no go on the as is..  see what happens later...
                          LDFLAGS="$opal_check_package_$1_save_LDFLAGS"
                          $1_LDFLAGS="$opal_check_package_$1_orig_LDFLAGS"
                          unset opal_Lib])])])

Note that it only checks the lib subdirectory if the provided libdir is not empty. So an empty libdir doesn't go down that path.

Is this a bug in the check_package code?

@eschnett
Copy link
Author

I find that things are working fine if I use gcc instead of clang. With gcc, I was not able to reproduce the problem at all. I am still using clang to build hwloc -- I only switch to gcc for OpenMPI.

At the same time, even when I manage to configure hwloc for OpenMPI correctly (using both --with-hwloc-libdir, and setting both -L and -Wl,-rpath, via LDFLAGS), I cannot build OpenMPI with clang. I receive errors of the kind

duplicate symbol ___sputc in:

I found other places on the internet that describe this OpenMPI/clang incompatibility. I assume that this is some kind of misunderstanding regarding the meaning of static and/or inline as interpreted via the C89, C99, and C++ standards and certain GNU extensions.

Given that OpenMPI has trouble building code with clang, I assume that this may also trigger configuration errors.

I have nevertheless created a gist with the output config.log at https://gist.github.com/eschnett/617370edda0365c81238.

@jsquyres
Copy link
Member

Sorry for the delay in replying; I traveled / presented at a workshop this week, which always makes me behind in actual work...

@rhc54 No, I think opal_check_package (ompi_check_package in v1.8) is ok. The hwloc/external/configure.m4 will pass an empty libdir value if --with-hwloc-libdir was not specified (and in this case, it wasn't). Hence, opal(ompi)_check_package will check both <foo>/lib and <foo>/lib64.

@eschnett Thanks for the gist. According to that config.log, it looks like you configured it with:

  $ /Users/eschnett/src/cc/FunHPC.cxx/external/openmpi-1.8.4/configure --prefix=/Users/eschnett/src/cc/FunHPC.cxx/openmpi-1.8.4 --with-hwloc=/Users/eschnett/src/cc/FunHPC.cxx/hwloc-1.10.1 --disable-shared CC=clang CXX=clang++ CFLAGS=-march=native -Wall -g -O3 CXXFLAGS=-march=native -Wall -g -O3

and that the hwloc external component was able to successfully configure. I.e., it found the libhwloc library in /Users/eschnett/src/cc/FunHPC.cxx/hwloc-1.10.1/lib:

configure:76804: result: looking for library in lib
configure:76806: checking for hwloc_topology_init in -lhwloc
configure:76831: clang -o conftest -DNDEBUG -march=native -Wall -g -O3 -finline-functions -fno-strict-aliasing    -I/Users/eschnett/src/cc/FunHPC.cxx/hwloc-1.10.1/include  -Wl,-flat_namespace  -L/Users/eschnett/src/cc/FunHPC.cxx/hwloc-1.10.1/lib conftest.c -lhwloc    >&5
clang: warning: optimization flag '-finline-functions' is not supported
configure:76831: $? = 0
configure:76840: result: yes

and it ultimately concludes that it can build the hwloc external component (which is the chunk of code that builds against an external hwloc, not the embedded hwloc):

configure:77359: checking if MCA component hwloc:external can compile
configure:77361: result: yes

Indeed, it looks like this config.log is from a successful run of configure.

Can you send the config.log from a failed configure?

As for building issues with clang, other than the annoying warnings about --finline-functions (which I really should fix...), I'm unaware of clang build issues. Indeed, I build Open MPI with clang in my nightly regression tests -- can you provide some pointers to internetage with details of problems that people are having? Perhaps there's some platforms / compiler combinations that we're not testing that have a problem...?

@jsquyres
Copy link
Member

FWIW: @rhc54 and I just talked through the opal(ompi)_check_package.m4 code on the phone, and we convinced ourselves that it appears to be correct. So I think that's a red herring in this issue -- let's see what the config.log from a failed configure shows, and go from there. Thanks!

@eschnett
Copy link
Author

I have created a new gist with the contents of config.log as well as the captured output of the complete build: https://gist.github.com/eschnett/6d0855fa5a0413e156a8. This is with clang, and shows the errors I receive in the end ("duplicate symbol").

This is OS X:

$ uname -a
Darwin Redshift 14.1.0 Darwin Kernel Version 14.1.0: Thu Feb 26 19:26:47 PST 2015; root:xnu-2782.10.73~1/RELEASE_X86_64 x86_64

and clang installed via MacPorts (that's not Apple's system clang):

$ clang --version
clang version 3.6.0 (tags/RELEASE_360/final)
Target: x86_64-apple-darwin14.1.0
Thread model: posix

@eschnett
Copy link
Author

I have now identified a similar issue when building OpenMPI on Linux with gcc. This is a self-built gcc:

[eschnett@shelob1 FunHPC.cxx]$ /project/eschnett/shelob/gcc-4.9.2/bin/gcc --version
gcc (GCC) 4.9.2
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Make screen output and OpenMPI configure log file are at https://gist.github.com/eschnett/0cc565ce3768af4e9119. Configuring fails when configuring libevent; libevent's configure output is at https://gist.github.com/eschnett/75316b734bdee0a8b08e.

@eschnett
Copy link
Author

Regarding the Linux/gcc issue: If I configure OpenMPI like this (taken from config.log):

  $ /project/eschnett/shelob/src/cc/FunHPC.cxx/external/openmpi-1.8.4/configure --prefix=/project/eschnett/shelob/src/cc/FunHPC.cxx/openmpi-1.8.4 --with-hwloc=/project/eschnett/shelob/src/cc/FunHPC.cxx/hwloc-1.10.1 CC=/project/eschnett/shelob/gcc-4.9.2/bin/gcc CXX=/project/eschnett/shelob/gcc-4.9.2/bin/g++ CFLAGS=-m128bit-long-double -march=native -Wall -g -O3 CXXFLAGS=-m128bit-long-double -march=native -Wall -g -O3 LDFLAGS=-L/project/eschnett/shelob/src/cc/FunHPC.cxx/hwloc-1.10.1/lib -Wl,-rpath,/project/eschnett/shelob/src/cc/FunHPC.cxx/hwloc-1.10.1/lib

then it configures and builds fine. The difference is that I explicitly pass LDFLAGS.

@jsquyres
Copy link
Member

This is great information; thank you.

I may be swamped and unable to look at this until Monday, but I'll definitely get to it.

Sent from my phone. No type good.

On Mar 19, 2015, at 8:57 AM, Erik Schnetter <notifications@github.mirror.nvdadr.commailto:notifications@github.com> wrote:

Regarding the Linux/gcc issue: If I configure OpenMPI like this (taken from config.log):

$ /project/eschnett/shelob/src/cc/FunHPC.cxx/external/openmpi-1.8.4/configure --prefix=/project/eschnett/shelob/src/cc/FunHPC.cxx/openmpi-1.8.4 --with-hwloc=/project/eschnett/shelob/src/cc/FunHPC.cxx/hwloc-1.10.1 CC=/project/eschnett/shelob/gcc-4.9.2/bin/gcc CXX=/project/eschnett/shelob/gcc-4.9.2/bin/g++ CFLAGS=-m128bit-long-double -march=native -Wall -g -O3 CXXFLAGS=-m128bit-long-double -march=native -Wall -g -O3 LDFLAGS=-L/project/eschnett/shelob/src/cc/FunHPC.cxx/hwloc-1.10.1/lib -Wl,-rpath,/project/eschnett/shelob/src/cc/FunHPC.cxx/hwloc-1.10.1/lib

then it configures and builds fine. The difference is that I explicitly pass LDFLAGS.


Reply to this email directly or view it on GitHubhttps://github.com//issues/471#issuecomment-83615751.

@rhc54
Copy link
Contributor

rhc54 commented Apr 3, 2015

@jsquyres did you ever resolve this? Is it something that needs to be done before releasing 1.8.5?

@rhc54 rhc54 added the bug label Apr 3, 2015
@rhc54 rhc54 added this to the Open MPI 1.8.5 milestone Apr 3, 2015
@jsquyres
Copy link
Member

(I know, I'm dreadfully late in following up on this issue... :-( )

I finally spent some quality time with this issue today.

I believe that part of what has been confusing me is that there are (at least?) two distinct issues being reported on this issue:

  1. duplicate symbol ___sputc output when building with clang on OS X
  2. libevent failure to configure when using an external hwloc

First issue: duplicate __sputc symbol

I was finally able to duplicate the first issue. I'm able to replicate at the v1.8 branch head with a ports-installed clang 3.7 on the latest stable Yosemite (10.10.3) with the XCode clang (i.e., no need for a ports-installed clang). The key is the CFLAGS:

$ ./configure --prefix=/Users/jsquyres/bogus 'CFLAGS=-march=native -Wall -g -O3'

(note, also, the lack of a need for an external hwloc to replicate the issue)

What's sketchy is that It turns out that even -Wall is necessary. Specifically, 'CFLAGS=-march=native -g -O3' is not sufficient to make the __sputc duplicate symbol issue appear. That's pretty shady. I'm not ruling out the possibility of an OMPI bug here (e.g., we're missing some compiler/linker flag?), but a bug requiring the presence of -Wall is... weird.

Now that I'm able to replicate, I'm digging deeper...

Second issue: libevent configure fails with external hwloc

I finally asked myself the Right question today: "why on earth does libevent's configure care anything about hwloc?"

Specifically: libevent doesn't depend on hwloc, so libevent's configure should be orthogonal to however we're configuring/using hwloc.

But.

When we have an external hwloc, we add -lhwloc to $LIBS. And since $LIBS is in the environment, it's effectively passed to libevent's configure script. The test that fails in the libevent configure script is just checking basic compiler functionality:

configure:3734: /project/eschnett/shelob/gcc-4.9.2/bin/gcc -o conftest -m128bit-long-double -march=native -Wall -g -O3 -I/project/eschnett/shelob/src/cc/FunHPC.cxx/external/openmpi-1.8.4 -I/project/eschnett/shelob/src/cc/FunHPC.cxx/external/openmpi-1.8.4-build -I/project/eschnett/shelob/src/cc/FunHPC.cxx/external/openmpi-1.8.4/opal/include    -I/project/eschnett/shelob/src/cc/FunHPC.cxx/hwloc-1.10.1/include -I/project/eschnett/shelob/src/cc/FunHPC.cxx/external/openmpi-1.8.4-build/opal/mca/hwloc/external/hwloc/include    -L/project/eschnett/shelob/src/cc/FunHPC.cxx/hwloc-1.10.1/lib conftest.c -lm -lutil   -lhwloc  >&5
configure:3738: $? = 0
configure:3745: ./conftest
./conftest: error while loading shared libraries: libhwloc.so.5: cannot open shared object file: No such file or directory
configure:3749: $? = 127
configure:3756: error: in `/project/eschnett/shelob/src/cc/FunHPC.cxx/external/openmpi-1.8.4-build/opal/mca/event/libevent2021/libevent':
configure:3758: error: cannot run C compiled programs.
If you meant to cross compile, use `--host'.
See `config.log' for more details

You can see that the test program fails to run because it can't find libhwloc.so.5. Notice that even though the test program doesn't invoke any hwloc functionality, it was compiled with -lhwloc -- hence, the problem.

Notice that this will only happen when hwloc is installed in a path that is not searched by the run-time linker (e.g., it's not in your $LD_LIBRARY_PATH).

I have to think about this a little to develop a correct fix. In the meantime, @escnett's LDFLAGS-on-the-configure-command-line workaround is good one. Adding the hwloc library path to LD_LIBRARY_PATH would likely also work.

@eschnett
Copy link
Author

Regarding -Wall: This sounds a bit as if configure was looking at compiler output, either directly, or indirectly by using -Werror that then triggers because of additional warnings.

@eschnett
Copy link
Author

Regarding -lhwloc: You could check whether building a program and then running it works. If it doesn't, then that's an error that should be caught before libevent catches it.

Alternatively, you could add the respective -Wl,-rpath flag to LDFLAGS, so that hwloc will be found at run time.

I circumvent this issue now by setting LD_LIBRARY_PATH (and DYLD_LIBRARY_PATH, since I'm using OS X).

@jsquyres
Copy link
Member

Regarding the duplicate __sputc issue: well, this was tangled.

The root cause of the issue is that AC_C_INLINE somehow determines that the compiler does not support the inline keyword, and therefore plops this into opal_config.h:

#define inline

...which causes all kinds of Badness, the most obvious of which is that it screws up the actual inlining of /usr/include/stdio.h's __sputc symbol on OS X.

Terrible.

However, OMPI doesn't need to use the AC_C_INLINE test any more before we require a C99 compiler (which is guaranteed to support the inline keyword).

As some added bonuses:

  • we don't need to use AC_C_RESTRICT, either (because C99 is guaranteed to support the restrict keyword)
  • I trimmed some gcc 2.96-specific code (!) out of configure.ac

@jsquyres
Copy link
Member

Fixed the first issue on master in 1029dea (I accidentally referred to the wrong bug number in the commit message -- curses!).

jsquyres added a commit that referenced this issue Apr 14, 2015
…igure

== Short version

Do not export special variables into the environment (e.g., LIBS,
LDFLAGS, etc.) when invoking subdir configure scripts.  This prevents
problems described in #471.

== More detail

Exporing special env variables before invoking a subdir configure
script causes problems in some cases.  E.g., in #471,
when the user configures with `--with-hwloc=/path/to/hwloc`, and that
directory is *not* in a default linker search location will cause the
libevent subdir configuration to fail.

This happens because:

1. We'll pass LIBS="-L/path/to/hwloc/lib -lhwloc" to the libevent
   configure script
1. Meaning: configure-generated executables will link successfully
1. But unless LD_LIBRARY_PATH (or some other
   tell-the-linker-where-to-find-things mechanism) includes
   /path/to/hwloc/lib, the executable can't run.

Specifically, the libevent "hey, does the compiler generate proper
executables?" check will fail, and configure will abort (because OMPI
needs libevent).

I checked the history: exporting these vars dates all the way back to
LAM/MPI.  I can't think of a reason why we need to export these
variables -- AC_CONFIG_SUBDIRs doesn't do it; subdir configure scripts
should be orthogonal from the upper-layer configure script (and its
variables).  So let's remove these export statements and see if
anything breaks.
jsquyres added a commit to jsquyres/ompi-release that referenced this issue Apr 14, 2015
…igure

== Short version

Do not export special variables into the environment (e.g., LIBS,
LDFLAGS, etc.) when invoking subdir configure scripts.  This prevents
problems described in open-mpi/ompi#471.

== More detail

Exporing special env variables before invoking a subdir configure
script causes problems in some cases.  E.g., in open-mpi/ompi#471,
when the user configures with `--with-hwloc=/path/to/hwloc`, and that
directory is *not* in a default linker search location will cause the
libevent subdir configuration to fail.

This happens because:

1. We'll pass LIBS="-L/path/to/hwloc/lib -lhwloc" to the libevent
   configure script
1. Meaning: configure-generated executables will link successfully
1. But unless LD_LIBRARY_PATH (or some other
   tell-the-linker-where-to-find-things mechanism) includes
   /path/to/hwloc/lib, the executable can't run.

Specifically, the libevent "hey, does the compiler generate proper
executables?" check will fail, and configure will abort (because OMPI
needs libevent).

I checked the history: exporting these vars dates all the way back to
LAM/MPI.  I can't think of a reason why we need to export these
variables -- AC_CONFIG_SUBDIRs doesn't do it; subdir configure scripts
should be orthogonal from the upper-layer configure script (and its
variables).  So let's remove these export statements and see if
anything breaks.

(cherry picked/adapted from commit open-mpi/ompi@9ac9be1)
@jsquyres
Copy link
Member

This issue is now fixed on master, and the 2nd (of 2) PRs is opened to bring the fix over to v1.8.

Yippie!

So I'm closing this issue.

bosilca pushed a commit to ICLDisco/ompi that referenced this issue May 1, 2015
…igure

== Short version

Do not export special variables into the environment (e.g., LIBS,
LDFLAGS, etc.) when invoking subdir configure scripts.  This prevents
problems described in open-mpi#471.

== More detail

Exporing special env variables before invoking a subdir configure
script causes problems in some cases.  E.g., in open-mpi#471,
when the user configures with `--with-hwloc=/path/to/hwloc`, and that
directory is *not* in a default linker search location will cause the
libevent subdir configuration to fail.

This happens because:

1. We'll pass LIBS="-L/path/to/hwloc/lib -lhwloc" to the libevent
   configure script
1. Meaning: configure-generated executables will link successfully
1. But unless LD_LIBRARY_PATH (or some other
   tell-the-linker-where-to-find-things mechanism) includes
   /path/to/hwloc/lib, the executable can't run.

Specifically, the libevent "hey, does the compiler generate proper
executables?" check will fail, and configure will abort (because OMPI
needs libevent).

I checked the history: exporting these vars dates all the way back to
LAM/MPI.  I can't think of a reason why we need to export these
variables -- AC_CONFIG_SUBDIRs doesn't do it; subdir configure scripts
should be orthogonal from the upper-layer configure script (and its
variables).  So let's remove these export statements and see if
anything breaks.
bosilca pushed a commit to eddy16112/ompi that referenced this issue Jun 18, 2015
…igure

== Short version

Do not export special variables into the environment (e.g., LIBS,
LDFLAGS, etc.) when invoking subdir configure scripts.  This prevents
problems described in open-mpi#471.

== More detail

Exporing special env variables before invoking a subdir configure
script causes problems in some cases.  E.g., in open-mpi#471,
when the user configures with `--with-hwloc=/path/to/hwloc`, and that
directory is *not* in a default linker search location will cause the
libevent subdir configuration to fail.

This happens because:

1. We'll pass LIBS="-L/path/to/hwloc/lib -lhwloc" to the libevent
   configure script
1. Meaning: configure-generated executables will link successfully
1. But unless LD_LIBRARY_PATH (or some other
   tell-the-linker-where-to-find-things mechanism) includes
   /path/to/hwloc/lib, the executable can't run.

Specifically, the libevent "hey, does the compiler generate proper
executables?" check will fail, and configure will abort (because OMPI
needs libevent).

I checked the history: exporting these vars dates all the way back to
LAM/MPI.  I can't think of a reason why we need to export these
variables -- AC_CONFIG_SUBDIRs doesn't do it; subdir configure scripts
should be orthogonal from the upper-layer configure script (and its
variables).  So let's remove these export statements and see if
anything breaks.
jsquyres added a commit to jsquyres/ompi that referenced this issue Nov 10, 2015
…pdate

.gitignore: add man page in CUDA extension
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants