Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v6-30 Backport cling fixes for CMS #14358

Merged

Conversation

vepadulano
Copy link
Member

Backports of #14287 and #14352

@phsft-bot
Copy link
Collaborator

Starting build on ROOT-performance-centos8-multicore/soversion, ROOT-ubuntu2204/nortcxxmod, ROOT-ubuntu2004/python3, mac12arm/cxx20, windows10/default
How to customize builds

@phsft-bot
Copy link
Collaborator

Build failed on ROOT-ubuntu2204/nortcxxmod.
Running on root-ubuntu-2204-2.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Errors:

  • [2024-01-16T13:36:46.790Z] /home/sftnight/build/workspace/root-pullrequests-build/root/core/metacling/src/TClingCallbacks.cxx:90:120: error: cannot allocate an object of abstract type ‘llvm::orc::MaterializationUnit’

@dpiparo dpiparo self-requested a review January 16, 2024 13:54
Copy link
Member

@dpiparo dpiparo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if tests pass, we can merge.

@vepadulano
Copy link
Member Author

Unfortunately the commits do not even compile with some scary error

 /home/sftnight/build/workspace/root-pullrequests-build/root/core/metacling/src/TClingCallbacks.cxx: In constructor ‘AutoloadLibraryMU::AutoloadLibraryMU(const TClingCallbacks&, const string&, const SymbolNameVector&)’:
/home/sftnight/build/workspace/root-pullrequests-build/root/core/metacling/src/TClingCallbacks.cxx:90:120: error: cannot allocate an object of abstract type ‘llvm::orc::MaterializationUnit’
   90 |       : MaterializationUnit({getSymbolFlagsMap(Symbols), nullptr}), fCallbacks(cb), fLibrary(Library), fSymbols(Symbols)

Can this be some difference between LLVM 13 and 16 that's causing it? @vgvassilev

@vgvassilev
Copy link
Member

Nice we need to re-solve this for 6.30...

AutoloadLibraryMU(const std::string &Library, const llvm::orc::SymbolNameVector &Symbols)
: MaterializationUnit(getSymbolFlagsMap(Symbols), nullptr), fLibrary(Library), fSymbols(Symbols)
AutoloadLibraryMU(const TClingCallbacks &cb, const std::string &Library, const llvm::orc::SymbolNameVector &Symbols)
: MaterializationUnit({getSymbolFlagsMap(Symbols), nullptr}), fCallbacks(cb), fLibrary(Library), fSymbols(Symbols)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
: MaterializationUnit({getSymbolFlagsMap(Symbols), nullptr}), fCallbacks(cb), fLibrary(Library), fSymbols(Symbols)
: MaterializationUnit(getSymbolFlagsMap(Symbols), nullptr), fCallbacks(cb), fLibrary(Library), fSymbols(Symbols)

How about that one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build finishes fine on my machine, I have updated the commit

@phsft-bot
Copy link
Collaborator

Build failed on windows10/default.
Running on null:C:\build\workspace\root-pullrequests-build
See console output.

Errors:

  • [2024-01-16T14:06:48.405Z] C:\build\workspace\root-pullrequests-build\root\core\metacling\src\TClingCallbacks.cxx(90,29): error C2259: 'llvm::orc::MaterializationUnit': cannot instantiate abstract class [C:\build\workspace\root-pullrequests-build\build\core\metacling\src\MetaCling.vcxproj]

Copy link

github-actions bot commented Jan 16, 2024

Test Results

     9 files       9 suites   1d 11h 20m 11s ⏱️
 2 481 tests  2 479 ✅ 0 💤 2 ❌
21 361 runs  21 359 ✅ 0 💤 2 ❌

For more details on these failures, see this check.

Results for commit d4f8f6a.

♻️ This comment has been updated with latest results.

@phsft-bot
Copy link
Collaborator

Starting build on ROOT-performance-centos8-multicore/soversion, ROOT-ubuntu2204/nortcxxmod, ROOT-ubuntu2004/python3, mac12arm/cxx20, windows10/default
How to customize builds

@smuzaffar
Copy link
Contributor

thanks for the backport, I have started CMSSW tests here cms-sw#195

@vepadulano
Copy link
Member Author

vepadulano commented Jan 16, 2024

thanks for the backport, I have started CMSSW tests here cms-sw#195

Hi @smuzaffar , thanks for your promptness! We discovered there was a small wrinkle preventing proper compilation in 6.30, I have uploaded an updated commit which should work properly literally a few minutes ago. I just wanted to make sure that the PR on CMSSW takes this latest version of the commits

EDIT: I checked on the linked CMSSW PR and it seems the right version of the commit was picked, so should be fine.

@phsft-bot
Copy link
Collaborator

Build failed on ROOT-ubuntu2204/nortcxxmod.
Running on root-ubuntu-2204-2.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@vepadulano
Copy link
Member Author

Ok now it compiles but we get the following error in the new test

  input_line_49:1:21: error: unknown type name 'ROOT_7459'
  int ROOT_7459 = 42; ROOT_7459++;
                      ^
  input_line_49:1:30: error: expected unqualified-id
  int ROOT_7459 = 42; ROOT_7459++;

@phsft-bot
Copy link
Collaborator

Build failed on windows10/default.
Running on null:C:\build\workspace\root-pullrequests-build
See console output.

Failing tests:

@dpiparo
Copy link
Member

dpiparo commented Jan 16, 2024

I would like to point out that this is a backport of a very generic fix, which works very well with llvm16, which is not in the branch 6.30. A solution of the issue reported by CMS was already made available in December: if you want, this is a more refined version of that.

@vgvassilev
Copy link
Member

vgvassilev commented Jan 16, 2024

Ok now it compiles but we get the following error in the new test

  input_line_49:1:21: error: unknown type name 'ROOT_7459'
  int ROOT_7459 = 42; ROOT_7459++;
                      ^
  input_line_49:1:30: error: expected unqualified-id
  int ROOT_7459 = 42; ROOT_7459++;

Somehow the parsing seems broken. We should probably try using ProcessLine instead of Declare... still a mystery why this fails only for the PCH based builds...

EDIT: Parsing is broken because Declare does not support statements on the global scope but ProcessLine does. In addition it works in master because clang already supports statements on the global scope (thanks to clang-repl)...

@vepadulano
Copy link
Member Author

We should probably try using ProcessLine instead of Declare

Seems like an easy test, let me try that 👍

@phsft-bot
Copy link
Collaborator

Starting build on ROOT-performance-centos8-multicore/soversion, ROOT-ubuntu2204/nortcxxmod, ROOT-ubuntu2004/python3, mac12arm/cxx20, windows10/default
How to customize builds

@vepadulano
Copy link
Member Author

At least on my machine substituting Declare with ProcessLine fixed the test failure, let's see what the CI has to say

@vgvassilev
Copy link
Member

At least on my machine substituting Declare with ProcessLine fixed the test failure, let's see what the CI has to say

We need to land that change in the master for consistency reasons.

@vepadulano
Copy link
Member Author

We need to land that change in the master for consistency reasons.

Good point! I will then separate this change from the original commit and create a separate commit that we can forward-port to master

The llvm9 JIT issued callbacks when a symbol was missing and we reacted on it
by loading the relevant library. In root-project/root@9b2041e3 we have kept the
logic but now the JIT started querying more often even for symbols which are
okay to be missing. In turn that leads to scanning all libraries causing
performance issues.

This patch tries to limit this functionality only in contexts where automatic
loading is allowed.
vgvassilev and others added 5 commits January 16, 2024 21:49
Co-authored-by: Jonas Hahnfeld <jonas.hahnfeld@cern.ch>
Symbol lookup is a quite expensive operation and might result in JIT
compilation and library loading.

Co-authored-by: Jonas Hahnfeld <jonas.hahnfeld@cern.ch>
This allows us to avoid generating symbols in libCore for these constants
keeping the same amount of open calls at ROOT startup time.
`TInterpreter::Declare` does not support issuing statements on the global scope, while `TInterpreter::ProcessLine` does. This test was first introduced in a development version of ROOT (6.31), where the upgrade to LLVM16 was already in place. In that scenario, clang supports global scope statements (thanks to changes made for clang-repl). Applying the test to ROOT 6.30 uncovered the problem, since the clang of LLVM13 does not support global scope statements. Fix the test by using `ProcessLine` instead of `Declare` which does the intended thing independently from the ROOT version.

Co-authored-by: Vassil Vassilev <vasil.georgiev.vasilev@cern.ch>
@phsft-bot
Copy link
Collaborator

Starting build on ROOT-performance-centos8-multicore/soversion, ROOT-ubuntu2204/nortcxxmod, ROOT-ubuntu2004/python3, mac12arm/cxx20, windows10/default
How to customize builds

@vepadulano
Copy link
Member Author

@vgvassilev The test now passes on all platforms. Let me know what you think about the commit and if we can merge this PR

Copy link
Member

@vgvassilev vgvassilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@vepadulano vepadulano merged commit be7addb into root-project:v6-30-00-patches Jan 17, 2024
10 of 15 checks passed
@dpiparo
Copy link
Member

dpiparo commented Jan 17, 2024

Dear @smuzaffar , also thanks to your help, we converged. This PR merged the changes that for us are an improvement wrt the fix provided in December 2023. This the commit we would propose to CMS for the CMSSW 14 release: would you be so kind to give it another try to verify once again the code works as expected for CMS?

@smuzaffar
Copy link
Contributor

@dpiparo , thanks for the backport. CMSSW tests via cms-sw#195 for this change look good. I just have opened cms-sw/cmsdist#8949 to integrate latest v6-30-00-patches changes in to CMSSW 14.0.X

@pcanal pcanal changed the title Backport cling fixes for CMS v6-30 Backport cling fixes for CMS Jan 17, 2024
@phsft-bot
Copy link
Collaborator

Starting build on ROOT-performance-centos8-multicore/soversion, ROOT-ubuntu2204/nortcxxmod, ROOT-ubuntu2004/python3, mac12arm/cxx20, windows10/default
How to customize builds

@phsft-bot
Copy link
Collaborator

Build failed on ROOT-ubuntu2204/nortcxxmod.
Running on root-ubuntu-2204-3.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link
Collaborator

Build failed on ROOT-performance-centos8-multicore/soversion.
Running on olbdw-01.cern.ch:/data/sftnight/workspace/root-pullrequests-build
See console output.

Failing tests:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants