Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Temporary: abort if cutParser fails to find methods #44590

Merged
merged 1 commit into from
Apr 3, 2024

Conversation

Dr15Jones
Copy link
Contributor

PR description:

This is to help us debug a threading issue in ROOT. The abort will cause cmsRun to pause all threads and allow us to correlate what else is running when this problem occurs.

The intent is to only use this for a very short time in the IBs to help us try to debug the underlying problem. Once we get more info, this PR should be reverted.

PR validation:

Code compiles.

@Dr15Jones
Copy link
Contributor Author

assign core

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 2, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 2, 2024

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-44590/39762

  • This PR adds an extra 16KB to repository

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 2, 2024

New categories assigned: core

@Dr15Jones,@makortel,@smuzaffar you have been requested to review this Pull request/Issue and eventually sign? Thanks

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 2, 2024

A new Pull Request was created by @Dr15Jones for master.

It involves the following packages:

  • CommonTools/Utils (reconstruction)

@jfernan2, @smuzaffar, @cmsbuild, @makortel, @mandrenguyen, @Dr15Jones can you please review it and eventually sign? Thanks.
@missirol this is something you requested to watch as well.
@sextonkennedy, @rappoccio, @antoniovilela you are the release manager for this.

cms-bot commands are listed here

@Dr15Jones
Copy link
Contributor Author

please test

@makortel
Copy link
Contributor

makortel commented Apr 2, 2024

This is to help us debug a threading issue in ROOT

The issue is #33084

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 2, 2024

-1

Failed Tests: UnitTests
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-092000/38551/summary.html
COMMIT: 03fcaa4
CMSSW: CMSSW_14_1_X_2024-04-02-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/44590/38551/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found 1 errors in the following unit tests:

---> test testCommonToolsUtil had ERRORS

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 45 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3297469
  • DQMHistoTests: Total failures: 0
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3297449
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 202 log files, 165 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

This is to help us debug a threading issue in ROOT. The abort
will cause cmsRun to pause all threads and allow us to correlate
what else is running when this problem occurs.
@Dr15Jones
Copy link
Contributor Author

please test

@jfernan2
Copy link
Contributor

jfernan2 commented Apr 3, 2024

+1

@Dr15Jones
Copy link
Contributor Author

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 3, 2024

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @rappoccio, @antoniovilela (and backports should be raised in the release meeting by the corresponding L2)

@Dr15Jones
Copy link
Contributor Author

@cms-sw/orp-l2 please sign so we have more time to possibly trigger the problem.

@antoniovilela
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit aee8fc7 into cms-sw:master Apr 3, 2024
11 checks passed
@antoniovilela
Copy link
Contributor

In the core software meeting today this was agreed to be the best (or least bad) way forward to figure out the data race. The plan is to revert this PR before the next pre-prelease (14_1_0_pre3), and in fact I would open a PR doing so immediately after this PR is merged (but put it on hold).

Good idea.

@antoniovilela
Copy link
Contributor

@Dr15Jones @makortel
Just to confirm, the unit test we have failing in all architectures in 14_1_X is coming from this, correct?

@makortel
Copy link
Contributor

makortel commented Apr 8, 2024

@Dr15Jones @makortel Just to confirm, the unit test we have failing in all architectures in 14_1_X is coming from this, correct?

Yes, as noted in #33084 (comment)

@antoniovilela
Copy link
Contributor

@Dr15Jones @makortel Just to confirm, the unit test we have failing in all architectures in 14_1_X is coming from this, correct?

Yes, as noted in #33084 (comment)

Thanks. Ok, this seems fun..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants