Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve behavior after exception in begin/end global lumi #44840

Merged
merged 1 commit into from
May 1, 2024

Conversation

wddgit
Copy link
Contributor

@wddgit wddgit commented Apr 24, 2024

PR description:

Improve the behavior of the Framework after global begin/end lumi exceptions. This is the second in a series of PRs where we plan to make the behavior after exceptions more consistent in all the begin/end transitions. The first PR handled stream begin/end lumi exceptions (see #44624). The comments at the head of that PR state the design for this behavior we are implementing.

The intent is that nothing in the output will change if there are not any exceptions.

This work was motivated by discussions related to Issues #43831 and #42501.

This PR also adds some new exception context information for exceptions occurring in service functions related to begin/end transitions.

PR validation:

An existing unit test covering exceptions in different transitions is extended to cover the most salient cases. Additional manual testing of many various cases was also done. Existing unit tests pass.

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 24, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-44840/40088

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @wddgit for master.

It involves the following packages:

  • FWCore/Framework (core)
  • FWCore/Integration (core)

@cmsbuild, @Dr15Jones, @makortel, @smuzaffar can you please review it and eventually sign? Thanks.
@missirol, @makortel this is something you requested to watch as well.
@antoniovilela, @sextonkennedy, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@wddgit
Copy link
Contributor Author

wddgit commented Apr 24, 2024

enable threading

@wddgit
Copy link
Contributor Author

wddgit commented Apr 24, 2024

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ba46e8/39075/summary.html
COMMIT: daed9f8
CMSSW: CMSSW_14_1_X_2024-04-24-1100/el8_amd64_gcc12
Additional Tests: THREADING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/44840/39075/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

There are some workflows for which there are errors in the baseline:
24834.78 step 2
The results for the comparisons for these workflows could be incomplete
This means most likely that the IB is having errors in the relvals.The error does NOT come from this pull request

Summary:

@wddgit
Copy link
Contributor Author

wddgit commented Apr 25, 2024

The comparison failures are all ones seen before:

Non reproducibility in wf 136.793 #43293
Non-reproducibility in TriggerResults in 24834.0, 24834.911, and 25034.999 #43790
Failures related to MessageLogger, 10224.0, 13034.0, 25202.0
Failures related to logErrorHarvester, 14234.0

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-44840/40110

@wddgit
Copy link
Contributor Author

wddgit commented Apr 26, 2024

please test

I pushed a commit and that combined with the responses above should resolve all the comments. Let me know if there are any more.

@cmsbuild
Copy link
Contributor

Pull request #44840 was updated. @makortel, @smuzaffar, @Dr15Jones can you please check and sign again.

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ba46e8/39125/summary.html
COMMIT: 8794d0e
CMSSW: CMSSW_14_1_X_2024-04-26-1100/el8_amd64_gcc12
Additional Tests: THREADING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/44840/39125/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

There are some workflows for which there are errors in the baseline:
24834.78 step 2
The results for the comparisons for these workflows could be incomplete
This means most likely that the IB is having errors in the relvals.The error does NOT come from this pull request

Summary:

@makortel
Copy link
Contributor

@makortel
Copy link
Contributor

+core

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @antoniovilela, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@rappoccio
Copy link
Contributor

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants