Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ECAL skip GPU unpacking of the rest of the block if a bad block is detected - 130x #42395

Conversation

thomreis
Copy link
Contributor

PR description:

Skip the GPU unpacking of the rest of the block if a bad block is detected in one thread. This behaviour matches the one of the CPU unpacker.

Backport of #42301 for HLT.

PR validation:

No crashes with integrity errors observed in runs 367771, 368547, and 368724 in #39568. Passes WF 12434.512

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 27, 2023

A new Pull Request was created by @thomreis (Thomas Reis) for CMSSW_13_0_X.

It involves the following packages:

  • EventFilter/EcalRawToDigi (reconstruction)

@cmsbuild, @mandrenguyen, @clacaputo can you please review it and eventually sign? Thanks.
@rchatter, @argiro, @Martin-Grunewald, @missirol, @thomreis, @wang0jin this is something you requested to watch as well.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@thomreis
Copy link
Contributor Author

type ecal

@cmsbuild cmsbuild added the ecal label Jul 27, 2023
@thomreis
Copy link
Contributor Author

enable gpu

@thomreis
Copy link
Contributor Author

backport of #42301

@thomreis
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ade098/33943/summary.html
COMMIT: a95ff38
CMSSW: CMSSW_13_0_X_2023-07-27-1100/el8_amd64_gcc11
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/42395/33943/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 61 lines to the logs
  • Reco comparison results: 10 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3281270
  • DQMHistoTests: Total failures: 7
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3281241
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 213 log files, 164 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 3
  • DQMHistoTests: Total histograms compared: 40086
  • DQMHistoTests: Total failures: 22
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 40064
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 2 files compared)
  • Checked 8 log files, 6 edm output root files, 3 DQM output files
  • TriggerResults: no differences found

@mandrenguyen
Copy link
Contributor

I let @rappoccio and @perrotta comment, but normally one would make the 13_1_X backport as well.
Doesn't cost much.

@perrotta
Copy link
Contributor

I let @rappoccio and @perrotta comment, but normally one would make the 13_1_X backport as well. Doesn't cost much.

I agree. If we backport in an older release cycle it is a good habit to also backport in all open cycles in between, unless there are counterindications in doing so.
Instead, I wonder if a backport to the pp data taking release 13_0_X is still really needed, now that it is clear that there will be no pp data taking with 13_0_X any more in 2023. Some discussion about whether to keep maintaning that release for the data taking will be addressed during the next ORP: stay tuned.

@thomreis
Copy link
Contributor Author

OK I will make the 13_1_X backport as well then.
For this one I let you decide. If you do not want it in the end you can close the PR.

@thomreis
Copy link
Contributor Author

13_1_X backport: #42406

@clacaputo
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next CMSSW_13_0_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_13_3_X is complete. This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

perrotta commented Aug 1, 2023

+1

@cmsbuild cmsbuild merged commit 82f0200 into cms-sw:CMSSW_13_0_X Aug 1, 2023
@thomreis thomreis deleted the ecal-gpu-unpacker-integrity-checks-part2-130x branch August 2, 2023 07:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants