Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for Alpaka Phase2 Pixel Reco #44874

Merged
merged 1 commit into from
May 9, 2024

Conversation

AdrianoDee
Copy link
Contributor

PR description:

Few fixes for the Alpaka chain for Phase2 Pixel reconstruction:

  1. clusterBinning to be < than 32*32 (warpSize*warpSize for CUDA);
  2. (minimal and applies also to Phase 1) properly size the SiPixelDigisHost buffer in CopyToHost method;
  3. removed all the errors part for Phase 2 clusterizer (not defined at the moment);
  4. fixing ALPAKA_ASSERT_ACC(TrackerTraits::numberOfModules < 2048);.

PR validation:

Running *.402 wfs for Phase 2 conditions (e.g. 25034 TTbar D98 with PU)

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 30, 2024

cms-bot internal usage

@AdrianoDee
Copy link
Contributor Author

type bug-fix

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-44874/40132

  • This PR adds an extra 48KB to repository

  • Found files with invalid states:

    • DataFormats/SiPixelDigiSoA/src/classes_def.xml.generated:

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-44874/40133

  • This PR adds an extra 40KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @AdrianoDee for master.

It involves the following packages:

  • DataFormats/SiPixelDigiSoA (reconstruction, heterogeneous)
  • Geometry/CommonTopologies (geometry)
  • RecoLocalTracker/SiPixelClusterizer (reconstruction)

@mandrenguyen, @bsunanda, @fwyzard, @mdhildreth, @cmsbuild, @civanch, @Dr15Jones, @jfernan2, @makortel can you please review it and eventually sign? Thanks.
@felicepantaleo, @bsunanda, @gpetruc, @GiacomoSguazzoni, @mtosi, @tvami, @missirol, @dkotlins, @mroguljic, @ferencek, @VourMa, @JanFSchulte, @VinInn, @fabiocos, @threus, @tsusa, @rovere, @mmusich this is something you requested to watch as well.
@rappoccio, @antoniovilela, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@AdrianoDee
Copy link
Contributor Author

enable gpu

@AdrianoDee
Copy link
Contributor Author

test parameters:

  • enable = gpu
  • workflows = 29834.402, 29834.403, 29661.402, 29661.403
  • workflows_gpu = 29834.402, 29834.403, 29834.404, 29661.402, 29661.403
  • workflow_opts = -w upgrade
  • workflow_opts_gpu = -w upgrade

@AdrianoDee
Copy link
Contributor Author

Testing with TTbar and NuGun.

@AdrianoDee
Copy link
Contributor Author

please test

@@ -111,9 +125,18 @@ namespace pixelClustering {
for (auto i : cms::alpakatools::independent_group_elements(acc, nclus)) {
newclusId[i] = ok[i] = (charge[i] >= chargeCut) ? 1 : 0;
if (0 == ok[i])
good = false;
good &= false; //better than simple assignment in case of race?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think that a simple assignment is better, because it does not need to read the old value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, reverting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the "in case of race", the good is a local variable, and therefore there should be no data races with it.

@cmsbuild
Copy link
Contributor

cmsbuild commented May 6, 2024

Pull request #44874 was updated. @mdhildreth, @bsunanda, @civanch, @Dr15Jones, @fwyzard, @cmsbuild, @mandrenguyen, @jfernan2, @makortel can you please check and sign again.

@AdrianoDee
Copy link
Contributor Author

please test

@fwyzard
Copy link
Contributor

fwyzard commented May 6, 2024

+heterogeneous

@cmsbuild
Copy link
Contributor

cmsbuild commented May 6, 2024

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-045e02/39253/summary.html
COMMIT: d3ab5ba
CMSSW: CMSSW_14_1_X_2024-05-06-1100/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/44874/39253/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially removed 2 lines from the logs
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3364549
  • DQMHistoTests: Total failures: 7
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3364522
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 210 log files, 175 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 46 differences found in the comparisons
  • DQMHistoTests: Total files compared: 5
  • DQMHistoTests: Total histograms compared: 71813
  • DQMHistoTests: Total failures: 3629
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 68184
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 4 files compared)
  • Checked 19 log files, 22 edm output root files, 5 DQM output files
  • TriggerResults: no differences found

@fwyzard
Copy link
Contributor

fwyzard commented May 6, 2024

@AdrianoDee could you open a PR to backport the changes to 14.0.x ?

@civanch
Copy link
Contributor

civanch commented May 6, 2024

+1

@AdrianoDee
Copy link
Contributor Author

@AdrianoDee could you open a PR to backport the changes to 14.0.x ?

Opened here: #44915.

@fwyzard
Copy link
Contributor

fwyzard commented May 8, 2024

@cms-sw/reconstruction-l2 any comments from your side, to this PR or its 14.0.x backport (#44915) ?

@mandrenguyen
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented May 9, 2024

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @antoniovilela, @rappoccio, @sextonkennedy (and backports should be raised in the release meeting by the corresponding L2)

@antoniovilela
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit b54f95f into cms-sw:master May 9, 2024
15 checks passed
@AdrianoDee AdrianoDee deleted the phase2_alpaka_fixes branch May 14, 2024 05:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants