Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add customize to include Alpaka HCal PF Clustering at HLT #43971

Merged

Conversation

waredjeb
Copy link
Contributor

This PR add customizeHLTforAlpakaParticleFlowClustering() to customizeHLTForAlpaka() to integrate the Alpaka HCAL PF Clustering at HLT.

The function defines all the needed modules and replace them in the necessary Sequences. Currently it also run the Alpaka CPU-serial version up to the PFClusterSoA collection, before the Legacy conversion.

To compare the GPU version against the CPU version a follow-up PR will be needed to perform the CPU legacy conversion and the comparison at DQM.

Validation

Validated running the following HLT configuration

hltGetConfiguration /dev/CMSSW_14_0_0/GRun --unprescale --output all --globaltag auto:phase1_2024_realistic --mc --max-events 10 --input /store/mc/Run3Winter24Digi/TT_TuneCP5_13p6TeV_powheg-pythia8/GEN-SIM-RAW/133X_mcRun3_2024_realistic_v8-v2/80000/dc984f7f-2e54-48c4-8950-5daa848b6db9.root --customise HLTrigger/Configuration/customizeHLTforAlpaka.customizeHLTforAlpaka

And also by merging #43701 and running workflow 12434.423

FYI @jsamudio @hatakeyamak

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 15, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43971/38876

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @waredjeb (Wahid Redjeb) for master.

It involves the following packages:

  • HLTrigger/Configuration (hlt)

@cmsbuild, @Martin-Grunewald, @mmusich can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @missirol, @silviodonato this is something you requested to watch as well.
@antoniovilela, @sextonkennedy, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@mmusich
Copy link
Contributor

mmusich commented Feb 15, 2024

enable gpu

@mmusich
Copy link
Contributor

mmusich commented Feb 15, 2024

test parameters:

  • workflow_opts_gpu= -w upgrade
  • workflows_gpu= 12434.423

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43971/38877

@cmsbuild
Copy link
Contributor

Pull request #43971 was updated. @Martin-Grunewald, @cmsbuild, @mmusich can you please check and sign again.

@fwyzard
Copy link
Contributor

fwyzard commented Feb 15, 2024

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2b6a17/37478/summary.html
COMMIT: 2b472df
CMSSW: CMSSW_14_1_X_2024-02-14-2300/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/43971/37478/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 48 differences found in the comparisons
  • DQMHistoTests: Total files compared: 4
  • DQMHistoTests: Total histograms compared: 40949
  • DQMHistoTests: Total failures: 1835
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 39114
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 3 files compared)
  • Checked 12 log files, 15 edm output root files, 4 DQM output files
  • TriggerResults: found differences in 1 / 3 workflows

@mmusich
Copy link
Contributor

mmusich commented Feb 15, 2024

Some preliminary checks from hlt side:

  • customization function looks OK from visual spot-check
  • workflow 12434.423, in particular step2 run without errors / warnings
  • private TSG check with [1] doesn't reveal issues.

@waredjeb where is the 14.0.X backport? it was privately communicated that this needs to enter 14_0_0.


[1]

hltGetConfiguration /dev/CMSSW_14_0_0/GRun --unprescale --output all --globaltag auto:phase1_2024_realistic --mc --max-events 10 --input /store/mc/Run3Winter24Digi/TT_TuneCP5_13p6TeV_powheg-pythia8/GEN-SIM-RAW/133X_mcRun3_2024_realistic_v8-v2/80000/dc984f7f-2e54-48c4-8950-5daa848b6db9.root --customise HLTrigger/Configuration/customizeHLTforAlpaka.customizeHLTforAlpaka > & hlt.py &
cmsRun hlt.py > & hlt.log &

src = cms.InputTag("hltPFRecHitSoAProducerHCALCPUSerial")
)

process.particleFlowRecHitHF = cms.EDProducer("PFRecHitProducer",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modules run at HLT start their label with lower-case hlt. Also, it actually seems this module is not further used anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right, I am checking that everything works correctly without this module

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated removing this module, thanks for spotting it!

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43971/38893

@cmsbuild
Copy link
Contributor

Pull request #43971 was updated. @mmusich, @Martin-Grunewald, @cmsbuild can you please check and sign again.

@mmusich
Copy link
Contributor

mmusich commented Feb 15, 2024

@cmsbuild, please test

@waredjeb
Copy link
Contributor Author

waredjeb commented Feb 15, 2024

Possible that the tests failed with this error?

17:23:23 Traceback (most recent call last):
17:23:23   File "/cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/43979/37499/CMSSW_14_0_X_2024-02-15-1100/bin/el8_amd64_gcc12/runTheMatrix.py", line 762, in <module>
17:23:23     ret = runSelected(opt)
17:23:23   File "/cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/43979/37499/CMSSW_14_0_X_2024-02-15-1100/bin/el8_amd64_gcc12/runTheMatrix.py", line 31, in runSelected
17:23:23     if len(undefSet)>0: raise ValueError('Undefined workflows: '+', '.join(map(str,list(undefSet))))
17:23:23 ValueError: Undefined workflows: 12434.423

@mmusich
Copy link
Contributor

mmusich commented Feb 15, 2024

Possible that the tests failed with this error?

it shows as failed here because it's the same branch as the backport and the backport is failing because of that reason.
I assume the reason is that the backport of #43701 is not yet in an IB an thus workflow 12434.423 is undefined.
Test should finish successfully here as they did before.

@waredjeb
Copy link
Contributor Author

Possible that the tests failed with this error?

it shows as failed here because it's the same branch as the backport and it's failing because of that reason. I assume the reason is that the backport of #43701 is not yet in an IB an thus workflow 12434.423 is undefined. Test should finish successfully here as they did before.

I see, thanks for the clarification

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2b6a17/37498/summary.html
COMMIT: 28e9220
CMSSW: CMSSW_14_1_X_2024-02-15-1100/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/43971/37498/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 11 differences found in the comparisons
  • DQMHistoTests: Total files compared: 4
  • DQMHistoTests: Total histograms compared: 40949
  • DQMHistoTests: Total failures: 506
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 40443
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 3 files compared)
  • Checked 12 log files, 15 edm output root files, 4 DQM output files
  • TriggerResults: found differences in 1 / 3 workflows

@Martin-Grunewald
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @antoniovilela, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@antoniovilela
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 82470f5 into cms-sw:master Feb 16, 2024
29 checks passed
Copy link
Contributor

@missirol missirol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of late questions.

Comment on lines +37 to +54
process.hltPFRecHitHCALParamsESProducer = cms.ESProducer('PFRecHitHCALParamsESProducer@alpaka',
energyThresholdsHB = cms.vdouble(
0.1,
0.2,
0.3,
0.3
),
energyThresholdsHE = cms.vdouble(
0.1,
0.2,
0.2,
0.2,
0.2,
0.2,
0.2
),
appendToDataLabel = cms.string(''),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought these numbers should now be taken from the GT, so is this ESProducer still needed ? (I'm not thinking about the customisation per se, but about what we will actually integrate in the HLT menu)

Copy link
Contributor

@hatakeyamak hatakeyamak Feb 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that, yes, these numbers are now taken from GT, but we still need to provide some "default" numbers to get the code function (somewhat historic). Perhaps something we can try to see if we can design the code differently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When adding DB thresholds, my understanding was to maintain function with hardcoded parameters if needed. An example of the current implementation starts here:

If there is need to change the functionality to DB only, I can look into it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the reminder. It's probably wise to keep the functionality to use numbers via config.


process.hltLegacyPFClusterProducer = cms.EDProducer("LegacyPFClusterProducer",
src = cms.InputTag("hltPFClusterSoAProducer"),
pfClusterParams = cms.ESInputTag("pfClusterParamsESProducer:"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be hltPFClusterParamsESProducer: instead ? (I guess it does not matter, since the tests passed ? From a quick look, it seems that pfClusterParams is not really used inside LegacyPFClusterProducer)

Copy link
Contributor Author

@waredjeb waredjeb Feb 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I guess we can directly drop this parameter from LegacyPFClusterProducer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants