Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make repack process be able to generate new data tiers: L1SCOUT, HLTSCOUT #44381

Merged
merged 2 commits into from
Mar 15, 2024

Conversation

haozturk
Copy link
Contributor

PR description:

This PR updates the REPACK process such that it is able to generate two new data tiers: HLTSCOUT and L1SCOUT in addition to RAW. I've updated the event content of HLTSCOUT and L1SCOUT such that they inherit their output commands from HLTriggerRAW and L1TriggerRAW respectively. Additionally, I've included the output commands of L1SCOUT and HLTSCOUT in the output module of the Repack process.

I've coordinated with @drkovalskyi @germanfgv @dynamic-entropy on this work.

Related tickets:

PR validation:

Note that I've performed these changes by using "CMSSW_14_0_1" as a base and I've tested them on the same release. The original branch is in this PR. I wanted to rebase my changes into upstream master and re-test, however there has been conflicts during this rebase in files that I haven't changed and don't know how to resolve. The tests I performed and showed below were done in the version that the backport PR uses as source. If anybody can show me how I can test this version, I'm happy to do so. You can find the details of the testing I've performed in the backport PR below:

I used Configuration/DataProcessing/test/RunRepack.py for testing. I added a new option to this script


$ python3 DataProcessing/test/RunRepack.py --lfn /store/t0streamer/Data/PhysicsCommissioning/000/377/350/run377350_ls0001_streamPhysicsCommissioning_StorageManager.dat --data-tier RAW
Now do:
cmsRun -e RunRepackCfg.py
[haozturk@lxplus802 Configuration]$ cmsRun -e RunRepackCfg.py
08-Mar-2024 15:55:15 CET  Initiating request to open file root://eoscms.cern.ch//eos/cms/store/t0streamer/Data/PhysicsCommissioning/000/377/350/run377350_ls0001_streamPhysicsCommissioning_StorageManager.dat
08-Mar-2024 15:55:17 CET  Successfully opened file root://eoscms.cern.ch//eos/cms/store/t0streamer/Data/PhysicsCommissioning/000/377/350/run377350_ls0001_streamPhysicsCommissioning_StorageManager.dat
Begin processing the 1st record. Run 377350, Event 172727, LumiSection 1 on stream 0 at 08-Mar-2024 15:55:20.650 CET
Begin processing the 2nd record. Run 377350, Event 255073, LumiSection 1 on stream 0 at 08-Mar-2024 15:55:20.739 CET
Begin processing the 3rd record. Run 377350, Event 495959, LumiSection 1 on stream 0 at 08-Mar-2024 15:55:20.825 CET
Begin processing the 4th record. Run 377350, Event 5487, LumiSection 1 on stream 0 at 08-Mar-2024 15:55:20.907 CET
Begin processing the 5th record. Run 377350, Event 5498, LumiSection 1 on stream 0 at 08-Mar-2024 15:55:20.990 CET
Begin processing the 6th record. Run 377350, Event 4244, LumiSection 1 on stream 0 at 08-Mar-2024 15:55:21.075 CET
Begin processing the 7th record. Run 377350, Event 5482, LumiSection 1 on stream 0 at 08-Mar-2024 15:55:21.164 CET
Begin processing the 8th record. Run 377350, Event 4232, LumiSection 1 on stream 0 at 08-Mar-2024 15:55:21.250 CET
Begin processing the 9th record. Run 377350, Event 4252, LumiSection 1 on stream 0 at 08-Mar-2024 15:55:21.329 CET
Begin processing the 10th record. Run 377350, Event 5499, LumiSection 1 on stream 0 at 08-Mar-2024 15:55:21.419 CET
08-Mar-2024 15:55:21 CET  Closed file root://eoscms.cern.ch//eos/cms/store/t0streamer/Data/PhysicsCommissioning/000/377/350/run377350_ls0001_streamPhysicsCommissioning_StorageManager.dat
[haozturk@lxplus802 Configuration]$ python3 DataProcessing/test/RunRepack.py --lfn /store/t0streamer/Data/PhysicsCommissioning/000/377/350/run377350_ls0001_streamPhysicsCommissioning_StorageManager.dat --data-tier L1SCOUT
Now do:
cmsRun -e RunRepackCfg.py
[haozturk@lxplus802 Configuration]$ cmsRun -e RunRepackCfg.py
08-Mar-2024 15:55:38 CET  Initiating request to open file root://eoscms.cern.ch//eos/cms/store/t0streamer/Data/PhysicsCommissioning/000/377/350/run377350_ls0001_streamPhysicsCommissioning_StorageManager.dat
08-Mar-2024 15:55:40 CET  Successfully opened file root://eoscms.cern.ch//eos/cms/store/t0streamer/Data/PhysicsCommissioning/000/377/350/run377350_ls0001_streamPhysicsCommissioning_StorageManager.dat
Begin processing the 1st record. Run 377350, Event 172727, LumiSection 1 on stream 0 at 08-Mar-2024 15:55:41.687 CET
Begin processing the 2nd record. Run 377350, Event 255073, LumiSection 1 on stream 0 at 08-Mar-2024 15:55:41.760 CET
Begin processing the 3rd record. Run 377350, Event 495959, LumiSection 1 on stream 0 at 08-Mar-2024 15:55:41.837 CET
Begin processing the 4th record. Run 377350, Event 5487, LumiSection 1 on stream 0 at 08-Mar-2024 15:55:41.905 CET
Begin processing the 5th record. Run 377350, Event 5498, LumiSection 1 on stream 0 at 08-Mar-2024 15:55:41.973 CET
Begin processing the 6th record. Run 377350, Event 4244, LumiSection 1 on stream 0 at 08-Mar-2024 15:55:42.041 CET
Begin processing the 7th record. Run 377350, Event 5482, LumiSection 1 on stream 0 at 08-Mar-2024 15:55:42.108 CET
Begin processing the 8th record. Run 377350, Event 4232, LumiSection 1 on stream 0 at 08-Mar-2024 15:55:42.174 CET
Begin processing the 9th record. Run 377350, Event 4252, LumiSection 1 on stream 0 at 08-Mar-2024 15:55:42.238 CET
Begin processing the 10th record. Run 377350, Event 5499, LumiSection 1 on stream 0 at 08-Mar-2024 15:55:42.305 CET
08-Mar-2024 15:55:42 CET  Closed file root://eoscms.cern.ch//eos/cms/store/t0streamer/Data/PhysicsCommissioning/000/377/350/run377350_ls0001_streamPhysicsCommissioning_StorageManager.dat
[haozturk@lxplus802 Configuration]$ 
[haozturk@lxplus802 Configuration]$ 
[haozturk@lxplus802 Configuration]$ python3 DataProcessing/test/RunRepack.py --lfn /store/t0streamer/Data/PhysicsCommissioning/000/377/350/run377350_ls0001_streamPhysicsCommissioning_StorageManager.dat --data-tier HLTSCOUT
Now do:
cmsRun -e RunRepackCfg.py
[haozturk@lxplus802 Configuration]$ cmsRun -e RunRepackCfg.py
08-Mar-2024 15:56:00 CET  Initiating request to open file root://eoscms.cern.ch//eos/cms/store/t0streamer/Data/PhysicsCommissioning/000/377/350/run377350_ls0001_streamPhysicsCommissioning_StorageManager.dat
08-Mar-2024 15:56:02 CET  Successfully opened file root://eoscms.cern.ch//eos/cms/store/t0streamer/Data/PhysicsCommissioning/000/377/350/run377350_ls0001_streamPhysicsCommissioning_StorageManager.dat
Begin processing the 1st record. Run 377350, Event 172727, LumiSection 1 on stream 0 at 08-Mar-2024 15:56:03.414 CET
Begin processing the 2nd record. Run 377350, Event 255073, LumiSection 1 on stream 0 at 08-Mar-2024 15:56:03.495 CET
Begin processing the 3rd record. Run 377350, Event 495959, LumiSection 1 on stream 0 at 08-Mar-2024 15:56:03.577 CET
Begin processing the 4th record. Run 377350, Event 5487, LumiSection 1 on stream 0 at 08-Mar-2024 15:56:03.656 CET
Begin processing the 5th record. Run 377350, Event 5498, LumiSection 1 on stream 0 at 08-Mar-2024 15:56:03.737 CET
Begin processing the 6th record. Run 377350, Event 4244, LumiSection 1 on stream 0 at 08-Mar-2024 15:56:03.818 CET
Begin processing the 7th record. Run 377350, Event 5482, LumiSection 1 on stream 0 at 08-Mar-2024 15:56:03.903 CET
Begin processing the 8th record. Run 377350, Event 4232, LumiSection 1 on stream 0 at 08-Mar-2024 15:56:03.984 CET
Begin processing the 9th record. Run 377350, Event 4252, LumiSection 1 on stream 0 at 08-Mar-2024 15:56:04.071 CET
Begin processing the 10th record. Run 377350, Event 5499, LumiSection 1 on stream 0 at 08-Mar-2024 15:56:04.160 CET
08-Mar-2024 15:56:04 CET  Closed file root://eoscms.cern.ch//eos/cms/store/t0streamer/Data/PhysicsCommissioning/000/377/350/run377350_ls0001_streamPhysicsCommissioning_StorageManager.dat
[haozturk@lxplus802 Configuration]$ 
[haozturk@lxplus802 Configuration]$ ls -l
total 3195
drwxr-xr-x. 2 haozturk zp   4096 Mar  8 15:51 DataProcessing
drwxr-xr-x. 2 haozturk zp   4096 Mar  8 15:51 EventContent
-rw-r--r--. 1 haozturk zp  14137 Mar  8 15:56 FrameworkJobReport.xml
-rw-r--r--. 1 haozturk zp   8411 Mar  8 15:55 RunRepackCfg.py
-rw-r--r--. 1 haozturk zp 561807 Mar  8 15:56 write_PrimDS1_HLTSCOUT.root
-rw-r--r--. 1 haozturk zp 495347 Mar  8 15:55 write_PrimDS1_L1SCOUT.root
-rw-r--r--. 1 haozturk zp 562065 Mar  8 15:55 write_PrimDS1_RAW.root
-rw-r--r--. 1 haozturk zp 561807 Mar  8 15:56 write_PrimDS2_HLTSCOUT.root
-rw-r--r--. 1 haozturk zp 495347 Mar  8 15:55 write_PrimDS2_L1SCOUT.root
-rw-r--r--. 1 haozturk zp 562065 Mar  8 15:55 write_PrimDS2_RAW.root
[haozturk@lxplus802 Configuration]$ edmDumpEventContent write_PrimDS1_RAW.root 
Type                      Module                   Label     Process   
-----------------------------------------------------------------------
GlobalObjectMapRecord     "hltGtStage2ObjectMap"   ""        "HLT"     
edm::TriggerResults       "TriggerResults"         ""        "HLT"     
trigger::TriggerEvent     "hltTriggerSummaryAOD"   ""        "HLT"     
FEDRawDataCollection      "rawDataCollector"       ""        "LHC"     
[haozturk@lxplus802 Configuration]$ edmDumpEventContent write_PrimDS1_L1SCOUT.root 
Type                     Module               Label     Process   
------------------------------------------------------------------
FEDRawDataCollection     "rawDataCollector"   ""        "LHC"     
[haozturk@lxplus802 Configuration]$ edmDumpEventContent write_PrimDS1_HLTSCOUT.root 
Type                      Module                   Label     Process   
-----------------------------------------------------------------------
GlobalObjectMapRecord     "hltGtStage2ObjectMap"   ""        "HLT"     
edm::TriggerResults       "TriggerResults"         ""        "HLT"     
trigger::TriggerEvent     "hltTriggerSummaryAOD"   ""        "HLT"     
FEDRawDataCollection      "rawDataCollector"       ""        "LHC"  

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

Backport is #44380

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 12, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-44381/39439

  • This PR adds an extra 20KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @haozturk for master.

It involves the following packages:

  • Configuration/DataProcessing (operations)
  • Configuration/EventContent (operations)

@davidlange6, @rappoccio, @cmsbuild, @fabiocos, @antoniovilela can you please review it and eventually sign? Thanks.
@fabiocos, @Martin-Grunewald, @missirol this is something you requested to watch as well.
@sextonkennedy, @antoniovilela, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@drkovalskyi
Copy link
Contributor

Could someone please review it. It's urgently needed for Tier0 to start testing for Run2024 data taking.

compressionAlgorithm=cms.untracked.string("LZMA"),
compressionLevel=cms.untracked.int32(4)
)
HLTSCOUTEventContent.outputCommands.extend(HLTriggerRAW.outputCommands)
Copy link
Contributor

@mmusich mmusich Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we've been discussing internally within TSG that this could be just HLTScouting instead of HLTriggerRAWor even better we could add a new HLTriggerHLTSCOUT to HLTrigger_EventContent_cff.py replacing HLTScouting (same keep instructions but adding a 'drop *_hlt*_*_*' at the beginning). This would need to be done internally by TSG and can go as a refinement after this PR is integrated.

@mmusich
Copy link
Contributor

mmusich commented Mar 13, 2024

@cmsbuild, please test

@mmusich
Copy link
Contributor

mmusich commented Mar 13, 2024

urgent

  • it's urgently needed for Tier0 to start testing for Run2024 data taking.

@drkovalskyi
Copy link
Contributor

@mmusich, we were thinking along this line as well, but since it doesn't have a drop statement, we had to proceed with this proposal. Let's start with something that can be used right away and refine later.

@mmusich
Copy link
Contributor

mmusich commented Mar 13, 2024

@drkovalskyi

Let's start with something that can be used right away and refine later.

agreed.

@mmusich
Copy link
Contributor

mmusich commented Mar 13, 2024

tagging @cms-sw/l1-l2 in case there are comments on the L1SCOUT part.

@aloeliger
Copy link
Contributor

I'm going to redirect that question to @dinyar, our scouting expert.

@aloeliger
Copy link
Contributor

Otherwise, it looks fine to me.

@dinyar
Copy link
Contributor

dinyar commented Mar 13, 2024

Hi @aloeliger, thanks for the tag! I'll pass it right on to @Mmiglio who is working on the CMSSW-based parts of the L1T scouting system.

@Mmiglio
Copy link
Contributor

Mmiglio commented Mar 13, 2024

Hi, thanks for the tag. It looks good from the L1TScouting perspective as well, but I think that the event content needs to be changed.

compressionAlgorithm=cms.untracked.string("LZMA"),
compressionLevel=cms.untracked.int32(4)
)
L1SCOUTEventContent.outputCommands.extend(L1TriggerRAW.outputCommands)
Copy link
Contributor

@Mmiglio Mmiglio Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall this be changed in a different PR? The L1Scout event content is different from the L1Trigger event content.
For example, we would need something on the line of these commands

"keep *_GmtUnpacker_*_*"
"keep *_CaloUnpacker_*_*",
"keep *_BmtfStubsUnpacker_*_*",

I can pick a different name for the modules if needed.

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-755f0c/38090/summary.html
COMMIT: b0e72f7
CMSSW: CMSSW_14_1_X_2024-03-12-2300/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/44381/38090/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially removed 5 lines from the logs
  • Reco comparison results: 52 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3297383
  • DQMHistoTests: Total failures: 9
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3297354
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 202 log files, 165 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

@drkovalskyi
Copy link
Contributor

@davidlange6, what trivial change you have in mind. Maybe I missed it.

In general, we have a long chain of things that we need to wire to make it all work in production. Indeed, L1 scouting seems incomplete and will need to be updated, but this can be done later and we can start testing things already.

fileName = cms.untracked.string("%s.root" % moduleLabel)
)

outputModule.dataset = cms.untracked.PSet(dataTier = cms.untracked.string("RAW"))
if dataTier != defaultDataTier:
outputModule.outputCommands = copy.copy(eventContent.outputCommands)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@drkovalskyi - my suggestion is to remove this line and the previous one, which removes the concern I've raised.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for us it would be ok. We can still use the data tier name to differentiate L1 scouting, HLT scouting, and regular RAW. It also means that T0 will not enforce any content on these new data tiers. What do you think @drkovalskyi?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Let's not enforce event content in repacking. It introduces more problems then it solves. We may still need the new EventContent for central production, so I would still keep that part, but what concerns repacking itself, I agree with David's proposal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they are needed for MC production and/or derived data sets, but those should be correct if defined... ( i guess the one for hlt scouting is already present somewhere as its in miniaod already iiuc.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For MC uses cases we will have time to fine tune and iterate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please open an issue so that you don't forget to clean things up

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed these two lines as you suggested.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please open an issue so that you don't forget to clean things up

@drkovalskyi

Please let us know once you open the issue.
Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haozturk could you please follow up on this request

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the issue: #44409 Please correct me if there is any missing or incorrect statements there.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-44381/39475

@cmsbuild
Copy link
Contributor

Pull request #44381 was updated. @antoniovilela, @fabiocos, @cmsbuild, @davidlange6, @rappoccio can you please check and sign again.

@mmusich
Copy link
Contributor

mmusich commented Mar 14, 2024

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-755f0c/38132/summary.html
COMMIT: 0255266
CMSSW: CMSSW_14_1_X_2024-03-13-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/44381/38132/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 39 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3297383
  • DQMHistoTests: Total failures: 3
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3297360
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 202 log files, 165 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

@antoniovilela
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged.

@cmsbuild cmsbuild merged commit 28fb17c into cms-sw:master Mar 15, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants