Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add layer-1 monitoring for new slot-7 cards [13_0_7] #42021

Merged
merged 1 commit into from
Jul 4, 2023

Conversation

hftsoi
Copy link
Contributor

@hftsoi hftsoi commented Jun 19, 2023

PR description:

This PR modifies Calo-Layer1 unpacker to adapt the additions of a new CTP7 card in slot-7 in each of the three layer-1 crates (FEDs 1354, 1356, 1358), where each card sends the same payload header and trailer as all other existing calo cards, but with a fixed payload data size of 6 32-bit words, regardless of normal or FAT events being sent. New monitoring elements are added to layer-1 DQM for the 3x6x32 bits. The modification is done in such a way that it works before and after the card addition.

Note that the monitoring elements for HCAL FB4-5 are commented out, we will put them back once HCAL fixes them (FB4-5 are reserved bits and not used for LLP, but they are sending unphysical data there which layer-1 could not read out, causing discrepancies seen when comparing them).

PR validation:

Validated by running offline DQM on past commissioning runs, it works as expected for current production firmware. This adds minor firmware status checks on top of #41383 which has been tested online with new firmware.

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 19, 2023

A new Pull Request was created by @hftsoi (Ho-Fung Tsoi) for CMSSW_13_0_X.

It involves the following packages:

  • DQM/L1TMonitor (dqm)
  • EventFilter/L1TRawToDigi (l1)

@aloeliger, @epalencia, @nothingface0, @emanueleusai, @cmsbuild, @pmandrik, @syuvivida, @tjavaid, @micsucmed, @rvenditti can you please review it and eventually sign? Thanks.
@dinyar, @missirol, @Martin-Grunewald, @thomreis, @eyigitba this is something you requested to watch as well.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@hftsoi hftsoi marked this pull request as draft June 19, 2023 20:27
@hftsoi
Copy link
Contributor Author

hftsoi commented Jun 19, 2023

hi @syuvivida this is the 13_0_7 version on top of #41383 with additional minor checks added, could you please integrate this into online DQM production, we will do a quick test on it as soon as this PR is there. Thanks!

@emanueleusai
Copy link
Member

please test

@emanueleusai
Copy link
Member

type hcal

@cmsbuild cmsbuild added the hcal label Jun 20, 2023
@emanueleusai
Copy link
Member

hi @syuvivida this is the 13_0_7 version on top of #41383 with additional minor checks added, could you please integrate this into online DQM production, we will do a quick test on it as soon as this PR is there. Thanks!

working on it

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9db053/33255/summary.html
COMMIT: e6133b0
CMSSW: CMSSW_13_0_X_2023-06-19-2300/el8_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/42021/33255/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 206 lines to the logs
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3315876
  • DQMHistoTests: Total failures: 3
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3315851
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -925.8799999999998 KiB( 48 files compared)
  • DQMHistoSizes: changed ( 10024.0,... ): -46.294 KiB L1T/L1TStage2CaloLayer1
  • Checked 213 log files, 164 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@micsucmed
Copy link

Hi @hftsoi we have run the test in our playback systems with this PR, you can have a look at the output root file at EOS in this path:
/eos/cms/store/group/comm_dqm/temp_DQMGUI_data_repository/totem/DQM_V0001_L1T_R000528572.root

@hftsoi
Copy link
Contributor Author

hftsoi commented Jun 20, 2023

thanks @micsucmed it looks good, could you please deploy it?

@micsucmed
Copy link

@hftsoi, great, we will deploy it tomorrow after the RC meeting

@emanueleusai
Copy link
Member

+1

  • p5 tests ok

Comment on lines +122 to +123
//dqm::reco::MonitorElement *hcalOccFg4Discrepancy_;
//dqm::reco::MonitorElement *hcalOccFg5Discrepancy_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these are no longer needed, could you remove them?

Comment on lines +461 to +462
//const bool Hfg4Agreement = (abs(ieta) < 29) ? (layer1fg4 == uHTRfg4) : true;
//const bool Hfg5Agreement = (abs(ieta) < 29) ? (layer1fg5 == uHTRfg5) : true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These as well

Comment on lines +526 to +531
//if (not Hfg4Agreement) {
// eventMonitors.hcalOccFg4Discrepancy_->Fill(ieta, iphi);
//}
//if (not Hfg5Agreement) {
// eventMonitors.hcalOccFg5Discrepancy_->Fill(ieta, iphi);
//}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These as well

Comment on lines +761 to +764
//eventMonitors.hcalOccFg4Discrepancy_ =
// bookHcalOccupancy("hcalOccFg4Discrepancy", "HCal Fine Grain 4 Discrepancy between uHTR and Layer1");
//eventMonitors.hcalOccFg5Discrepancy_ =
// bookHcalOccupancy("hcalOccFg5Discrepancy", "HCal Fine Grain 5 Discrepancy between uHTR and Layer1");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are commented at the moment to mute monitoring of channels where HCAL is sending unphysical data to layer1, and will be uncommented once they fix it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine with me then

@hftsoi
Copy link
Contributor Author

hftsoi commented Jun 22, 2023

Re-open this PR, P5 test went fine and we want these changes to stay permanent. A PR is opened to master (#42048), and this will be a backport.

@hftsoi hftsoi marked this pull request as ready for review June 22, 2023 10:28
@aloeliger
Copy link
Contributor

backport of #42048

@emanueleusai
Copy link
Member

please test

@hftsoi
Copy link
Contributor Author

hftsoi commented Jul 3, 2023

-1

Failed Tests: RelVals RelVals-INPUT Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9db053/33503/summary.html COMMIT: ded0751 CMSSW: CMSSW_13_0_X_2023-07-02-0000/el8_amd64_gcc11 User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/42021/33503/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals

----- Begin Fatal Exception 02-Jul-2023 21:50:44 CEST-----------------------
An exception of category 'FileOpenError' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 5 stream: 0
   [1] Running path 'HLTAnalyzerEndpath'
   [2] Prefetching for module L1TRawToDigi/'hltGtStage2Digis'
   [3] Prefetching for module RawDataCollectorByLabel/'rawDataCollector'
   [4] Prefetching for module SiStripDigiToRawModule/'SiStripDigiToRaw'
   [5] Calling method for module MixingModule/'mix'
   [6] Calling RootInputFileSequence::initTheFile()
   [7] Calling StorageFactory::open()
   [8] Calling XrdFile::open()
Exception Message:
Failed to open the file 'root://xrootd-cms.infn.it//store/relval/CMSSW_12_0_0_pre4/RelValMinBias_13/GEN-SIM/113X_mc2017_realistic_v5-v1/00000/a21693e9-4d25-496a-96e2-c28232a7a712.root'
   Additional Info:
      [a] Calling RootInputFileSequence::initTheFile(): fail to open the file with name root://cms-xrd-global.cern.ch//eos/cms/store/relval/CMSSW_12_0_0_pre4/RelValMinBias_13/GEN-SIM/113X_mc2017_realistic_v5-v1/00000/a21693e9-4d25-496a-96e2-c28232a7a712.root
      [b] Calling RootInputFileSequence::initTheFile(): fail to open the file with name root://eoscms.cern.ch//eos/cms/store/user/cmsbuild/store/relval/CMSSW_12_0_0_pre4/RelValMinBias_13/GEN-SIM/113X_mc2017_realistic_v5-v1/00000/a21693e9-4d25-496a-96e2-c28232a7a712.root
      [c] Input file root://xrootd-cms.infn.it//store/relval/CMSSW_12_0_0_pre4/RelValMinBias_13/GEN-SIM/113X_mc2017_realistic_v5-v1/00000/a21693e9-4d25-496a-96e2-c28232a7a712.root could not be opened.
      [d] XrdCl::File::Open(name='root://xrootd-cms.infn.it//store/relval/CMSSW_12_0_0_pre4/RelValMinBias_13/GEN-SIM/113X_mc2017_realistic_v5-v1/00000/a21693e9-4d25-496a-96e2-c28232a7a712.root', flags=0x10, permissions=0660) => error '[ERROR] Server responded with an error: [3011] No servers are available to read the file.
' (errno=3011, code=400). No additional data servers were found.
      [e] Last URL tried: root://cms-xrd-global.cern.ch:1094//store/relval/CMSSW_12_0_0_pre4/RelValMinBias_13/GEN-SIM/113X_mc2017_realistic_v5-v1/00000/a21693e9-4d25-496a-96e2-c28232a7a712.root?tried=+1213xrootd-redic.pi.infn.it,&xrdcl.requuid=b50b5a30-952e-4aa9-b564-2f7dfe661149
      [f] Problematic data server: cms-xrd-global.cern.ch:1094
      [g] Disabled source: cms-xrd-global.cern.ch:1094
----- End Fatal Exception -------------------------------------------------

RelVals-INPUT

  • 4.294.29_RunMinBias2011B/step2_RunMinBias2011B.log
  • 23.023.0_JpsiMM/step2_JpsiMM.log
  • 134.806134.806_RunMuonEG2015C/step2_RunMuonEG2015C.log

Expand to see more relval errors ...

Hi, I don't think the failed test has anything to do with this PR (similar failing has been seen in #38361), could you please check? Thank you!

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 3, 2023

-1

Failed Tests: RelVals RelVals-INPUT
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9db053/33505/summary.html
COMMIT: ded0751
CMSSW: CMSSW_13_0_X_2023-07-02-2300/el8_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/42021/33505/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals

----- Begin Fatal Exception 03-Jul-2023 07:13:45 CEST-----------------------
An exception of category 'FileOpenError' occurred while
   [0] Constructing the EventProcessor
   [1] Constructing module: class=MixingModule label='mix'
   [2] Calling RootInputFileSequence::initTheFile()
   [3] Calling StorageFactory::open()
   [4] Calling XrdFile::open()
Exception Message:
Failed to open the file 'root://xrootd-cms.infn.it//store/relval/CMSSW_12_0_0_pre4/RelValMinBias_13/GEN-SIM/113X_mc2017_realistic_v5-v1/00000/17fac9a9-98f1-43d3-9dbd-d26d638e04dd.root'
   Additional Info:
      [a] Calling RootInputFileSequence::initTheFile(): fail to open the file with name root://cms-xrd-global.cern.ch//eos/cms/store/relval/CMSSW_12_0_0_pre4/RelValMinBias_13/GEN-SIM/113X_mc2017_realistic_v5-v1/00000/17fac9a9-98f1-43d3-9dbd-d26d638e04dd.root
      [b] Calling RootInputFileSequence::initTheFile(): fail to open the file with name root://eoscms.cern.ch//eos/cms/store/user/cmsbuild/store/relval/CMSSW_12_0_0_pre4/RelValMinBias_13/GEN-SIM/113X_mc2017_realistic_v5-v1/00000/17fac9a9-98f1-43d3-9dbd-d26d638e04dd.root
      [c] Input file root://xrootd-cms.infn.it//store/relval/CMSSW_12_0_0_pre4/RelValMinBias_13/GEN-SIM/113X_mc2017_realistic_v5-v1/00000/17fac9a9-98f1-43d3-9dbd-d26d638e04dd.root could not be opened.
      [d] XrdCl::File::Open(name='root://xrootd-cms.infn.it//store/relval/CMSSW_12_0_0_pre4/RelValMinBias_13/GEN-SIM/113X_mc2017_realistic_v5-v1/00000/17fac9a9-98f1-43d3-9dbd-d26d638e04dd.root', flags=0x10, permissions=0660) => error '[ERROR] Server responded with an error: [3011] No servers are available to read the file.
' (errno=3011, code=400). No additional data servers were found.
      [e] Last URL tried: root://cms-xrd-global.cern.ch:1094//store/relval/CMSSW_12_0_0_pre4/RelValMinBias_13/GEN-SIM/113X_mc2017_realistic_v5-v1/00000/17fac9a9-98f1-43d3-9dbd-d26d638e04dd.root?tried=+1213llrxrd-redir.in2p3.fr,&xrdcl.requuid=1d2df3fc-8f0b-42ed-98bc-aa191ce3d000
      [f] Problematic data server: cms-xrd-global.cern.ch:1094
      [g] Disabled source: cms-xrd-global.cern.ch:1094
----- End Fatal Exception -------------------------------------------------

RelVals-INPUT

  • 4.294.29_RunMinBias2011B/step2_RunMinBias2011B.log
  • 23.023.0_JpsiMM/step2_JpsiMM.log
  • 134.808134.808_RunSingleMuPrpt2015C/step2_RunSingleMuPrpt2015C.log
Expand to see more relval errors ...

@perrotta
Copy link
Contributor

perrotta commented Jul 4, 2023

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 4, 2023

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9db053/33534/summary.html
COMMIT: ded0751
CMSSW: CMSSW_13_0_X_2023-07-03-2300/el8_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/42021/33534/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially removed 199 lines from the logs
  • Reco comparison results: 120 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3317096
  • DQMHistoTests: Total failures: 4109
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3312965
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -925.8799999999998 KiB( 48 files compared)
  • DQMHistoSizes: changed ( 10024.0,... ): -46.294 KiB L1T/L1TStage2CaloLayer1
  • Checked 213 log files, 164 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@aloeliger
Copy link
Contributor

+l1

  • Just resigning after the backport

@perrotta
Copy link
Contributor

perrotta commented Jul 4, 2023

@emanueleusai you signed this already before the latest minor updates that made this backport PR identical to its already merged master version, also signed by @cms-sw/dqm-l2
I assume that you will also sign after those minor updates, and merge this PR in order to have it included in the going to be built 13_0_10 release.
Please complain if there is anything wrong with it.

@perrotta
Copy link
Contributor

perrotta commented Jul 4, 2023

+1

@perrotta
Copy link
Contributor

perrotta commented Jul 4, 2023

merge

@cmsbuild cmsbuild merged commit 7785a52 into cms-sw:CMSSW_13_0_X Jul 4, 2023
@emanueleusai
Copy link
Member

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 5, 2023

This pull request is fully signed and it will be integrated in one of the next CMSSW_13_0_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_13_2_X is complete. This pull request will be automatically merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants