Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[13_0_X] Online DQM: replace the hard-coded output directory name of event display clients with an input argument, backport of 41986 and 42128 #42132

Merged
merged 1 commit into from
Jul 8, 2023

Conversation

syuvivida
Copy link
Contributor

PR description:

This is a backport of PR 41986 and PR 42128.

We are working on the upgrade of online DQM machines [0][1]. There will be a few months that we share the same CMSSW code between the new and the current machines.
In the current (old) DQM machines, the disks of bu-c2f11-09-01 and bu-c2f11-13-01 are mounted on our fu machines as /fff/BU0. Event display clients visualization-live and visualization-live-secondInstance produce output root files at /fff/BU0/output.

However, the mount point (path) has changed in the new online DQM machines [1]. In order to use the same event display client codes for both old and new machines and also to make the path name more flexible, we replace the output path with an input argument (with a default value of /fff/BU0/output). The old machines will use an old hltd version and take the default value of the argument outputBaseDir, while the input values for the new machines will be determined by hltd and startDqmRun.sh.

[0] twiki about the upgrade of DQM machines
[1] JIRA ticket that includes the communication with DAQ
[2] JIRA ticket of the tests during TS1

PR validation:

  • This PR has been tested at lxplus by running the hlt, hcal, and ecal clients standalone with CMSSW_13_0_X_2023-06-27-1100 , CMSSW_13_1_X_2023-06-27-1100, and CMSSW_13_2_X_2023-06-26-2300 with the streamers at /eos/cms/store/group/comm_dqm/Collisions23_tempStreamers/.
  • This PR has been tested at the current (old) online DQM playback machines and ran all clients without problem when using the default value of the input argument.
  • This PR has been deployed/tested when we tested the data transfer during TS1 [2].

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 29, 2023

A new Pull Request was created by @syuvivida for CMSSW_13_0_X.

It involves the following packages:

  • DQM/Integration (dqm)

@nothingface0, @emanueleusai, @cmsbuild, @pmandrik, @syuvivida, @tjavaid, @micsucmed, @rvenditti can you please review it and eventually sign? Thanks.
@threus, @batinkov, @francescobrivio this is something you requested to watch as well.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@emanueleusai
Copy link
Member

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 3, 2023

-1

Failed Tests: RelVals RelVals-INPUT
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-c31fb2/33508/summary.html
COMMIT: 9a024fe
CMSSW: CMSSW_13_0_X_2023-07-02-2300/el8_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/42132/33508/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals

----- Begin Fatal Exception 03-Jul-2023 07:19:44 CEST-----------------------
An exception of category 'FileOpenError' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 5 stream: 0
   [1] Running path 'HLTAnalyzerEndpath'
   [2] Prefetching for module L1TRawToDigi/'hltGtStage2Digis'
   [3] Prefetching for module RawDataCollectorByLabel/'rawDataCollector'
   [4] Prefetching for module SiStripDigiToRawModule/'SiStripDigiToRaw'
   [5] Calling method for module MixingModule/'mix'
   [6] Calling RootInputFileSequence::initTheFile()
   [7] Calling StorageFactory::open()
   [8] Calling XrdFile::open()
Exception Message:
Failed to open the file 'root://xrootd-cms.infn.it//store/relval/CMSSW_12_0_0_pre4/RelValMinBias_13/GEN-SIM/113X_mc2017_realistic_v5-v1/00000/a21693e9-4d25-496a-96e2-c28232a7a712.root'
   Additional Info:
      [a] Calling RootInputFileSequence::initTheFile(): fail to open the file with name root://cms-xrd-global.cern.ch//eos/cms/store/relval/CMSSW_12_0_0_pre4/RelValMinBias_13/GEN-SIM/113X_mc2017_realistic_v5-v1/00000/a21693e9-4d25-496a-96e2-c28232a7a712.root
      [b] Calling RootInputFileSequence::initTheFile(): fail to open the file with name root://eoscms.cern.ch//eos/cms/store/user/cmsbuild/store/relval/CMSSW_12_0_0_pre4/RelValMinBias_13/GEN-SIM/113X_mc2017_realistic_v5-v1/00000/a21693e9-4d25-496a-96e2-c28232a7a712.root
      [c] Input file root://xrootd-cms.infn.it//store/relval/CMSSW_12_0_0_pre4/RelValMinBias_13/GEN-SIM/113X_mc2017_realistic_v5-v1/00000/a21693e9-4d25-496a-96e2-c28232a7a712.root could not be opened.
      [d] XrdCl::File::Open(name='root://xrootd-cms.infn.it//store/relval/CMSSW_12_0_0_pre4/RelValMinBias_13/GEN-SIM/113X_mc2017_realistic_v5-v1/00000/a21693e9-4d25-496a-96e2-c28232a7a712.root', flags=0x10, permissions=0660) => error '[ERROR] Server responded with an error: [3011] No servers are available to read the file.
' (errno=3011, code=400). No additional data servers were found.
      [e] Last URL tried: root://cms-xrd-global.cern.ch:1094//store/relval/CMSSW_12_0_0_pre4/RelValMinBias_13/GEN-SIM/113X_mc2017_realistic_v5-v1/00000/a21693e9-4d25-496a-96e2-c28232a7a712.root?tried=+1213llrxrd-redir.in2p3.fr,&xrdcl.requuid=64739e27-a3ab-4443-b1af-da4ebde1f1ce
      [f] Problematic data server: cms-xrd-global.cern.ch:1094
      [g] Disabled source: cms-xrd-global.cern.ch:1094
----- End Fatal Exception -------------------------------------------------

RelVals-INPUT

  • 4.294.29_RunMinBias2011B/step2_RunMinBias2011B.log
  • 134.806134.806_RunMuonEG2015C/step2_RunMuonEG2015C.log
  • 134.808134.808_RunSingleMuPrpt2015C/step2_RunSingleMuPrpt2015C.log
Expand to see more relval errors ...

@perrotta
Copy link
Contributor

perrotta commented Jul 4, 2023

backport of #42128
(also #41986)

@syuvivida
Copy link
Contributor Author

Hello, just to note that the test failure has nothing to do with this PR.

@perrotta
Copy link
Contributor

perrotta commented Jul 4, 2023

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 5, 2023

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-c31fb2/33550/summary.html
COMMIT: 9a024fe
CMSSW: CMSSW_13_0_X_2023-07-04-1100/el8_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/42132/33550/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 8 lines to the logs
  • Reco comparison results: 1 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3317136
  • DQMHistoTests: Total failures: 3
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3317111
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 213 log files, 164 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@emanueleusai
Copy link
Member

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 5, 2023

This pull request is fully signed and it will be integrated in one of the next CMSSW_13_0_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_13_2_X is complete. This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

perrotta commented Jul 8, 2023

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants