Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix LHCInfoPer* PopCons to produce lumi-type IOVs when running in duringFill mode #43837

Conversation

francescobrivio
Copy link
Contributor

@francescobrivio francescobrivio commented Feb 1, 2024

PR description:

Since, when used in the duringFill mode, both the LHCInfoPerLSand LHCInfoPeFill PopCons will have to populate tags in the HLT GT (with an ad hoc customization in the HLT menu to get updated conditions via the refreshTime parameter), the time-type of the produced payloads/IOVs must be lumi based (see cond::TimeType::lumiid).

This PR introduces a differentiation in how the payload/IOVs are produced depending on the "mode" (passed as external paramenter to the cmsRun command) used to run the O2O:

  • endFill mode: produce timestamp-type payloads
  • duringFill mode: produce lumiid-type payloads

Further considerations

Currently the IOVs are produced directly from the LSs analyzed to get the payload data: if the appends of such IOVs to an hlt-sync'ed tag were to be performed via the uploadConditions.py script, they would be all rejected due to the synchronization policy.
But, after discussion with @mmusich, we realized that the PopCon upload mechanism does not exploits the uploader script, but rather uses the PoolDBOutputService::writeMany method, so the appends should be possible.
Given the values of the parameters of the payloads (mainly beta* and crossinging angle) do not change very frequently, I suppose it is fine to upload the IOVs even though the HLT has moved on to processing a next LS: the next time the refreshParameter mechanism kicks in, the conditions consumed by HLT should be updated with the updated values.

Alternatively one could use the OnlineDBOutputService::writeIOVForNextLumisection method (which is used for the OnlineBeamSpot workflow), which reads the LS currently being processed in HLT and writes the payload to currentLS+latency, to guarantee an upload "in the future" wrt what HLT is processing.
The drawback is that this would require significant changes in the code (probably the whole module would have to be moved outside the PopCon infrastructure).

@JanChyczynski @mmusich @vavati @fwyzard Any suggestion is more than welcome, especially on this last point!

I'm opening this as draft PR since there are still a few points to be addressed:

  • Make sure that all the IOVs insertions are performed with the correct time-type, especially when the addEmptyPayload() method is used
  • Make sure the tests already implemented cover all possible modes of running
  • Decide if the IOVs used for appending new payloads is fine or not (see description above)

PR validation:

Some preliminary private tests, forcing the time-type of the tags to be produced as lumiid even in endFill mode, show that the IOVs are correctly created as LS based and the intervals match the LSs from OMS.
Command examples:

cmsRun CondTools/RunInfo/python/LHCInfoPerFillPopConAnalyzer.py mode=endFill
cmsRun CondTools/RunInfo/python/LHCInfoPerLSPopConAnalyzer.py mode=endFill
cmsRun CondTools/RunInfo/python/LHCInfoPerFillPopConAnalyzer.py mode=duringFill
cmsRun CondTools/RunInfo/python/LHCInfoPerLSPopConAnalyzer.py mode=duringFill

Backport:

Not a backport, eventual 133X backports will be opened depending on the 2024 data-taking schedule - to be decided.

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 1, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 1, 2024

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43837/38664

  • This PR adds an extra 28KB to repository

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 1, 2024

A new Pull Request was created by @francescobrivio for master.

It involves the following packages:

  • CondTools/RunInfo (db)

@francescobrivio, @perrotta, @consuegs, @cmsbuild, @saumyaphor4252 can you please review it and eventually sign? Thanks.
@mmusich, @rsreds, @PonIlya, @yuanchao this is something you requested to watch as well.
@antoniovilela, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@vavati
Copy link

vavati commented Feb 1, 2024

About "Further considerations" : The two mechanisms should be asynchronous so according to me it is ok . PoPCon should update more frequent (let's say 10 min) and HLT_refresh ~ 15 min
Even if they start together at the beginning of the run (not sure about that) they are shifted by ~ 5-10 min. For the plateau of the levelling it is ok. At least this is a good starting point. Then we can see if we need more frequent upload/refresh.

@mmusich
Copy link
Contributor

mmusich commented Feb 1, 2024

PoPCon should update more frequent (let's say 10 min) and HLT_refresh ~ 15 min

for the BeamSpot (with a similar use case) we refresh every 2 LS in the HLT:

GlobalTag = cms.ESSource( "PoolDBESSource",
    DBParameters = cms.PSet( 
      connectionRetrialTimeOut = cms.untracked.int32( 60 ),
      idleConnectionCleanupPeriod = cms.untracked.int32( 10 ),
      enableReadOnlySessionOnUpdateConnection = cms.untracked.bool( False ),
      enablePoolAutomaticCleanUp = cms.untracked.bool( False ),
      messageLevel = cms.untracked.int32( 0 ),
      authenticationPath = cms.untracked.string( "." ),
      connectionRetrialPeriod = cms.untracked.int32( 10 ),
      connectionTimeOut = cms.untracked.int32( 0 ),
      enableConnectionSharing = cms.untracked.bool( True )
    ),
    connect = cms.string( "frontier://FrontierProd/CMS_CONDITIONS" ),
    globaltag = cms.string( "None" ),
    snapshotTime = cms.string( "" ),
    toGet = cms.VPSet( 
      cms.PSet(  refreshTime = cms.uint64( 2 ),
        record = cms.string( "BeamSpotOnlineLegacyObjectsRcd" )
      ),
      cms.PSet(  refreshTime = cms.uint64( 2 ),
        record = cms.string( "BeamSpotOnlineHLTObjectsRcd" )
      )
    ),
    DumpStat = cms.untracked.bool( False ),
    ReconnectEachRun = cms.untracked.bool( True ),
    RefreshAlways = cms.untracked.bool( False ),
    RefreshEachRun = cms.untracked.bool( True ),
    RefreshOpenIOVs = cms.untracked.bool( False ),
    pfnPostfix = cms.untracked.string( "" ),
    pfnPrefix = cms.untracked.string( "" )
)

IIRC trying to balance the load on the DB vs reconstruction accuracy.

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 2, 2024

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43837/38683

  • This PR adds an extra 24KB to repository

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 2, 2024

Pull request #43837 was updated. @saumyaphor4252, @consuegs, @perrotta, @francescobrivio, @cmsbuild can you please check and sign again.

@francescobrivio francescobrivio marked this pull request as ready for review February 2, 2024 16:27
@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 6, 2024

Milestone for this pull request has been moved to CMSSW_14_1_X. Please open a backport if it should also go in to CMSSW_14_0_X.

@francescobrivio
Copy link
Contributor Author

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 1, 2024

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43837/39307

  • This PR adds an extra 28KB to repository

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 1, 2024

Pull request #43837 was updated. @francescobrivio, @perrotta, @consuegs, @saumyaphor4252 can you please check and sign again.

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 1, 2024

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a82a19/37859/summary.html
COMMIT: 96cce8c
CMSSW: CMSSW_14_1_X_2024-03-01-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/43837/37859/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 4 lines to the logs
  • Reco comparison results: 49 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3338784
  • DQMHistoTests: Total failures: 9
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3338753
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 205 log files, 166 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@perrotta
Copy link
Contributor

perrotta commented Mar 4, 2024

+db

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 4, 2024

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @antoniovilela, @rappoccio, @sextonkennedy (and backports should be raised in the release meeting by the corresponding L2)

@antoniovilela
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 38abdb6 into cms-sw:master Mar 4, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants