-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regarding failures during submission in McM for WWJ + NNLOPS sample #42716
Comments
A new Issue was created by @sv3048 SADHANA VERMA. @Dr15Jones, @rappoccio, @smuzaffar, @makortel, @sextonkennedy, @antoniovilela can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
assign generators |
New categories assigned: generators @mkirsano,@menglu21,@alberto-sanchez,@SiewYan,@GurpreetSinghChahal,@Saptaparna you have been requested to review this Pull request/Issue and eventually sign? Thanks |
@smuzaffar does cmsdist just pick up libgsl from the OS? |
No we should be picking up gsl from cns external |
Does |
Hello @ALL! Thanks and Regards, |
cc'ing @bbilin @menglu21 @sunilUIET Thanks ! |
Hi, |
i guess the key issue is the "0" in |
The Makefile in WWJ has |
To add. You can fix the makefile and re-compile the executables. All grids can be copied. So there is no need to resubmit jobs. Just the executables should be fixed, all pre-sampled info added and you are good to go. |
hum, cmssw doesn't put gsl-config into the externals bin area. @smuzaffar - is that expected? [I guess this is because the gsl toolfile does not include a path...] |
@davidlange6 , you are right. Our
|
cms-sw/cmsdist#8711 adds our |
A new scram runtime hook has been deployed on cvmfs ( cms-sw/cmsdist#8712 ) which now properly adds our [a]
|
Perfect. Thanks for all your work @smuzaffar ! @sv3048 can you update the executables? |
Hi @agrohsje ! |
Hi @agrohsje @smuzaffar ! I have a query. could you please confirm that ? Thanks and Regards, |
yes that is correct @sv3048. No need to change cmssw version, just recompiling the WWJ should now pick the gsl from cvmfs |
Okay, Thanks ! |
You can double-check that all works well by doing |
Dear @smuzaffar , @agrohsje and other experts, It hurts me a lot to resurrect this issue, which I hoped it was finally solved. Following the previous discussion, I have recompiled the WWJ process after the cmssw external package was updated, and indeed it could pick the right
Therefore we proceeded with the sample injection (after some delay). However, the That's really unfortunate and I'm really sorry that I couldn't catch this earlier, but I'm far from being an expert in computing. I have cross checked the I would like to ask you if you could please fix this again, I hope it's doable. Of course, this is just my interpretation, if you think the problem can be solved differently please tell me. Thank you very much in advance, let me know if something is not clear and you need more details. Best regards, |
Dear Mattia, |
Hi @agrohsje , Thanks for replying, I didn't mean to overload you with extra work and I totally understand if you have other business to run. To be honest, I don't have any clue about whether that specific library is actually needed by the underlying Powheg code (and where it's used eventually). I can give it a try at what you suggested, just recompiling the code without that extra flag and see what happens. Thanks again for your feedback, if that doesn't work out I will ask other people in this thread to help me (if they can). Cheers, |
if you picked up the right
instead using the default gsl-config on lxplus (eg, from /usr/bin)
so what is the output of |
Dear @davidlange6 , If I run the |
I see - I had just done if instead I go to CMSSW_10_6_21, I do get the same environment as you. The difference being
that cmsenv misses. I'm not sure why that would be.. Anyway, you can either set this envvar, or perhaps use |
Indeed if I source that script I get what perhaps is the "right" output, thanks a lot! I was about to go with the second approach you suggested, but I think it's cleanest to set up the proper inititalization script as you just showed. Thank you very much, I'll try to compile the code again and see what happens. Maybe I can test the new gridpack via crab to check if it actually doesn't fail in a grid node, so that next time we inject the sample we don't encounter any undesired behaviour. Cheers and thanks again for your support, |
fwiw, doing
just sets up gsl and its dependencies. If you are relying on other things from CMSSW, then they may not be configured properly. |
I would go with --libs-without-cblas. In fact that was the reason why I asked if you need that or not. I would pick all from CMSSW and remove cblas. |
ok thank you both, I'll just remove that flag then, cheers |
SCRAM gsl tool hook |
cms-sw/cms-common#10 should fix the gsl scram runtime hook to set [a]
|
Hi @smuzaffar thanks a lot for taking care of it. Just for my understanding, so the cblas library must be linked to gsl in any case and it's not safe to simply remove it, right? Following yesterday's comments, that's what I've been doing but I can wait for the deployment of the new feature if that's the correct way of doing this |
@mlizzo , |
Thanks for the clarification, I will wait for the new build and recompile the code again, cheers |
@mlizzo , cms-sw/cms-common#10 has been deployed on cvmfs . Can you please try rebuilding ? |
Hi @smuzaffar , it works perfectly:
Thank you very much for your prompt help |
Dear experts!
We are facing https://cms-unified.web.cern.ch/cms-unified/showlog/?search=task_HIG-RunIISummer20UL16wmLHEGENAPV-13411 error in the WWJ + NNLOPS sample submission.
You can also check :
https://cms-unified.web.cern.ch/cms-unified/report/cmsunified_task_HIG-RunIISummer20UL16wmLHEGEN-14330__v1_T_230821_125219_1993
Here is the ongoing JIRA for this request and you can find the last few comments in the JIRA useful for debugging.
https://its.cern.ch/jira/projects/HIGHPRIOREQ/issues/HIGHPRIOREQ-631?filter=allissues
According to our previous MC contact Mattia who created this gridpack (discussion is also in JIRA), one of the log files [1], the Powheg executable fails because it cannot find the "libgsl.so.0" library. This is unexpected to us, as 1) it never occurred during validation 2) as far as we know, that is a common library used by GNU and is installed in every lxplus machine in /usr/lib (it can be checked by running "gsl-config --prefix"). Is it possible that this library is missing? We do not have idea where exactly the runcmsgrid script is executed, but also have no further clue, therefore we would be very grateful if someone could help us understanding the root of this issue.
[1] cms-unified.web.cern.ch/cms-unified/joblogs/cmsunified_task_HIG-RunIISummer20UL16wmLHEGEN-14330__v1_T_230821_125219_1993/8001/HIG-RunIISummer20UL16wmLHEGEN-14330_0/09f37ff3-c996-49d0-8391-4e670bc78024-149-0-logArchive/job/WMTaskSpace/cmsRun1/cmsRun1-stdout.log
Please let us know if we need to provide anything else.
Thanks and Regards,
Sadhana Verma
cc'ing @sunilUIET too here !
@sunilUIET Please feel free to add other responsible people to this issue who can help us in this regard.
Best,
Sadhana for HWW MC contact
The text was updated successfully, but these errors were encountered: