Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Porting Pixel Tracks to Alpaka [Not to Merge] #41117

Closed
wants to merge 35 commits into from

Conversation

AdrianoDee
Copy link
Contributor

@AdrianoDee AdrianoDee commented Mar 21, 2023

PR description:

Common work with @borzari and @nothingface0.

This PR will allow to run Pixel Tracks Reconstruction in Alpaka. It's still a work in progress and needs to be properly tested. We are opening it so that it is (more) public and may be reviewed by experts.

Will updated the description accordingly while updating the PR.

This includes #40932 with the latest comments received addressed.

This is not to merge and it's here for testing purposes. It has been split in 8 smaller PRs, to be merged in sequence, to ease the review:

(@ericcano)


21st November

Tested with #43064, everything is fine. Some general clean-up renaming:

  • all the SOA DataFormats now are in the form DataFormats/XYXSoA/;
  • naming uniformly the SoA classes with XYZHost, XYZDevice, XYZsSoACollection;
  • started to remove all the remaining *GPU objects in Alpaka code either with *Device or nothing (e.g. GPUAlgo -> Algo);
  • fixing CopyToHost methods to avoid useless specialization for Host to Host copy;
  • added ASSERT_DEVICE_MATCHES_HOST_COLLECTION everywhere
  • used the "automatic dictionary" generator with SET_PORTABLEHOSTCOLLECTION_READ_RULES;
  • using std::conditional_t for collection Host/Device definition.

The resolution problem was solved by @borzari spotting this (a great catch!):

--- a/RecoTracker/PixelTrackFitting/interface/alpaka/BrokenLine.h
+++ b/RecoTracker/PixelTrackFitting/interface/alpaka/BrokenLine.h
@@ -121,7 +121,7 @@ namespace ALPAKA_ACCELERATOR_NAMESPACE {
       scalar tempC = -rho * y0 + tempSmallU * cosPhi;
       scalar tempB = rho * x0 + tempSmallU * sinPhi;
       scalar tempA = 2. * deltaOrth + rho * (riemannFit::sqr(deltaOrth) + riemannFit::sqr(deltaPara));
-      scalar tempU = alpaka::math::sin(acc, 1. + rho * tempA);
+      scalar tempU = alpaka::math::sqrt(acc, 1. + rho * tempA);
 
       // Intermediate computations for the error matrix transform
       scalar xi = 1. / (riemannFit::sqr(tempB) + riemannFit::sqr(tempC));
diff --git a/RecoTracker/PixelTrackFitting/plugins/PixelTrackDumpAlpaka.cc b/RecoTracker/PixelTrackFitting/plugins/PixelTrackDumpAlpaka.cc
index 7524fa012eb..2b70db60900 100644

15th November

This now includes #43064 up to 5f9c2e6.


19th October

We will use this PR as a proxy for the full development in order to be able to run the integration tests. Changing the status to "Ready to review" to be able to run the bot commands and checks.

Module Naming

For the moment we applied the following rule for the naming:

  1. where the module had in it's name CUDA we simply drop the CUDA suffix;
  2. where this is not possible or doesn't apply we appended Alpaka to the module name.

Where 2. usually applies to SoA to legacy converters.

Additional workflows

An alpaka process modifier is added togheter with a set of new workflows:

  • *.55 running Pixel only in Alpaka;
  • *.554 running Pixel only in Alpaka for profiling;
  • *.557 running Pixel only in Alpaka for CPU vs GPU validation;

A note: in order to cohabit with the CUDA workflows, for the modules providing the conversion to legacy formats, we had to live with the SwitchProducedCUDA logic. For example, for the local reco configurations, siPixelRecHitsPreSplitting is defined as:

# SwitchProducer wrapping the legacy pixel rechit producer
siPixelRecHitsPreSplitting = SwitchProducerCUDA(
    cpu = siPixelRecHits.clone(
        src = 'siPixelClustersPreSplitting'
    )
)

and in order to be able to modify or replace it with toModify or toReplaceWith, the alpaka modifier acts on the cpu branch of the SwitchProducedCUDA.

(alpaka & ~phase2_tracker).toModify(siPixelRecHitsPreSplitting,
    cpu = _siPixelRecHitFromSoAAlpakaPhase1.clone(
            pixelRecHitSrc = cms.InputTag('siPixelRecHitsPreSplittingAlpaka'),
            src = cms.InputTag('siPixelClustersPreSplitting'))
)

This was the only way we found to keep the same naming for the final AoS products.

Run3 Physics Results

Find here all the validation plots from MTV for Run3 ttbar.

Results are almost perfectly overlapping with the exception for the $d_{xy}$ resolution that is degradated (see e.g. here). We are investigating this and should have spotted the culprit.

Run3 Througput

Running a profiling workflow on Run3 data (Run 370293) on fu-c2a02-37-02 we see a degradation in performance (around 20% in througput).

Note that when running a single EDM stream CUDA and Alpaka throughput are the same.


20th October

With 66f48f9 fixed tests (thanks to @ericcano). For the moment commented the testOneHistoContainer tests since the issue is solved in #43064. RecoTracker/PixelTrackFitting/testEigenGPUNoFit_t fails also in a clean CMSSW_13_3_X_2023-10-18-1100.

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-41117/34753

ERROR: Build errors found during clang-tidy run.

/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02777/el8_amd64_gcc11/external/alpaka/develop-20230215-4b4e61dd13deaab9037e250657f620e6/include/alpaka/core/Common.hpp:89:34: note: expanded from macro 'ALPAKA_FN_INLINE'
--
DataFormats/Track/interface/alpaka/PixelTrackUtilities.h:60:41: error: an attribute list cannot appear here [clang-diagnostic-error]
    static constexpr ALPAKA_FN_HOST_ACC ALPAKA_FN_INLINE void copyFromDense(TrackSoAView &tracks,
                                        ^
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02777/el8_amd64_gcc11/external/alpaka/develop-20230215-4b4e61dd13deaab9037e250657f620e6/include/alpaka/core/Common.hpp:89:34: note: expanded from macro 'ALPAKA_FN_INLINE'
--
DataFormats/Track/interface/alpaka/PixelTrackUtilities.h:71:41: error: an attribute list cannot appear here [clang-diagnostic-error]
    static constexpr ALPAKA_FN_HOST_ACC ALPAKA_FN_INLINE void copyToDense(const TrackSoAConstView &tracks,
                                        ^
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02777/el8_amd64_gcc11/external/alpaka/develop-20230215-4b4e61dd13deaab9037e250657f620e6/include/alpaka/core/Common.hpp:89:34: note: expanded from macro 'ALPAKA_FN_INLINE'
--
DataFormats/Track/interface/alpaka/PixelTrackUtilities.h:83:41: error: an attribute list cannot appear here [clang-diagnostic-error]
    static constexpr ALPAKA_FN_HOST_ACC ALPAKA_FN_INLINE int computeNumberOfLayers(const TrackSoAConstView &tracks,
                                        ^
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02777/el8_amd64_gcc11/external/alpaka/develop-20230215-4b4e61dd13deaab9037e250657f620e6/include/alpaka/core/Common.hpp:89:34: note: expanded from macro 'ALPAKA_FN_INLINE'
--
DataFormats/Track/interface/alpaka/PixelTrackUtilities.h:97:41: error: an attribute list cannot appear here [clang-diagnostic-error]
    static constexpr ALPAKA_FN_HOST_ACC ALPAKA_FN_INLINE int nHits(const TrackSoAConstView &tracks, int i) {
                                        ^
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02777/el8_amd64_gcc11/external/alpaka/develop-20230215-4b4e61dd13deaab9037e250657f620e6/include/alpaka/core/Common.hpp:89:34: note: expanded from macro 'ALPAKA_FN_INLINE'
--
gmake: *** [config/SCRAM/GMake/Makefile.coderules:129: code-checks] Error 2
gmake: *** [There are compilation/build errors. Please see the detail log above.] Error 2

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-41117/34770

ERROR: Build errors found during clang-tidy run.

/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02777/el8_amd64_gcc11/external/alpaka/develop-20230215-1b72c1c284fb9ec0aa4ade70e3a15b56/include/alpaka/core/Common.hpp:89:34: note: expanded from macro 'ALPAKA_FN_INLINE'
--
DataFormats/Track/interface/alpaka/PixelTrackUtilities.h:60:34: error: an attribute list cannot appear here [clang-diagnostic-error]
    constexpr ALPAKA_FN_HOST_ACC ALPAKA_FN_INLINE static void copyFromDense(TrackSoAView &tracks,
                                 ^
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02777/el8_amd64_gcc11/external/alpaka/develop-20230215-1b72c1c284fb9ec0aa4ade70e3a15b56/include/alpaka/core/Common.hpp:89:34: note: expanded from macro 'ALPAKA_FN_INLINE'
--
DataFormats/Track/interface/alpaka/PixelTrackUtilities.h:71:34: error: an attribute list cannot appear here [clang-diagnostic-error]
    constexpr ALPAKA_FN_HOST_ACC ALPAKA_FN_INLINE static void copyToDense(const TrackSoAConstView &tracks,
                                 ^
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02777/el8_amd64_gcc11/external/alpaka/develop-20230215-1b72c1c284fb9ec0aa4ade70e3a15b56/include/alpaka/core/Common.hpp:89:34: note: expanded from macro 'ALPAKA_FN_INLINE'
--
DataFormats/Track/interface/alpaka/PixelTrackUtilities.h:83:34: error: an attribute list cannot appear here [clang-diagnostic-error]
    constexpr ALPAKA_FN_HOST_ACC ALPAKA_FN_INLINE static int computeNumberOfLayers(const TrackSoAConstView &tracks,
                                 ^
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02777/el8_amd64_gcc11/external/alpaka/develop-20230215-1b72c1c284fb9ec0aa4ade70e3a15b56/include/alpaka/core/Common.hpp:89:34: note: expanded from macro 'ALPAKA_FN_INLINE'
--
DataFormats/Track/interface/alpaka/PixelTrackUtilities.h:97:34: error: an attribute list cannot appear here [clang-diagnostic-error]
    constexpr ALPAKA_FN_HOST_ACC ALPAKA_FN_INLINE static int nHits(const TrackSoAConstView &tracks, int i) {
                                 ^
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02777/el8_amd64_gcc11/external/alpaka/develop-20230215-1b72c1c284fb9ec0aa4ade70e3a15b56/include/alpaka/core/Common.hpp:89:34: note: expanded from macro 'ALPAKA_FN_INLINE'
--
gmake: *** [config/SCRAM/GMake/Makefile.coderules:129: code-checks] Error 2
gmake: *** [There are compilation/build errors. Please see the detail log above.] Error 2

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-41117/34807

ERROR: Build errors found during clang-tidy run.

/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02777/el8_amd64_gcc11/external/alpaka/develop-20230215-1b72c1c284fb9ec0aa4ade70e3a15b56/include/alpaka/core/Common.hpp:89:34: note: expanded from macro 'ALPAKA_FN_INLINE'
--
DataFormats/Track/interface/alpaka/PixelTrackUtilities.h:60:34: error: an attribute list cannot appear here [clang-diagnostic-error]
    constexpr ALPAKA_FN_HOST_ACC ALPAKA_FN_INLINE static void copyFromDense(TrackSoAView &tracks,
                                 ^
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02777/el8_amd64_gcc11/external/alpaka/develop-20230215-1b72c1c284fb9ec0aa4ade70e3a15b56/include/alpaka/core/Common.hpp:89:34: note: expanded from macro 'ALPAKA_FN_INLINE'
--
DataFormats/Track/interface/alpaka/PixelTrackUtilities.h:71:34: error: an attribute list cannot appear here [clang-diagnostic-error]
    constexpr ALPAKA_FN_HOST_ACC ALPAKA_FN_INLINE static void copyToDense(const TrackSoAConstView &tracks,
                                 ^
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02777/el8_amd64_gcc11/external/alpaka/develop-20230215-1b72c1c284fb9ec0aa4ade70e3a15b56/include/alpaka/core/Common.hpp:89:34: note: expanded from macro 'ALPAKA_FN_INLINE'
--
DataFormats/Track/interface/alpaka/PixelTrackUtilities.h:83:34: error: an attribute list cannot appear here [clang-diagnostic-error]
    constexpr ALPAKA_FN_HOST_ACC ALPAKA_FN_INLINE static int computeNumberOfLayers(const TrackSoAConstView &tracks,
                                 ^
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02777/el8_amd64_gcc11/external/alpaka/develop-20230215-1b72c1c284fb9ec0aa4ade70e3a15b56/include/alpaka/core/Common.hpp:89:34: note: expanded from macro 'ALPAKA_FN_INLINE'
--
DataFormats/Track/interface/alpaka/PixelTrackUtilities.h:97:34: error: an attribute list cannot appear here [clang-diagnostic-error]
    constexpr ALPAKA_FN_HOST_ACC ALPAKA_FN_INLINE static int nHits(const TrackSoAConstView &tracks, int i) {
                                 ^
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02777/el8_amd64_gcc11/external/alpaka/develop-20230215-1b72c1c284fb9ec0aa4ade70e3a15b56/include/alpaka/core/Common.hpp:89:34: note: expanded from macro 'ALPAKA_FN_INLINE'
--
gmake: *** [config/SCRAM/GMake/Makefile.coderules:129: code-checks] Error 2
gmake: *** [There are compilation/build errors. Please see the detail log above.] Error 2

@tvami
Copy link
Contributor

tvami commented Mar 23, 2023

@AdrianoDee in case you didnt notice: you'll need to do code checks

- general renaming to have *Device, *Host, *Collection data formats
- consistent package naming with XSoA
- fix for resolutions
- adding various new functionalities to all dataformatas (automatic dictionaries for Host SoA and Device-Host assert, ...)
@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-41117/37986

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.