Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge develop into CK_upstream (Please don't squash when merging) #1165

Merged
merged 40 commits into from
Sep 21, 2021

Conversation

asroy
Copy link
Contributor

@asroy asroy commented Sep 21, 2021

CK_upstream is the branch for sync between MIOpen and CK.

Please don't squash when merging develop into CK_upstream, to keep the commit history simple

shaojiewang and others added 30 commits August 22, 2021 00:09
* fix typo in ctest is_anabled->is_enabled

* Extend timeout for retry stage w/ historical data

Co-authored-by: Jun Liu <Liu.Jun@amd.com>
* Remove the num_cu and device name from the db classes

* Fix db installation and remove miopen.db

* pdb fallback if cu count no match

* Enable fallback for missing perf info

* handle empty handle

* handle string exceptions, add cu count to files.

* convert function to lambda

* convert function to lambda

* remove unsupported embed target

* update embedded db logic

* guard miopen_data include

* Remove unable

* fix test verification error due to fft

* address reivew comments: reduce logging verbosity

* update perf data

* vacuum files and remove old db file

* fix cmake prompt help
…ESTS] Support for GFX90A_ENABLED/DISABLED. (#1110)

* remove test_conv2d from skip test for gfx90a

* add gfx90a to blacklist of conv_winoRxS_f2x3

* skip test_conv2d for gfx90a

* use develop's ctest file

* remove fix in solver. working for W/A in ctest

* add W/A in winoRxS f2x3 solver
* [MLIR] Fixing WRW invoker by setting zero to weights

* Set dw tensor to zero for only fp32 case
* Update rocm version to 4.3

* address return brace init list tidy check

* disable navi timeout

Co-authored-by: Jun Liu <Liu.Jun@amd.com>
#1108)

* add solver ConvIgemmFwdV6r1DlopsNchwKcyxNkhw; rename static ck source files

* make inner product compatible on gfx900

* Update src/include/miopen/solver/ck_utility_common.hpp

* compiler parameter use stream

* use int instead of index_t in kernel wrapper

* DynamicBuffer, StaticBuffer, amd_buffer_load support customized value for invalid element

* Add dynamic generic reduction kernel layer (kernel wrappers, kernel implementations and utilities)

* Some updates to dynamic composable kernel facility for the need of dynamic generic reduction

* Update to generic reduction C++ host interface layer to support dynamic generic reduction

* Update to remove tidy complaints in host interface layer

* Change the unary operator form from void op(T &x) to T op(T x)

* Update to pass single workspace pointer for all kernels (fix for OpenCL backend)

* Use cppcheck-suppress to prevent some strange warnings

* Re-use operator [] and () for DynamicBuffer and update to depending codes

* Remove useless codes in first call threadwise/warpwise/blockwise kernel wrappers

* [performance] Remove un-needed local buffer initialization

Co-authored-by: Chao Liu <chao.liu2@amd.com>
Co-authored-by: JD <Jehandad.Khan@amd.com>
* make IsTunable function available from anysolver

git-subtree-dir: fin
git-subtree-split: 684ce5b0fc5853cb2b23213e94f4ea3ef3821745
…default for HIP backend && ROCm 4.3 (#1125)

* disable dynamic reduction by default for HIP backend && ROCm 4.3

* [CI] Upgrade ROCm to 4.3.1

Co-authored-by: Jun Liu <Liu.Jun@amd.com>
* add gfx90a asm igemm config part

* add asm igemm kernels for gfx90a

* debug version

* remove debug code

* enable fwd and bwd asm igemm for nhwc layout

* rename symbol

* fix bug in fwd kernel

* enable gfx90a test for asm igemm nhwc

* fix error in cmake function

* use GFX90a instead of GFX90A

* WA 1093 disable test_conv2d

* remove test_conv2d from skip test for gfx90a

* re-open test_conv2d for ctest

* add gfx90a to blacklist of conv_winoRxS_f2x3

* skip test_conv2d for gfx90a

* fix typo

Co-authored-by: Jun Liu <Liu.Jun@amd.com>
* Fwd spatial multi-kernel bnorm solver

* Scope reduction

* Fixed solver ids order after merges

Co-authored-by: Vasilii Filippov <vfilippo@amd.com>
* Reduce number of dropout test case

* define dropout single ctest
…=4x, 2x (#1114)

* add support for several NHWC bwd ssd config, when k=4x, 2x

* add ctest
* Implement tuning support for ConvMlirIgemmFwd

* Bump MLIR commit to latest

Co-authored-by: Jun Liu <Liu.Jun@amd.com>
* Implemented the 3rd solver

* format

Co-authored-by: Vasilii Filippov <vfilippo@amd.com>
…n error in small cyx case (#1137)

* add gfx908 fwd kernel for fix small cyx

* add gfx90a fwd kernel for fix small cyx

* add ctest for the failed case
…e on ROCM 4.3 (#1131)

* Fix the calculation of ws_buf2_bytes_offset for dynamic reduction in src/reducetensor.cpp

* Just remove IsDynamicReductionEnabled()

* Tiny fix in ReduceTensorDescriptor::GetWorkspaceSize()

* Update to the calculation of ws_buf2_bytes_offset
* Implement tuning support for ConvMlirIgemmFwdXdlops

* Narrowing KPACK size tuning range to 4/8
* disable all asm kernels if xnack enabled

* remove test_find_db, test_main, test_immed_conv2d from skip tests for gfx90a

* fix clang-tidy

Co-authored-by: Jun Liu <Liu.Jun@amd.com>
* Fixed missing checkNumericsOutput in bnorm forward

Co-authored-by: Vasilii Filippov <vfilippo@amd.com>
Co-authored-by: Jun Liu <Liu.Jun@amd.com>
* [Test][CI] Fix #1064 #1095 remove work arounds for test_dropout

* Resolve merge issues
* ci-add-parameter-gpu-type(01) Revert "[TESTS][Navi21] Move Navi21 tests for release schedule (#1135)"

This reverts commit 8b2f260.

* Remove redundant DISABLE_ALL_STAGES parameter

* ci-add-parameter-gpu-type(03) Refactor SMOKE_MIOPENTENSILE_LATEST

* ci-add-parameter-gpu-type(04) Formatting

* ci-add-parameter-gpu-type(05) Finally remove redundant DISABLE_ALL_STAGES

* ci-add-parameter-gpu-type(06) Remove unused BUILD_PACKAGES

* ci-add-parameter-gpu-type(07) Add all TARGET params. Support selection of TARGET_NOGPU

* ci-add-parameter-gpu-type(08) Support selection of VEGA targets

* ci-add-parameter-gpu-type(09) Support selection of TARGET_GFX908

* ci-add-parameter-gpu-type(10) Support selection of TARGET_GFX90A

* ci-add-parameter-gpu-type(11) Support selection of TARGET_NAVI21

* ci-add-parameter-gpu-type(12) Re-implement [TESTS][Navi21] Move Navi21 tests for release schedule (#1135)

* ci-add-parameter-gpu-type(13) Remove WORKAROUND_iGemm_936 from FP32 tests (resolve https://github.com/ROCmSoftwarePlatform/MIOpen/pull/147/files#r706276953)
Slimakanzer and others added 6 commits September 14, 2021 14:40
* explicity disable asm solvers for gfx90a

* revert unnecessary gfx90a restrictions

* fix ConvBinWinogradRxS formatting
* Bnorm backward solver core

* Added return and CheckNumerics to the case when invoker was found.

Co-authored-by: Vasilii Filippov <vfilippo@amd.com>
* tuning updates for 4.4

* [TESTS][Vega][WORKAROUND] disable test_conv_embed_db on Vega

Co-authored-by: Jehandad Khan <jahandad@gmail.com>
Co-authored-by: Artem Tamazov <artem.tamazov@gmail.com>
Co-authored-by: Jun Liu <Liu.Jun@amd.com>
…ult perf_config (#1159)

* [MLIR] Add heuristic init request to represent the default perf_config

* Refactor bwd with default perf_config

* Refactor fwd xdlops with default perf_config

* Refactor bwd xdlops with default perf_config

* Address review feedbacks
@asroy asroy requested a review from junliume September 21, 2021 01:55
junliume
junliume previously approved these changes Sep 21, 2021
Copy link
Collaborator

@junliume junliume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asroy this PR is more a procedure to keep consistency and history. Please be careful not to select the "default" merge option, select from the drop down and first one would be okay?

@codecov

This comment has been minimized.

* Revert find-db changes from #1133 "[Tuning] rocm 4.4 update for develop"

* Revert "Revert find-db changes from #1133 "[Tuning] rocm 4.4 update for develop""

This reverts commit 86e7e0a.

* Revert "[Tuning] rocm 4.4 update for develop (#1133)"

This reverts commit 068b6b1.
@atamazov
Copy link
Contributor

@asroy The CK_upstream branch is not protected, so you can merge this PR (but approval is still required).

BTW I do not see any special need to create PRs for updates of CK_upstream, but you know better.

atamazov
atamazov previously approved these changes Sep 21, 2021
…n' before entering agent. Syntax fixes/workarounds. (#1162)

* Squashed commit of the following:

commit a062e5f
Author: Artem Tamazov <artem.tamazov@gmail.com>
Date:   Sat Sep 18 00:15:17 2021 +0300

    [CI] 3rd fix attempt

commit 5a6ab2d
Author: Artem Tamazov <artem.tamazov@gmail.com>
Date:   Fri Sep 17 18:46:00 2021 +0300

    [CI] More syntax fixes

commit 7546281
Author: Artem Tamazov <artem.tamazov@gmail.com>
Date:   Fri Sep 17 18:29:58 2021 +0300

    [CI] Fix syntax error

commit c4aa858
Author: Artem Tamazov <artem.tamazov@gmail.com>
Date:   Fri Sep 17 18:12:13 2021 +0300

    [CI] Fix: evaluate TARGET param in 'when' before entering agent.

commit 43ee6c4
Merge: 7bf603a 7177b7c
Author: Artem Tamazov <artem.tamazov@gmail.com>
Date:   Thu Sep 16 00:26:16 2021 +0300

    Merge branch 'develop' into wip-rocmtest-trial-vega10

    # RESOLVED Conflicts:
    #	Jenkinsfile

commit 7bf603a
Author: Artem Tamazov <artem.tamazov@gmail.com>
Date:   Wed Jun 30 21:37:56 2021 +0300

    Remove gfx908 from Smoke MLIR

commit abe0930
Merge: ea52135 afbcf4c
Author: Artem Tamazov <artem.tamazov@gmail.com>
Date:   Wed Jun 30 21:20:33 2021 +0300

    Merge branch 'develop' into wip-rocmtest-trial-vega10

commit ea52135
Merge: 691f4cf 7e51452
Author: Artem Tamazov <artem.tamazov@gmail.com>
Date:   Wed Jun 30 21:19:34 2021 +0300

    Merge branch 'develop' into wip-rocmtest-trial-vega10

    # RESOLVED Conflicts:
    #	Jenkinsfile

commit 691f4cf
Merge: 0523a04 856055f
Author: Artem Tamazov <artem.tamazov@gmail.com>
Date:   Fri Apr 30 01:48:05 2021 +0300

    Merge branch 'develop' into wip-rocmtest-trial-vega10

    # RESOLVED Conflicts:
    #	Jenkinsfile

commit 0523a04
Author: Artem Tamazov <artem.tamazov@gmail.com>
Date:   Mon Apr 26 17:38:17 2021 +0300

    Disable SMOKE_MIOPENTENSILE_LATEST (Vega20, gfx908)

commit eb5a1f3
Merge: 4d84648 312cc22
Author: Artem Tamazov <artem.tamazov@gmail.com>
Date:   Mon Apr 26 17:34:19 2021 +0300

    Merge branch 'develop' into wip-rocmtest-trial-vega10

commit 4d84648
Merge: 05c96e7 00d5754
Author: Artem Tamazov <artem.tamazov@gmail.com>
Date:   Wed Apr 21 16:36:53 2021 +0300

    Merge branch 'develop' into wip-rocmtest-trial-vega10

    # RESOLVED Conflicts:
    #	Jenkinsfile

commit 05c96e7
Merge: 9a642d4 6dc32fe
Author: Artem Tamazov <artem.tamazov@gmail.com>
Date:   Wed Apr 14 23:35:56 2021 +0300

    Merge branch 'develop' into wip-rocmtest-trial-vega10

    # RESOLVED Conflicts:
    #	Jenkinsfile

commit 9a642d4
Author: omkar kakarparthi <omkarkakarparthi@omkars-MacBook-Pro.local>
Date:   Sat Apr 10 15:56:22 2021 -0500

    adding back rocm-3.7

commit 4da4fcc
Author: omkar kakarparthi <omkarkakarparthi@omkars-MacBook-Pro.local>
Date:   Sat Apr 10 01:06:47 2021 -0500

    adidng rocm-4.1

commit 7ac2301
Merge: 1a57bab 1825f34
Author: Artem Tamazov <artem.tamazov@gmail.com>
Date:   Thu Apr 8 00:08:48 2021 +0300

    Merge branch 'develop' into wip-rocmtest-trial-vega10

    # RESOLVED Conflicts:
    #	Jenkinsfile

commit 1a57bab
Author: Artem Tamazov <artem.tamazov@gmail.com>
Date:   Mon Mar 29 23:26:29 2021 +0300

    Targeted to rocmtest-trial nodes. Removed all tests except intended for vega/vega10.

* [CI] HOTFIX: Add docker user to the `render` group for Ubuntu 20.04

* [CI] Add render group to the dockerfile
@junliume
Copy link
Collaborator

@asroy new reviews will be dismissed whenever develop is updated. Since at this current commit (2d7b2dd) we will do a staging anyway, thus merging it back to CK

@junliume junliume merged commit 1108ea5 into CK_upstream Sep 21, 2021
ltqin pushed a commit that referenced this pull request Oct 28, 2021
646fcc268 Merge pull request #47 from ROCmSoftwarePlatform/develop
6014185ac [Bug Fix] GridwiseGemm_bk0mk1_bk0nk1_mn_xdlops_v2r4 loop issue (#44)
3e9113707 Merge pull request #46 from ROCmSoftwarePlatform/miopen_downstream_all
211dae822 Merge branch 'develop' into miopen_downstream_all
5890e3007 [Composable Kernel] update develop branch code to ck_upstream
d5297abae fix bug in gridwise gemm xdlops v2r3 (#45)
38a90b6ed Merge pull request #43 from ROCmSoftwarePlatform/develop
c3018794b bug fix (#39)
fd49ff808 add nchw atomic , nhwc and nhwc atomic method   for backward weight (#30)
b2dc55f82 [MIOpen Downstream] Fix Reduction Kernel (#34)
b3e8d57d5 Tweak GEMM kernel (#38)
846f462bd Add VectorType support into StaticBuffer (#27)
dfb80c4e3 [Enhancements] Several bugfixes and refactoring of dynamic generic reduction  (#1156)
8557901d0 Merge pull request #1165 from ROCmSoftwarePlatform/develop
f305bebdc Merge pull request #31 from ROCmSoftwarePlatform/miopen_downstream-dynamic_reduction_pr
b725e3fc8 Merge remote-tracking branch 'origin/develop' into miopen_downstream-dynamic_reduction_pr
88833bd9a Merge pull request #32 from ROCmSoftwarePlatform/develop
df0d68106 :Merge remote-tracking branch 'origin/develop' into CK_upstream
f3acd2510 Add  a version of Merge transform that use integerdivision and mod (#25)
19613902b GEMM driver and kernel (#29)
627d8ef35 Backward weight v4r4r2 with xdlops (#18)
10bb81106 Misc fixes (#24)
9e80cdceb [SWDEV-281541][MSRCHA-100] Implementation of Dynamic Generic Reduction  (#1108)
a7a758d8c GlobalAtomicAdd for fp32/int32 (#23)
9d3f634a3 Xdlops refactor fix (#22)
c6f26bb48 magic division use __umulhi() (#19)
6fe3627a9 Composable kernel init integration v3 (#1097)
a2ad6d353 refactor dynamic xdlops iGemm (#13)
ba6f79a75 Added host_conv_wrw for verification (#15)

git-subtree-dir: src/composable_kernel
git-subtree-split: 646fcc268ede841a16cdaafb68aa64803d8390e1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.