-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MIOpenTensile Part1 : Add MIOpenTensile in GEMM path #139
Conversation
Congratulations 🎉. DeepCode analyzed your code in 2.794 seconds and we found no issues. Enjoy a moment of no bugs ☀️. 👉 View analysis in DeepCode’s Dashboard | Configure the bot |
The Invokers work is complicated by frequent changes in the baseline code. For example, recent additions of ImplicitGemm solvers have created substantial additional work for @DrizztDoUrden, and, thus have delayed the overall Invoker's ETA. @DrizztDoUrden is also reworking Gemm code to Solvers and Invokers. This work is currently paused because Vasya is doing something more important (ImplicitGemm Invokers), but I heard it is almost ready. If this PR will be merged first, then its code will need to be reworked once again almost immediately after merging, and that is we all would like to avoid, I guess. I am not aware of the priorities of this MIOpenTensile thing. @daniellowell @ce1adon @DrizztDoUrden could you please communicate and decide the optimal order? |
Doesn't matter much unless it's conv...ocl.cpp changes, which are not here. I don't change gemm internals there. |
Okay, then. @ce1adon Could you please set priority label. |
@ce1adon Please describe how the PR affects the library's behavior. Or this is NFC? |
It's not a new feature. The influence is updated in top comment. |
Jira ticket related to the blocker of this PR |
@ce1adon Up to you. However, AFAIU |
hip-clang.docker
Outdated
RUN export HIPCC_LINK_FLAGS_APPEND='-O3 -parallel-jobs=4' | ||
RUN export HIPCC_COMPILE_FLAGS_APPEND='-O3 -Wno-format-nonliteral -parallel-jobs=4' | ||
# install last released miopentensile in default, install latest commits when MIOTENSILE_VER="latest" | ||
RUN if [ "$MIOTENSILE_VER" = "latest" ] ; then cget -p $PREFIX install -DAMDGPU_TARGETS=${GPU_ARCH} ROCmSoftwarePlatform/MIOpenTensile@e258bbaed0bd9de546ea38c9a5c42f71fa41d9a0; else cget -p $PREFIX install -DAMDGPU_TARGETS=${GPU_ARCH} ROCmSoftwarePlatform/MIOpenTensile@2e3e792b2674bf8cdf2620749a298ed5313351bb; fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-DAMDGPU_TARGETS=${GPU_ARCH}
should be passed cget init
. Also this should install tensile from a file so that hcc and hip-clang install the same version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will move gpu_arch to init.
Currently both hcc and hip-clang are using the same version. The version divergence here is both for hip-clang. It intends to test both a stable version and a latest version of yaml+tensile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved gpu_arch flag to cget init
…-clang dockerfile
The last test passed but is marked as failed. |
NOTE: d0969a0 passed at # 38 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* add miopentensile path * use miotensile in hip backend * update miotensile * generalize miotensile stride * disable miopentensile in ocl backend * remove ocl backend consecutive identical branches * disable use of miotensile in hip-clang * update miopentensile version * use const arg in gemm * miot introduce transpose feature * update miot version * disable miotensile in hip-clang in cmakelists * disable miotensile except in hcc in cmakelists * disable ocl wrw2 solvers * enable miopentensile built with hipclang * add miotensile in hip-clang docker * add python3-venv in hip-clang docker * hip-clang docker update * update hip-clang docker compiler * update miopentensile info * disable miotensile with hcc * set miopentensile off & add dedicated test stage * Revert "clang tidy" This reverts commit bd706c5. * Revert "disable ocl wrw2 solvers" This reverts commit 17f5a59. * workaround for issue 2534 * Adjust test config * Adjust gemm env_var * use miotensile in winograd3x3multipass solver * revert gemm enforce backend values * skip igemm tests in test stage iv (miotensile tests) * skip igemm tests in test stage iv * use parallel jobs for tensile * specify gpu architecture(s) in miotensile build process * fix jenkins docker arg * fix jenkins docker arg string * rename gpu target param for miotensile * specify gpu architecture(s) in Jenkinsfile buildJob * relocate miotensile gpu targets arg * add symlink to /opt/rocm in Dockerfile * revise Dockerfile to pass GPU architecture flag for dependencies * add jenkins tests for last/latest version of miopentensile * remove workaround for issue #2534 * winograd workaround condition update: current miopentensile doesn't support bf16/fp16 * fix jenkinsfile typo & move GPU architecture flag to cget init in hip-clang dockerfile * revise conv_multipass_wino3x3WrW GEMM workaround for miopentensile * miopentensile stage rename * reset miopentensile version in hcc * revert wino3x3WrW datatype constraint in miopentensile * remove miopentensile from dependency list & update version Co-authored-by: Jing Zhou <zhou@ixt-rack-95.local.lan>
OR (MIOPEN_TEST_BFLOAT16 AND ${PARSE_ALLOW_BFLOAT16}) | ||
OR (MIOPEN_TEST_HALF AND ${PARSE_ALLOW_HALF}) | ||
OR (MIOPEN_TEST_INT8 AND ${PARSE_ALLOW_INT8})) | ||
OR (MIOPEN_TEST_INT8 AND ${PARSE_ALLOW_INT8}) | ||
OR (NOT (MIOPEN_TEST_MIOTENSILE AND (NAME IN_LIST SKIP_TESTS)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ce1adon It should be
OR (MIOPEN_TEST_MIOTENSILE AND NOT (NAME IN_LIST SKIP_TESTS)))
Current code enables ALL tests when MIOPEN_TEST_MIOTENSILE=off, thus affecting non-miopentensile tests (half etc). I will fix that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is under the condition of (NOT (MIOPEN_TEST_INT8 OR MIOPEN_TEST_BFLOAT16 OR MIOPEN_TEST_HALF OR MIOPEN_TEST_MIOTENSILE)), which means this only applies to fp32 cases. Do you mean fp32 doesn't need to test all the cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is not "under" that confition. It is just one of ORed conditions. Alternate formatting:
if(
(NOT (MIOPEN_TEST_INT8 OR MIOPEN_TEST_BFLOAT16 OR MIOPEN_TEST_HALF OR MIOPEN_TEST_MIOTENSILE)) // (1)
OR (MIOPEN_TEST_BFLOAT16 AND ${PARSE_ALLOW_BFLOAT16}) // (2)
OR ( MIOPEN_TEST_HALF AND ${PARSE_ALLOW_HALF}) // (3)
OR (MIOPEN_TEST_INT8 AND ${PARSE_ALLOW_INT8}) // (4)
OR (NOT (MIOPEN_TEST_MIOTENSILE AND (NAME IN_LIST SKIP_TESTS))) // (5)
) ...
When MIOPEN_TEST_MIOTENSILE=Off, condition (5) is always true. This works properly for MIOpenTensile, that's why you've missed that, I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 318f480 (not yet merged to develop)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in #750
If MIOpenTensile is the default for FP32, then shall we remove FP32 testing from "MIOpenTensile" stages in the Jenkinsfile? |
The purpose of this PR is to add MIOpenTensile path in GEMM algorithm for fp32 on hip backend. It replaces the default GEMM path through rocblas in HIP backend for fp32 type.