[Tensor reorder] Universal tensor transform feature, a fallback of batched transpose kernel #1419

aska-0096 · 2022-02-10T07:59:58Z

This kernel support (4-dimensional FP32, FP16, INT8) data layout transform to arbitrary one.

The input tensor is seen as default order with [0, 1, 2, 3], and could be reordered into 23 orders:
[0, 1, 3, 2], [0, 2, 1, 3], [0, 2, 3, 1], [0, 3, 1, 2], [0, 3, 2, 1],
[1, 0, 2, 3], [1, 0, 3, 2], [1, 2, 0, 3], [1, 2, 3, 0], [1, 3, 0, 2], [1, 3, 2, 0],
[2, 0, 1, 3], [2, 0, 3, 1], [2, 1, 0, 3], [2, 1, 3, 0], [2, 3, 0, 1], [2, 3, 1, 0],
[3, 0, 1, 2], [3, 0, 2, 1], [3, 1, 0, 2], [3, 1, 2, 0], [3, 2, 0, 1], [3, 2, 1, 0]

INPUT & OUTPUT tensor reorder commonly used:
NCHW to NCHWc = [0, 2, 3, 1], NHWC to NCHW = [0, 3, 1, 2], NCHW to NHWC =[0, 2, 3, 1], NCHW to NCHWc =[0, 2, 3, 1]
FILTER tensor reorder may used:
kyxc to cyxkc = [2, 1, 0, 3], kcyx to cyxkc = [1, 3, 0, 2]

Among these orders,
[0, 1, 3, 2], [0, 2, 3, 1], [0, 3, 1, 2], [2, 3, 0, 1], [3, 0, 1, 2]
are using batched transpose kernel to achieve higher performance.

shaojiewang

Do we need add some ctest cases for tensor reorder? @carlushuang
Could you please attach some perf data?

src/hip/general_tensor_reorder_sol.cpp

src/include/miopen/tensor_reorder_util.hpp

aska-0096 · 2022-02-14T07:51:08Z

[tensor_reorder]FP32&FP16_perf_data_gfx908.txt
Here is the FP32&FP16 performance data.

…orm/MIOpen into tensor_reorder

aska-0096 · 2022-02-16T09:15:10Z

@atamazov gentle ping for request review on this PR :)

junliume · 2022-02-25T01:11:03Z

Ping reviewers: @shaojiewang @carlushuang @atamazov :)

junliume · 2022-03-11T02:00:33Z

@aska-0096 please ping reviewers to push this PR through. Thanks!

aska-0096 · 2022-03-11T02:02:50Z

@shaojiewang @carlushuang @atamazov ping for requesting review

Re-request review

junliume · 2022-03-14T23:10:05Z

@aska-0096 sorry for the late review, can we add some tests to the functionality?

aska-0096 · 2022-03-18T13:45:12Z

Functional test has been added in test/tensor_reorder.cpp.

atamazov

Sorry for delaying the review.

[Future] @DrizztDoUrden Can we align this tensor transform with the Solver/Solution architecture and then "fuse" solutions/invokers, when do you think?

test/tensor_reorder.cpp

atamazov · 2022-03-24T14:00:52Z

test/tensor_reorder.cpp

+    run_test<reorder_test<float, miopen::TensorReorderSolution>>();
+    run_test<reorder_test<uint16_t, miopen::TensorReorderSolution>>();
+    run_test<reorder_test<uint8_t, miopen::TensorReorderSolution>>();


The test should not cover all data types at once. It should support the following options: --float (the default), --half, --double, -int8 etc. Please look how test_conv2d is designed.

Not resolved yet. #1481 says:

Modify CTest design, now only --all flag will conduct all the data type test while the default one is --float

This does not match the design of other tests. Datatype should be controlled in different way, as described in the comment above.

The --all option controls how many test configs should be tested. With it, a reasonably big set of configs is to be tested. Without it, only one. The test should provide a set of options that allows the user to specify a particular config for testing. This is how test_conv*d is designed.

We can run test like

./bin/test_tensor_reorder --all ./bin/test_tensor_reorder --double ./bin/test_tensor_reorder --float ./bin/test_tensor_reorder --half ./bin/test_tensor_reorder --int8

To specify the data type we want to test.
In PR #1515

Good, but this won't fully resolve this review comment. Please ask questions if you need more info.

Luckily, this is not very important right now.

In my opinion, conv2d call different data type test driver via get the arguments from command line and the command line arguments are defined in test/CMakeLists.txt using add_custom_test function.
Do you mean we should add this ctest to test/CMakeLists.txt using add_custom_test feature and control the datatype via MIOPEN_TEST_FLOAT_ARG ?

@aska-0096 My comment is about the test executable itself.

[Notice] WRT adding custom tests. Please look into test/CMakeLists.txt. You can see there:

file(GLOB TESTS *.cpp) ... foreach(TEST ${TESTS}) get_filename_component(BASE_NAME ${TEST} NAME_WE) add_test_executable(test_${BASE_NAME} ${TEST}) endforeach()

The above instructs CMake to build all .cpp files into executables and then add the executables to the testing. Options like "--all", "--half" etc are added to the command line of the executable in accordance to the MIOPEN_TEST_ALL, MIOPEN_TEST_HALF and similar CMake variables.

So you do not need to specifically add your new test executable to test/CMakeLists.txt, but you have to ensure that executable accepts and properly handles options like "--all", "--float" etc. For example, "--all" should not affect the data type. Please revise the first two review comments in this thread. Please also look into the source of add_test_executable() cmake function for some details.

We use custom tests for adding specific test cases that not covered by "--all", which can be handy for regression testing. That is why I've asked for options that allows to specify a single test case at #1419 (comment).

src/kernels/gpu_general_tensor_reorder_kernel/order.hpp

atamazov · 2022-03-24T14:13:32Z

test/tensor_reorder.cpp

+    }
+};
+
+int main()


The test should support running a single test case, so we need to add relevant options.

All test cases should be run only when --all is specified.

Not resolved in #1481. Please see #1419 (comment)

Updated in PR #1515

~~Let's assume this is resolved; #1419 (comment) can be used as a place for further discussion.~~

src/kernels/gpu_general_tensor_reorder_kernel/general_tensor_reorder.cpp

src/include/miopen/general_tensor_reorder_sol.hpp

src/hip/general_tensor_reorder_sol.cpp

src/include/miopen/tensor_reorder_util.hpp

aska-0096 · 2022-03-25T04:14:11Z

Sorry for delaying the review.

Thanks for reviewing. I'll solve them as soon as possible.

atamazov · 2022-03-25T14:01:07Z

@aska-0096 The comments marked with 🔴 are the most important ones.

atamazov · 2022-04-08T18:59:47Z

@junliume @aska-0096 Please mark all 🟢 review comments as Resolved (I can't do that), thanks.

atamazov · 2022-04-10T22:22:14Z

@junliume @aska-0096 Please mark all 🟢 review comments as Resolved (I can't do that), thanks.

Let's do it once more; thanks!

aska-0096 added 30 commits January 24, 2022 17:40

test_file commit

c52547a

add all files

60d4564

fix some bugs and try

569044f

fix bug

682a725

fix bug

7fc0de7

fix bugs

b1f5c89

fix bugs

9573861

fix bug

ca1bb57

fix bugs

b0c188c

fix bug

57dab09

fixbug

45894a7

fixbug

84863c4

test 1

54d1f2e

General test, (Batched passed)

e5f8617

0321 test

4dba45c

explicit template instance

c3c5303

fix bug

b539c9c

fix bug

b9e8684

move instantiation into sol.hpp

c766a69

fix bug

37b1926

fixbug

a36ce98

fix bug

923e4b3

fix bug

7802205

fixbug

541a1e7

fix bug

45f1a6f

fixbug

3cc7c61

fixbug

08a9c82

fixbug

3374fa6

fixbug

0dfac32

batched test

e9ac702

fix format: add a new line

04f48d6

shaojiewang previously requested changes Feb 11, 2022

View reviewed changes

src/hip/general_tensor_reorder_sol.cpp Outdated Show resolved Hide resolved

src/hip/general_tensor_reorder_sol.cpp Outdated Show resolved Hide resolved

src/include/miopen/tensor_reorder_util.hpp Show resolved Hide resolved

shaojiewang added the TESTING_CI_PASSED label Feb 14, 2022

aska-0096 added 3 commits February 14, 2022 07:56

[skip ci] Update: add double data type suppport.

e42f13f

Merge branch 'tensor_reorder' of https://github.com/ROCmSoftwarePlatf…

d0198e2

…orm/MIOpen into tensor_reorder

Update: add explanation comments on specific order.

a5099b0

junliume added the urgency_normal label Mar 11, 2022

junliume requested review from shaojiewang and removed request for atamazov March 14, 2022 23:08

junliume approved these changes Mar 14, 2022

View reviewed changes

shaojiewang approved these changes Mar 15, 2022

View reviewed changes

junliume merged commit 3efba70 into develop Mar 21, 2022

atamazov reviewed Mar 24, 2022

View reviewed changes

atamazov mentioned this pull request Mar 24, 2022

Post-merge review of #1419 "Universal tensor transform feature" and #1481... #1476

Closed

carlushuang mentioned this pull request Mar 26, 2022

[Tensor reorder][Quality][#issue 1476] Improve naming style and CTest design #1481

Merged

atamazov mentioned this pull request Apr 7, 2022

[tests] Enabled PCH for testing builds that have COMGR enabled (removed WORKAROUND_ISSUE_898) #1478

Merged

junliume mentioned this pull request Apr 7, 2022

[tests] [by Artem] Enabled PCH for testing builds that have COMGR enabled (removed WORKAROUND_ISSUE_898) #1508

Merged

atamazov mentioned this pull request Apr 11, 2022

[Tensor reorder][Quality][#issue 1476] Split kernel file & resolve unsolved issues #1515

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Tensor reorder] Universal tensor transform feature, a fallback of batched transpose kernel #1419

[Tensor reorder] Universal tensor transform feature, a fallback of batched transpose kernel #1419

aska-0096 commented Feb 10, 2022

shaojiewang left a comment

aska-0096 commented Feb 14, 2022

aska-0096 commented Feb 16, 2022

junliume commented Feb 25, 2022

junliume commented Mar 11, 2022

aska-0096 commented Mar 11, 2022

junliume commented Mar 14, 2022

aska-0096 commented Mar 18, 2022

atamazov left a comment

atamazov Mar 24, 2022

atamazov Apr 8, 2022

aska-0096 Apr 9, 2022

atamazov Apr 10, 2022

aska-0096 Apr 12, 2022

atamazov Apr 22, 2022 •

edited

Loading

atamazov Mar 24, 2022

atamazov Apr 8, 2022

aska-0096 Apr 9, 2022

atamazov Apr 10, 2022 •

edited

Loading

aska-0096 commented Mar 25, 2022

atamazov commented Mar 25, 2022

atamazov commented Apr 8, 2022

atamazov commented Apr 10, 2022

[Tensor reorder] Universal tensor transform feature, a fallback of batched transpose kernel #1419

[Tensor reorder] Universal tensor transform feature, a fallback of batched transpose kernel #1419

Conversation

aska-0096 commented Feb 10, 2022

shaojiewang left a comment

Choose a reason for hiding this comment

aska-0096 commented Feb 14, 2022

aska-0096 commented Feb 16, 2022

junliume commented Feb 25, 2022

junliume commented Mar 11, 2022

aska-0096 commented Mar 11, 2022

junliume commented Mar 14, 2022

aska-0096 commented Mar 18, 2022

atamazov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

atamazov Apr 22, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

atamazov Apr 10, 2022 • edited Loading

Choose a reason for hiding this comment

aska-0096 commented Mar 25, 2022

atamazov commented Mar 25, 2022

atamazov commented Apr 8, 2022

atamazov commented Apr 10, 2022

atamazov Apr 22, 2022 •

edited

Loading

atamazov Apr 10, 2022 •

edited

Loading