New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Refactor BnCKFwdInference::GetSolution for NHWC #3120

Merged

junliume merged 16 commits into develop from sl/batchnorm_nhwc

Aug 5, 2024

Contributor

xinlipn commented Jul 17, 2024

Rename RunCKSolution to InitInvokerFactoryBnCKFwdInferenceNHWC to differentiate the upcoming the new API InitInvokerFactoryBnCKFwdInferenceNCHW
Move common code to implicitgemm_ck_util.hpp


          Refactor BnCKFwdInference::GetSolution for NHWC. Rename and move RunC…

b341ef3

…KSolution to implicitgemm_ck_util.hpp

xinlipn requested review from amberhassaan, iq136boy and bghimireamd

July 17, 2024 04:10

xinlipn requested review from JehandadKhan and junliume as code owners

July 17, 2024 04:10

amberhassaan reviewed

View reviewed changes

src/include/miopen/solver/implicitgemm_ck_util.hpp Outdated

@@ @@ -45,6 +50,142 @@ struct ProblemDescription; @@
               namespace solver {
               #if MIOPEN_BACKEND_HIP && MIOPEN_USE_COMPOSABLEKERNEL
+              namespace batchnorm {

Contributor

amberhassaan Jul 17, 2024

this code should be moved back to a header used by batchnorm solvers only if there is common code between forward and backward batchnorm CKArgs etc., then create a separate header otherwise, move back to cpp files.

Structure the CKArgs class similar to how convolution's CKArgs classes are structured where there's a method called MakeArumentPointer.

Contributor Author

xinlipn Jul 19, 2024

@amberhassaan , I have reverted the changes in implicitgemm_ck_util.hpp. Also refactored BnCKFwdInference::GetSolution

src/solver/batchnorm/backward_ck.cpp
src/solver/conv/conv_hip_implicit_gemm_grouped_fwd_xdlops.cpp


          Merge branch 'develop' into sl/batchnorm_nhwc

eee971f

xinlipn self-assigned this

xinlipn marked this pull request as draft

July 18, 2024 22:17

xinlipn added 2 commits

July 19, 2024 14:28


          Refactored BnCKFwdInference::GetSolution and reverted changes in impl…

fd48fd2

…icitgemm_ck_util.hpp


          Merge branch 'develop' into sl/batchnorm_nhwc

bc71bb6

amberhassaan reviewed

View reviewed changes

src/solver/batchnorm/forward_inference_ck.cpp Outdated

               {
-                  const auto& args = CKArgsBNormFwd{problem};
+                  ConvSolution result;
+                  result.invoker_factory = [bn_problem](const std::vector<Kernel>& kernels) {

Contributor

amberhassaan Jul 20, 2024

Do not capture problem_description by value as it's a rather large object. Create variables above line 132 for what you need from problem_description and capture those variables by values.

src/solver/batchnorm/forward_inference_ck.cpp Outdated

    
                      bn_problem,

                      [&](auto data_type_val) {

                          using T = decltype(data_type_val);

                          if constexpr(std::is_same_v<T, F16>)

Contributor

amberhassaan Jul 20, 2024

This if/else-if logic can be simplified as follows:

using AccTy = std::conditional_t<std::is_same_v<T, F64>, 
                               T, // pick this if true
                               F32>; // pick this if false
InvokerFactoryNHWC<T, T, AccT, T, T, AccT>(bn_problem);

In fact InvokerFactory coudl also be changed to take just two template type parameters T and AccT.

xinlipn added 3 commits

July 22, 2024 08:25


          Merge branch 'develop' into sl/batchnorm_nhwc

721a929


          Create variables instead of capturing problem_description, refactor G…

b871ebf

…etSolution with simpler logic


          Merge branch 'develop' into sl/batchnorm_nhwc

21f1456

xinlipn marked this pull request as ready for review

July 25, 2024 15:13

CAHEK7 reviewed

View reviewed changes

src/solver/batchnorm/forward_inference_ck.cpp Outdated

+                                                                                MeanVarDataType>(bn_problem)](
+                                               const std::vector<Kernel>& kernels) {
+                      std::ignore = kernels;
+                      return [&](const Handle& handle, const AnyInvokeParams& primitive_parameters) {

Contributor

CAHEK7 Jul 25, 2024

@atamazov @DrizztDoUrden can we guarantee that invoker_factory context exists longer that its generated lambda?
I'm asking about this case:

result.invoker_factory = [=](...)
{
    retunr [&](...){...};
};

versus that one:

result.invoker_factory = [=](...)
{
    retunr [=](...){...};
};

If we can guarantee that, we can always capture by reference in the second lambda and save some memory and avoid extra copy operations. If we can't, we must always capture by value.

Contributor

DrizztDoUrden Jul 26, 2024 •

edited

Loading

No, it almost always exists for a shorter period. Capturing by a reference here is an error, both lambdas should always capture by copy or move, never referencing.

Contributor

atamazov Jul 26, 2024

Design of Invokers #216

CAHEK7 reviewed

View reviewed changes

src/solver/batchnorm/forward_inference_ck.cpp Outdated

Comment on lines 139 to 140

		const std::vector<Kernel>& kernels) {
		std::ignore = kernels;

Contributor

CAHEK7 Jul 25, 2024

Suggested change

      
                                             const std::vector<Kernel>& kernels) {
          
                    std::ignore = kernels;
          
                                             const std::vector<Kernel>& /* kernels */) {

Contributor Author

xinlipn Jul 26, 2024 •

edited

Loading

@CAHEK7 , thanks for the comments. Code updated. Could you mark this as resolved?

DrizztDoUrden requested changes

View reviewed changes

src/solver/batchnorm/forward_inference_ck.cpp Outdated Show resolved Hide resolved

xinlipn added 5 commits

July 26, 2024 10:32


          Pass by value instead of reference for Lambad in InvokerFactoryMakerNHWC

acacbcc


          Fix std::move of the const variable error

53ea4b1


          Merge branch 'develop' into sl/batchnorm_nhwc

5381a4a


          Fix hip tidy stage error

882b7a5


          Merge branch 'develop' into sl/batchnorm_nhwc

0fa1e36

amberhassaan reviewed

View reviewed changes

Contributor

amberhassaan left a comment

some minor refactoring needed.

src/solver/batchnorm/forward_inference_ck.cpp Outdated

+                                                                                BiasDataType,
+                                                                                MeanVarDataType>(bn_problem)](
+                                               const std::vector<Kernel>& /*kernels*/) mutable {
+                      return [=, args = std::move(args)](const Handle& handle,

Contributor

amberhassaan Jul 30, 2024

don't use = by itself, rather name each variable captured by value.

Contributor Author

xinlipn Jul 31, 2024

@amberhassaan done

src/solver/batchnorm/forward_inference_ck.cpp Outdated

-                      {params.x, params.estimatedMean, params.estimatedVariance, params.bnScale, params.bnBias},
-                      {params.y},
-                      Normalize{params.epsilon});
+                          auto argument_ptr = bn_ptr->MakeArgumentPointer(args.xyLengths,

Contributor

amberhassaan Jul 30, 2024

add a method to BnArgs called MakeArgPtr and you can then pick the fields from inside that class. Better design IMO for composability and hiding details.

Contributor Author

xinlipn Jul 31, 2024

@amberhassaan done


          Create wrapper MakeArgPtr for MakeArgumentPointer and other Lambda im…

be74e26

…provement

DrizztDoUrden approved these changes

View reviewed changes

iq136boy approved these changes

View reviewed changes

amberhassaan reviewed

View reviewed changes

src/solver/batchnorm/forward_inference_ck.cpp Outdated

+                                               const std::vector<Kernel>& /*kernels*/) mutable {
+                      return [args = std::move(args), kernel_index = kernel_index](
+                                 const Handle& handle, const AnyInvokeParams& primitive_parameters) {
+                          using DeviceOp = ck::tensor_operation::device::DeviceElementwise<

Contributor

amberhassaan Jul 31, 2024

Sorry, I missed this earlier. Please move lines (161-171), i.e. computation of bn_ptr outside and capture bn_ptr by move semantics. See the InitInvokerFactory for convolution solvers.

Contributor Author

xinlipn Aug 1, 2024

@amberhassaan , changes have been checked in. Thanks

xinlipn added 2 commits

August 1, 2024 09:56


          Move bn_ptr outside and capture bn_ptr by move and refactor DeviceOp

0fa7894


          Merge branch 'develop' into sl/batchnorm_nhwc

eed467e

amberhassaan approved these changes

View reviewed changes

Contributor

amberhassaan left a comment

This looks OK now. @junliume : ready to merge when it passes CI.


          Replace reference with std::move for clarification

366779a

junliume approved these changes

View reviewed changes

junliume merged commit cfabfbb into develop

140 of 141 checks passed

junliume deleted the sl/batchnorm_nhwc branch

August 5, 2024 05:46

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

CAHEK7 CAHEK7 left review comments

atamazov atamazov left review comments

amberhassaan amberhassaan approved these changes

iq136boy iq136boy approved these changes

DrizztDoUrden DrizztDoUrden approved these changes

junliume junliume approved these changes

bghimireamd Awaiting requested review from bghimireamd

JehandadKhan Awaiting requested review from JehandadKhan JehandadKhan is a code owner

Labels

None yet