-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
igemm WrW invokers #233
igemm WrW invokers #233
Conversation
Congratulations 🎉. DeepCode analyzed your code in 2.588 seconds and we found no issues. Enjoy a moment of no bugs ☀️. 👉 View analysis in DeepCode’s Dashboard | Configure the bot |
Reviews? |
I just pray not to have any more solver merges here. They are trivial but require a hell lot of testing to be sure it's good to go. I will enjoy the stuff in the Run and Measure PR anyway. |
Ok, let's have this one next to merge. |
Sure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else | ||
{ | ||
result.invoker_factory = [](const std::vector<Kernel>& kernels) { | ||
return [=](const Handle& handle, const boost::any& primitve_params) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SetTensor is missing for fp32. Please take a look at the deleted line 4212 in convolutionocl.cpp file in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, will fix that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OMG
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how it passed tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It didn't won in your tests. This issue leads to correctness issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean there are long tests which should test something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this solver is a new iGemm that produces much more "universal" kernels (like Winograd, so that one kernel fits many problem configs) but the cost is performance drop. So it is quite possible that it never wins in our CI tests.
/cc @daniellowell
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, what would be interesting to have is a Find mode that also does verification. A self-testing feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting idea 🔥
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually we can make it so right now. The driver could register callbacks in the library. The library should invoke callback when Solvers are being evaluated, after each Invok'ation. The driver verifies output buffer. Done. Ha.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please throw in MakeImplGemmDataInvokerFactory()
if ctx.direction.IsBackwardWrW()
is true.
else | ||
{ | ||
result.invoker_factory = [](const std::vector<Kernel>& kernels) { | ||
return [=](const Handle& handle, const boost::any& primitve_params) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know im knit-picking but are you trying to say 'primitive_params' instead of primitve_params? throughout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, thank you. I guess I should install a spellchecker for the VS xD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch 👁️
@atamazov I have fixed the missing invoker |
I see, thanks. Will re-test soon. Besides, develop is currently blocked for merging due to CI issues, @daniellowell is working of resolving those. |
This comment has been minimized.
This comment has been minimized.
I merged develop into this branch. It should pass the CI. Final reviews please. |
This comment has been minimized.
This comment has been minimized.
8e1c802
to
3a19178
Compare
@atamazov @alexandraBara Let's move this PR forward. |
Just finished rigorous testing session, collecting results... @alexandraBara Please free to ask @DrizztDoUrden about details of design & implementation. We would like to share knowledge with anyone who wants. Questions make this easier for us. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
🌀 Testing resultsNo performance or correctness regressions among 93 FP32 and 93 FP16 configs, Radeon VII, ROCm 3.5. Both Normal Find and Immediate modes examined. |
Can we get a design document ? |
@JehandadKhan On what specifically? This pr is about WrW igemm invokers, which are 99% copy-paste from old calling code from conv...ocl.cpp |
Basic info can be found at #216. We plan to further elaborate explanations using the questions coming from interested individuals. |
Blocker for #203 as some of the solvers here are tunable.