Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[performance] Only allow non-uniform grids for kernels that actually require it. #2708

Open
atamazov opened this issue Jan 29, 2024 · 2 comments

Comments

@atamazov
Copy link
Contributor

atamazov commented Jan 29, 2024

Leftover of

The above PRs have a potential side effect. These unconditionally switch OFF some optimizations in the compiler. As a result, we may not get the expected performance gains (or, worse, get a drop).

value_unknown We don't know how much additional performance we can regain. AFAICS almost nothing with ROCm 6.0. However, as the compiler develops, this may change in the future.

How to resolve the issue: The flags that allow non-uniform grids should be applied only to the kernels that actually use non-uniform grids. This work requires manual investigation of the solver's GetSolution() code.

⚠️ Old .kdb files may contain kernels built with the assumption that block sizes are non-uniform. Therefore, in order to get the expected performance gain, the binary kernel cache needs to be regenerated (after all other fixes are done).

Implementation tips

  • Look for WORKAROUND_SWDEV_413293 in the code
  • Unconditional enforcement of -fno-offload-uniform-block introduced in #2307 should be removed. The code that invokes HIP compiler should parse options and, if -foffload-uniform-block is found, just keep it. Otherise, it should add -fno-offload-uniform-block.
  • HIP: Pass -foffload-uniform-block from the solvers that use only uniform grids.
  • OpenCL: Pass -cl-uniform-work-group-size from the solvers that use only uniform grids.

[Attribution] @junliume @JehandadKhan

@atamazov
Copy link
Contributor Author

[Informative] Description updated with a note on importance/value.

@atamazov
Copy link
Contributor Author

@junliume [process] Setting urgency_unknown is not useless. For example, it is possible to quickly find the tickets that require clarification of urgency/importance, and revisit them, which I would recommend doing periodically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants