Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Jenkinsfile] Improvements and simplifications #806

Merged
merged 14 commits into from
Mar 20, 2021

Conversation

atamazov
Copy link
Contributor

@atamazov atamazov commented Mar 19, 2021

  • [coverage] Use installable builds in 4 Full stages out of 9 (BF16/FP16/FP32, OCL/HIP, MI100/Vega)
  • [improvement] Get rid of MIOPEN_GPU_SYNC
  • [code quality] Remove useless cargo cult docker image name handling (resolves issue [Jenkinsfile][NFC] Remove docker image name handling -- it is useless and just clutters the file. #801)
  • [UX] Always show datatype in stage names. Remove useless comments and empty lines.
  • [UX] Remove 'Release' from stage names. Rename 'Fast' and 'Aux' stages to 'Smoke'.
  • [code quality] Use only nodes marked with 'nogpu' for tests that require no GPU.
  • [code quality] Require 'rocmtest' and 'miopen' labels on all nodes, including gfx908.

[Side note] All relevant CI nodes updated with 'nogpu', 'miopen and 'rocmtest' labels:

  • 'nogpu' is set on nodes that are allowed to run tests that do not require a GPU. Currently, these are all nodes except MI100.
  • 'rocmtest' means that the node belongs to the "ROCmSoftwarePlatform" github organization (which now includes MIOpen, AMDMIGraphX, MIOpenGEMM, MIOpenTensile).
  • 'miopen' means that the node is allowed to run MIOpen tests.

@codecov

This comment has been minimized.

@atamazov
Copy link
Contributor Author

@pfultz2 This Codecov works like a random number generator ;)

@atamazov
Copy link
Contributor Author

@ce1adon @pfultz2 @shurale-nkn @JehandadKhan All CI tests passed. None substantial performance gain or drop observed. Please approve.


def default_image_name() {
return 'miopen-v2'
return 'rocmtest && miopen && ' + name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be (rocmtest || miopen) so you can allocate nodes just for miopen in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to allocate nodes exclusively for miopen by removing other labels (like migraphx) from the nodes. From the PR description:

[Side note] All relevant CI nodes updated with 'nogpu', 'miopen and 'rocmtest' labels:

  • 'nogpu' is set on nodes that are allowed to run tests that do not require a GPU. Currently, these are all nodes except MI100.
  • 'rocmtest' means that the node belongs to the "ROCmSoftwarePlatform" github organization (which now includes MIOpen, AMDMIGraphX, MIOpenGEMM, MIOpenTensile).
  • 'miopen' means that the node is allowed to run MIOpen tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MIGraphX uses rocmtest || migraphx so you would need to remove both migraphx label and the rocmtest label.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Currently I do not plan to be greedy and allocate nodes for MIOpen exclusively.

However -- I would appreciate if you find the hierarchic approach in this file more logical and borrow it for MIGraphX.

Jenkinsfile Show resolved Hide resolved
if(flags.contains('-DBUILD_DEV=Off'))
{
config_targets = 'install ' + config_targets
flags = '-DCMAKE_INSTALL_PREFIX=../install ' + flags
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why aren't we using the default install prefix? The packages we build should be able to be used by others.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried once -- it didn't work and I do not know why. I can try once more , if you wish.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yea it may not work because it may need to be sudo. Why are we adding an install step here? When we call make package it will run install into the build directory, but the package will use the correct install prefix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to test "installed" MIOpen. Some aspects of its behavior differ for DEV builds, e.g. binary cache is enabled. This may accelerate the tests. And recently I fiddled around with an issue which is not reproducible with DEV builds (#798).

Or you mean that install step is not necessary to test non-DEV build? IIRC at least installation of system databases is required.

Jenkinsfile Show resolved Hide resolved
@pfultz2
Copy link
Contributor

pfultz2 commented Mar 20, 2021

This Codecov works like a random number generator ;)

Its not random. This is comparing against 7bd5999 which is not the latest on develop. The code coverage changes are from 1c72099.

It seems like Codecov is not finding the base commit correctly, I am not sure why as I dont see this problem with migraphx.

@atamazov
Copy link
Contributor Author

@pfultz2 pls let me know if this can be merged (or we can discuss more..)

@atamazov atamazov merged commit 7ea7f9f into develop Mar 20, 2021
@pfultz2
Copy link
Contributor

pfultz2 commented Mar 21, 2021

pls let me know if this can be merged (or we can discuss more..)

This shouldn't have been merged yet until the broken packages was fixed.

@atamazov
Copy link
Contributor Author

@pfultz2

This shouldn't have been merged yet until the broken packages was fixed.

Sorry, but I do not understand. Could you please explain a bit more?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants