Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Fix cub::Debug for builds without RDC #533

Merged
merged 1 commit into from
Jul 20, 2022

Conversation

gevtushenko
Copy link
Collaborator

cub::Debug is used in most device algorithms and uses cudaGetLastError to reset errors occurred occurred during device-scope algorithms invocations. cudaGetLastError usage was introduced in 22b0573. The issue consists in cudaGetLastError usage in __host__ __device__ functions in build with -rdc=false:

__global__ void kernel() {
  cudaGetLastError();
}

int main() {
  kernel<<<1, 1>>>();
}
:nvcc main.cu
ptxas fatal   : Unresolved extern function 'cudaGetLastError'

The issue was caused by 3c20a56, so we have to backport this into 1.17. The fix consists in invoking cudaGetLastError only in the case of host invocations or device ones with -rdc=true.

@gevtushenko gevtushenko added the P1: should have Necessary, but not critical. label Jul 20, 2022
@gevtushenko gevtushenko added this to the 2.0.0 milestone Jul 20, 2022
gevtushenko added a commit to gevtushenko/thrust that referenced this pull request Jul 20, 2022
@gevtushenko gevtushenko added testing: gpuCI in progress Started gpuCI testing. testing: gpuCI passed Passed gpuCI testing. and removed testing: gpuCI in progress Started gpuCI testing. labels Jul 20, 2022
@gevtushenko gevtushenko merged commit a8ef99a into NVIDIA:main Jul 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
P1: should have Necessary, but not critical. testing: gpuCI passed Passed gpuCI testing.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unguarded call to CUDART API from __host__ __device__ function cub::Debug
2 participants