Issue with `devices` flag on multi-GPU system

Recently when benchmarking libcudf on a DGX system, I ran into an issue where the MR setup by libcudf would only be respected by nvbench on GPU0. We observed that a CUDA MR would be used in place of the MR provided by libcudf. However, the compute did run on the correct GPU as specified by `devices`, so the root cause may be different than some related issues (e.g. https://github.com/NVIDIA/nvbench/pull/113).

This works and uses the pool MR default on GPU4
`
nsys profile -f true --gpu-metrics-device=all --output=report_cudavis --env-var CUDA_VISIBLE_DEVICES=4 ./STREAM_COMPACTION_NVBENCH -d 0 -b 1 -a NumRows=100000000 --timeout 0.3 -a Type=[I32] -a keep=[any] -a cardinality=10000000
`

This does not work and somehow uses a CUDA MR on GPU4
`
nsys profile -f true --gpu-metrics-device=all --output=report ./STREAM_COMPACTION_NVBENCH -d 4 -b 1 -a NumRows=100000000 --timeout 0.3 -a Type=[I32] -a keep=[any] -a cardinality=10000000
`

[20241018_report.zip](https://github.com/user-attachments/files/17525862/20241018_report.zip)




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Issue with `devices` flag on multi-GPU system #189

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue with devices flag on multi-GPU system #189

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Issue with `devices` flag on multi-GPU system #189