Use CONCURRENT_KERNEL tracing in KernelNameTracer #97280
Merged
+104
−46
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Use CONCURRENT_KERNEL tracing in KernelNameTracer
The standard kernel tracing serializes kernel execution which doesn't play with command buffers. I see different results with command buffers enabled compared to command buffers disabled.
Switching to the concurrent kernel tracing fixes the issue and leads to consistent results.
This is also backfilling the kernel names for B200 in the DotAlgorithm tests.