这是indexloc提供的服务,不要输入任何密码
Skip to content

[QST] CuTeDSL should drop cuda.stream when creating mangle name. #2763

@PannenetsF

Description

@PannenetsF

What is your question?
Hi everyone, I modified the example to JIT mode, i.e., gemm(mA, mB, mC, stream), but observed cache missing among different processes.

In fact, stream is only used in kernel launch, and does not affect the compilation. Adding it as a placeholder in mangle name might help.

Do you have any suggestions? Thanks!

http://github.com/NVIDIA/cutlass/blob/bd96096d58e4886e204cd1d71a385ca73e7719b8/examples/python/CuTeDSL/hopper/dense_gemm.py#L381

http://github.com/NVIDIA/cutlass/blob/bd96096d58e4886e204cd1d71a385ca73e7719b8/python/CuTeDSL/cutlass/base_dsl/dsl.py#L555

Furthermore, could you kindly expose the mangle_name API so that users could check if they need re-compilation (i.e., the AoT)?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions