这是indexloc提供的服务,不要输入任何密码
Skip to content

Pattern to hoist pack unpack ops from scf.for op #21431

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

YashDeshpande25
Copy link
Contributor

@YashDeshpande25 YashDeshpande25 commented Jul 21, 2025

This pattern looks for a scf.yield op within an scf.for, whose producer is a linalg.unpack op & hoist both the unpack & the corresponding pack ops outside the scf.for within GPUPackToIntrinsicsPass.This pattern is helpful if we want to do reduction tiling before packToIntrinsics.

@YashDeshpande25 YashDeshpande25 marked this pull request as draft July 21, 2025 16:51
Copy link
Contributor

@nirvedhmeshram nirvedhmeshram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good, just some suggestions to harden the pattern and testing

@YashDeshpande25 YashDeshpande25 force-pushed the direct_conv_hoist_pack_unpack branch 2 times, most recently from bfd1ee6 to 9b2603a Compare July 23, 2025 00:24
@YashDeshpande25 YashDeshpande25 force-pushed the direct_conv_hoist_pack_unpack branch 2 times, most recently from 8d43595 to 144417a Compare July 25, 2025 19:33
Signed-off-by: Yash Deshpande <ydeshpan@amd.com>
@YashDeshpande25 YashDeshpande25 force-pushed the direct_conv_hoist_pack_unpack branch from 144417a to 19a1783 Compare July 25, 2025 19:39
for (auto operand : yieldOp.getOperands()) {
unpackOp = operand.getDefiningOp<linalg::UnPackOp>();
if (!unpackOp) {
tiedResultIdx++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use for for (auto [tiedResultIdx, operand] = llvm::enumerate(yieldOp.getOperands()), this way you dont need to increment tiedResultIdx at different points.

#map = affine_map<(d0, d1, d2) -> (d0, d2)>
#map1 = affine_map<(d0, d1, d2) -> (d2, d1)>
#map2 = affine_map<(d0, d1, d2) -> (d0, d1)>
func.func @hoist_pack_unpack(%arg0 : tensor<4x8x16x4xf32>, %arg1 : tensor<8x8x16x4xf32>, %arg2 : tensor<64x127xf32>) -> tensor<64x127xf32>{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this test, testing something that the later test with two pack/unpack pairs is not? If it is not, you can remove this one.

#map = affine_map<(d0, d1, d2) -> (d0, d2)>
#map1 = affine_map<(d0, d1, d2) -> (d2, d1)>
#map2 = affine_map<(d0, d1, d2) -> (d0, d1)>
func.func @no_hoist_pack_unpack(%arg0 : tensor<4x8x16x4xf32>, %arg1 : tensor<8x4x16x4xf32>, %arg2 : tensor<64x64xf32>) -> tensor<64x64xf32>{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make the test names more descriptive, e.g no_hoist_pack_unpack_mismatched no_hoist_pack_unpack_mismatched_otherusers

%c0 = arith.constant 0 : index
%c32 = arith.constant 32 : index
%padding_value = arith.constant 0.000000e+00 : f32
%1, %result0 = scf.for %arg3 = %c0 to %c512 step %c32 iter_args(%arg5 = %input, %arg4 = %arg2) -> (tensor<64x128xf32>, tensor<64x127xf32>) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe do another argument in the middle that doesn't have a pack/unpack on it?

linalg::PackOp packOp;
int64_t tiedResultIdx = 0;

// Iterate through all operands of yieldOp & hoist each available
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment is misleading, you hoist the first legal pack/unpack pair?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants