这是indexloc提供的服务,不要输入任何密码
Skip to content

[AArch64] Add matmul microkernels for SVE/FEAT_I8MM #21491

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

momchil-velikov
Copy link

@momchil-velikov momchil-velikov commented Jul 25, 2025

Note that the microkernels depend in SVE vector length/scale as described in this comment:

// The next three microkernels work by replicating a single 2x8 LHS tile
// VSCALE times to fill a SVE register and multiplying it by several
// RHS SVE registers to compute two full rows of the output tile.
// The tile size and VSCALE are dependent on each other - the VSCALE must be
// such that the RHS registers (`rh0`, `rhs`, etc) cover the entire RHS tile.
// For example the "4x2VSx8" kernel should work correctly for
// 4x2x8/VSCALE == 1, 4x4x8/VSCALE == 2, 4x8x8/VSCALE == 4 (tested only with
// VSCALE == 2)

The patch was tested (using mmt4d_test/mmt4d_benchmark) with 256-bit SVE vectors. If using 128-bit SVE vectors the microkernel selection process needs to be adjusted.

@momchil-velikov momchil-velikov marked this pull request as draft July 25, 2025 11:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant