Fig. 5: Speed-ups of the AlphaTensor-discovered algorithm. | Nature

Fig. 5: Speed-ups of the AlphaTensor-discovered algorithm.

From: Discovering faster matrix multiplication algorithms with reinforcement learning

Fig. 5

a,b, Speed-ups (%) of the AlphaTensor-discovered algorithms tailored for a GPU (a) and a TPU (b), optimized for a matrix multiplication of size 8,192 × 8,192. Speed-ups are measured relative to standard (for example, cuBLAS for the GPU) matrix multiplication on the same hardware. Speed-ups are reported for various matrix sizes (despite optimizing the algorithm only on one matrix size). We also report the speed-up of the Strassen-square  algorithm. The median speed-up is reported over 200 runs. The standard deviation over runs is <0.4 percentage points (see Supplementary Information for more details). c, Speed-up of both algorithms (tailored to a GPU and a TPU) benchmarked on both devices.

Back to article page