Li et al., 2020 - Google Patents

VBSF: a new storage format for SIMD sparse matrix–vector multiplication on modern processors.

Li et al., 2020

Document ID: 16062402257633582260
Author: Li Y; Xie P; Chen X; Liu J; Yang B; Li S; Gong C; Gan X; Xu H
Publication year: 2020
Publication venue: Journal of Supercomputing

External Links

Cited by

Snippet

Sparse matrix–vector multiplication (SpMV) is one of the most indispensable kernels of solving problems in numerous applications, but its performance of SpMV is limited by the need for frequent memory access. Modern processors exploit data-level parallelism to …

Continue reading at search.ebscohost.com (other versions)

239000011159 matrix material 0 abstract description 63

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/141—Discrete Fourier transforms
- G06F17/142—Fast Fourier transforms, e.g. using a Cooley-Tukey type algorithm
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/147—Discrete orthonormal transforms, e.g. discrete cosine transform, discrete sine transform, and variations therefrom, e.g. modified discrete cosine transform, integer transforms approximating the discrete cosine transform
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/80—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8007—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/78—Architectures of general purpose stored programme computers comprising a single central processing unit
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2217/00—Indexing scheme relating to computer aided design [CAD]
- G06F2217/78—Power analysis and optimization

Similar Documents

Publication	Publication Date	Title
Li et al.	2020	VBSF: a new storage format for SIMD sparse matrix–vector multiplication on modern processors
Beamer et al.	2017	Reducing pagerank communication via propagation blocking
Demmel et al.	2012	Communication-optimal parallel and sequential QR and LU factorizations
Anderson et al.	2011	Communication-avoiding QR decomposition for GPUs
Lin et al.	2012	K-means implementation on FPGA for high-dimensional data using triangle inequality
Bahn et al.	2009	Parallel FFT algorithms on network-on-chips
Chen et al.	2018	An efficient SIMD compression format for sparse matrix‐vector multiplication
CA3186227A1 (en)	2022-01-27	System and method for accelerating training of deep learning networks
Schenk et al.	2011	Pardiso
US20180373677A1 (en)	2018-12-27	Apparatus and Methods of Providing Efficient Data Parallelization for Multi-Dimensional FFTs
Martínez-del-Amor et al.	2012	Population Dynamics P systems on CUDA
Sun et al.	2011	An I/O bandwidth-sensitive sparse matrix-vector multiplication engine on FPGAs
Ziane Khodja et al.	2014	Parallel sparse linear solver with GMRES method using minimization techniques of communications for GPU clusters
Oyarzun et al.	2017	Portable implementation model for CFD simulations. Application to hybrid CPU/GPU supercomputers
Shen et al.	2019	A high-performance systolic array accelerator dedicated for CNN
Zhang et al.	2016	Efficient sparse matrix–vector multiplication using cache oblivious extension quadtree storage format
Kim et al.	2012	Compute spearman correlation coefficient with Matlab/CUDA
Magoulès et al.	2015	Auto-tuned Krylov methods on cluster of graphics processing unit
Lu et al.	2023	Tilesptrsv: a tiled algorithm for parallel sparse triangular solve on gpus
Chen et al.	2024	Hdreason: Algorithm-hardware codesign for hyperdimensional knowledge graph reasoning
Steffl et al.	2017	Lacore: A supercomputing-like linear algebra accelerator for soc-based designs
Tao et al.	2015	GPU accelerated sparse matrix‐vector multiplication and sparse matrix‐transpose vector multiplication
Gustavson et al.	2000	Minimal-storage high-performance Cholesky factorization via blocking and recursion
Gao et al.	2017	Adaptive Optimization l 1-Minimization Solvers on GPU
Hofmann et al.	2015	Performance analysis of the Kahan-enhanced scalar product on current multicore processors