Skip to content
spaffy edited this page Feb 29, 2012 · 1 revision

Sparse Matrix-Vector Multiplication (SpMV)

Description: Measures performance for sparse matrix vector multiplication using several different algorithms and data structures. The default randomly generated matrices are square and have a sparsity of 1 percent. Alternatively, a matrix market file can be loaded using the --mm_filename example.mm argument.

Problem Sizes: (NxN Matrix) - 1024, 8192, 12288, 16384

Precision: Both

Includes PCIe Transfer Time: in [testName]_PCIe measurements

SpMV uses three kernels. The first two are based on the compressed sparse row (CSR) data structure. The first kernel assigns one thread or local work group item to each row of the matrix. The second kernel takes the same approach, except it assigns a full warp or small group of threads to handle each row. These kernels are tested on both normal and padded data. The third kernel uses the recently proposed ELLPACKR data structure. For more information on sparse matrix vector multiplication on GPUs, see the excellent paper by M. Garland.

Specific Tests (All report GFLOPS)

  • CSR-Scalar - performance of the single thread per row CSR kernel
  • CSR-Vector - performance of the warp/vector of threads per row CSR kernel
  • ELLPACKR - performance of the kernel using the ELLPACKR data structure
Clone this wiki locally