Compare MKL inv speed
... to https://github.com/pc2/SubmatrixMethod/blob/master/mkl-matrix-inv.c.
Test runs
Input matrix: sprandsym-s1000-d1-c2-n1.txt
Results:
➜ bauerc@cn-0252 SubmatrixMethod git:(master) MKL_NUM_THREADS=1 ./mkl-matrix-inv 1000 sprandsym-s1000-d1-c2-n1.txt sprandsym-s1000-d1-c2-n1.out.txt
Reading input matrix took 0.042s
Doing a precise inversion took 0.081s
Writing result matrix took 0.253s
➜ bauerc@cn-0252 SubmatrixMethod git:(master) MKL_NUM_THREADS=1 ./mkl-matrix-inv 1000 sprandsym-s1000-d1-c2-n1.txt sprandsym-s1000-d1-c2-n1.out.txt
Reading input matrix took 0.027s
Doing a precise inversion took 0.077s
Writing result matrix took 0.296s
➜ bauerc@cn-0252 SubmatrixMethod git:(master) MKL_NUM_THREADS=1 ./mkl-matrix-inv 1000 sprandsym-s1000-d1-c2-n1.txt sprandsym-s1000-d1-c2-n1.out.txt
Reading input matrix took 0.027s
Doing a precise inversion took 0.077s
Writing result matrix took 0.296s
➜ bauerc@cn-0252 SubmatrixMethod git:(master) MKL_NUM_THREADS=1 ./mkl-matrix-inv 1000 sprandsym-s1000-d1-c2-n1.txt sprandsym-s1000-d1-c2-n1.out.txt
Reading input matrix took 0.027s
Doing a precise inversion took 0.065s
Writing result matrix took 0.252s
➜ bauerc@cn-0252 SubmatrixMethod git:(master) MKL_NUM_THREADS=20 ./mkl-matrix-inv 1000 sprandsym-s1000-d1-c2-n1.txt sprandsym-s1000-d1-c2-n1.out.txt
Reading input matrix took 0.027s
Doing a precise inversion took 0.080s
Writing result matrix took 0.257s
➜ bauerc@cn-0252 SubmatrixMethod git:(master) MKL_NUM_THREADS=20 ./mkl-matrix-inv 1000 sprandsym-s1000-d1-c2-n1.txt sprandsym-s1000-d1-c2-n1.out.txt
Reading input matrix took 0.027s
Doing a precise inversion took 0.054s
Writing result matrix took 0.256s
➜ bauerc@cn-0252 SubmatrixMethod git:(master) MKL_NUM_THREADS=20 ./mkl-matrix-inv 1000 sprandsym-s1000-d1-c2-n1.txt sprandsym-s1000-d1-c2-n1.out.txt
Reading input matrix took 0.027s
Doing a precise inversion took 0.056s
Writing result matrix took 0.257s
➜ bauerc@cn-0252 SubmatrixMethod git:(master) MKL_NUM_THREADS=20 ./mkl-matrix-inv 1000 sprandsym-s1000-d1-c2-n1.txt sprandsym-s1000-d1-c2-n1.out.txt
Reading input matrix took 0.027s
Doing a precise inversion took 0.094s
Writing result matrix took 0.261s
Considering the minimum we find 65ms
for 1 thread and 54ms
for 20 threads.
Edited by Carsten Bauer