gemm
Here are 63 public repositories matching this topic...
Low Precision Arithmetic for Convolutional Neural Network Inference
-
Updated
Oct 29, 2017 - C++
-
Updated
Feb 4, 2018 - C++
The repository targets the OpenCL gemm function performance optimization. It compares several libraries clBLAS, clBLAST, MIOpenGemm, Intel MKL(CPU) and cuBLAS(CUDA) on different matrix sizes/vendor's hardwares/OS. Out-of-the-box easy as MSVC, MinGW, Linux(CentOS) x86_64 binary provided. 在不同矩阵大小/硬件/操作系统下比较几个BLAS库的sgemm函数性能,提供binary,开盒即用。
-
Updated
Mar 28, 2019 - C
My experiments with convolution
-
Updated
Jun 21, 2020 - C++
Serial and parallel implementations of matrix multiplication
-
Updated
Feb 19, 2021 - C++
Fast Matrix Multiplication Implementation in C programming language. This matrix multiplication algorithm is similar to what Numpy uses to compute dot products.
-
Updated
Jun 6, 2021 - C
BLISlab: A Sandbox for Optimizing GEMM
-
Updated
Jun 17, 2021 - C
Manually optimize the GEMM (GEneral Matrix Multiply) operation. There is a long way to go.
-
Updated
Aug 22, 2021 - C++
Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.
-
Updated
Nov 28, 2021 - Cuda
Matrix-matrix multiplication implementations benchmarking
-
Updated
Dec 2, 2021 - Rust
Improve this page
Add a description, image, and links to the gemm topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the gemm topic, visit your repo's landing page and select "manage topics."