#

gemm

Here are 63 public repositories matching this topic...

szagoruyko / openai-gemm.pytorch

PyTorch bindings for openai-gemm

Updated Feb 6, 2017
Python

xylcbd / gemm_base

gemm baseline code.

gemm mkl openblas gemm-optimization

Updated Oct 22, 2017
C++

KaiserKlayton / lpa_cnn

Low Precision Arithmetic for Convolutional Neural Network Inference

benchmarking caffe deep-learning image-recognition convolutional-neural-networks 8-bit gemm

Updated Oct 29, 2017
C++

blackccpie / fastconv

fast 2D convolution implementation benchmark

cpp avx simd convolution gemm toeplitz im2col

Updated Nov 21, 2017
C++

riskybacon / mnist_arma_blas

machine-learning matrix-multiplication blas gemm

Updated Feb 4, 2018
C++

koallen / gemm-optimization

My experiments on optimizing GEMM

optimization math-library gemm

Updated Oct 17, 2018
C

PhuNH / hpc-aa

High Performance Computing - Algorithms and Applications Course in WS18-19 at TUM

cuda fermi gemm

Updated Feb 3, 2019
C++

mz24cn / gemm_optimization

The repository targets the OpenCL gemm function performance optimization. It compares several libraries clBLAS, clBLAST, MIOpenGemm, Intel MKL(CPU) and cuBLAS(CUDA) on different matrix sizes/vendor's hardwares/OS. Out-of-the-box easy as MSVC, MinGW, Linux(CentOS) x86_64 binary provided. 在不同矩阵大小/硬件/操作系统下比较几个BLAS库的sgemm函数性能，提供binary，开盒即用。

opencl cublas matrix-multiplication blas gemm mkl clblas sgemm clblast gemm-optimization clnet

Updated Mar 28, 2019
C

hma02 / cublasHgemm-P100

Code for testing the native float16 matrix multiplication performance on Tesla P100 and V100 GPU based on cublasHgemm

gpu cublas precision gemm half-precision float16 p100 v100

Updated Aug 20, 2019
Cuda

andreytkachenko / yarblas

Yet another rust BLAS

rust machine-learning math rust-lang blas gemm

Updated Feb 13, 2020
Rust

BenQuickDeNN / CUDA-GEMM

CUDA version GEMM

Updated Mar 5, 2020
C++

scocoyash / Convolution-To-Gemm

My experiments with convolution

matrix-multiplication convolution openmpi gemm gemm-optimization

Updated Jun 21, 2020
C++

CoffeeBeforeArch / mmul

Serial and parallel implementations of matrix multiplication

serial parallel matrix-multiplication benchmarks gemm mmul

Updated Feb 19, 2021
C++

CambriconECO / BANGC_Gemm_Tutorial

algorithm gemm cambricon bangc

Updated Apr 7, 2021
C++

iVishalr / GEMM

Fast Matrix Multiplication Implementation in C programming language. This matrix multiplication algorithm is similar to what Numpy uses to compute dot products.

c matrix-multiplication gemm gemm-optimization

Updated Jun 6, 2021
C

flame / blislab

BLISlab: A Sandbox for Optimizing GEMM

matrix-multiplication gemm code-optimization blis

Updated Jun 17, 2021
C

zixuanweeei / gemm-opt

Manually optimize the GEMM (GEneral Matrix Multiply) operation. There is a long way to go.

cpu cpp gemm gemm-optimization

Updated Aug 22, 2021
C++

fsword73 / HPC-Course-2021

HPC course for Grad 3/4th 2021

Updated Nov 4, 2021
Cuda

yzhaiustc / Optimizing-SGEMM-on-NVIDIA-Turing-GPUs

Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.

optimization cuda nvidia gemm

Updated Nov 28, 2021
Cuda

martins0n / matmul

Matrix-matrix multiplication implementations benchmarking

matrix-multiplication blas gemm matmul

Updated Dec 2, 2021
Rust

Improve this page

Add a description, image, and links to the gemm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gemm topic, visit your repo's landing page and select "manage topics."