DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
-
Updated
Jun 2, 2024 - Python
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
A library for easily merging multiple LLM experts, and efficiently train the merged LLM.
Repository for our paper "See More Details: Efficient Image Super-Resolution by Experts Mining"
MoE Decoder Transformer implementation with MLX
PyTorch library for cost-effective, fast and easy serving of MoE models.
[arXiv'24] Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization
The idea to create the perfect LLM currently possible came to my mind because I was watching a YouTube on GaLore, the "sequel" to LoRa, and I realized how fucking groundbreaking that tech is. I was daydreaming about pretraining my own model, this (probably impossible to implement) concept is a refined version of that model.
Surrogate Modeling Toolbox
Efficient global optimization toolbox in Rust: bayesian optimization, mixture of gaussian processes, sampling methods
[SIGIR'24] The official implementation code of MOELoRA.
[Paper][Preprint 2024] Mixture of Modality Knowledge Experts for Robust Multi-modal Knowledge Graph Completion
an LLM toolkit
Mistral and Mixtral (MoE) from scratch
Simplified Implementation of SOTA Deep Learning Papers in Pytorch
RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion Models
Early release of the official implementation for "GraphMETRO: Mitigating Complex Graph Distribution Shifts via Mixture of Aligned Experts"
A curated reading list of research in Adaptive Computation, Dynamic Compute & Mixture of Experts (MoE).
Fast Inference of MoE Models with CPU-GPU Orchestration
Implementation of the "the first large-scale multimodal mixture of experts models." from the paper: "Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts"
Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity"
Add a description, image, and links to the mixture-of-experts topic page so that developers can more easily learn about it.
To associate your repository with the mixture-of-experts topic, visit your repo's landing page and select "manage topics."