inference

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

deep-learning inference nvidia gpu-acceleration tensorrt

Updated Jun 1, 2024
C++

aws / amazon-sagemaker-examples

Star

Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.

training aws data-science machine-learning reinforcement-learning deep-learning examples jupyter-notebook inference sagemaker mlops

Updated Jun 1, 2024
Jupyter Notebook

🔮 SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalable model training and vector search.

Updated Jun 2, 2024
Python

xorbitsai / inference

Star

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

Updated Jun 2, 2024
Python

huggingface / text-generation-inference

Star

Large Language Model Text Generation Inference

nlp bloom deep-learning inference pytorch falcon transformer gpt starcoder

Updated Jun 1, 2024
Python

deepjavalibrary / djl-serving

Star

A universal scalable machine learning model deployment solution

deep-learning deployment inference pytorch serving djl

Updated Jun 1, 2024
Java

autonomi-ai / nos

Star

⚡️ A fast and flexible PyTorch inference server that runs locally, on any cloud or AI HW.

machine-learning computer-vision inference inference-acceleration generative-ai llm-inference

Updated Jun 1, 2024
Python

pgmpy / pgmpy

Star

Python Library for learning (Structure and Parameter), inference (Probabilistic and Causal), and simulations in Bayesian Networks.

python inference simulations bayesian-networks probabilistic-graphical-models causal-inference structure-learning directed-acyclic-graph causal-discovery

Updated Jun 1, 2024
Python

google / JetStream

Star

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).

gpu inference pytorch transformer llama gpt gemma model-serving tpu jax mlops large-language-models llm llmops llm-inference llama2

Updated Jun 2, 2024
Python

wesleey / python-artificial-intelligence

Sponsor

Star

Search, Knowledge, Uncertainty, Optimization, Learning, Neural Networks and Language.

python markov-chain pagerank-algorithm inference artificial-intelligence search-algorithm minimax propositional-logic adversarial-search breadth-first-search alpha-beta-pruning depth-first-search informed-search uninformed-search a-star-search depth-limited-search

Updated Jun 1, 2024
Python

roboflow / inference

Star

A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.

Updated Jun 1, 2024
Python

Improve this page

Add a description, image, and links to the inference topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the inference topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inference

Here are 1,194 public repositories matching this topic...

vllm-project / vllm

Potatooff / Le-Potato

cyrusbehr / tensorrt-cpp-api

curtisgray / wingman

google / XNNPACK

MGTheTrain / python-machine-learning-starter

openvinotoolkit / openvino

SubstrateLabs / substrate-python

SubstrateLabs / substrate-typescript

NVIDIA / TensorRT

aws / amazon-sagemaker-examples

SuperDuperDB / superduperdb

xorbitsai / inference

huggingface / text-generation-inference

deepjavalibrary / djl-serving

autonomi-ai / nos

pgmpy / pgmpy

google / JetStream

wesleey / python-artificial-intelligence

roboflow / inference

Improve this page

Add this topic to your repo