inference

Here are 531 public repositories matching this topic...

hpcaitech / ColossalAI

Making large AI models cheaper, faster and more accessible

ai deep-learning hpc distributed-computing inference big-model large-scale data-parallelism model-parallelism pipeline-parallelism foundation-models heterogeneous-training

Updated Jan 24, 2025
Python

microsoft / DeepSpeed

Star

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

machine-learning compression deep-learning gpu inference pytorch zero data-parallelism model-parallelism mixture-of-experts pipeline-parallelism billion-parameters trillion-parameters

Updated Jan 29, 2025
Python

vllm-project / vllm

Sponsor

Star

A high-throughput and memory-efficient inference and serving engine for LLMs

amd cuda inference pytorch transformer llama gpt rocm model-serving tpu hpu mlops xpu llm inferentia llmops llm-serving trainium

Updated Jan 29, 2025
Python

SYSTRAN / faster-whisper

Star

Faster Whisper transcription with CTranslate2

deep-learning inference transformer speech-recognition openai speech-to-text quantization whisper

Updated Jan 1, 2025
Python

huggingface / text-generation-inference

Star

Large Language Model Text Generation Inference

nlp bloom deep-learning inference pytorch falcon transformer gpt starcoder

Updated Jan 29, 2025
Python

triton-inference-server / server

Star

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

machine-learning cloud deep-learning gpu inference edge datacenter

Updated Jan 29, 2025
Python

sgl-project / sglang

Star

SGLang is a fast serving framework for large language models and vision language models.

cuda inference pytorch transformer moe llama vlm llm llm-serving llava llama2 deepseek-llm deepseek llama3 llama3-1 deepseek-v3

Updated Jan 28, 2025
Python

Linzaer / Ultra-Light-Fast-Generic-Face-Detector-1MB

Star

💎1MB lightweight face detection model (1MB轻量级人脸检测模型)

arm inference face-detection mnn ncnn

Updated Dec 29, 2023
Python

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

Updated Jan 29, 2025
Python

Trusted-AI / adversarial-robustness-toolbox

Star

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

python machine-learning privacy ai attack extraction inference artificial-intelligence evasion red-team poisoning adversarial-machine-learning blue-team adversarial-examples adversarial-attacks trusted-ai trustworthy-ai

Updated Jan 27, 2025
Python

NVIDIA-AI-IOT / torch2trt

Star

An easy to use PyTorch to TensorRT converter

inference pytorch classification tensorrt jetson-tx2 jetson-xavier jetson-nano

Updated Aug 17, 2024
Python

AutoGPTQ / AutoGPTQ

Star

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

nlp deep-learning transformers inference pytorch transformer quantization large-language-models llms

Updated Jan 21, 2025
Python

openvinotoolkit / open_model_zoo

Star

Pre-trained Deep Learning models and demos (high quality and extremely fast)

demo model-zoo model models inference cnn-model caffemodel tensorflow-models pytorch-models deep-learning-models openvino onnx-models openvino-toolkit openvino-models openvino-model-zoo

Updated Jan 16, 2025
Python

neuralmagic / deepsparse

Star

Sparsity-aware deep learning inference runtime for CPUs

nlp performance computer-vision inference machinelearning pruning object-detection pretrained-models quantization cpus onnx sparsification llm-inference deepsparse

Updated Jul 19, 2024
Python

pgmpy / pgmpy

Sponsor

Star

Python Library for learning (Structure and Parameter), inference (Probabilistic and Causal), and simulations in Bayesian Networks.

python inference simulations bayesian-networks causal-inference structure-learning causal-discovery causal-identification

Updated Jan 28, 2025
Python

huggingface / optimum

Star

🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools

training optimization intel transformers inference pytorch quantization onnx tflite onnxruntime graphcore habana

Updated Jan 29, 2025
Python

microsoft / DeepSpeed-MII

Star

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

deep-learning inference pytorch

Updated Jan 24, 2025
Python

tobegit3hub / tensorflow_template_application

Star

TensorFlow template application for deep learning

machine-learning csv deep-learning tensorflow inference cnn lstm tensorboard mlp libsvm tfrecords wide-and-deep serving

Updated Jul 5, 2023
Python

pytorch / ao

Star

PyTorch native quantization and sparsity for training and inference

training sparsity cuda inference optimizer pytorch transformer offloading llama quantization mx brrr dtypes float8

Updated Jan 29, 2025
Python

ELS-RD / transformer-deploy

Star

Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀

machine-learning natural-language-processing deep-learning deployment server inference

Updated Oct 23, 2024
Python

Improve this page

Add a description, image, and links to the inference topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the inference topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inference

Here are 531 public repositories matching this topic...

hpcaitech / ColossalAI

microsoft / DeepSpeed

vllm-project / vllm

SYSTRAN / faster-whisper

huggingface / text-generation-inference

triton-inference-server / server

sgl-project / sglang

Linzaer / Ultra-Light-Fast-Generic-Face-Detector-1MB

xorbitsai / inference

Trusted-AI / adversarial-robustness-toolbox

NVIDIA-AI-IOT / torch2trt

AutoGPTQ / AutoGPTQ

openvinotoolkit / open_model_zoo

neuralmagic / deepsparse

pgmpy / pgmpy

huggingface / optimum

microsoft / DeepSpeed-MII

tobegit3hub / tensorflow_template_application

pytorch / ao

ELS-RD / transformer-deploy

Improve this page

Add this topic to your repo