Argilla is a collaboration platform for AI engineers and domain experts that require high-quality outputs, full data ownership, and overall efficiency.
-
Updated
May 29, 2024 - Python
Argilla is a collaboration platform for AI engineers and domain experts that require high-quality outputs, full data ownership, and overall efficiency.
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Unify Efficient Fine-Tuning of 100+ LLMs
SimPO: Simple Preference Optimization with a Reference-Free Reward
Achieving Efficient Alignment through Learned Correction
Official release of InternLM2 7B and 20B base and chat models. 200K context support
Python client library for improving your LLM app accuracy
⚗️ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.
AI research lab🔬: implementations of AI papers and theoretical research: InstructGPT, llama, transformers, diffusion models, RLHF, etc...
A curated list of reinforcement learning with human feedback resources (continually updated)
Robust recipes to align language models with human and AI preferences
RewardBench: the first evaluation tool for reward models.
Add a description, image, and links to the rlhf topic page so that developers can more easily learn about it.
To associate your repository with the rlhf topic, visit your repo's landing page and select "manage topics."