rlhf

Star

Here are 116 public repositories matching this topic...

MOONLAPSED / cognOS

Star

Python package for cognosis kb, syntax, and markup language. Under-construction.

agent rlhf local-llm llama2

Updated May 27, 2024
Python

vualidon / rewrite_retrieve_read_law

Star

RAG Law systems base on google search and Gemini Pro

law rag google-search-api llm rlhf gemini-pro

Updated Mar 14, 2024
Python

himanshuvnm / Foundation-Model-Large-Language-Model-FM-LLM

Star

This repository was commited under the action of executing important tasks on which modern Generative AI concepts are laid on. In particular, we focussed on three coding actions of Large Language Models. Extra and necessary details are given in the README.md file.

aws python3 pytorch lora rnn-pytorch attention-is-all-you-need fine-tuning hate-speech-detection huggingface huggingface-transformers foundation-models large-language-models generative-ai rlhf flan-t5 peft-fine-tuning-llm ml-m5-2xlarge low-rank-ada

Updated Mar 28, 2024
Jupyter Notebook

AMfeta99 / NLP_LLM

Star

This repository is dedicated to small projects and some theoretical material that I used to get into NLP and LLM in a practical and efficient way.

Updated May 6, 2024
Jupyter Notebook

akain0 / Reinforcement-Learning-

Star

Projects and Models built in Python leveraging PyTorch, implementing Reinforcement Learning algorithms for reward-based tasks.

reinforcement-learning reinforcement-learning-algorithms a3c lstm-neural-networks bellman-equation rlhf

Updated May 7, 2024
Jupyter Notebook

saschaschramm / tiny-chatgpt

Star

Researching the reinforcement learning algorithm of ChatGPT

gae temporal-differencing-learning ppo chatgpt rlhf general-advantage-estimation

Updated Apr 7, 2023
Jupyter Notebook

OpenRL-Lab / RL_Tutorial

Star

Reinforcement Learning Tutorial (强化学习教程)

reinforcement-learning deep-reinforcement-learning tutorials pytorch dqn on-policy rlhf

Updated Sep 10, 2023

ZiyiZhang27 / tdpo

Star

[ICML 2024] Code for the paper "Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases"

reinforcement-learning alignment text-to-image diffusion-models stable-diffusion human-feedback rlhf

Updated May 20, 2024
Python

jddunn / rlhf

Star

Library built on TextRL for easy training and usage of fine-tuned models using RLHF, a rewards model, and PPO

ppo rlhf reward-model textrl

Updated Feb 28, 2024
Python

lyndskg / ChatGPT4Me

Star

A program that enhances and customizes ChatGPT's underlying pre-trained LLM w/ transformer architecture. Based on OpenAI's beta InstructGPT fine-tune model.

supervised-learning gpt fine-tuning gpt-3 llm chatgpt chatgpt-api chatgpt3 rlhf instructgpt

Updated Jul 30, 2023

AlignInc / aligner-replication

Star

The reproduct of the paper - Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction

alignment aligner rlhf

Updated Mar 29, 2024
Python

ssbuild / t5_rlhf

Star

chatyuan_rlhf_training

lora reward ppo t5 rlhf adalora qlora

Updated Sep 19, 2023
Python

vicgalle / awesome-rlaif

Sponsor

Star

A curated and updated list of relevant articles and repositories on Reinforcement Learning from AI Feedback (RLAIF)

awesome research language-model llm rlhf rlaif

Updated Jan 24, 2024

phonism / llm4cp

Star

Large Language Model for Competitive Programming

competitive-programming llama ppo large-language-models rlhf

Updated Apr 28, 2023
Python

log10-io / log10js

Star

JavaScript client library for managing your LLM data in one place

javascript debugging ai monitoring logging artificial-intelligence openai autonomous-agents openai-api langchain rlhf llmops langchain-js

Updated May 3, 2023
JavaScript

jeremy-collins / robot-rlhf

Star

Robot Learning from Human Feedback. Inspired by advancements in NLP, we train a robot policy via reinforcement learning using a reward function learned exclusively from human preferences.

reinforcement-learning robotics alignment chatgpt rlhf