dpo

A open-source framework designed to adapt pre-trained Language Models (LLMs), such as Llama, Mistral, and Mixtral, to a wide array of domains and languages.

lora finetuning dpo llm finetuning-llms continual-pre-training

Updated May 27, 2024
Python

armbues / SiLLM

Star

SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.

lora mlx dpo apple-silicon large-language-models llm llm-training llm-inference

Updated May 24, 2024
Python

ContextualAI / HALOs

Star

A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).

alignment ppo halos dpo kto rlhf

Updated May 24, 2024
Python

ducnh279 / Align-LLMs-with-DPO

Star

Align a Large Language Model (LLM) with DPO loss

python transformers pytorch alignment dpo llms

Updated May 17, 2024

armbues / SiLLM-examples

Star

Examples for using the SiLLM framework for training and running Large Language Models (LLMs) on Apple Silicon

lora mlx dpo apple-silicon large-language-models llm llm-training llm-inference

Updated May 17, 2024
Python

shibing624 / MedicalGPT

Star

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型，实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。

medical llama gpt dpo llm chatgpt medicalgpt

Updated May 15, 2024
Python

golang-malawi / go-dpo

Star

Unofficial Go library for DPO Group

golang library payments dpo

Updated May 3, 2024
Go

DPO-Group / DPO_Gravity_Forms

Star

This is the DPO Group plugin for Gravity Forms.

gravityforms gravity-forms gravityforms-payment dpo

Updated Apr 29, 2024
PHP

ssbuild / llm_dpo

Star

dpo finetuning

lora dpo qlora

Updated Apr 23, 2024
Python

vicgalle / configurable-safety-tuning

Sponsor

Star

Data and models for the paper "Configurable Safety Tuning of Language Models with Synthetic Preference Data"

alignment safety preference-learning dpo llm

Updated Apr 23, 2024
Python

sugarandgugu / Simple-Trl-Training

Star

基于DPO算法微调语言大模型，简单好上手。

simple dpo trl llm rlhf

Updated Apr 16, 2024
Python

RobinSmits / Dutch-LLMs

Star

Various training, inference and validation code and results related to Open LLM's that were pretrained (full or partially) on the Dutch language.

transformers pytorch alpaca peft dpo trl large-language-models open-llama polylm qwen2

Updated Apr 9, 2024
Jupyter Notebook

levje / resnet-dpo

Star

Proof-of-concept leveraging DPO loss to fine-tune a ResNet to classify images from CIFAR10 dataset.

pytorch alignment classification dpo

Updated Mar 29, 2024
Python

M4-ai / Mulimega4-ai

Star

We're improving Yi-9B-200K with a ton of new abilities for high performance in generalist and specialist tasks.

ai math checkpoint codegen yi finetune dpo function-calling agentics

Updated Mar 27, 2024

martin-wey / CodeUltraFeedback

Star

CodeUltraFeedback: aligning large language models to coding preferences

alignment code-generation dpo large-language-models llm-as-a-judge codeultrafeedback codal-bench

Updated Mar 17, 2024
Python

eyess-glitch / phi-2-fine-tuning

Star

This repository contains the source code used for finetuning the LLM phi-2 with several frameworks, such as DPO.

fine-tuning dpo llm human-eval retrieval-augmented-generation phi-2

Updated Mar 3, 2024
Jupyter Notebook

ukairia777 / tensorflow-nlp-tutorial

Star

tensorflow를 사용하여 텍스트 전처리부터, Topic Models, BERT, GPT, LLM과 같은 최신 모델의 다운스트림 태스크들을 정리한 Deep Learning NLP 저장소입니다.

nlp natural-language-processing tensorflow transformers named-entity-recognition question-answering llama lora trainer bert keras-tutorial sft dpo nlp-tutorial huggingface bert-ner llm

Updated Feb 22, 2024
Jupyter Notebook

Improve this page

Add a description, image, and links to the dpo topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the dpo topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dpo

Here are 43 public repositories matching this topic...

jianzhnie / LLamaTuner

modelscope / swift

DPO-Group / DPO_WooCommerce

adithya-s-k / Indic-llm

armbues / SiLLM

ContextualAI / HALOs

ducnh279 / Align-LLMs-with-DPO

armbues / SiLLM-examples

shibing624 / MedicalGPT

golang-malawi / go-dpo

DPO-Group / DPO_Gravity_Forms

ssbuild / llm_dpo

vicgalle / configurable-safety-tuning

sugarandgugu / Simple-Trl-Training

RobinSmits / Dutch-LLMs

levje / resnet-dpo

M4-ai / Mulimega4-ai

martin-wey / CodeUltraFeedback

eyess-glitch / phi-2-fine-tuning

ukairia777 / tensorflow-nlp-tutorial

Improve this page

Add this topic to your repo