██FR█████ █INTELL███████████
frenchintelligence.org
reinforcement-learning
RLHF – The Key to Building Safe AI Models Across Industries
October 7, 2024
Human Study Validates GPT-4 Win Rates for TL;DR Summarization
August 26, 2024
Performance of Best of N Baseline for Various N and Sample Responses and GPT-4 Judgments
August 26, 2024
The Unlikelihood Baseline in Sentiment Experiments
August 26, 2024
GPT-4 Prompts for Computing Summarization and Dialogue Win Rates
August 26, 2024
Fine-Tuning GPT-2 for IMDb Sentiment Analysis
August 26, 2024
DPO Hyperparameters and Implementation Details
August 26, 2024
Analyzing Reward Functions and Equivalence Classes
August 26, 2024
Deriving the DPO Objective Under the Plackett-Luce Model
August 25, 2024
Deriving the DPO Objective Under the Bradley-Terry Model
August 25, 2024
Deriving the Optimum of the KL-Constrained Reward Maximization Objective
August 25, 2024
Behind the Scenes: The Team Behind DPO
August 25, 2024
GPT-4 vs. Humans: Validating AI Judgment in Language Model Training
August 25, 2024
Theoretical Analysis of Direct Preference Optimization
August 25, 2024
Bypassing the Reward Model: A New RLHF Paradigm
August 25, 2024
How AI Learns from Human Preferences
August 25, 2024
Simplifying AI Training: Direct Preference Optimization vs. Traditional RL
August 25, 2024
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
August 25, 2024
Exploration of Reinforcement Learning in LLMs
June 20, 2024