Home
Latest
Featured
Tags
Search
Search for Blog
Reward models
1
Large Language Models
Online Alignment
Reinforcement Learning
Human Feedback
Reward Models
•
4 Jun, 2024
Revolutionizing Large Language Models: Active Preference Elicitation for Online Alignment
By
Desmond Morales