Home
Latest
Featured
Tags
Search
Search for Blog
Evaluation
2
AI
Large Language Models
LiveBench
Benchmark
Evaluation
•
13 Jun, 2024
LiveBench: A New Standard for Evaluating Large Language Models
By
Elise Montgomery
LLMs
Evaluation
Generation-Based Metrics
Artificial Intelligence
•
6 Mar, 2024
Rethinking LLM Evaluation: A Shift to Generation-Based Metrics
By
Avery Parks