
Education
Reading Group (+🧋): JUDGEMENTBENCH: Comparing Rubric and Preference Evaluation for Quality Assessment
Om evenemanget
Join the Snorkel AI Reading Group, a dynamic forum dedicated to exploring groundbreaking advancements in AI while fostering meaningful connections in our community. 🤝
In this insightful afternoon session, Russell Yang, an AI Engineering Fellow at Stanford Law School, will present his recent research paper: JudgmentBench: Comparing Rubric and Preference Evaluation for Quality Assessment.
Agenda:
- 3pm - Doors open
- 3:30pm - Talk begins
🧋 Enjoy Boba tea and other refreshments while you learn! 🧋🧋🧋
Key Takeaways:
- What is JudgmentBench? A unique dataset comprising 30 real-world legal tasks with 1,539 rubric scores and 1,530 pairwise preference judgments, sourced from practicing attorneys including those from major U.S. law firms.
- Learn why this is the first public dataset in a specialized domain where both supervision signals are gathered from the same experts on identical items.
- Explore the often-unjustified choice between rubric scoring and comparative judgment, despite their dominance in current benchmarking.
- Discover how comparative judgments significantly outperform rubrics in quality ordering, featuring a mean Spearman correlation of 0.908 vs. 0.150, while also requiring less than half the annotation time.
- Understand how this pattern holds true for both human annotators and LLM autograders.
- Delve into the broader research agenda opened by this paired dataset on how expert judgment should be effectively elicited, aggregated, and utilized in fields lacking verifiable ground truth.
JudgmentBench is a collaborative effort among Stanford, Harvey AI, and Snorkel AI.
📍 Location: 101 Second Street
🔗 Reserve Your Spot!
Liknande evenemang
Plats
101 Second Street
Vägbeskrivning








