Strange Evals - Benching Benchmarks
Education

Strange Evals - Benching Benchmarks

Fri, Jun 19
03:30 AM โ€“ 05:00 AM
Free ยท See website
About the event

Join us for an engaging paper reading club where we delve into the intricate world of benchmarks!


Each session features an exploration of widely cited benchmarks, helping us to build a robust intuition about the landscape weโ€™re navigating. We'll be dissecting papers as well as examining raw benchmark samplesโ€”surprisingly, many of them don't make a lot of sense!

This weekโ€™s discussion is especially intriguing: instead of zeroing in on a single benchmark, weโ€™ll explore the question of whether benchmark scores actually predict how models are used in the real world.

Presenter: Jake Boggs himself


Pre-reading: Can We Predict Model Usage from Benchmarks?


A special shoutout to HUD for hosting us! Check them out at HUD.ai ๐Ÿข

Please note that attendance will be limited to keep the discussion focused.

This week in Sverige

More events in Sverige