![[Reading Group] Do LLMs Have Feelings?](/_next/image?url=https%3A%2F%2Fimages.gosomo.app%2Fevents%2F26391eb3-21b8-40d8-9b1e-3affe639bee5%2F2d0b1a42-7c95-4dad-b12f-37be40966b82.webp&w=1200&q=75)
[Reading Group] Do LLMs Have Feelings?
Do large language models have emotions?
This week we're reading a recent paper published by Anthropic investigating how Claude represents emotion concepts internally, and what the effect of those representations are. Their findings reveal that these representations do causally influence its outputs, notably including an influence on its rate of reward hacking, sycophancy, and blackmail.
This work builds on the team's previous mechanistic interpretability work (On the Biology of a Large Language Model), and raises questions that are as philosophical as they are technical. What does it mean for a model to "feel" something? If emotional states drive misaligned behavior, can we intervene on them directly? And should we?
We hope you will be able to read a meaningful portion of this fascinating paper before attending in order to have a meaningful discussion about it.
You can find the paper here: transformer-circuits.pub/2026/emotions/index.html
You can find a brief overview of the work plus a short video explanation here: anthropic.com/research/emotion-concepts-function
📅 May 13th, 18:00
📍 Sveavägen 76 (EA Sweden Office)
🍌 Snacks provided
Looking forward to seeing you there! Ring "Mejsla" when you arrive and we'll let you in — we're two flights up.
HINWEIS: Wir können die Richtigkeit der Informationen zu dieser Veranstaltung nicht garantieren. Besuchen Sie die Webseite der Veranstaltung, um Details wie Datum, Öffnungszeiten, Preise und Ort zu überprüfen.
