AI Safety Diary: September 27, 2025
A diary entry on the 3rd chapter of the AI Safety Book, focusing on the core challenges of single-agent safety, such as specifying correct reward functions and preventing unintended behaviors in a single AI system.