AI Safety Diary: September 8, 2025
A diary entry on common use cases for AI models and the risks of models obfuscating their reasoning to evade safety monitors.
A diary entry on common use cases for AI models and the risks of models obfuscating their reasoning to evade safety monitors.