AI Safety Diary: September 22, 2025

A diary entry on the 8th chapter of the Effective Altruism Handbook, focusing on the practical application of EA principles in career choices, donations, and community involvement to maximize positive impact.

September 22, 2025 · 1 min

AI Safety Diary: September 16, 2025

Completed Chapter 7 of the Effective Altruism Handbook, ‘What do you think?’, which emphasizes the importance of critical thinking and actively contributing personal insights to the community discourse.

September 16, 2025 · 1 min

AI Safety Diary: September 10, 2025

A diary entry on AI risks, including misalignment, misuse, and s-risks, and an exploration of emergent misalignment due to prompt sensitivity in LLMs.

September 10, 2025 · 1 min

AI Safety Diary: September 9, 2025

A diary entry on longtermism and its moral implications for the future, and a paper on teaching models to verbalize reward hacking in Chain-of-Thought reasoning.

September 9, 2025 · 1 min

AI Safety Diary: August 25, 2025

A diary entry on Chapter 4 of the Effective Altruism Handbook, ‘Our Final Century?’, which examines existential risks, particularly human-made pandemics, and strategies for biosecurity.

August 25, 2025 · 1 min

AI Safety Diary: August 18, 2025

A diary entry on Chapter 3 of the Effective Altruism Handbook, ‘Radical Empathy’, which explores impartial care and extending empathy to non-human animals.

August 18, 2025 · 1 min

AI Safety Diary: August 15, 2025

A diary entry on Chapter 2 of the Effective Altruism Handbook, focusing on the significant differences in the impact of interventions aimed at alleviating global poverty.

August 15, 2025 · 1 min

AI Safety Diary: August 8, 2025

A diary entry on exploring the ‘Effectiveness Mindset’ from the Effective Altruism Handbook, in the context of AI safety and governance.

August 8, 2025 · 1 min