AI Safety Diary: September 22, 2025
A diary entry on the 8th chapter of the Effective Altruism Handbook, focusing on the practical application of EA principles in career choices, donations, and community involvement to maximize positive impact.
A diary entry on the 8th chapter of the Effective Altruism Handbook, focusing on the practical application of EA principles in career choices, donations, and community involvement to maximize positive impact.
Completed Chapter 7 of the Effective Altruism Handbook, ‘What do you think?’, which emphasizes the importance of critical thinking and actively contributing personal insights to the community discourse.
A diary entry on AI risks, including misalignment, misuse, and s-risks, and an exploration of emergent misalignment due to prompt sensitivity in LLMs.
A diary entry on longtermism and its moral implications for the future, and a paper on teaching models to verbalize reward hacking in Chain-of-Thought reasoning.
A diary entry on Chapter 4 of the Effective Altruism Handbook, ‘Our Final Century?’, which examines existential risks, particularly human-made pandemics, and strategies for biosecurity.
A diary entry on Chapter 3 of the Effective Altruism Handbook, ‘Radical Empathy’, which explores impartial care and extending empathy to non-human animals.
A diary entry on Chapter 2 of the Effective Altruism Handbook, focusing on the significant differences in the impact of interventions aimed at alleviating global poverty.
A diary entry on exploring the ‘Effectiveness Mindset’ from the Effective Altruism Handbook, in the context of AI safety and governance.