Ai Alignment

Exploring the Existential Risks of AI: A Deep Dive into 'If Anyone Builds It, Everyone Dies'

A comprehensive review and analysis of the book ‘If Anyone Builds It, Everyone Dies’ by Eliezer Yudkowsky and Nate Soares, highlighting the dangers of unchecked AI development and calls for global action.

AI Safety Diary: September 15, 2025

Draws parallels between AI ‘scheming’ and ape language experiments, exploring deceptive tendencies in LLMs and the need for advanced monitoring for AI safety.

AI Safety Diary: September 12, 2025

Examines the challenges of predicting AI agent behavior from observed actions and its implications for AI safety, alignment, and the need for robust monitoring.

AI Safety Diary: August 30, 2025

A diary entry on tracing the reasoning processes of Large Language Models (LLMs) to enhance interpretability, and a discussion on the inherent difficulties and challenges in achieving AI alignment.

AI Safety Diary: August 13, 2025

A diary entry on Unit 1 of the BlueDot AI Alignment course, covering foundational concepts like neural networks, gradient descent, transformers, and the future impacts of AI.