AI Safety Diary: August 13, 2025

Today, I began the BlueDot AI Alignment course and completed its first unit as part of my AI safety studies. Below is the resource I reviewed. Resource: AI and the Years Ahead Source: Unit 1: AI and the Years Ahead , BlueDot Impact AI Alignment Course. Summary: This unit introduces the foundational concepts of AI and its potential future impacts. It describes AI as a collection of approaches, focusing on key techniques like neural networks, gradient descent, and transformers used to train large language models (LLMs) such as ChatGPT. The unit explains how hardware advancements have driven AI progress and covers essential machine learning terms like weights, biases, parameters, neurons, and activations. It also explores the economic and non-economic incentives behind developing transformative AI systems and highlights recent advances in AI capabilities, providing a framework for understanding AI’s societal and economic implications.

August 13, 2025 · 1 min · Serhat Giydiren

AI Safety Diary: August 12, 2025

Today, I completed Unit 1: How AI Systems Work of the BlueDot AI Governance course . Below is a summary of each resource I explored. Resource: How Does AI Learn? A Beginner’s Guide with Examples Source: How Does AI Learn? A Beginner’s Guide with Examples , AI Safety Fundamentals. Summary: This guide provides an accessible introduction to how AI systems learn, focusing on machine learning. It explains key concepts like supervised learning (where models learn from labeled data, e.g., identifying spam emails), unsupervised learning (finding patterns in unlabeled data, e.g., clustering customer preferences), and reinforcement learning (learning through rewards, e.g., training a game-playing AI). The article uses simple examples to illustrate how models are trained and highlights challenges like overfitting and the need for diverse datasets to avoid bias. Resource: Large Language Models Explained Briefly Source: Large Language Models Explained Briefly , YouTube video. Summary: This short video offers a concise overview of large language models (LLMs). It explains that LLMs, like GPT-3, are trained on vast text datasets to predict the next word in a sequence, enabling them to generate coherent text. The video covers their architecture, primarily transformers, and their applications, such as chatbots and text generation. It also touches on limitations, including high computational costs and potential biases in training data. Resource: Intro to Large Language Models Source: Intro to Large Language Models , YouTube video by Andrej Karpathy. Summary: This one-hour talk provides a detailed introduction to large language models (LLMs). It explains how LLMs are built on transformer architectures and trained on massive text corpora to perform tasks like text generation, translation, and question-answering. The video discusses the importance of scaling compute and data for performance improvements, the emergence of capabilities like reasoning, and challenges such as alignment with human values and mitigating harmful outputs. Resource: Visualizing the Deep Learning Revolution Source: Visualizing the Deep Learning Revolution by Richard Ngo, Medium. Summary: This article illustrates the rapid progress in AI over the past decade, driven by deep learning. It covers advancements in four domains: vision (e.g., image and video generation with GANs, transformers, and diffusion models), games (e.g., AlphaGo, AlphaStar, and AI in open-ended environments like Minecraft), language-based tasks (e.g., GPT-2, GPT-3, and ChatGPT’s text generation and reasoning capabilities), and science (e.g., AlphaFold 2 solving protein folding and AI generating chemical compounds). The article emphasizes that scaling compute and data, rather than new algorithms, has been the primary driver of progress. It also notes that AI’s rapid advancement has surprised experts and raises concerns about existential risks from unaligned AGI.

August 12, 2025 · 2 min · Serhat Giydiren

AI Safety Diary: August 11, 2025

Today, I began Unit 1: How AI Systems Work of the BlueDot AI Governance course . Below is the resource I explored. Resource: The AI Triad and What It Means for National Security Strategy Source: The AI Triad and What It Means for National Security Strategy by Ben Buchanan, Center for Security and Emerging Technology (CSET), August 2020. Summary: This paper introduces the “AI Triad” framework—algorithms, data, and computing power—to explain modern machine learning and its implications for national security. It describes algorithms as instructions for processing information, covering supervised learning (predicting outcomes from labeled data), unsupervised learning (finding patterns in unorganized data), and reinforcement learning (learning through trial and error). Data is critical for training AI systems, particularly for supervised learning, but requires careful management to avoid bias and address privacy concerns. Computing power is highlighted as a key driver of AI progress, with a 300,000-fold increase in compute used for top AI projects from 2012 to 2018. The paper connects these components to national security applications, such as analyzing drone footage, targeting propaganda, and powering autonomous military vehicles. It also discusses policy levers like talent recruitment for algorithms, privacy regulations for data, and export controls for compute.

August 11, 2025 · 1 min · Serhat Giydiren

AI Safety Diary: August 10, 2025

Today, I continued exploring the Introduction to AI Safety, Ethics, and Society textbook as part of my AI safety studies. Below is the resource I reviewed. Resource: Introduction to AI Safety, Ethics, and Society (Chapters 6–10 Slides) Source: Introduction to AI Safety, Ethics, and Society by Dan Hendrycks, Taylor & Francis, 2024. Summary: The slides for chapters 6–10 of this textbook, developed by Dan Hendrycks, director of the Center for AI Safety, conclude the introduction to AI safety, ethics, and societal impacts. The chapters covered are: Chapter 6: Beneficial AI and Machine Ethics - Explores the design of AI systems that align with human values and ethical principles, discussing frameworks for ensuring AI contributes positively to society. Chapter 7: Collective Action Problems - Examines challenges in coordinating AI development across stakeholders, addressing issues like competition and cooperation that impact safe AI deployment. Chapter 8: Governance - Covers approaches to AI governance, including safety standards, international treaties, and trade-offs between centralized and decentralized access to advanced AI systems. Chapter 9: Appendix: Ethics - Provides additional insights into ethical considerations for AI, focusing on moral frameworks and their application to AI decision-making. Chapter 10: Appendix: Utility Functions - Discusses the role of utility functions in AI systems, exploring how they shape AI behavior and the challenges of defining safe and effective objectives.

August 10, 2025 · 2 min · Serhat Giydiren

AI Safety Diary: August 9, 2025

Today, I explored the Introduction to AI Safety, Ethics, and Society textbook as part of my AI safety studies. Below is the resource I reviewed. Resource: Introduction to AI Safety, Ethics, and Society (Chapters 1–5 Slides) Source: Introduction to AI Safety, Ethics, and Society by Dan Hendrycks, Taylor & Francis, 2024. Summary: The slides for the first five chapters of this textbook, developed by Dan Hendrycks, director of the Center for AI Safety, provide an introduction to AI safety, ethics, and societal impacts. The chapters covered are: Chapter 1: Overview of Catastrophic AI Risks - Introduces potential catastrophic risks from advanced AI, such as malicious use, accidents, and rogue AI systems. Chapter 2: AI Fundamentals - Covers the basics of modern AI systems, focusing on deep learning, transformer architectures, and scaling laws that drive AI performance. Chapter 3: Single-Agent Safety - Discusses technical challenges in ensuring the safety of individual AI systems, including issues like opaqueness, proxy gaming, and adversarial attacks. Chapter 4: Safety Engineering - Explores principles of safety engineering applied to AI, emphasizing methods to design robust and reliable AI systems. Chapter 5: Complex Systems - Examines AI within the context of complex sociotechnical systems, highlighting the role of systems theory in managing risks from AI deployment.

August 9, 2025 · 1 min · Serhat Giydiren

AI Safety Diary: August 8, 2025

Today, I explored the Effective Altruism Handbook and completed its first chapter as part of my studies related to AI safety and governance. Below is the resource I reviewed. Resource: The Effectiveness Mindset Source: The Effectiveness Mindset , Effective Altruism Forum, Chapter 1 of the Effective Altruism Handbook. Summary: This chapter introduces the core idea of effective altruism: maximizing the impact of one’s time and resources to help others. It emphasizes the importance of focusing on interventions that benefit the most people, rather than those with lesser impact. The chapter highlights the challenge of identifying effective interventions, which requires a “scout mindset”—an approach focused on seeking truth and questioning existing ideas rather than defending preconceived notions.

August 8, 2025 · 1 min · Serhat Giydiren

3-way Partitioning & Quick Select & Quick Sort & Find Kth Largest Element

struct bound { int lt, gt; }; bound partition_3way(vector < int > &arr, int lo, int hi) { int pivot=arr[lo],i=lo; while(i<=hi) { if (arr[i]<pivot) swap(arr[i++],arr[lo++]); else if (arr[i]>pivot) swap(arr[i],arr[hi--]); else i++; } return {lo,hi}; } int quick_select(vector < int > &nums, const int &k) { random_shuffle(nums.begin(),nums.end()); int lo=0,hi=int(nums.size())-1; while(lo<=hi) { bound b=partition_3way(nums,lo,hi); if (k>b.gt) lo=b.gt+1; else if (k<b.lt) hi=b.lt-1; else return nums[b.lt]; } return -1; } void quick_sort(vector < int > &nums, const int &lo, const int &hi) { if (lo>=hi) return; bound b=partition_3way(nums,lo,hi); quick_sort(nums,lo,b.lt-1); quick_sort(nums,b.gt+1,hi); } int findKthLargest(vector < int > &nums, const int &k) { return quick_select(nums,int(nums.size())-k); }

January 1, 2022 · 1 min