Saturday, August 6 • 4:00pm - 5:00pm
MIRI - Logical Induction: Progress in AI Alignment

MIRI is interested in reasoning about highly advanced AI systems before they exist, specifically for developing models of safety and control for such systems. Current agent models used in game theory and economics are insufficient for this purpose, because of their highly unrealistic assumption that a rational agent must immediately know all the logical consequences of its beliefs at any given time. Logical induction refers to the more realistic process by which an agent's beliefs should change over time as it thinks and realizes the consequences of its past observations. In this talk, we will examine (1) some criteria for "ideal" logical induction, (2) a new algorithm for logical induction that satisfies many desirable properties, and (3) some implications for what a very powerful AI system is able to learn. About 50% of the talk will be technical in nature, with adjustments made for audience composition and interest.


Andrew Critch

Andrew, who usually goes by “Critch”, earned his PhD in mathematics at UC Berkeley, researching algebro-geometric properties of machine learning models. He completed his B.Sc. with Honors in pure mathematics in just 20 months, and earned the Governor General’s Medal as the top undergraduate student at the Memorial University of Newfoundland that year. At the age of 19 he began his PhD in analytic geometry at the University of... Read More →

