We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode How AI Could Be A Mathematician's Co-Pilot by 2026 (Prof. Swarat Chaudhuri)

How AI Could Be A Mathematician's Co-Pilot by 2026 (Prof. Swarat Chaudhuri)

2024/11/25
logo of podcast Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

AI Deep Dive AI Chapters Transcript
People
S
Swarat Chaudhuri
Topics
Swarat Chaudhuri教授讨论了人工智能推理、定理证明和数学发现的突破性进展。他解释了神经符号方法,并分享了他关于COPRA(基于GPT的证明代理)的开创性工作以及对大型语言模型的见解。他认为,通过结合神经网络和经典符号代码,神经符号编程可以表示混合模型。他认为,使用极低级的语言(如汇编语言),让机器自行找出指令,可以发现人类无法想象的新事物,从而实现数学、物理学和其它学科的新突破。他还讨论了在数学中使用AI进行定理证明和概念发现,以及如何扩展和模块化数学证明。此外,他还探讨了形式化方法和AI数学中的挑战,包括形式化证明、经验性谓词和AI数学中的不确定性,以及如何解决AI系统中的污染和概念学习问题。他认为,大型语言模型在数学推理方面取得了显著进展,但仍有很大的提升空间,例如缺乏模块化和组合能力。他介绍了他的工作,包括类型导向的神经程序合成、神经松弛程序启发式方法以及使用大型语言模型进行库学习和抽象。他认为,大型语言模型可以作为抽象工具,识别程序中的常见模式并用自然语言解释,从而学习概念。他还讨论了符号回归,以及如何使用大型语言模型来指导程序搜索或表达式搜索,以及进行抽象。他认为,到2026年,AI可以作为数学家的重要助手,帮助他们解决复杂的数学问题,并可能导致新的数学形式和人类与AI之间新的合作模式。他还讨论了形式化证明框架(如Lean)的作用,以及如何将大型语言模型与这些框架结合起来,以提高数学证明的效率和可靠性。最后,他还讨论了测试集污染的问题,以及如何减轻这种风险。

Deep Dive

Chapters
Professor Swarat Chaudhuri discusses the definition of reasoning in AI, emphasizing the importance of performance in mathematical, programming, and planning tasks. He also touches on the challenges of defining reasoning based on human-like behavior and the quantitative measures needed to assess progress.
  • Reasoning in AI is defined by performance in tasks like mathematics, programming, and planning.
  • There is no binary definition of reasoning; it's a quantitative measure of improvement.
  • Different methods for reasoning include symbolic and neural approaches.

Shownotes Transcript

Professor Swarat Chaudhuri from the University of Texas at Austin and visiting researcher at Google DeepMind discusses breakthroughs in AI reasoning, theorem proving, and mathematical discovery. Chaudhuri explains his groundbreaking work on COPRA (a GPT-based prover agent), shares insights on neurosymbolic approaches to AI.

Professor Swarat Chaudhuri:

https://www.cs.utexas.edu/~swarat/

SPONSOR MESSAGES:

CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments.

https://centml.ai/pricing/

Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on ARC and AGI, they just acquired MindsAI - the current winners of the ARC challenge. Are you interested in working on ARC, or getting involved in their events? Goto https://tufalabs.ai/

TOC:

[00:00:00] 0. Introduction / CentML ad, Tufa ad

  1. AI Reasoning: From Language Models to Neurosymbolic Approaches

[00:02:27] 1.1 Defining Reasoning in AI

[00:09:51] 1.2 Limitations of Current Language Models

[00:17:22] 1.3 Neuro-symbolic Approaches and Program Synthesis

[00:24:59] 1.4 COPRA and In-Context Learning for Theorem Proving

[00:34:39] 1.5 Symbolic Regression and LLM-Guided Abstraction

  1. AI in Mathematics: Theorem Proving and Concept Discovery

[00:43:37] 2.1 AI-Assisted Theorem Proving and Proof Verification

[01:01:37] 2.2 Symbolic Regression and Concept Discovery in Mathematics

[01:11:57] 2.3 Scaling and Modularizing Mathematical Proofs

[01:21:53] 2.4 COPRA: In-Context Learning for Formal Theorem-Proving

[01:28:22] 2.5 AI-driven theorem proving and mathematical discovery

  1. Formal Methods and Challenges in AI Mathematics

[01:30:42] 3.1 Formal proofs, empirical predicates, and uncertainty in AI mathematics

[01:34:01] 3.2 Characteristics of good theoretical computer science research

[01:39:16] 3.3 LLMs in theorem generation and proving

[01:42:21] 3.4 Addressing contamination and concept learning in AI systems

REFS:

00:04:58 The Chinese Room Argument, https://plato.stanford.edu/entries/chinese-room/

00:11:42 Software 2.0, https://medium.com/@karpathy/software-2-0-a64152b37c35

00:11:57 Solving Olympiad Geometry Without Human Demonstrations, https://www.nature.com/articles/s41586-023-06747-5

00:13:26 Lean, https://lean-lang.org/

00:15:43 A General Reinforcement Learning Algorithm That Masters Chess, Shogi, and Go Through Self-Play, https://www.science.org/doi/10.1126/science.aar6404

00:19:24 DreamCoder (Ellis et al., PLDI 2021), https://arxiv.org/abs/2006.08381

00:24:37 The Lambda Calculus, https://plato.stanford.edu/entries/lambda-calculus/

00:26:43 Neural Sketch Learning for Conditional Program Generation, https://arxiv.org/pdf/1703.05698

00:28:08 Learning Differentiable Programs With Admissible Neural Heuristics, https://arxiv.org/abs/2007.12101

00:31:03 Symbolic Regression With a Learned Concept Library (Grayeli et al., NeurIPS 2024), https://arxiv.org/abs/2409.09359

00:41:30 Formal Verification of Parallel Programs, https://dl.acm.org/doi/10.1145/360248.360251

01:00:37 Training Compute-Optimal Large Language Models, https://arxiv.org/abs/2203.15556

01:18:19 Chain-of-Thought Prompting Elicits Reasoning in Large Language Models, https://arxiv.org/abs/2201.11903

01:18:42 Draft, Sketch, and Prove: Guiding Formal Theorem Provers With Informal Proofs, https://arxiv.org/abs/2210.12283

01:19:49 Learning Formal Mathematics From Intrinsic Motivation, https://arxiv.org/pdf/2407.00695

01:20:19 An In-Context Learning Agent for Formal Theorem-Proving (Thakur et al., CoLM 2024), https://arxiv.org/pdf/2310.04353

01:23:58 Learning to Prove Theorems via Interacting With Proof Assistants, https://arxiv.org/abs/1905.09381

01:39:58 An In-Context Learning Agent for Formal Theorem-Proving (Thakur et al., CoLM 2024), https://arxiv.org/pdf/2310.04353

01:42:24 Programmatically Interpretable Reinforcement Learning (Verma et al., ICML 2018), https://arxiv.org/abs/1804.02477