Go offline with the Player FM app!
Sam Lehman: What the Reinforcement Learning Renaissance Means for Decentralized AI
Manage episode 479906773 series 2478788
Join Tommy Shaughnessy from Delphi Ventures as he hosts Sam Lehman, Principal at Symbolic Capital and AI researcher, for a deep dive into the Reinforcement Learning (RL) renaissance and its implications for decentralized AI. Sam recently authored a widely discussed post, "The World's RL Gym", exploring the evolution of AI scaling and the exciting potential of decentralized networks for training next-generation models.The World’s RL Gym: https://www.symbolic.capital/writing/the-worlds-rl-gym
🎯 Key Highlights
The three phases of AI scaling: Pre-training, Inference Time Compute, and the RL Renaissance.
How DeepMind's novel RL approach (using GRPO) created powerful reasoning models with minimal human data.
Understanding "reasoning traces" and how models learn to "think" longer and more effectively.
The potential downsides of human preference data potentially inhibiting model creativity, drawing parallels to AlphaGo.
Exploring the "World's RL Gym" concept: Decentralizing RL through open environments, diverse tasks, and verified data.
Why open, collaborative RL environments might outperform closed-source labs in generating diverse AI strategies.
The critical role of high-quality base models for successful RL fine-tuning.
Future AI architectures: Continuous learning and the potential of modular Mixture-of-Experts (MoE) models.
Current landscape: Open-source vs. proprietary AI, the challenge of model lock-in, and the role of crypto networks.
Debunking recent claims that "RL is dead" and understanding its true impact.
💡 Want to stay updated with the latest in crypto & AI? Hit subscribe and the notification bell! 🔔
🧠 Follow the Alpha
Tommy's Twitter: @Shaughnessy119
Sam's Twitter: @SPLehman
Symbolic Capital’s Twitter: @symbolicvc
🔗 Connect with Delphi
🌐 Portal: https://delphidigital.io/
🐦 Twitter: https://twitter.com/delphi_digital
💼 LinkedIn: https://www.linkedin.com/company/delphi-digital
🎧 Listen on
Spotify: https://open.spotify.com/show/62PR1RigLG2YN5Pelq6UY9?si=18ac7ccf36ab4753
Apple Podcasts: https://podcasts.apple.com/us/podcast/the-delphi-podcast/id1438148082
Youtube: https://www.youtube.com/channel/UC9Yy99ZlQIX9-PdG_xHj43Q
Timestamps
00:00 - Introduction: Sam Lehman, Symbolic Capital & "The World's RL Gym"
01:30 - History of AI Scaling: Pre-training Era
03:30 - Phase 2: Inference Time Compute Scaling
09:30 - Phase 3: The RL Renaissance & DeepMind Moment
14:30 - How DeepMind Trained R1 without Human Preferences
16:30 - AlphaGo Analogy: Human Data Inhibiting Creativity?
20:30 - Generalizability of RL Training: How Far Does It Go?
22:30 - The "Aha Moment": Models Learning to Think Longer
25:30 - Concept: Decentralized RL & The World's Gym
31:30 - Why Decentralize RL? Open Collaboration vs. Closed Labs
35:00 - Understanding Reasoning Traces
39:00 - Current Decentralized RL Projects (Prime Intellect, General Reasoning)
41:30 - Future Architectures: Continuous Improvement & Modular Models
46:30 - Open Source vs. Proprietary AI: Landscape & Challenges
50:30 - The Lock-In Problem with Foundational Models
52:30 - Is AGI Here? Experiences with GPT-4o
56:30 - Investment Focus in Decentralized AI
59:00 - Modular MoE Models & Jensen's HDEE Paper
1:03:00 - Debunking "RL is Dead" Claims
1:06:00 - Importance of Performant Base Models for RL
Disclaimer
This podcast is strictly informational and educational and is not investment advice or a solicitation to buy or sell any tokens or securities or to make any financial decisions. Do not trade or invest in any project, tokens, or securities based upon this podcast episode. The host and members at Delphi Ventures may personally own tokens or art that are mentioned on the podcast. Our current show features paid sponsorships which may be featured at the start, middle, and/or the end of the episode. These sponsorships are for informational purposes only and are not a solicitation to use any product, service or token.
463 episodes
Manage episode 479906773 series 2478788
Join Tommy Shaughnessy from Delphi Ventures as he hosts Sam Lehman, Principal at Symbolic Capital and AI researcher, for a deep dive into the Reinforcement Learning (RL) renaissance and its implications for decentralized AI. Sam recently authored a widely discussed post, "The World's RL Gym", exploring the evolution of AI scaling and the exciting potential of decentralized networks for training next-generation models.The World’s RL Gym: https://www.symbolic.capital/writing/the-worlds-rl-gym
🎯 Key Highlights
The three phases of AI scaling: Pre-training, Inference Time Compute, and the RL Renaissance.
How DeepMind's novel RL approach (using GRPO) created powerful reasoning models with minimal human data.
Understanding "reasoning traces" and how models learn to "think" longer and more effectively.
The potential downsides of human preference data potentially inhibiting model creativity, drawing parallels to AlphaGo.
Exploring the "World's RL Gym" concept: Decentralizing RL through open environments, diverse tasks, and verified data.
Why open, collaborative RL environments might outperform closed-source labs in generating diverse AI strategies.
The critical role of high-quality base models for successful RL fine-tuning.
Future AI architectures: Continuous learning and the potential of modular Mixture-of-Experts (MoE) models.
Current landscape: Open-source vs. proprietary AI, the challenge of model lock-in, and the role of crypto networks.
Debunking recent claims that "RL is dead" and understanding its true impact.
💡 Want to stay updated with the latest in crypto & AI? Hit subscribe and the notification bell! 🔔
🧠 Follow the Alpha
Tommy's Twitter: @Shaughnessy119
Sam's Twitter: @SPLehman
Symbolic Capital’s Twitter: @symbolicvc
🔗 Connect with Delphi
🌐 Portal: https://delphidigital.io/
🐦 Twitter: https://twitter.com/delphi_digital
💼 LinkedIn: https://www.linkedin.com/company/delphi-digital
🎧 Listen on
Spotify: https://open.spotify.com/show/62PR1RigLG2YN5Pelq6UY9?si=18ac7ccf36ab4753
Apple Podcasts: https://podcasts.apple.com/us/podcast/the-delphi-podcast/id1438148082
Youtube: https://www.youtube.com/channel/UC9Yy99ZlQIX9-PdG_xHj43Q
Timestamps
00:00 - Introduction: Sam Lehman, Symbolic Capital & "The World's RL Gym"
01:30 - History of AI Scaling: Pre-training Era
03:30 - Phase 2: Inference Time Compute Scaling
09:30 - Phase 3: The RL Renaissance & DeepMind Moment
14:30 - How DeepMind Trained R1 without Human Preferences
16:30 - AlphaGo Analogy: Human Data Inhibiting Creativity?
20:30 - Generalizability of RL Training: How Far Does It Go?
22:30 - The "Aha Moment": Models Learning to Think Longer
25:30 - Concept: Decentralized RL & The World's Gym
31:30 - Why Decentralize RL? Open Collaboration vs. Closed Labs
35:00 - Understanding Reasoning Traces
39:00 - Current Decentralized RL Projects (Prime Intellect, General Reasoning)
41:30 - Future Architectures: Continuous Improvement & Modular Models
46:30 - Open Source vs. Proprietary AI: Landscape & Challenges
50:30 - The Lock-In Problem with Foundational Models
52:30 - Is AGI Here? Experiences with GPT-4o
56:30 - Investment Focus in Decentralized AI
59:00 - Modular MoE Models & Jensen's HDEE Paper
1:03:00 - Debunking "RL is Dead" Claims
1:06:00 - Importance of Performant Base Models for RL
Disclaimer
This podcast is strictly informational and educational and is not investment advice or a solicitation to buy or sell any tokens or securities or to make any financial decisions. Do not trade or invest in any project, tokens, or securities based upon this podcast episode. The host and members at Delphi Ventures may personally own tokens or art that are mentioned on the podcast. Our current show features paid sponsorships which may be featured at the start, middle, and/or the end of the episode. These sponsorships are for informational purposes only and are not a solicitation to use any product, service or token.
463 episodes
All episodes
×Welcome to Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.