Artwork

Content provided by Simplify Tech. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Simplify Tech or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ppacc.player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

The Hidden Crisis in AI Development: No One’s Testing

37:46
 
Share
 

Manage episode 479130578 series 3591750
Content provided by Simplify Tech. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Simplify Tech or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ppacc.player.fm/legal.

In this episode, we sit down with Alon Bochman, a seasoned AI leader and founder of RagMetrics, whose career spans leadership roles at Google, Microsoft, FactSet, and a successful startup exit to Thomson Reuters. Alon breaks down the urgent gaps in AI testing today, why bilateral evaluation between humans and AI is the future, and shares a chilling real-world example of autonomous agents evolving faster than expected.

We dive into why true AGI might be forever elusive (because we keep redefining it), the moral and economic need for universal basic income, and why today's "chatbot phase" is just the first flicker of a much larger AI revolution—akin to the dawn of electricity.

A must-listen for anyone thinking seriously about building with AI, investing in AI, or simply understanding where this tech tidal wave is headed.

Chapters

  • 00:00 – Introduction to Alon Bochman: Ex-Google, Ex-Microsoft, now Founder of RagMetrics
  • 00:50 – Journey from fintech to leading AI teams at major tech giants
  • 01:50 – The problem no one talks about: Why AI apps often skip real testing
  • 03:50 – The three bad choices in AI evaluation today
  • 06:12 – How bilateral feedback loops improve AI testing and model reliability
  • 08:18 – Human bias in feedback: Lessons learned from FactSet tech support automation
  • 13:43 – The importance of read/write knowledge bases to adapt with change
  • 15:43 – RagMetrics’ mission: Making AI evaluation scalable and less painful
  • 16:08 – Redefining AGI: Why we move the goalposts every time computers get better
  • 21:10 – AI’s impact on jobs: Why universal income may become a necessity
  • 25:29 – The electricity analogy: How AI will transform industries like a silent revolution
  • 27:54 – The scariest AI demo Alon ever saw: Agents that build their own tools
  • 32:36 – Using AI personally: Learning, demo generation, and accelerating workflows
  • 34:49 – Final advice to entrepreneurs: Just start, but test before you ship

Notable Quotes

  • “If you're not testing your AI today, you're building a bridge without running trucks over it.”
  • “AI will never reach AGI—because AGI keeps getting redefined.”
  • “Progress is unstoppable. The only question is whether you're the Roomba—or the person programming it.”
  • “Testing is the bridge from hobby projects to real-world value.”
  • “The scariest thing I saw? An agent learning to create its own tools in 25 lines of code.”

Links & Resources


  continue reading

37 episodes

Artwork
iconShare
 
Manage episode 479130578 series 3591750
Content provided by Simplify Tech. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Simplify Tech or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ppacc.player.fm/legal.

In this episode, we sit down with Alon Bochman, a seasoned AI leader and founder of RagMetrics, whose career spans leadership roles at Google, Microsoft, FactSet, and a successful startup exit to Thomson Reuters. Alon breaks down the urgent gaps in AI testing today, why bilateral evaluation between humans and AI is the future, and shares a chilling real-world example of autonomous agents evolving faster than expected.

We dive into why true AGI might be forever elusive (because we keep redefining it), the moral and economic need for universal basic income, and why today's "chatbot phase" is just the first flicker of a much larger AI revolution—akin to the dawn of electricity.

A must-listen for anyone thinking seriously about building with AI, investing in AI, or simply understanding where this tech tidal wave is headed.

Chapters

  • 00:00 – Introduction to Alon Bochman: Ex-Google, Ex-Microsoft, now Founder of RagMetrics
  • 00:50 – Journey from fintech to leading AI teams at major tech giants
  • 01:50 – The problem no one talks about: Why AI apps often skip real testing
  • 03:50 – The three bad choices in AI evaluation today
  • 06:12 – How bilateral feedback loops improve AI testing and model reliability
  • 08:18 – Human bias in feedback: Lessons learned from FactSet tech support automation
  • 13:43 – The importance of read/write knowledge bases to adapt with change
  • 15:43 – RagMetrics’ mission: Making AI evaluation scalable and less painful
  • 16:08 – Redefining AGI: Why we move the goalposts every time computers get better
  • 21:10 – AI’s impact on jobs: Why universal income may become a necessity
  • 25:29 – The electricity analogy: How AI will transform industries like a silent revolution
  • 27:54 – The scariest AI demo Alon ever saw: Agents that build their own tools
  • 32:36 – Using AI personally: Learning, demo generation, and accelerating workflows
  • 34:49 – Final advice to entrepreneurs: Just start, but test before you ship

Notable Quotes

  • “If you're not testing your AI today, you're building a bridge without running trucks over it.”
  • “AI will never reach AGI—because AGI keeps getting redefined.”
  • “Progress is unstoppable. The only question is whether you're the Roomba—or the person programming it.”
  • “Testing is the bridge from hobby projects to real-world value.”
  • “The scariest thing I saw? An agent learning to create its own tools in 25 lines of code.”

Links & Resources


  continue reading

37 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Quick Reference Guide

Listen to this show while you explore
Play