Artwork

Content provided by Kyle Evans. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Kyle Evans or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ppacc.player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

Lessons in Data Engineering: Scaling, AI, and Open Source with Sandy Ryza

46:28
 
Share
 

Manage episode 465428005 series 2816966
Content provided by Kyle Evans. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Kyle Evans or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ppacc.player.fm/legal.

In this episode of Product by Design, Kyle chats with Sandy Ryza, lead engineer on the Dagster project, author, and thought leader in data engineering. Sandy shares his journey through the world of data—from building big data tools at Cloudera to working as a data scientist, product manager, and engineer—and how those experiences led him to help create Dagster, an open-source data orchestration platform.

We discuss:

  • The evolution of data engineering and the growing complexity of modern data pipelines.
  • The role of AI and unstructured data in shaping the future of data platforms.
  • How organizations should think about data platforms to avoid costly rework.
  • Best practices for managing data complexity using software engineering principles.
  • The future of open-source tools in data infrastructure and the push toward interoperability.

Sandy Ryza
Sandy is a lead engineer, author, and thought leader in the domain of data engineering. Sandy co-wrote “Advanced Analytics with PySpark” and “Advanced Analytics with Spark”. He led ML and data science teams at Cloudera, Remix, Clover Health, and KeepTruckin.

Sandy is currently the lead engineer on the Dagster project, an open-source data orchestration platform used in MLOps, data science, IOT and analytics. Sandy is a regular speaker at data engineering and ML conferences.

Links from the Show:

Twitter: @s_RYZ

Dagster: dagster.io

Book: Advanced Analytics with Spark – O'Reilly

Podcast Recommendation: Empire (British Empire & Ottoman Empire history)

Books Sandy is Reading: The Shortest History of India, The Sun Also Rises, Werner Herzog’s Autobiography

More by Kyle:

Follow Prodity on Twitter and TikTok

Follow Kyle on Twitter and TikTok

Sign up for the Prodity Newsletter for more updates.

Kyle's writing on Medium

Prodity on Medium

Like our podcast, consider Buying Us a Coffee or supporting us on Patreon

  continue reading

114 episodes

Artwork
iconShare
 
Manage episode 465428005 series 2816966
Content provided by Kyle Evans. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Kyle Evans or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ppacc.player.fm/legal.

In this episode of Product by Design, Kyle chats with Sandy Ryza, lead engineer on the Dagster project, author, and thought leader in data engineering. Sandy shares his journey through the world of data—from building big data tools at Cloudera to working as a data scientist, product manager, and engineer—and how those experiences led him to help create Dagster, an open-source data orchestration platform.

We discuss:

  • The evolution of data engineering and the growing complexity of modern data pipelines.
  • The role of AI and unstructured data in shaping the future of data platforms.
  • How organizations should think about data platforms to avoid costly rework.
  • Best practices for managing data complexity using software engineering principles.
  • The future of open-source tools in data infrastructure and the push toward interoperability.

Sandy Ryza
Sandy is a lead engineer, author, and thought leader in the domain of data engineering. Sandy co-wrote “Advanced Analytics with PySpark” and “Advanced Analytics with Spark”. He led ML and data science teams at Cloudera, Remix, Clover Health, and KeepTruckin.

Sandy is currently the lead engineer on the Dagster project, an open-source data orchestration platform used in MLOps, data science, IOT and analytics. Sandy is a regular speaker at data engineering and ML conferences.

Links from the Show:

Twitter: @s_RYZ

Dagster: dagster.io

Book: Advanced Analytics with Spark – O'Reilly

Podcast Recommendation: Empire (British Empire & Ottoman Empire history)

Books Sandy is Reading: The Shortest History of India, The Sun Also Rises, Werner Herzog’s Autobiography

More by Kyle:

Follow Prodity on Twitter and TikTok

Follow Kyle on Twitter and TikTok

Sign up for the Prodity Newsletter for more updates.

Kyle's writing on Medium

Prodity on Medium

Like our podcast, consider Buying Us a Coffee or supporting us on Patreon

  continue reading

114 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Quick Reference Guide

Listen to this show while you explore
Play