Artwork

Content provided by Demetrios. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Demetrios or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ppacc.player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

The Birth and Growth of Spark: An Open Source Success Story // Matei Zaharia // MLOps Podcast #155

58:12
 
Share
 

Manage episode 361612949 series 3241972
Content provided by Demetrios. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Demetrios or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ppacc.player.fm/legal.

MLOps Coffee Sessions #155 with Matei Zaharia, The Birth and Growth of Spark: An Open Source Success Story, co-hosted by Vishnu Rachakonda. // Abstract We dive deep into the creation of Spark, with the creator himself - Matei Zaharia Chief technologist at Databricks. This episode also explores the development of Databricks' other open source home run ML Flow and the concept of "lake house ML". As a special treat Matei talked to us about the details of the "DSP" (Demonstrate Search Predict) project, which aims to enable building applications by combining LLMs and other text-returning systems. // About the guest: Matei has the unique advantage of being able to see different perspectives, having worked in both academia and the industry. He listens carefully to people's challenges and excitement about ML and uses this to come up with new ideas. As a member of Databricks, Matei also has the advantage of applying ML to Databricks' own internal practices. He is constantly asking the question "What's a better way to do this?" // Bio Matei Zaharia is an Associate Professor of Computer Science at Stanford and Chief Technologist at Databricks. He started the Apache Spark project during his Ph.D. at UC Berkeley, and co-developed other widely used open-source projects, including MLflow and Delta Lake, at Databricks. At Stanford, he works on distributed systems, NLP, and information retrieval, building programming models that can combine language models and external services to perform complex tasks. Matei’s research work was recognized through the 2014 ACM Doctoral Dissertation Award for the best Ph.D. dissertation in computer science, an NSF CAREER Award, and the US Presidential Early Career Award for Scientists and Engineers (PECASE). // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links https://cs.stanford.edu/~matei/ https://spark.apache.org/ --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/ Connect with Matei on LinkedIn: https://www.linkedin.com/in/mateizaharia/ Timestamps: [00:00] Matei's preferred coffee [01:45] Takeaways [05:50] Please subscribe to our newsletters, join our Slack, and subscribe to our podcast channels! [06:52] Getting to know Matei as a person [09:10] Spark [14:18] Open and freewheeling cross-pollination [16:35] Actual formation of Spark [20:05] Spark and MLFlow Similarities and Differences [24:24] Concepts in MLFlow [27:34] DJ Khalid of the ML world [30:58] Data Lakehouse [33:35] Stanford's unique culture of the Computer Science Department [36:06] Starting a company [39:30] Unique advice to grad students [41:51] Open source project [44:35] LLMs in the New Revolution [47:57] Type of company to start with [49:56] Emergence of Corporate Research Labs [53:50] LLMs size context [54:44] Companies to respect [57:28] Wrap up

  continue reading

438 episodes

Artwork
iconShare
 
Manage episode 361612949 series 3241972
Content provided by Demetrios. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Demetrios or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ppacc.player.fm/legal.

MLOps Coffee Sessions #155 with Matei Zaharia, The Birth and Growth of Spark: An Open Source Success Story, co-hosted by Vishnu Rachakonda. // Abstract We dive deep into the creation of Spark, with the creator himself - Matei Zaharia Chief technologist at Databricks. This episode also explores the development of Databricks' other open source home run ML Flow and the concept of "lake house ML". As a special treat Matei talked to us about the details of the "DSP" (Demonstrate Search Predict) project, which aims to enable building applications by combining LLMs and other text-returning systems. // About the guest: Matei has the unique advantage of being able to see different perspectives, having worked in both academia and the industry. He listens carefully to people's challenges and excitement about ML and uses this to come up with new ideas. As a member of Databricks, Matei also has the advantage of applying ML to Databricks' own internal practices. He is constantly asking the question "What's a better way to do this?" // Bio Matei Zaharia is an Associate Professor of Computer Science at Stanford and Chief Technologist at Databricks. He started the Apache Spark project during his Ph.D. at UC Berkeley, and co-developed other widely used open-source projects, including MLflow and Delta Lake, at Databricks. At Stanford, he works on distributed systems, NLP, and information retrieval, building programming models that can combine language models and external services to perform complex tasks. Matei’s research work was recognized through the 2014 ACM Doctoral Dissertation Award for the best Ph.D. dissertation in computer science, an NSF CAREER Award, and the US Presidential Early Career Award for Scientists and Engineers (PECASE). // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links https://cs.stanford.edu/~matei/ https://spark.apache.org/ --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/ Connect with Matei on LinkedIn: https://www.linkedin.com/in/mateizaharia/ Timestamps: [00:00] Matei's preferred coffee [01:45] Takeaways [05:50] Please subscribe to our newsletters, join our Slack, and subscribe to our podcast channels! [06:52] Getting to know Matei as a person [09:10] Spark [14:18] Open and freewheeling cross-pollination [16:35] Actual formation of Spark [20:05] Spark and MLFlow Similarities and Differences [24:24] Concepts in MLFlow [27:34] DJ Khalid of the ML world [30:58] Data Lakehouse [33:35] Stanford's unique culture of the Computer Science Department [36:06] Starting a company [39:30] Unique advice to grad students [41:51] Open source project [44:35] LLMs in the New Revolution [47:57] Type of company to start with [49:56] Emergence of Corporate Research Labs [53:50] LLMs size context [54:44] Companies to respect [57:28] Wrap up

  continue reading

438 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Quick Reference Guide

Copyright 2025 | Privacy Policy | Terms of Service | | Copyright
Listen to this show while you explore
Play