Artwork

Content provided by Demetrios. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Demetrios or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ppacc.player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

Scalable Python for Everyone, Everywhere // Matthew Rocklin // MLOps Meetup #38

57:10
 
Share
 

Manage episode 313294507 series 3241972
Content provided by Demetrios. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Demetrios or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ppacc.player.fm/legal.

Parallel Computing with Dask and Coiled

Python makes data science and machine learning accessible to millions of people around the world. However, historically Python hasn't handled parallel computing well, which leads to issues as researchers try to tackle problems on increasingly large datasets. Dask is an open source Python library that enables the existing Python data science stack (Numpy, Pandas, Scikit-Learn, Jupyter, ...) with parallel and distributed computing. Today Dask has been broadly adopted by most major Python libraries, and is maintained by a robust open source community across the world.

This talk discusses parallel computing generally, Dask's approach to parallelizing an existing ecosystem of software, and some of the challenges we've seen in deploying distributed systems.

Finally, we also addressed the challenges of robustly deploying distributed systems, which ends up being one of the main accessibility challenges for users today. We hope that by the end of the meetup attendees will better understand parallel computing, have built intuition around how Dask works, and have the opportunity to play with their own Dask cluster on the cloud.

Matthew is an open source software developer in the numeric Python ecosystem. He maintains several PyData libraries, but today focuses mostly on Dask a library for scalable computing. Matthew worked for Anaconda Inc for several years, then built out the Dask team at NVIDIA for RAPIDS, and most recently founded Coiled Computing to improve Python's scalability with Dask for large organizations.

Matthew has given talks at a variety of technical, academic, and industry conferences. A list of talks and keynotes is available at (https://matthewrocklin.com/talks).

Matthew holds a bachelor’s degree from UC Berkeley in physics and mathematics, and a PhD in computer science from the University of Chicago.

Check out our posts here to get more context around where we're coming from:
https://medium.com/coiled-hq/coiled-dask-for-everyone-everywhere-376f5de0eff4
https://medium.com/coiled-hq/the-unbearable-challenges-of-data-science-at-scale-83d294fa67f8

----------- Connect With Us ✌️-------------

Join our Slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register

Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/
Connect with Matthew on LinkedIn: https://www.linkedin.com/in/matthew-rocklin-461b4323/

  continue reading

449 episodes

Artwork
iconShare
 
Manage episode 313294507 series 3241972
Content provided by Demetrios. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Demetrios or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ppacc.player.fm/legal.

Parallel Computing with Dask and Coiled

Python makes data science and machine learning accessible to millions of people around the world. However, historically Python hasn't handled parallel computing well, which leads to issues as researchers try to tackle problems on increasingly large datasets. Dask is an open source Python library that enables the existing Python data science stack (Numpy, Pandas, Scikit-Learn, Jupyter, ...) with parallel and distributed computing. Today Dask has been broadly adopted by most major Python libraries, and is maintained by a robust open source community across the world.

This talk discusses parallel computing generally, Dask's approach to parallelizing an existing ecosystem of software, and some of the challenges we've seen in deploying distributed systems.

Finally, we also addressed the challenges of robustly deploying distributed systems, which ends up being one of the main accessibility challenges for users today. We hope that by the end of the meetup attendees will better understand parallel computing, have built intuition around how Dask works, and have the opportunity to play with their own Dask cluster on the cloud.

Matthew is an open source software developer in the numeric Python ecosystem. He maintains several PyData libraries, but today focuses mostly on Dask a library for scalable computing. Matthew worked for Anaconda Inc for several years, then built out the Dask team at NVIDIA for RAPIDS, and most recently founded Coiled Computing to improve Python's scalability with Dask for large organizations.

Matthew has given talks at a variety of technical, academic, and industry conferences. A list of talks and keynotes is available at (https://matthewrocklin.com/talks).

Matthew holds a bachelor’s degree from UC Berkeley in physics and mathematics, and a PhD in computer science from the University of Chicago.

Check out our posts here to get more context around where we're coming from:
https://medium.com/coiled-hq/coiled-dask-for-everyone-everywhere-376f5de0eff4
https://medium.com/coiled-hq/the-unbearable-challenges-of-data-science-at-scale-83d294fa67f8

----------- Connect With Us ✌️-------------

Join our Slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register

Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/
Connect with Matthew on LinkedIn: https://www.linkedin.com/in/matthew-rocklin-461b4323/

  continue reading

449 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Quick Reference Guide

Copyright 2025 | Privacy Policy | Terms of Service | | Copyright
Listen to this show while you explore
Play