Go offline with the Player FM app!
Shopify’s Journey to Planet-Scale Observability - OpenObservability Talks S5E09
Manage episode 468674693 series 3252969
Shopify operates at massive scale, running thousands of services and processing billions of events per second. To tackle the challenges of observability at this scale, they built Observe—an in-house observability stack that makes use of open-source tools and specifications. In fact, they replaced an older vendors-based system, in an awe-inspiring migration project. But why build their own stack? Which open source tools did they use? How did they shape the user experience to their needs?
Joining us to unpack Shopify’s journey is Elijah McPherson, an engineering leader with deep expertise in observability and distributed systems. Elijah led the complete rebuild of Shopify’s observability stack and now also oversees jobs, caching, search, and ClickHouse infrastructure. Tune in to hear firsthand insights from one of the most innovative purpose-built observability implementations in production today!
The episode was live-streamed on 11 February 2025 and the video is available at https://www.youtube.com/watch?v=rBfTjlXKJW0
OpenObservability Talks episodes are released monthly, on the last Thursday of each month and are available for listening on your favorite podcast app and on YouTube.
We live-stream the episodes on Twitch and YouTube Live - tune in to see us live, and chime in with your comments and questions on the live chat.
https://www.youtube.com/@openobservabilitytalks
https://www.twitch.tv/openobservability
Show Notes:
00:46 - Episode and guest intro
03:43 - Why rebuild the observability stack in house
05:47 - Cost and vendor lock-in
07:09 - Tailoring observability for the organizational processes
10:27 - How to build a team to build in-house observability
13:37 - The importance of product sense in internal platforms
18:05 - The functionality of Shopify’s observability platform
25:15 - The Open Source stack used at Shopify observability
29:50 - Extending open source Grafana to Shopify’s needs
36:23 - Adopting open standards
42:26 - observability into business health
45:16 - how to run a migration project for a live production platform
53:15 - final tips and best practices
56:41 - which organizations should develop in-house observability
Resources:
Episode: Scaling Platform Engineering: Shopify’s Blueprint: https://medium.com/p/f18e97140681
Shopify Observe - lectures: https://www.linkedin.com/posts/elijahmcpherson_observe-activity-7258195493657223168-mOGS/
Socials:
Twitter: https://twitter.com/OpenObserv
YouTube: https://www.youtube.com/@openobservabilitytalks
Dotan Horovits
============
Twitter:
@horovits
LinkedIn:
www.linkedin.com/in/horovits
Mastodon: @horovits@fosstodon
BlueSky: @horovits.bsky.social
Elijah McPherson
===============
Twitter: https://twitter.com/ElijahMcPherson
59 episodes
Manage episode 468674693 series 3252969
Shopify operates at massive scale, running thousands of services and processing billions of events per second. To tackle the challenges of observability at this scale, they built Observe—an in-house observability stack that makes use of open-source tools and specifications. In fact, they replaced an older vendors-based system, in an awe-inspiring migration project. But why build their own stack? Which open source tools did they use? How did they shape the user experience to their needs?
Joining us to unpack Shopify’s journey is Elijah McPherson, an engineering leader with deep expertise in observability and distributed systems. Elijah led the complete rebuild of Shopify’s observability stack and now also oversees jobs, caching, search, and ClickHouse infrastructure. Tune in to hear firsthand insights from one of the most innovative purpose-built observability implementations in production today!
The episode was live-streamed on 11 February 2025 and the video is available at https://www.youtube.com/watch?v=rBfTjlXKJW0
OpenObservability Talks episodes are released monthly, on the last Thursday of each month and are available for listening on your favorite podcast app and on YouTube.
We live-stream the episodes on Twitch and YouTube Live - tune in to see us live, and chime in with your comments and questions on the live chat.
https://www.youtube.com/@openobservabilitytalks
https://www.twitch.tv/openobservability
Show Notes:
00:46 - Episode and guest intro
03:43 - Why rebuild the observability stack in house
05:47 - Cost and vendor lock-in
07:09 - Tailoring observability for the organizational processes
10:27 - How to build a team to build in-house observability
13:37 - The importance of product sense in internal platforms
18:05 - The functionality of Shopify’s observability platform
25:15 - The Open Source stack used at Shopify observability
29:50 - Extending open source Grafana to Shopify’s needs
36:23 - Adopting open standards
42:26 - observability into business health
45:16 - how to run a migration project for a live production platform
53:15 - final tips and best practices
56:41 - which organizations should develop in-house observability
Resources:
Episode: Scaling Platform Engineering: Shopify’s Blueprint: https://medium.com/p/f18e97140681
Shopify Observe - lectures: https://www.linkedin.com/posts/elijahmcpherson_observe-activity-7258195493657223168-mOGS/
Socials:
Twitter: https://twitter.com/OpenObserv
YouTube: https://www.youtube.com/@openobservabilitytalks
Dotan Horovits
============
Twitter:
@horovits
LinkedIn:
www.linkedin.com/in/horovits
Mastodon: @horovits@fosstodon
BlueSky: @horovits.bsky.social
Elijah McPherson
===============
Twitter: https://twitter.com/ElijahMcPherson
59 episodes
All episodes
×Welcome to Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.