Go offline with the Player FM app!
Podcasts Worth a Listen
SPONSORED


1 America’s Sweethearts: Dallas Cowboys Cheerleaders Season 2 - Tryouts, Tears, & Texas 32:48
#006 Data Orchestration Tools, Choosing the right one for your needs
Manage episode 428522577 series 3585930
In this episode, Nicolay Gerold interviews John Wessel, the founder of Agreeable Data, about data orchestration. They discuss the evolution of data orchestration tools, the popularity of Apache Airflow, the crowded market of orchestration tools, and the key problem that orchestrators solve. They also explore the components of a data orchestrator, the role of AI in data orchestration, and how to choose the right orchestrator for a project. They touch on the challenges of managing orchestrators, the importance of monitoring and optimization, and the need for product people to be more involved in the orchestration space. They also discuss data residency considerations and the future of orchestration tools.
Sound Bites
"The modern era, definitely airflow. Took the market share, a lot of people running it themselves." "It's like people are launching new orchestrators every day. This is a funny one. This was like two weeks ago, somebody launched an orchestrator that was like a meta-orchestrator." "The DAG introduced two other components. It's directed acyclic graph is what DAG means, but direct is like there's a start and there's a finish and the acyclic is there's no loops."
Key Topics
- The evolution of data orchestration: From basic task scheduling to complex DAG-based solutions
- What is a data orchestrator and when do you need one? Understanding the role of orchestrators in handling complex dependencies and scaling data pipelines.
- The crowded market: A look at popular options like Airflow, Daxter, Prefect, and more.
- Best practices: Choosing the right tool, prioritizing serverless solutions when possible, and focusing on solving the use case before implementing complex tools.
- Data residency and GDPR: How regulations influence tool selection, especially in Europe.
- Future of the field: The need for consolidation and finding the right balance between features and usability.
John Wessel:
Nicolay Gerold:
Data orchestration, data movement, Apache Airflow, orchestrator selection, DAG, AI in orchestration, serverless, Kubernetes, infrastructure as code, monitoring, optimization, data residency, product involvement, generative AI.
Chapters
00:00 Introduction and Overview
00:34 The Evolution of Data Orchestration Tools
04:54 Components and Flow of Data in Orchestrators
08:24 Deployment Options: Serverless vs. Kubernetes
11:14 Considerations for Data Residency and Security
13:02 The Need for a Clear Winner in the Orchestration Space
20:47 Optimization Techniques for Memory and Time-Limited Issues
23:09 Integrating Orchestrators with Infrastructure-as-Code
24:33 Bridging the Gap Between Data and Engineering Practices
27:2 2Exciting Technologies Outside of Data Orchestration
30:09 The Feature of Dagster
59 episodes
Manage episode 428522577 series 3585930
In this episode, Nicolay Gerold interviews John Wessel, the founder of Agreeable Data, about data orchestration. They discuss the evolution of data orchestration tools, the popularity of Apache Airflow, the crowded market of orchestration tools, and the key problem that orchestrators solve. They also explore the components of a data orchestrator, the role of AI in data orchestration, and how to choose the right orchestrator for a project. They touch on the challenges of managing orchestrators, the importance of monitoring and optimization, and the need for product people to be more involved in the orchestration space. They also discuss data residency considerations and the future of orchestration tools.
Sound Bites
"The modern era, definitely airflow. Took the market share, a lot of people running it themselves." "It's like people are launching new orchestrators every day. This is a funny one. This was like two weeks ago, somebody launched an orchestrator that was like a meta-orchestrator." "The DAG introduced two other components. It's directed acyclic graph is what DAG means, but direct is like there's a start and there's a finish and the acyclic is there's no loops."
Key Topics
- The evolution of data orchestration: From basic task scheduling to complex DAG-based solutions
- What is a data orchestrator and when do you need one? Understanding the role of orchestrators in handling complex dependencies and scaling data pipelines.
- The crowded market: A look at popular options like Airflow, Daxter, Prefect, and more.
- Best practices: Choosing the right tool, prioritizing serverless solutions when possible, and focusing on solving the use case before implementing complex tools.
- Data residency and GDPR: How regulations influence tool selection, especially in Europe.
- Future of the field: The need for consolidation and finding the right balance between features and usability.
John Wessel:
Nicolay Gerold:
Data orchestration, data movement, Apache Airflow, orchestrator selection, DAG, AI in orchestration, serverless, Kubernetes, infrastructure as code, monitoring, optimization, data residency, product involvement, generative AI.
Chapters
00:00 Introduction and Overview
00:34 The Evolution of Data Orchestration Tools
04:54 Components and Flow of Data in Orchestrators
08:24 Deployment Options: Serverless vs. Kubernetes
11:14 Considerations for Data Residency and Security
13:02 The Need for a Clear Winner in the Orchestration Space
20:47 Optimization Techniques for Memory and Time-Limited Issues
23:09 Integrating Orchestrators with Infrastructure-as-Code
24:33 Bridging the Gap Between Data and Engineering Practices
27:2 2Exciting Technologies Outside of Data Orchestration
30:09 The Feature of Dagster
59 episodes
All episodes
×
1 #052 Don't Build Models, Build Systems That Build Models 59:22

1 #051 Build systems that can be debugged at 4am by tired humans with no context 1:05:51

1 #050 Bringing LLMs to Production: Delete Frameworks, Avoid Finetuning, Ship Faster 1:06:57

1 #050 TAKEAWAYS Bringing LLMs to Production: Delete Frameworks, Avoid Finetuning, Ship Faster 11:00

1 #049 BAML: The Programming Language That Turns LLMs into Predictable Functions 1:02:38

1 #049 TAKEAWAYS BAML: The Programming Language That Turns LLMs into Predictable Functions 1:12:34

1 #048 TAKEAWAYS Why Your AI Agents Need Permission to Act, Not Just Read 7:06

1 #048 Why Your AI Agents Need Permission to Act, Not Just Read 57:02

1 #047 Architecting Information for Search, Humans, and Artificial Intelligence 57:21

1 #046 Building a Search Database From First Principles 53:28

1 #045 RAG As Two Things - Prompt Engineering and Search 1:02:43

1 #044 Graphs Aren't Just For Specialists Anymore 1:03:34

1 #043 Knowledge Graphs Won't Fix Bad Data 1:10:58

1 #042 Temporal RAG, Embracing Time for Smarter, Reliable Knowledge Graphs 1:33:43

1 #041 Context Engineering, How Knowledge Graphs Help LLMs Reason 1:33:34
Welcome to Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.