Scaling AI inference with open source ft. Brian Stevens
MP3•Episode home
Manage episode 486763187 series 3668811
Content provided by Red Hat. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Red Hat or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ppacc.player.fm/legal.
Explore the future of enterprise AI with Red Hat's SVP and AI CTO, Brian Stevens. In this episode, we delve into how AI is being practically reimagined for real-world business environments, focusing on the pivotal shift to production-quality inference at scale and the transformative power of open source. Brian Stevens shares his expertise and unique perspective on: • The evolution of AI from experimental stages to essential, production-ready enterprise solutions. • Key lessons from the early days of enterprise Linux and their application to today’s AI inference challenges. • The critical role of projects like vLLM in optimizing AI models and creating a common, efficient inference stack for diverse hardware. • Innovations in GPU-based inference and distributed systems (like KV cache) that enable AI scalability. Tune in for a deep dive into the infrastructure and strategies making enterprise AI a reality. Whether you're a seasoned technologist, an AI practitioner, or a leader charting your company's AI journey, this discussion will provide valuable insights into building an accessible, efficient, and powerful AI future with open source.
…
continue reading
2 episodes