Go offline with the Player FM app!
Audio AI on the Edge with Ceva
Manage episode 488331937 series 3574631
Audio processing at the edge is undergoing a revolution as deep learning transforms what's possible on tiny, power-constrained devices. Daniel from SIVA takes us on a fascinating journey through the complete lifecycle of audio AI models—from initial development to real-world deployment on microcontrollers.
We explore two groundbreaking applications that demonstrate the power of audio machine learning on resource-limited hardware. First, Environmental Noise Cancellation (ENC) addresses the critical need for clear communication in noisy environments. Rather than accepting the limitations of traditional approaches that require multiple microphones, SIVA's single-microphone solution leverages deep neural networks to achieve superior noise reduction while preserving speech quality—all with a model eight times smaller than conventional alternatives.
The conversation then shifts to voice interfaces, where Text-to-Model technology is eliminating months of development time by generating keyword spotting models directly from text input. This innovation allows manufacturers to create, modify, or rebrand voice commands instantly without costly data collection and retraining cycles. Each additional keyword requires merely one kilobyte of memory, making sophisticated voice interfaces accessible even on the smallest devices.
Throughout the discussion, Daniel reveals the technical challenges and breakthroughs involved in optimizing these models for production environments. From quantization-aware training and SVD compression to knowledge distillation and framework conversion strategies, we gain practical insights into making AI work effectively within severe computational constraints.
Whether you're developing embedded systems, designing voice-enabled products, or simply curious about the future of human-machine interaction, this episode offers valuable perspective on how audio AI is becoming both more powerful and more accessible. The era of intelligent listening devices is here—and they're smaller, more efficient, and more capable than ever before.
Ready to explore audio AI for your next project? Check out SIVA's YouTube channel for demos of these technologies in action, or join the Edge AI Foundation's Audio Working Group to collaborate with industry experts on advancing this rapidly evolving field.
Learn more about the EDGE AI FOUNDATION - edgeaifoundation.org
Chapters
1. Audio AI on the Edge with Ceva (00:00:00)
2. Introduction and Foundation Updates (00:00:36)
3. Upcoming Events and Livestreams (00:03:35)
4. Welcoming Daniel from SIVA (00:06:15)
5. Three Stages of Neural Network Development (00:12:19)
6. Environmental Noise Cancellation Applications (00:13:46)
7. D-Filter 3 Architecture and Modifications (00:19:08)
8. Model Optimization Techniques (00:29:44)
9. Deployment Process for MCUs (00:34:36)
10. Text-to-Model Solution for Voice Interfaces (00:38:34)
11. Q&A on PyTorch to TensorFlow Conversion (00:45:09)
12. Knowledge Distillation and Model Deployment (00:53:21)
13. Final Q&A Session (00:56:26)
44 episodes
Manage episode 488331937 series 3574631
Audio processing at the edge is undergoing a revolution as deep learning transforms what's possible on tiny, power-constrained devices. Daniel from SIVA takes us on a fascinating journey through the complete lifecycle of audio AI models—from initial development to real-world deployment on microcontrollers.
We explore two groundbreaking applications that demonstrate the power of audio machine learning on resource-limited hardware. First, Environmental Noise Cancellation (ENC) addresses the critical need for clear communication in noisy environments. Rather than accepting the limitations of traditional approaches that require multiple microphones, SIVA's single-microphone solution leverages deep neural networks to achieve superior noise reduction while preserving speech quality—all with a model eight times smaller than conventional alternatives.
The conversation then shifts to voice interfaces, where Text-to-Model technology is eliminating months of development time by generating keyword spotting models directly from text input. This innovation allows manufacturers to create, modify, or rebrand voice commands instantly without costly data collection and retraining cycles. Each additional keyword requires merely one kilobyte of memory, making sophisticated voice interfaces accessible even on the smallest devices.
Throughout the discussion, Daniel reveals the technical challenges and breakthroughs involved in optimizing these models for production environments. From quantization-aware training and SVD compression to knowledge distillation and framework conversion strategies, we gain practical insights into making AI work effectively within severe computational constraints.
Whether you're developing embedded systems, designing voice-enabled products, or simply curious about the future of human-machine interaction, this episode offers valuable perspective on how audio AI is becoming both more powerful and more accessible. The era of intelligent listening devices is here—and they're smaller, more efficient, and more capable than ever before.
Ready to explore audio AI for your next project? Check out SIVA's YouTube channel for demos of these technologies in action, or join the Edge AI Foundation's Audio Working Group to collaborate with industry experts on advancing this rapidly evolving field.
Learn more about the EDGE AI FOUNDATION - edgeaifoundation.org
Chapters
1. Audio AI on the Edge with Ceva (00:00:00)
2. Introduction and Foundation Updates (00:00:36)
3. Upcoming Events and Livestreams (00:03:35)
4. Welcoming Daniel from SIVA (00:06:15)
5. Three Stages of Neural Network Development (00:12:19)
6. Environmental Noise Cancellation Applications (00:13:46)
7. D-Filter 3 Architecture and Modifications (00:19:08)
8. Model Optimization Techniques (00:29:44)
9. Deployment Process for MCUs (00:34:36)
10. Text-to-Model Solution for Voice Interfaces (00:38:34)
11. Q&A on PyTorch to TensorFlow Conversion (00:45:09)
12. Knowledge Distillation and Model Deployment (00:53:21)
13. Final Q&A Session (00:56:26)
44 episodes
All episodes
×Welcome to Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.