Invisible Prompts: How Image Scaling Attacks Break AI Security

Daily Security Review

Content provided by Daily Security Review. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Daily Security Review or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ppacc.player.fm/legal.

6d ago 23:03

MP3•Episode home

Researchers have uncovered a new form of indirect prompt injection that leverages a simple but powerful trick: image scaling. This novel attack involves hiding malicious instructions inside high-resolution images, invisible to the human eye. When AI systems automatically downscale these images during preprocessing, the hidden prompt becomes visible—not to the user, but to the AI model itself. The result? The model executes instructions the user never saw, potentially leading to data exfiltration, manipulation, or unauthorized actions.

In this episode, we break down how this attack works, why it’s so stealthy, and the risks it poses to enterprise and consumer AI systems alike. Researchers at Trail of Bits demonstrated the attack against multiple platforms—including Google Gemini CLI, Vertex AI Studio, Google Assistant on Android, and agentic browser tools—with successful proof-of-concepts like exfiltrating calendar data. What makes this so dangerous is that users never see the malicious downscaled image, making detection nearly impossible outside of system-level safeguards.

Google has argued that the attack requires non-default configurations, such as auto-approving tool calls, but the ubiquity of image preprocessing across AI applications means the risk is far from theoretical. As AI integrates deeper into sensitive workflows, prompt injection—already listed as the top AI vulnerability by OWASP—continues to evolve in sophistication and subtlety.

We also explore the broader context:

Prompt Injection: Direct vs. indirect methods, and why indirect attacks are harder to spot.
Security Implications: From sensitive data theft to unauthorized system actions in enterprise environments.
Mitigation Strategies: Secure by design approaches like limiting image dimensions, previewing downscaled inputs, requiring explicit user confirmation for sensitive actions, validating and filtering inputs, and deploying layered monitoring to detect unusual text inside images.
Research Tools: The release of Anamorpher, an open-source framework to craft and analyze image scaling attacks, empowering the security community to study and defend against these threats.

This is not just a niche research finding—it’s a glimpse into the future of AI security risks. As attackers exploit the very preprocessing steps that make AI usable, organizations must adopt defense-in-depth strategies and treat AI inputs with the same skepticism as any untrusted data.

#AI #PromptInjection #ImageScaling #Cybersecurity #TrailofBits #Anamorpher #OWASP #DataExfiltration #AIsecurity #GoogleGemini #VertexAI #GoogleAssistant #OpenSourceSecurity #IndirectPromptInjection #SecureByDesign

315 episodes

#Tech #News #Tech News #Daily Security Review