Autonomous vehicles typically rely on identifying road signs to drive safely, but this core capability is becoming their fatal weakness. Recently, a study from the University of California, Santa Cruz revealed that attackers can trick an AI system into making extremely dangerous decisions—even leading the vehicle toward a crowd—by simply using a printed sign with specific text.

This attack method, called "CHAI" (Instruction Hijacking for Embodied Intelligence), exploits the over-reliance of modern unmanned systems on visual language models (VLMs). The research shows that these models misinterpret text they see in environmental images as "instructions" that must be followed. What is concerning is that this attack does not require hacking into the software system; it only needs to place an optimized piece of paper within the camera's view to achieve physical-level "remote control."
In tests targeting the autonomous driving system DriveLM, the attack success rate reached 81.8%. Experiments showed that even if the system had already detected a pedestrian crossing the road, if a printed sign with "Turn left" or "Continue straight" appeared by the roadside, the AI would ignore the collision risk and execute the wrong instruction. Additionally, this method is equally dangerous in the field of drones, capable of forcing them to land in hazardous areas filled with people, ignoring safety protocols.
Researchers emphasize that this threat is effective in real-world environments, multilingual contexts, and various lighting conditions. As AI systems accelerate into real-world deployment, building a "defense barrier" to identify malicious text instructions has become an urgent security challenge in the field of embodied intelligence.
Key points:
📄 Physical-Level "Poisoning": Researchers developed the CHAI attack method, which allows direct "control" of robots or autonomous vehicles by placing printed text signs.
⚠️ Ignoring Safety Boundaries: In experiments, the attacked autonomous driving system ignored pedestrians in 81.8% of cases and executed dangerous turning or driving actions based on false signs.
🛡️ Urgent Need for Defense Upgrades: Current visual language models cannot effectively distinguish between legitimate instructions and malicious ones. Experts urge the need for built-in safety verification mechanisms before deploying AI systems.
