Programmers may be facing a "uninvited" performance review. Recently, Anthropic announced a collaboration with Mozilla, using its AI model Claude Opus 4.6 to conduct a security audit of the Firefox browser. Surprisingly, Claude identified 22 security vulnerabilities within just two weeks.
Of these 22 vulnerabilities, 14 were classified as high-severity flaws. According to statistics, this number accounts for one-fifth of all high-severity vulnerabilities Mozilla fixed in 2025. This efficient performance not only highlights AI's exceptional ability in handling large and complex codebases, but also left many experienced security experts amazed: AI is fundamentally changing the economic landscape of vulnerability discovery.
Unlike common AI hallucinations, these 22 vulnerabilities have been strictly manually verified by Mozilla's security engineers, confirming them as real and serious security risks. Claude performed exceptionally well in identifying memory safety issues in specific code paths, providing higher-quality signals than traditional fuzzing (Fuzzing).
Industry insiders point out that experienced researchers usually find only 2 to 3 such vulnerabilities within two weeks. The introduction of AI has increased the efficiency of security audits by nearly ten times.
However, this breakthrough has also raised concerns within the community. As the barrier to AI-driven vulnerability discovery decreases, a large number of low-quality vulnerability reports generated by AI are flooding into open-source projects' bug bounty programs, causing a sharp increase in review costs. How to filter out truly valuable alerts from the massive information produced by AI has become a new challenge for the security community.
