Not Focused on Performance, but on Integrity: xAI Releases Grok 4.20 with Industry-Low Hallucination Rate

At a time when AI giants are feverishly adding resources and chasing peak performance scores, Elon Musk's xAI has taken a different approach, aiming to solve the most frustrating issue in the AI field: "seriously making up nonsense." Today, xAI officially launched Grok4.20Beta. Although it still lags behind the top tier in absolute intelligence scores, it has set a new industry record in the key metric of "truthfulness."

According to the latest evaluation by Artificial Analysis, Grok4.20 scored 48 on the intelligence index in reasoning mode. Although it lags behind and (both scoring 57), its performance in fact reliability was extremely impressive:

Low hallucination rate: In the AA Omniscience test, Grok4.20 achieved a 78% "non-hallucination rate", setting a new record.
Knowing what you know: When faced with questions it cannot answer, the model no longer tends to fabricate false facts, but instead more accurately admits "I don't know." This "honesty" is crucial for rigorous office and research environments.

Technical Architecture: A Three-in-One API Matrix

To meet the needs of different levels, xAI has launched three API variants:

Reasoning Mode: Sacrifices speed for deep logical thinking, which is the core of breaking the hallucination record this time.

Standard Mode: Focuses on fast response and routine interaction.

Multi-agent Mode: Supports multiple AI instances working together to handle complex tasks.

Market Strategy: More Content, No Extra Cost

In addition to its unique performance, Grok4.20 also has an aggressive commercial strategy:

Large context: Supports a context window of up to 2 million tokens, allowing it to absorb an entire book or large codebase at once.
Price advantage: It is priced between $2 to $6 per million tokens, not only cheaper than the previous version Grok4, but also highly competitive among current Western mainstream models.

The release of Grok4.20 reflects xAI's strategic shift—no longer obsessing over the total score race toward AGI, but precisely targeting the pain point of "enterprise-level reliability." As the evaluation institution stated, if other models are striving to become "omniscient prophets," Grok4.20 is striving to become "a helper who never lies."

For users with extremely high requirements for data accuracy, Grok4.20 may become a third major option, in addition to OpenAI and Google.

Not Focused on Performance, but on Integrity: xAI Releases Grok 4.20 with Industry-Low Hallucination Rate

Technical Architecture: A Three-in-One API Matrix

Market Strategy: More Content, No Extra Cost

Related Recommendations

Musk Puts Grok into Excel: Select a Block of Data and Ask About the Reason for Price Fluctuations, Charts Can Be Directly Inserted into the Table

xAI Rarely Sues Grok Users: Accuses Them of Misusing AI to Generate Child Pornography Images and Videos, Has Suspended Over 50,000 Violating Accounts

xAI Rare Punch: Suing User for Abuse of Grok to Create Inappropriate Images, 52,222 Accounts Blocked This Year

xAI Laid Bare the Source Code of Grok Build: Compile It Yourself, Run It Locally Offline, and See Every Single Tool Call

OpenAI Fights Back Against Musk's Trade Secret Lawsuit: Demands xAI to Bear Millions in Legal Fees, Claims It Sued First and Then Sought Evidence