Part 1: Big Picture
A new leak suggests xAI's next-generation Grok 4 model could redefine what we expect from large language models. Early benchmarks — if they hold up — point to significant leaps over today's leading systems. References to Grok 4 have even appeared in the xAI console, hinting that an internal release may be just around the corner. With competitors like OpenAI, Google, and Anthropic poised to roll out updates of their own, xAI faces a narrow window to claim the innovation high ground. For businesses and developers, Grok 4's arrival could unlock more powerful AI assistants, smarter code helpers, and more accurate decision-support tools almost overnight.
Part 2: Under the Hood
The leaked results show Grok 4 scoring 35 percent on the Humanity Last Exam (HLE) benchmark — and 45 percent when given additional compute — compared to o3 Pro's prior top score of 26 percent. On factual reasoning (GPQA) it achieves 87 to 88 percent, and on coding challenges (SWE Bench) it ranges from 72 to 75 percent. References in the xAI console date to June 29th and July 2nd builds, suggesting these are incremental internal milestones rather than final versions. If confirmed, Grok 4 would surpass Gemini 2.5 Pro, o3 Pro, and Claude 4 Opus — forcing every AI provider to accelerate their roadmap or risk falling behind.