Recently, the YouTube tech channel Fully Buffered conducted an impressive and hardcore test: successfully running Meta's latest Llama 3.2 3B large model on the Pentium 4 641 processor (released in 2006).

This test forced modern artificial intelligence technology to "collide" with old hardware from 20 years ago, not only demonstrating the fundamental compatibility limits of LLMs, but also making many netizens reflect: the evolution speed of Moore's Law in the AI era has achieved a cross-temporal "handshake" in this way.

Hardware "Archaeology": Achieving Modern Computing Power with Extreme Configuration

To conduct this test, the Fully Buffered team recreated the hardware ceiling of mainstream enthusiasts in 2006:

  • Core Heart: Intel Pentium 4 641 (3.2GHz, single-core, 2MB L2 cache).

  • Memory Array: ASUS P5WDH Deluxe motherboard paired with 4 pieces of 2GB DDR2-800 memory, totaling 8GB.

  • Software Environment: The team specifically configured a No-AVX mode inference environment to accommodate the lack of AVX2 instruction set in the old architecture.

Slow-Motion Inference: A "Long March" of 0.21 Token per Second

During the test, when the system asked the question "What's a Pentium 4?", this 20-year-old single-core processor immediately entered its "peak load" state.

  • Generation Efficiency: The final generation speed was as low as 0.21 Token/second.

  • Time Cost: To get a complete answer, the Pentium 4 ran at full load for nearly 33 minutes.

In today's AI applications that pursue "millisecond-level" response times, a 33-minute wait is undoubtedly a "crash-level" experience. But for this single-core processor from the NetBurst architecture era, it was a "logical marathon" of AI principles across 20 years on old silicon.

Meaning Beyond Practicality: Proving the Compatibility Limits of AI

Why run AI on such an old machine? The test team said that this test was not about practicality, but rather verified two key boundaries:

  1. The Survival Space of No-AVX Instruction Set: Modern large models almost always assume the AVX instruction set, but through specific inference modes, AI can still perform reasoning without these instruction sets.

  2. The Role of Memory as a "Foundation": The 3B model with 3 billion parameters barely fitting into 8GB DDR2 memory proves that even with extremely low computing power, a single-core CPU can still support the operation of modern LLMs, not necessarily relying on top-tier GPU power.

Epilogue: The "Later Years" of the NetBurst Architecture

In 2006, Intel's Pentium 4 was still obsessed with the high-frequency competition of the NetBurst architecture, which prioritized "high frequency and low efficiency." Engineers at the time may have foreseen that the era of processors would come, but they probably never imagined that their designed architecture would, 20 years later, understand and explain its own history in such a difficult way.

This test provides an extreme reference case for the AI hardware ecosystem: Computing power determines response speed, but instruction set compatibility and memory support are the fundamental lifelines for large model operations. When the Pentium 4 finally slowly typed its own description on the screen, it was not just a successful inference, but also a romantic farewell ceremony in the history of computer science.