Currently, the Colossus cluster, which xAI uses to train its Grok AI models, houses 100,000 Nvidia GPUs, with plans announced back in October to double that amount.
However, an announcement from the Greater Memphis Chamber Annual Chairman’s Luncheon suggested that xAI plans to expand its Colossus site to incorporate “a minimum of one million GPUs”.
Subscribe today for free
The expansion, which is already underway, would represent the largest capital investment in the region’s history, according to the Greater Memphis Chamber, dwarfing that of Ford’s $5.6 billion car manufacturing plant, Blue Oval City.
In addition to xAI’s expansion, the companies it worked with to build Colossus in just 122 days — Nvidia, Dell, and Supermicro — will all establish operations in the city.
“When we announced six months ago that xAI would make Memphis its home for Colossus, we recognised it as our defining moment,” said Ted Townsend, president and CEO of the Greater Memphis Chamber. “Memphis has provided the power and velocity necessary for not just xAI to grow and thrive, but making way for other companies as well. We’re excited to welcome Nvidia, Dell, and Supermicro to the ‘Digital Delta.’”
XAI now has its own “concierge service” from the Greater Memphis Chamber. The xAI Special Operations Team, led by Townsend and Troy Parkes, its SVP of global business development will help the startup “round the clock”.
Reacting to the reports, xAI founder and ex-OpenAI founder turned agitator Elon Musk, referenced Dr Evil from the Austin Powers films, saying: “Nope, at least 1 biiillioon GPUs!”
“Just trying to get to measly 1% completion of a puny Kardashev Type I civilisation,” Musk said, referencing the Kardashev scale, a method for measuring a civilisation's level of technological advancement based on the amount of energy it is capable of harnessing. Notably, a Kardashev Type I civilisation would be capable of controlling naturally occurring events like volcanic eruptions and earthquakes in addition to having access to a planet’s entire available energy and storing it for consumption.
XAI itself has not confirmed the amount of GPUs it will be adding to the Colossus cluster, though the startup is set to snap up thousands of Nvidia's upcoming H200 GPUs.
If xAI does scale Colossus to one million GPUs, it would likely be one of the largest supercomputing systems in the world, if not the largest.
To put it into perspective, El Capitan, which recently surpassed Frontier as the world’s most powerful supercomputer, houses 43,808 AMD 4th Gen EPYC CPUs and 43,808 AMD Instinct MI300A GPUs. Frontier, meanwhile, features around 38 thousand GPUs — meaning xAI’s Colossus would be over 26 times larger.
Such a system would, however, require significant infrastructure and resources, particularly power, with environmental groups already up in arms about the facility in its current state.
The Southern Environmental Law Centre claimed that xAI was operating gas turbines at the site without any permits, which would violate federal law.
“Every day those turbines are operating, they are polluting the air and doing significant harm to families in South Memphis,” Amanda Garcia, senior attorney at Southern Environmental Law Centre said back in November.
The OpenAI rival has been working to expand its infrastructure for a while, having previously relied on using servers from X and later Oracle Cloud to train its earlier Grok models.
However, Musk’s instance to take on OpenAI has seen his rival startup secure its own dedicated infrastructure to power training efforts for its Grok line of AI models.
The mammoth cluster has been running training workloads for xAI’s upcoming model, Grok 3, which Musk claimed in the summer would debut by the end of 2024 and would rival — or even surpass — the highly anticipated capabilities of OpenAI's GPT-5.
RELATED STORIES
Musk to double xAI’s Colossus cluster to 200K GPUs for Grok 3 training
Elon claims OpenAI's funding round rules violated antitrust laws