When OpenAI, the White House, Oracle, Softbank and MGX announced that they would invest in the Stargate Project – a new company to drive artificial intelligence development in the U.S. – questions about the implications for data center infrastructure facilities, power consumption and generation, AI development, and more immediately ensued.
It’s “like the space race with the Soviets in the 1960s,” explained Kuba Stolarski, IDC research vice president and global research lead for the compute infrastructure and service provider trends practice, in a new report.
“America is now in an AI race with the Chinese for the remainder of the 2020s by kicking off the largest AI infrastructure project in history” where the overall GDP of the country will be affected by the AI goods and services sold, he said in a report, Stargate Project Launches a New Age of AI Development, that followed the project’s announcement on January 21.
However, novel large language models like Chat GPT 3.0 were more than five years in the making. Challenged by rate-limiting GPU memory capacity and the level of compute operations required for LLM training, it’s proved costly. Thus, the race to achieve efficiency in the kinds of artificial intelligence that achieve or surpass human cognitive capabilities means that big AI is likely to adjust its tack, according to Stolarski.
‘Arms race’ for GPUs
With initial efforts to build the ambitious Stargate data infrastructure – $500 billion in server market investments is expected over the next four years – already underway in Abilene, Texas, OpenAI said that the first $100 billion will secure American leadership in AI.
However, “development around large language models will require enough commercial use cases to justify the investments” in the largest AI infrastructure project in history, according to Stolarski.
Besides LLMs, competing accelerated server technologies are used across a variety of use cases, he noted in the report.
He stressed that while the scale of Stargate has not been specified from a compute infrastructure standpoint, “the demand for GPU servers from Stargate may place a significant strain on GPU supply, which had just begun to ease during 2024 after a very supply-constrained 2023.”
“While supply will continue to increase each year, we have observed that demand continues to outpace supply in this extremely hot market segment,” he added.
Also, while President Donald Trump’s involvement “could refer in general to a removal of barriers” required to carry out a project of this scale, Stolarki said that there are many uncertainties related to energy policies.
“The interesting twist though is how some elements of the project that may be essential for its long-term success, such as clean energy, will seemingly conflict with the direction of the new administration,” he wrote.
Finally, it is also debatable that GPU farms are useful investments, and it’s likewise unclear if artificial general intelligence, or AGI, “can be achieved with raw computing power and existing LLM methods.”
“With comparable financial support, quantum computing, for example, might deliver much more revolutionary breakthroughs than LLMs,” said Stolarski.
DeepSeek and narrower models
Countering expansion of the LLM approach as a wise course of action, the Chinese firm DeepSeek claimed it’s been able to train a comparable AI model at an 11 times reduction in GPU capacity.
“I think the thing that DeepSeek kind of struck a nerve was this claim that they could do it at such a fraction of the cost, but it’s not apples to apples, because when the foundational models in the U.S. which were used actually to build DeepSeek,” Stolarski told Healthcare IT News in a follow-up conversation last week that followed that overseas firm’s announcement.
“The levels of optimization we’re going to get are probably going to continue to improve,” he said.
OpenAI is partnering with tech giants like Arm, Microsoft, NVIDIA and Oracle on technology development related to Stargate. Oracle has promised to go big to solve healthcare data challenges before, as with its ambitious pledge for a national electronic health record database and its recent cloud cybersecurity initiative.
“There are all these methods that they’re trying to work on in the background in the back end to try to make these things more efficient,” said Stolarski.
The cost must come down for AI proliferation across use cases – along with model scale.
“Smaller models that are much more fine-tuned and customized for highly specific use cases will be taking over,” he said in the Stargate position paper.
“These small language models (as opposed to large language models) do not require such huge infrastructure environments. Sparse models, narrow models, low precision models – much research is currently being done to dramatically reduce the infrastructure needs of AI model development while retaining their accuracy rates.”
As an example, Stolarski explained that IBM has been building smaller models to generate their customers’ revenue.
“If we can get more efficient [fewer GPUs], then the demand will increase,” he said.
“I think everybody who follows this market expected that we would get these efficiency gains, and it’s a matter of timing, and I think even now the knee-jerk reaction seems to be subsiding.”
Andrea Fox is senior editor of Healthcare IT News.
Email: afox@himss.org
Healthcare IT News is a HIMSS Media publication.