

AI Boom Gives Rise To 'GPU-as-a-Service' 35
An anonymous reader quotes a report from IEEE Spectrum: The surge of interest in AI is creating a massive demand for computing power. Around the world, companies are trying to keep up with the vast amount of GPUs needed to power more and more advanced AI models. While GPUs are not the only option for running an AI model, they have become the hardware of choice due to their ability to efficiently handle multiple operations simultaneously -- a critical feature when developing deep learning models. But not every AI startup has the capital to invest in the huge numbers of GPUs now required to run a cutting-edge model. For some, it's a better deal to outsource it. This has led to the rise of a new business: GPU-as-a-Service (GPUaaS). In recent years, companies like Hyperbolic, Kinesis, Runpod, and Vast.ai have sprouted up to remotely offer their clients the needed processing power.
[...] Studies have shown that more than half of the existing GPUs are not in use at any given time. Whether we're talking personal computers or colossal server farms, a lot of processing capacity is under-utilized. What Kinesis does is identify idle compute -- both for GPUs and CPUs -- in servers worldwide and compile them into a single computing source for companies to use. Kinesis partners with universities, data centers, companies, and individuals who are willing to sell their unused computing power. Through a special software installed on their servers, Kinesis detects idle processing units, preps them, and offers them to their clients for temporary use. [...] The biggest advantage of GPUaaS is economical. By removing the need to purchase and maintain the physical infrastructure, it allows companies to avoid investing in servers and IT management, and to instead put their resources toward improving their own deep learning, large language, and large vision models. It also lets customers pay for the exact amount of GPUs they use, saving the costs of the inevitable idle compute that would come with their own servers. The report notes that GPUaaS is growing in profitability. "In 2023, the industry's market size was valued at US $3.23 billion; in 2024, it grew to $4.31 billion," reports IEEE. "It's expected to rise to $49.84 billion by 2032."
[...] Studies have shown that more than half of the existing GPUs are not in use at any given time. Whether we're talking personal computers or colossal server farms, a lot of processing capacity is under-utilized. What Kinesis does is identify idle compute -- both for GPUs and CPUs -- in servers worldwide and compile them into a single computing source for companies to use. Kinesis partners with universities, data centers, companies, and individuals who are willing to sell their unused computing power. Through a special software installed on their servers, Kinesis detects idle processing units, preps them, and offers them to their clients for temporary use. [...] The biggest advantage of GPUaaS is economical. By removing the need to purchase and maintain the physical infrastructure, it allows companies to avoid investing in servers and IT management, and to instead put their resources toward improving their own deep learning, large language, and large vision models. It also lets customers pay for the exact amount of GPUs they use, saving the costs of the inevitable idle compute that would come with their own servers. The report notes that GPUaaS is growing in profitability. "In 2023, the industry's market size was valued at US $3.23 billion; in 2024, it grew to $4.31 billion," reports IEEE. "It's expected to rise to $49.84 billion by 2032."
Not unexpected (Score:2, Informative)
Given the low returns on "AI", you gotta do something with those graphic cards.
You may as well rent them out to the cryptobros for them to mine $TRUMP and $MELLANIA.
Otherwise the financial goals of "AGI" aren't going to happen, eh Sam?
Re: (Score:2)
How can I earn a few bucks by renting out a spare S3 Savage 3D?
Re: (Score:2)
How can I earn a few bucks by renting out a spare S3 Savage 3D?
Find some other people like you, build together a Beowulf cluster, plaster on an "API" and you're set.
Re: (Score:2)
"Find some other people like you, build together a Beowulf cluster..."
How many would "some" have to be to make it at least break even?
100? 1000? More?
Re: (Score:2)
Send me an email and I'll send you back a business plan for your investors and an invoice.
Re:Not unexpected (Score:4, Funny)
Re: (Score:3)
Welcome. Two conditions, you have to swear fealty and to be certain in your heart that Beowulf clusters of graphics are a thing.
Re: (Score:2)
Re: (Score:2)
Your hired :)
Re: (Score:3)
Pretty sure you can't mine TRUMP. It's a memecoin. Most of them aren't mineable. DOGE was an exception since it has its own chain (being a fork of Litecoin).
That being said, you can rent rigs for mining. You've been able to do so for years now.
Re: (Score:2)
Could be. The point is that "AI" rigs are available because there is less need for those "AI" services than what was projected by the aibros.
Re: (Score:2)
As mentioned elsewhere in this thread, you have it precisely backwards.
Re: (Score:3)
You have it exactly backwards. These are normal people with gaming GPUs renting them out for AI.
Gaming GPUs have limited VRAM, but there's still plenty of tasks they're useful for. I've rented some before.
Hardware as a service (Score:2)
GPU-as-a-service is just a specialized hardware as a service, which is something that looks convenient to corporate management the same way as consultants looks convenient to corporations but in the end means higher costs.
Someone has sold the idea of "you own nothing and you'll be valuable" to corporate management.
Re: (Score:3)
You have it precisely backwards. These are not "new datacentres being leased out because AI companies don't have enough work for them". They're everything from obsolete servers to gaming PCs being rented out, TO AI companies, because said AI companies don't have enough capacity.
A lot of them are scraped together from parts from Ebay and run in places with cheap power specifically as a moneymaking operation. An RTX 3090 gaming card may earn you ~$7/day / ~$2500/yr. An outdated server card like an A6000, due
Article link is wrong (Score:4, Informative)
Re: (Score:1)
That's why OpenAI had to make a devil's bargain with Azure, they're desperate for cheap compute.
Re: (Score:2)
Re: (Score:2)
Vast.AI is both cheaper and easier set up for AI applications.
Old As The Hills (Score:2)
ASIC's will come (Score:1)
ASIC's will come in time.
Basically we've seen the peak already with the 3090, that has been no real gain from the 4090 or 5090 that we're seeing. All nvidia has done is packed more in to the die, and tried to get as close to the 600w limit, and when the last card was also pretty close to 600w, kinda hard to get any more power out of the desktop GPU now.
So now is the time for fixed-logic ASIC's for resusable logic, going back 20 years back to when hardware T&L was a buzzword. GPU's are highly programable
Re: (Score:3)
ASIC's will come in time.
Basically we've seen the peak already with the 3090, that has been no real gain from the 4090 or 5090 that we're seeing.
Uhh, no one is doing serious AI with their gaming GPUs.
ASICs might one day encroach on Nvidia's market dominance. It's not clear when and if that will happen. Remember that the most mature ASIC is Google's TPU, which is already on its 6th iteration and at least 10th year of development. Maybe the 11th year will be the magical year, maybe the 20th year, maybe never. We'll see.
It's also not clear if a viable Nvidia competitor will appear as an ASIC or a GPU. My bet is on an AMD/Intel GPU over an ASIC, al
Re: (Score:3)
This isn't really accurate. Nobody serious is training medium or large models on gaming GPUs, or doing inference on large models on gaming GPUs, but training small models and running inference on medium and small models is perfectly fine on gaming GPUs. The advantage is that they rent out really cheaply online, so if your task doesn't need the VRAM, they're not a bad choice.
Also, re: ASICs, the problem is that the architecture keeps changing so quickly. In the past several years we've gone from running in
Re: (Score:2)
This isn't really accurate. Nobody serious is training medium or large models on gaming GPUs, or doing inference on large models on gaming GPUs, but training small models and running inference on medium and small models is perfectly fine on gaming GPUs. The advantage is that they rent out really cheaply online, so if your task doesn't need the VRAM, they're not a bad choice.
Yes, people do this at home. I do this at home. I don't consider this to be "serious", either in terms work that anyone else cares about or in terms of making a material impact on data center sales.
Also, re: ASICs, the problem is that the architecture keeps changing so quickly. ... That said, yes, there are ASICs - Groq (not Grok), Cerebras, etc. They're super-fast for inference.
We're in the early stages of research into AI use cases and models. So, the models and uses cases change quickly. As with almost all ASICs, there are some specialized uses cases where ASICs beat general-purpose processors. For embedded systems where the design and manfacturing don't change for many years, AS
Re: (Score:2)
Not every useful model has hundreds of billions of parameters. The #2 model on Huggingface right now is hexgrad/Kokoro-82M, an (obviously) 82M parameter text-to-speech model. There is no reason you should be renting out a GB200 to train something like that - exactly how large of batches are you picturing? The
Re: (Score:3)
AI is not a single faceted mathematical problem. we already have specific purpose build devices for AI workloads, they are referred to as GPUs because they follow a similar kind of generalised architecture to consumer stuff, but really they are nothing alike. You can't throw ASICs at every problem, ASICs are ideal for problems with highly defined mathematical boundaries. Training AI models is not one of them, there are multiple steps in the model training that involve a variety of mathematical problems, whi
Re: (Score:2)
Also before you point to "AI acceleration" in consumer products note that the application of an AI model is a different problem than training a model.
Re: (Score:3)
I'd call it a multicore processor but they call it an ASIC presumably because they designed it for a specific type of workload. So then, is a H200 a GPU or is it an ASIC? It is designed to run CUDA workloads well isn't it? Can it run ROCm? If it isn't GPU "but really they are nothing alike" then it sounds like an ASIC
Re: (Score:2)
Sure. I guess when you talk about Application Specific Integrated Circuit you could consider anything an ASIC. Maybe Intel CPU's are ASICs if you define the application running general purpose computational tasks. But I don't know anyone who uses the term ASIC like that. That includes Meta. Can you point to where they call MTIA an ASIC? Certainly on their official AI blogs they don't use the term ASIC at any time when talking about their device.
That said their blogs do sort of point to it being an ASIC by d
Re: ASIC's will come (Score:2)
Our solution to this challenge was to design a family of recommendation-specific Meta Training and Inference Accelerator (MTIA) ASICs. We co-designed the first-generation ASIC with next-generation recommendation model requirements in mind and...
I Wonder Which Game will Be the First (Score:2)
I wonder which game will be the first to incorporate this capability into their software.
Re: (Score:2)
I'm confused as to what you're envisioning. A game renting out your computer's GPU, at the time it's needed the most? Or a game renting out someone else's GPU, and routing all of the (huge) bandwidth needs (with incredibly low latency requirements) over the internet?