SambaNova Launches the Fastest DeepSeek-R1 671B with the Highest Efficiency

PALO ALTO, Calif., February 13, 2025 /BUSINESS WIRE/ --

SambaNova, the generative AI company delivering the most efficient AI chips and fastest models, announces that DeepSeek-R1 671B is running today on SambaNova Cloud at 198 tokens per second (t/s), achieving speeds and efficiency that no other platform can match.

This press release features multimedia. View the full release here: https://www.businesswire.com/news/home/20250213759244/en/

SambaNova shrinks the hardware required to efficiently serve DeepSeek-R1 671B to a single rack (16 chips) — delivering 3X the speed (198 t/s) and 5X the efficiency of the latest GPUs (Graphic: Business Wire)

DeepSeek-R1 has reduced AI training costs by 10X, but its widespread adoption has been hindered by high inference costs and inefficiencies — until now. SambaNova has removed this barrier, unlocking real-time, cost-effective inference at scale for developers and enterprises.

“Powered by the SN40L RDU chip, SambaNova is the fastest platform running DeepSeek at 198 tokens per second per user,” stated Rodrigo Liang, CEO and co-founder of SambaNova. “This will increase to 5X faster than the latest GPU speed on a single rack — and by year end, we will offer 100X capacity for DeepSeek-R1.”

"Being able to run the full DeepSeek-R1 671B model — not a distilled version — at SambaNova's blazingly fast speed is a game changer for developers. Reasoning models like R1 need to generate a lot of reasoning tokens to come up with a superior output, which makes them take longer than traditional LLMs. This makes speeding them up especially important," stated Dr. Andrew Ng, Founder of DeepLearning.AI, Managing General Partner at AI Fund, and an Adjunct Professor at Stanford University’s Computer Science Department.

"Artificial Analysis has independently benchmarked SambaNova’s cloud deployment of the full 671 billion parameter DeepSeek-R1 Mixture of Experts model at over 195 output tokens/s, the fastest output speed we have ever measured for DeepSeek-R1. High output speeds are particularly important for reasoning models, as these models use reasoning output tokens to improve the quality of their responses. SambaNova’s high output speeds will support the use of reasoning models in latency sensitive use cases," said George Cameron, Co-Founder, Artificial Analysis.

SambaNova Solves DeepSeek’s Biggest Challenge: Inference at Scale

DeepSeek-R1 has revolutionized AI by collapsing training costs by tenfold, however, widespread adoption has stalled because DeepSeek-R1’s reasoning capabilities require significantly more compute for inference, making AI production costlier. In reality, the inefficiency of GPU-based inference has kept DeepSeek-R1 out of reach for most developers.

SambaNova has solved this problem. With a proprietary dataflow architecture and three-tier memory design, SambaNova’s SN40L Reconfigurable Dataflow Unit (RDU) chips collapse the hardware requirements to run DeepSeek-R1 671B efficiently from 40 racks (320 of the latest GPUs) down to 1 rack (16 RDUs) — unlocking cost-effective inference at unmatched efficiency.

“DeepSeek-R1 is one of the most advanced frontier AI models available, but its full potential has been limited by the inefficiency of GPUs,” said Rodrigo Liang, CEO of SambaNova. “That changes today. We’re bringing the next major breakthrough — collapsing inference costs and reducing hardware requirements from 40 racks to just one — to offer DeepSeek-R1 at the fastest speeds, efficiently.”

“More than 10 million users and engineering teams at Fortune 500 companies rely on Blackbox AI to transform how they write code and build products. Our partnership with SambaNova plays a critical role in accelerating our autonomous coding agent workflows. SambaNova’s chip capabilities are unmatched for serving the full R1 671B model, which provides much better accuracy than any of the distilled versions. We couldn’t ask for a better partner to work with to serve millions of users,” stated Robert Rizk, CEO of Blackbox AI.

Coming Next: The Most Efficient DeepSeek API in the World — 100X Current Global Capacity

Sumti Jairath, Chief Architect, SambaNova, explained: “DeepSeek-R1 is the perfect match for SambaNova’s three-tier memory architecture. With 671 billion parameters R1 is the largest open source large language model released to date, which means it needs a lot of memory to run. GPUs are memory constrained, but SambaNova’s unique dataflow architecture means we can run the model efficiently to achieve 20000 tokens/s of total rack throughput in the near future — unprecedented efficiency when compared to GPUs due to their inherent memory and data communication bottlenecks.”

SambaNova is rapidly scaling its capacity to meet anticipated demand, and by the end of the year will offer more than 100x the current global capacity for DeepSeek-R1. This makes its RDUs the most efficient enterprise solution for reasoning models.

Get Early Access to R1 on SambaNova Cloud

DeepSeek-R1 671B full model is available now to all users to experience and to select users via API on SambaNova Cloud. To try it today visit cloud.sambanova.ai.

About SambaNova Systems

Customers turn to SambaNova to quickly deploy state-of-the-art generative AI capabilities within the enterprise. Our purpose-built enterprise-scale AI platform is the technology backbone for the next generation of AI computing.

Headquartered in Palo Alto, California, SambaNova Systems was founded in 2017 by industry luminaries, and hardware and software design experts from Sun/Oracle and Stanford University. Investors include SoftBank Vision Fund 2, funds and accounts managed by BlackRock, Intel Capital, GV, Walden International, Temasek, GIC, Redline Capital, Atlantic Bridge Ventures, Celesta, and several others. Visit us at sambanova.ai or contact us at info@sambanova.ai. Follow SambaNova Systems on LinkedIn and on X.

View source version on businesswire.com: https://www.businesswire.com/news/home/20250213759244/en/

Legal Disclaimer:

EIN Presswire provides this news content "as is" without warranty of any kind. We do not accept any responsibility or liability for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this article. If you have any complaints or copyright issues related to this article, kindly contact the author above.

You just read:

SambaNova Launches the Fastest DeepSeek-R1 671B with the Highest Efficiency

Distribution channels:

EIN Presswire's priority is author transparency. We do our best to weed out false and misleading content. The content above is the sole responsibility of the author who makes it available. If you have any complaints, kindly contact the author above.