Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Intel–SambaNova Collaboration Is One Answer to NVIDIA’s Groq Partnership, After It Became Clear GPUs Alone Can’t Dominate Inference

8 April 2026 at 15:11

Two smiling individuals stand in front of a screen displaying the 'SambaNova Systems' logo.

Inference is the next area of focus for compute providers, and after the NVIDIA-Groq partnership, the AI industry has realized it needs far more than just GPUs. This has led to a new pair emerging: Intel and SambaNova. Intel's Xeon 6 CPUs Will Act as the Host For Agentic Systems, Backed By SambaNova's SN50 Chip For Decode At this year's GTC, we saw NVIDIA talking about disaggregated inference, and how it has become important for them as a manufacturer to shift from their 'GPU-only' mentality, and instead bring in a relatively newer form of compute units into the infrastructure race. […]

Read full article at https://wccftech.com/intel-sambanova-collaboration-is-one-answer-to-nvidias-groq-partnership/

NVIDIA’s True Power Lies in Its Infrastructure, but There’s an Overlooked Dimension to Its Grip: Jensen’s ‘Web of Alliances’

25 March 2026 at 16:30

A person in a dark leather jacket is smiling and giving a thumbs-up in front of a softly lit, blue-toned background.

NVIDIA's position in the AI industry stems from its robust compute portfolio, but a WSJ report delves into Jensen's Web of Alliances, which is worth discussing. NVIDIA's Groq Agreement & Investment Moves Are All Part of a Broader Motive to Drive the AI Industry When we talk about the biggest beneficiaries of the current AI cycle, there's no doubt that NVIDIA is leading the race, given its unique position in providing compute power for our AI models. NVIDIA sees demand not just from AI labs like OpenAI and Anthropic, but also from hyperscalers like Meta, Amazon, and Google, and from […]

Read full article at https://wccftech.com/nvidia-true-power-lies-in-its-infrastructure-but-theres-an-overlooked-dimension/

Decoding the Future of Inference At NVIDIA: Groq LPUs Join Vera Rubin Platform For Low-Latency Inference

17 March 2026 at 16:00

With its upcoming Vera Rubin rackscale architecture, NVIDIA is going to be integrating LPUs from acquihire Groq, marking a major expansion beyond using GPUs alone for AI inference

The post Decoding the Future of Inference At NVIDIA: Groq LPUs Join Vera Rubin Platform For Low-Latency Inference appeared first on ServeTheHome.

NVIDIA Unveils Vera Rubin With Groq’s LPX to Break Into Inference, a Market Where It Has Never Been First

16 March 2026 at 19:48

A presenter on stage with three open computer servers, showcasing internal components against a black background.

NVIDIA's Groq partnership is now formalizing, as Jensen unveils a hybrid compute tray featuring Groq's third-generation LPU units in a Rubin rack. NVIDIA's Idea With Groq Is to Target 'High-Speed' Workloads, Hoping to Crack the Inference Competition The debate over what NVIDIA would do with Groq has been ongoing for quite some time, and we have maintained a key lead on developments. At GTC 2026, NVIDIA unveiled a new Vera Rubin hybrid compute tray, the Groq 3 LPX, which features eight of the 'unannounced' Groq3 units, which we'll discuss ahead. According to NVIDIA, LPX and Rubin together deliver unprecedented inference […]

Read full article at https://wccftech.com/nvidia-unveils-vera-rubin-with-groq-lpx-to-break-into-inference/

NVIDIA May Finally Abandon Its “One GPU Does Everything” Mantra at GTC 2026, and Here’s What to Expect

15 March 2026 at 20:52

A person is standing on stage showcasing various open server units with visible cooling systems and hardware components.

We are heading towards GTC 2026, one of the most important events within the AI world, and this year, we are expecting a massive shift in how computing is perceived. The race for AI infrastructure has evolved signifcantly over the past few years, as evolving compute requirements have forced companies like NVIDIA and AMD to innovate in what they offer. Since 2022, we have seen training workloads gain massive popularity, which Hopper and Blackwell capitalized on. Now, moving into 2026, agentic workloads are the next area to focus on for compute providers, which is why the upcoming GTC announcements from […]

Read full article at https://wccftech.com/nvidia-may-finally-abandon-its-one-gpu-does-everything-mantra-at-gtc-2026/

OpenAI Is Set to Be the Biggest Customer for the Upcoming NVIDIA-Groq AI Chip, Allocating 3GW of Dedicated ‘Inference Capacity’

28 February 2026 at 11:39

Two individuals are depicted in a black and white image with a background of server racks, one with a contemplative

OpenAI's newest partnership with NVIDIA not only focuses on Vera Rubin but also on inference capacity, which will be provided by the upcoming NVIDIA-Groq solution. OpenAI Now Pivots Towards NVIDIA For Inference, Likely Being Optimistic With the Upcoming Groq Solution OpenAI is currently engaged in financing deals with infrastructure partners all across the AI industry, and the AI giant recently announced $110 billion in fresh capital, driven by the likes of NVIDIA, SoftBank, and Amazon. OpenAI calls the investments a necessity to keep the AI bandwagon up and running, and they have been one of the ways the firm has […]

Read full article at https://wccftech.com/openai-is-set-to-be-the-biggest-customer-for-the-upcoming-nvidia-groq-ai-chip/

NVIDIA Says Groq Acquisition Will Play a Role Similar to Mellanox, Extending the Architecture as an “Accelerator” For Low-Latency Decode

26 February 2026 at 14:56

A presenter in a black leather jacket holds a large computing board with visible components and circuitry.

NVIDIA's plans for Groq's LPU units are a topic of debate in the industry, and when Jensen was asked about them during the Q4 2026 earnings call, he hinted at rather interesting stuff. NVIDIA's Groq LPUs Will Solidify the Company's Position In Latency-Sensitive Workloads NVIDIA's acquisition spree has been aggressive this year. Still, one of the major partnerships that the company entered into was with Groq, a non-licensing agreement worth up to $20 billion, which is Team Green's biggest investment. The announcement did slip in on Christmas Eve, and NVIDIA never really followed up on actual plans. Interestingly, NVIDIA's CEO […]

Read full article at https://wccftech.com/nvidia-says-groq-acquisition-will-play-a-role-similar-to-mellanox/

This New AI Chipmaker, Taalas, Hard-Wires AI Models Into Silicon to Make Them Faster and Cheaper; Early Results Crush Modern Solutions

20 February 2026 at 18:21

The image shows a Taalas HCI Technology Demonstrator featuring the Llama 3.1 8B model, TSMC 6nm technology, 815mm² area, 53

Well, it appears that the chip startup Taalas has found a solution to LLM response latency and performance by creating dedicated hardware that 'hardwires' AI models. Taalas Manages to Achieve 10x Higher TPS With Meta's Llama 8B LLM, That Too With 20x Lower Production Costs When you look at today's world of AI compute, latency is emerging as a massive constraint for modern-day compute providers, mainly because, in an agentic environment, the primary moat lies in token-per-second (TPS) figures and how quickly you can get a task done. One solution the industry sees is integrating SRAM into their offerings, and […]

Read full article at https://wccftech.com/this-new-ai-chipmaker-taalas-hard-wires-ai-models-into-silicon-to-make-them-faster/

❌
❌