TechCrunch
Nicolas Sauvage is betting on the boring parts of AI
4 May 2026 at 07:10

Nicolas Sauvage is betting on the boring parts of AI

By: Connie Loizos

4 May 2026 at 07:10

The portfolio he has assembled since 2019 is dotted with technologies that have become more widely interesting to VCs over the last year.

Wccftech
Groq’s Inference Chips Are Beating NVIDIA’s Blackwell by 5x on Cost – And Doing It Twice as Fast
23 April 2026 at 19:52

Groq’s Inference Chips Are Beating NVIDIA’s Blackwell by 5x on Cost – And Doing It Twice as Fast

Wccftech

By: Ramish Zafar

23 April 2026 at 19:52

As AI computing capacity continues to grow, an expert from computing infrastructure provider Nebius sat with AlphaSense to describe the state of the industry. While NVIDIA's leading-edge AI GPUs remain the top in the industry when it comes to performance, the expert believes alternatives are growing in popularity, particularly as the industry shifts its cost metrics. The demand for AI computing capacity also remains high, as providers can easily run at 100% utilization rates to drive down costs and earn the most from their investment. Alternatives To NVIDIA Chips Grow In Popularity As Industry Shifts Towards Cost Per Million Tokens […]

Read full article at https://wccftech.com/nvidias-ai-chips-see-alternatives-emerge-amidst-pricing-model-shift-to-cost-per-million-tokens/

Intel–SambaNova Collaboration Is One Answer to NVIDIA’s Groq Partnership, After It Became Clear GPUs Alone Can’t Dominate Inference

Wccftech

By: Muhammad Zuhair

8 April 2026 at 15:11

Inference is the next area of focus for compute providers, and after the NVIDIA-Groq partnership, the AI industry has realized it needs far more than just GPUs. This has led to a new pair emerging: Intel and SambaNova. Intel's Xeon 6 CPUs Will Act as the Host For Agentic Systems, Backed By SambaNova's SN50 Chip For Decode At this year's GTC, we saw NVIDIA talking about disaggregated inference, and how it has become important for them as a manufacturer to shift from their 'GPU-only' mentality, and instead bring in a relatively newer form of compute units into the infrastructure race. […]

Read full article at https://wccftech.com/intel-sambanova-collaboration-is-one-answer-to-nvidias-groq-partnership/

Wccftech
NVIDIA’s True Power Lies in Its Infrastructure, but There’s an Overlooked Dimension to Its Grip: Jensen’s ‘Web of Alliances’
25 March 2026 at 16:30

NVIDIA’s True Power Lies in Its Infrastructure, but There’s an Overlooked Dimension to Its Grip: Jensen’s ‘Web of Alliances’

Wccftech

By: Muhammad Zuhair

25 March 2026 at 16:30

NVIDIA's position in the AI industry stems from its robust compute portfolio, but a WSJ report delves into Jensen's Web of Alliances, which is worth discussing. NVIDIA's Groq Agreement & Investment Moves Are All Part of a Broader Motive to Drive the AI Industry When we talk about the biggest beneficiaries of the current AI cycle, there's no doubt that NVIDIA is leading the race, given its unique position in providing compute power for our AI models. NVIDIA sees demand not just from AI labs like OpenAI and Anthropic, but also from hyperscalers like Meta, Amazon, and Google, and from […]

Read full article at https://wccftech.com/nvidia-true-power-lies-in-its-infrastructure-but-theres-an-overlooked-dimension/

ServeTheHome
Decoding the Future of Inference At NVIDIA: Groq LPUs Join Vera Rubin Platform For Low-Latency Inference
17 March 2026 at 16:00

Decoding the Future of Inference At NVIDIA: Groq LPUs Join Vera Rubin Platform For Low-Latency Inference

ServeTheHome

By: Ryan Smith

17 March 2026 at 16:00

With its upcoming Vera Rubin rackscale architecture, NVIDIA is going to be integrating LPUs from acquihire Groq, marking a major expansion beyond using GPUs alone for AI inference

The post Decoding the Future of Inference At NVIDIA: Groq LPUs Join Vera Rubin Platform For Low-Latency Inference appeared first on ServeTheHome.

Wccftech
NVIDIA Unveils Vera Rubin With Groq’s LPX to Break Into Inference, a Market Where It Has Never Been First
16 March 2026 at 19:48

NVIDIA Unveils Vera Rubin With Groq’s LPX to Break Into Inference, a Market Where It Has Never Been First

Wccftech

By: Muhammad Zuhair

16 March 2026 at 19:48

NVIDIA's Groq partnership is now formalizing, as Jensen unveils a hybrid compute tray featuring Groq's third-generation LPU units in a Rubin rack. NVIDIA's Idea With Groq Is to Target 'High-Speed' Workloads, Hoping to Crack the Inference Competition The debate over what NVIDIA would do with Groq has been ongoing for quite some time, and we have maintained a key lead on developments. At GTC 2026, NVIDIA unveiled a new Vera Rubin hybrid compute tray, the Groq 3 LPX, which features eight of the 'unannounced' Groq3 units, which we'll discuss ahead. According to NVIDIA, LPX and Rubin together deliver unprecedented inference […]

Read full article at https://wccftech.com/nvidia-unveils-vera-rubin-with-groq-lpx-to-break-into-inference/

Wccftech
NVIDIA May Finally Abandon Its “One GPU Does Everything” Mantra at GTC 2026, and Here’s What to Expect
15 March 2026 at 20:52

NVIDIA May Finally Abandon Its “One GPU Does Everything” Mantra at GTC 2026, and Here’s What to Expect

Wccftech

By: Muhammad Zuhair

15 March 2026 at 20:52

We are heading towards GTC 2026, one of the most important events within the AI world, and this year, we are expecting a massive shift in how computing is perceived. The race for AI infrastructure has evolved signifcantly over the past few years, as evolving compute requirements have forced companies like NVIDIA and AMD to innovate in what they offer. Since 2022, we have seen training workloads gain massive popularity, which Hopper and Blackwell capitalized on. Now, moving into 2026, agentic workloads are the next area to focus on for compute providers, which is why the upcoming GTC announcements from […]

Read full article at https://wccftech.com/nvidia-may-finally-abandon-its-one-gpu-does-everything-mantra-at-gtc-2026/

Wccftech
OpenAI Is Set to Be the Biggest Customer for the Upcoming NVIDIA-Groq AI Chip, Allocating 3GW of Dedicated ‘Inference Capacity’
28 February 2026 at 11:39

OpenAI Is Set to Be the Biggest Customer for the Upcoming NVIDIA-Groq AI Chip, Allocating 3GW of Dedicated ‘Inference Capacity’

Wccftech

By: Muhammad Zuhair

28 February 2026 at 11:39

OpenAI's newest partnership with NVIDIA not only focuses on Vera Rubin but also on inference capacity, which will be provided by the upcoming NVIDIA-Groq solution. OpenAI Now Pivots Towards NVIDIA For Inference, Likely Being Optimistic With the Upcoming Groq Solution OpenAI is currently engaged in financing deals with infrastructure partners all across the AI industry, and the AI giant recently announced $110 billion in fresh capital, driven by the likes of NVIDIA, SoftBank, and Amazon. OpenAI calls the investments a necessity to keep the AI bandwagon up and running, and they have been one of the ways the firm has […]

Read full article at https://wccftech.com/openai-is-set-to-be-the-biggest-customer-for-the-upcoming-nvidia-groq-ai-chip/

Wccftech
NVIDIA Says Groq Acquisition Will Play a Role Similar to Mellanox, Extending the Architecture as an “Accelerator” For Low-Latency Decode
26 February 2026 at 14:56

NVIDIA Says Groq Acquisition Will Play a Role Similar to Mellanox, Extending the Architecture as an “Accelerator” For Low-Latency Decode

Wccftech

By: Muhammad Zuhair

26 February 2026 at 14:56

NVIDIA's plans for Groq's LPU units are a topic of debate in the industry, and when Jensen was asked about them during the Q4 2026 earnings call, he hinted at rather interesting stuff. NVIDIA's Groq LPUs Will Solidify the Company's Position In Latency-Sensitive Workloads NVIDIA's acquisition spree has been aggressive this year. Still, one of the major partnerships that the company entered into was with Groq, a non-licensing agreement worth up to $20 billion, which is Team Green's biggest investment. The announcement did slip in on Christmas Eve, and NVIDIA never really followed up on actual plans. Interestingly, NVIDIA's CEO […]

Read full article at https://wccftech.com/nvidia-says-groq-acquisition-will-play-a-role-similar-to-mellanox/

Wccftech
This New AI Chipmaker, Taalas, Hard-Wires AI Models Into Silicon to Make Them Faster and Cheaper; Early Results Crush Modern Solutions
20 February 2026 at 18:21

This New AI Chipmaker, Taalas, Hard-Wires AI Models Into Silicon to Make Them Faster and Cheaper; Early Results Crush Modern Solutions

Wccftech

By: Muhammad Zuhair

20 February 2026 at 18:21

Well, it appears that the chip startup Taalas has found a solution to LLM response latency and performance by creating dedicated hardware that 'hardwires' AI models. Taalas Manages to Achieve 10x Higher TPS With Meta's Llama 8B LLM, That Too With 20x Lower Production Costs When you look at today's world of AI compute, latency is emerging as a massive constraint for modern-day compute providers, mainly because, in an agentic environment, the primary moat lies in token-per-second (TPS) figures and how quickly you can get a task done. One solution the industry sees is integrating SRAM into their offerings, and […]

Read full article at https://wccftech.com/this-new-ai-chipmaker-taalas-hard-wires-ai-models-into-silicon-to-make-them-faster/

Normal view