G42 & Cerebras Deploy 8 Exaflop Sovereign AI Supercomputer in India

The Dawn of Sovereign Silicon: Analyzing the G42-Cerebras 8-Exaflop Deployment in India

February 20, 2026 marks a watershed moment in the geopolitics of artificial intelligence. In a move that fundamentally alters the balance of computational power in the Global South, UAE’s G42 teams up with Cerebras to deploy 8 exaflops of compute in India. This initiative, unveiled at the AI Impact Summit in New Delhi, is not merely a hardware acquisition; it is a declaration of “AI Sovereignty” by India, brokered by UAE capital and powered by American wafer-scale engineering.

The collaboration involves G42, the Abu Dhabi-based AI holding company, Cerebras Systems, the Silicon Valley pioneer of the Wafer-Scale Engine (WSE), and Indian partners including the Centre for Development of Advanced Computing (C-DAC). Together, they are erecting a computational fortress capable of 8 exaflops (FP16/AI) of performance. To put this in perspective, this single deployment rivals the aggregate AI compute capacity of entire nations, positioning India to train foundational models on the scale of GPT-5 or Claude locally, without data ever crossing its borders.

This article provides a technical deconstruction of the architecture behind this deal, the hardware specifications of the Cerebras CS-3 clusters involved, and the strategic implications for global AI development. We will explore how this infrastructure supports the newly released NANDA 87B Hindi-English model and what it means for the future of sovereign compute.

Deconstructing the 8-Exaflop Architecture

The headline figure of “8 exaflops” warrants technical scrutiny. In the realm of High-Performance Computing (HPC), exaflops are typically measured in FP64 (double-precision). However, in the context of Generative AI, we refer to AI-FLOPS, usually FP16 or increasingly FP8 with sparsity. The infrastructure deployed by G42 and Cerebras is built upon the Condor Galaxy design pattern, a distributed AI supercomputer architecture.

The Cerebras CS-3 and WSE-3 Engine

At the heart of this deployment is the Cerebras CS-3 system, powered by the Wafer-Scale Engine 3 (WSE-3). Unlike traditional GPU clusters that rely on thousands of discrete chips connected by complex cabling (InfiniBand or Ethernet), the WSE-3 creates a single chip the size of a silicon wafer.

Core Density: The WSE-3 contains 4 trillion transistors and 900,000 AI-optimized cores on a single slab of silicon.
On-Chip Memory: It boasts 44GB of on-chip SRAM. This is a critical architectural divergence from GPUs (like the NVIDIA H100 or Blackwell B200), which rely on HBM3e memory. The WSE-3’s memory bandwidth is 21 petabytes per second, roughly 7,000x faster than HBM links.
Interconnect Efficiency: In a GPU cluster, data must travel between chips, introducing latency. On the WSE-3, all cores are on the same silicon, communicating at silicon speeds. This allows the system to train models with near-linear scaling.

For a detailed comparison of wafer-scale architectures against traditional GPU setups, refer to our analysis on Wafer Scale Revolution Benchmarking OpenAI’s GPT-5.3 Architecture.

Cluster Topology and Cooling

Deploying 8 exaflops requires a constellation of CS-3 systems. A single CS-3 delivers roughly 125 petaFLOPS of peak AI performance. Therefore, reaching 8 exaflops implies a cluster of approximately 64 interconnected CS-3 units, likely arranged in a Condor Galaxy-style topology.

This density presents unique facility challenges. While a CS-3 replaces racks of GPUs, it requires specialized cooling and power delivery. The India deployment will utilize liquid-to-chip cooling infrastructures to manage the thermal density of wafer-scale processing. This aligns with the broader industry shift toward specialized data centers, as seen in Mistral’s infrastructure pivot in the Nordics, where cooling efficiency drives architectural decisions.

The Strategic Pivot: Sovereign AI and Data Residency

The partnership where UAE’s G42 teams up with Cerebras to deploy 8 exaflops of compute in India is driven by the concept of “Sovereign AI.” Nations are increasingly viewing AI compute not as a commercial service, but as strategic national infrastructure, similar to energy grids or defense systems.

Why Sovereignty Matters

For India, relying on US-based cloud providers (AWS, Azure, GCP) involves latency and, more critically, data jurisdiction risks. By hosting the hardware physically within India (likely in data centers in Mumbai or Hyderabad operated effectively under Indian jurisdiction), the government ensures that sensitive citizen data used to train models like NANDA remains subject to Indian law.

This mirrors trends we have observed globally. For instance, Chile’s sovereign AI architecture and the sovereign compute initiatives in Europe highlight a move away from centralized American hyperscalers. The G42-Cerebras deal accelerates this for India, leapfrogging the lengthy process of procuring restricted GPUs.

The US-UAE-India Triangle

This deal also navigates complex export controls. G42 has recently pivoted closer to US regulatory frameworks (securing investment from Microsoft and shedding Chinese hardware dependencies). By utilizing Cerebras (a US company), the deployment satisfies US export control requirements while serving India’s non-aligned technological interests. It is a deft geopolitical maneuver: using UAE capital to bring US innovation to Indian soil.

Application Layer: NANDA 87B and Beyond

Hardware without models is heater fuel. The immediate beneficiary of this 8-exaflop injection is the NANDA model family. Developed by Inception (a G42 subsidiary), MBZUAI, and Cerebras, NANDA is a Hindi-English LLM designed to capture the linguistic nuances of South Asia.

NANDA 87B Specs & Training:

Parameter Count: 87 Billion active parameters.
Training Token Count: Likely exceeding 4 Trillion tokens, heavily weighted towards Hindi, Tamil, and Telugu datasets.
Architecture: Decoder-only transformer with rotary embeddings, optimized for the WSE-3’s large SRAM capacity.

The 8-exaflop cluster will allow Indian researchers to fine-tune NANDA 87B and train successors (potentially NANDA-Large at 400B+ parameters) in weeks rather than months. This capacity is crucial for Enterprise AI adoption, where businesses need custom models trained on proprietary data without it leaving the country.

Comparative Analysis: WSE-3 vs. The Market

To understand the magnitude of UAE’s G42 teams up with Cerebras to deploy 8 exaflops of compute in India, we must compare it to alternative sovereign AI strategies.

Feature	Cerebras WSE-3 Cluster	NVIDIA H100 Cluster	Implication for India
Scaling Logic	Data Parallel (Simplified)	Model/Pipeline Parallelism	Lower engineering barrier for Indian startups.
Memory Bandwidth	21 PB/s (SRAM)	3.35 TB/s (HBM3)	Faster training for large context windows.
Supply Chain	TSMC 5nm (Managed by US)	TSMC 4N (Heavily constrained)	Faster time-to-deployment via G42 partnership.

The choice of Cerebras allows India to bypass the extreme supply shortages affecting NVIDIA GPUs. While the rest of the world waits 50 weeks for H100 delivery, the G42-Cerebras alliance is deploying now. This speed is akin to the Blackstone strategic injection into sovereign compute, where capital and availability trump pure theoretical benchmarks.

Challenges and Troubleshooting the Deployment

Deploying 8 exaflops is not without technical peril. Several friction points typically emerge in projects of this scale:

1. Power Density and Grid Stability

A cluster of this magnitude consumes megawatts of power. Ensuring stable, clean power in India’s grid infrastructure is non-trivial. The facility will likely require dedicated substations and backup generation, similar to the requirements discussed in Hardware Prerequisites for GPT-OSS Clusters.

2. Talent Gap

While India has a massive developer base, engineers experienced in wafer-scale optimization are rare. Standard CUDA skills do not directly translate to the Cerebras software stack (CSoW). G42 and Cerebras will need to launch extensive training programs, potentially leveraging the C-DAC partnership to upskill the workforce. This echoes the workforce architecture shifts we analyzed in IBM’s 2026 hiring strategy.

3. Model Governance

Running uncensored or sovereign models requires robust safety frameworks. As discussed in Uncensored LLM Architectures, there is a tension between performance and safety. India’s “India AI Mission” will need to define clear guardrails for how this massive compute is used, especially in public sector applications.

Future Outlook: India as an AI Superpower

This deployment positions India to move from an AI consumer to an AI producer. With 8 exaflops, India can host its own “Foundation World Models,” potentially rivaling efforts like Google’s Project Genie. We can expect to see a surge in Hindi, Bengali, and Marathi voice agents, agricultural AI advisors, and digital public infrastructure (DPI) powered by this cluster.

Furthermore, this paves the way for India to explore containerized swarm systems, where multiple specialized models run concurrently on the high-bandwidth Cerebras fabric to solve complex civic problems.

Frequently Asked Questions

Q: What does “8 exaflops” mean in this context?
In AI terms, this refers to 8 quintillion operations per second, likely at FP16 precision. It represents enough power to train a model larger than GPT-4 from scratch in a reasonable timeframe.

Q: Why did G42 choose Cerebras over NVIDIA?
Cerebras offers better supply availability and superior memory bandwidth for training massive models. Additionally, the Cerebras architecture simplifies the cluster design, reducing the complexity of networking required for a sovereign cloud.

Q: Will this computer be accessible to Indian startups?
Yes. The announcement confirms that the infrastructure will be accessible to startups, researchers, and government bodies, effectively democratizing access to high-end compute that is usually reserved for Silicon Valley giants.

Q: How does this relate to the NANDA model?
The NANDA 87B model was trained on Cerebras hardware. This new deployment serves as the permanent home for NANDA’s inference and future training runs, ensuring low-latency access for Indian users.