Google 的 AI Hypercomputer 是為下一波 AI Agents 打造

Google 正大力押注所謂的 agentic AI 時代，而這次不只是講更聰明的 chatbot 而已。在 Cloud Next 26 上，Google 發表了 AI Hypercomputer，這是一個全新的 Google Cloud 基礎設施平台，目標是把客製化 TPUs、Axion CPUs、NVIDIA GPUs、networking、storage 和 machine learning software 整合成一套超大型 AI 系統。

對馬來西亞和 SEA 讀者來說，這種 backend tech 一開始聽起來可能有點遙遠，但其實它會默默影響我們每天使用的 apps、games、creator tools 和 AI services。更快的 AI training 和 inference，通常代表功能 rollout 更快、cloud platforms scaling 成本更低，也讓 developers 在打造 games、esports analytics、virtual assistants 或 content pipelines 時，可以做出更有野心的 AI tools。

Google 到底發表了什麼

AI Hypercomputer 是 Google 對大規模 AI infrastructure 的新主張，超越傳統「supercomputer」概念。Google 不是只依賴單一類型的晶片，而是把多種 compute options 混合進同一個平台：

第 8 代 Google TPUs
Google Axion Arm-based Cloud CPUs
NVIDIA Vera Rubin NVL72 GPUs
AI-focused networking
High-speed storage
Open software 和 ML frameworks

這次最重磅的硬體是 Google 全新的 TPUv8 family，分成兩款晶片：負責 training 的 TPU 8t，以及負責 inference 的 TPU 8i。

TPU 8t 是為訓練超大型模型而設

TPU 8t 是為訓練大型 frontier AI models 而打造。Google 表示，單一 TPU 8t superpod 可擴展到 9,600 顆晶片，並配備 2PB shared high-bandwidth memory。官方宣稱每個 pod 可達 121 exaflops FP4 compute，大約比上一代 Ironwood 高出 2.84 倍。

Google 也在這裡推進多項改進，包括相比上一代翻倍的 interchip bandwidth、快 10 倍的 storage access，以及用來把 data 直接送進 TPUs 的 TPUDirect。平台也使用 Google 的 Virgo Network、JAX 和 Pathways software；Google 表示，這能支援在單一 logical cluster 內擴展到最多一百萬顆晶片，並接近 linear scaling。

TPU 8t 也導入原生 FP4 support，透過每個 parameter 使用更少 bits 來降低 memory bandwidth 壓力，同時仍然盡量維持 large-model accuracy 在可用水平。

TPU 8i 專注在 inference

TPU 8i 主要針對 inference，也就是訓練好的 AI models 實際回應 users 的階段。這很重要，因為一旦數百萬人開始使用某個 AI product，inference 才是現實成本可能爆炸的地方。

Google 表示 TPU 8i 配備 288GB HBM memory 和 384MB on-chip SRAM，相比上一代容量提升 3 倍。這顆晶片每個 pod 可提供 331.8 exaflops FP8 compute，Google 指出這比 Ironwood 高出 6.74 倍。

針對現代 Mixture of Experts models，Google 把 ICI bandwidth 翻倍到 19.2Tb/s。新的 Boardfly architecture 將 maximum network diameter 降低超過 50%，而 Collectives Acceleration Engine 則可把 on-chip latency 最多降低 5 倍。

NVIDIA Rubin 也是計畫的一部分

Google 並不只是依賴自家 silicon。公司表示 NVIDIA GPUs 依然是其 AI accelerator lineup 的核心部分，而 Google Cloud 也將成為首批提供 NVIDIA Vera Rubin NVL72 systems 的平台之一。這些系統會與現有 Hopper 和 Blackwell-based instances 並列。

這個組合很有意思，因為 cloud customers 不會被迫只走單一 hardware path。對 SEA 的 studios、AI startups、enterprise teams 和 research groups 來說，flexibility 很重要。有些 workloads 可能更適合 Google TPUs，但其他 workloads 可能還是偏好 NVIDIA 那套 CUDA-heavy ecosystem。

為什麼這對 SEA 重要

不，你在馬來西亞的 gaming PC 不會突然塞進一顆 TPU 8t。但 cloud AI infrastructure 會影響 gaming 和 entertainment 周邊的 tools：更聰明的 game NPCs、更快的 localisation、自動 video editing、AI moderation、esports data analysis、AI-generated assets 和 enterprise copilots。

Google 也表示，使用這套 infrastructure 的 customers 包括 US DOE、Boston Dynamics、Citadel Securities、Thinking Machine Labs 和 Axia Energy。這顯示它瞄準的是嚴肅的大規模 AI work，不是 consumer gimmicks。

最大 takeaway：Google 想讓自己的 cloud 成為 agentic AI 的主要基地之一，一邊使用自家的 TPUs 和 CPUs，一邊也把 NVIDIA Rubin 帶進來給需要該 ecosystem 的 customers。對馬來西亞和 SEA 來說，真正的影響大概率會透過建立在它之上的 apps 和 services 到來，而不是透過 hardware 本身。

來源：Wccftech Gaming