AMD’s MI350P Brings Big AI Muscle to Standard PCIe Servers

AMD has added a new PCIe card to its MI350 family, and this one is aimed squarely at data centres that want more AI performance without tearing apart their existing server setup.

The new AMD Instinct MI350P is an enterprise AI accelerator built for standard PCIe slots. That matters because not every company wants — or can afford — to jump straight into fully custom AI server platforms. For cloud providers, local data centre operators, universities, banks, telcos, and AI startups in Malaysia and SEA, a card like this could be the more practical upgrade path.

Instead of needing an exotic cooling setup, the MI350P is designed for existing air-cooled rack servers. It is a 10.5-inch dual-slot card with a fanless design, relying on chassis airflow inside the server. AMD rates it around a 600W power envelope, though it can also be configured down to 450W for servers with tighter power or thermal limits.

Spec-wise, the MI350P is basically the smaller sibling of AMD’s higher-end MI350X and MI355X. It uses AMD’s CDNA4 architecture and is built using TSMC’s 3nm and 6nm FinFET processes. The card packs 8,192 cores, 128 compute units, 512 Matrix Cores, a max clock of 2.2GHz, and 128MB of last-level cache.

The biggest headline for AI workloads is memory. AMD gives the MI350P 144GB of HBM3E with 4TB/s of bandwidth. That is exactly the kind of spec that matters for large language models, retrieval-augmented generation, and inference workloads where memory capacity and bandwidth can be just as important as raw compute.

AMD says up to eight MI350P cards can be used together in a single system, letting data centres scale depending on workload size. The card also supports lower-precision formats like MXFP6 and MXFP4, which are useful for speeding up LLM-related tasks. AMD claims the MI350P can hit an estimated 2,299 TFLOPs and up to 4,600 peak TFLOPs with MXFP4.

The obvious rival here is Nvidia’s H200 NVL, currently one of the strongest PCIe AI accelerator options. According to the figures shared, AMD’s MI350P has stronger theoretical compute in several areas: around 20% better FP64, 43% better FP16, and 39% better FP8 compared with Nvidia’s card.

That is a big flex from AMD, especially because Nvidia has not announced a PCIe version of its latest HBM-equipped B200 Blackwell GPU. So for now, AMD gets to claim a very sharp position: a newer AI accelerator that fits into a traditional PCIe form factor.

For Malaysian readers, this is not the kind of GPU you buy for your gaming PC or home AI hobby build. This is enterprise hardware — think serious server racks, not Shopee cart. But it still matters locally because AI infrastructure is becoming a bigger deal across SEA. More regional AI capacity can mean better local cloud services, faster enterprise AI deployments, and potentially more options beyond Nvidia-only stacks.

The big question is not just performance, though. It is software. Nvidia still has a massive advantage thanks to CUDA, which many developers and companies already rely on. AMD has been improving its ROCm software ecosystem, but adoption will depend on whether customers feel confident moving workloads over.

Still, the MI350P looks like a smart move. It gives AMD a serious PCIe AI card for companies that want more compute without going full custom platform. If ROCm keeps improving, this could become a very real option for SEA data centres looking to diversify their AI hardware.

Source: Tom's Hardware

AMD’s MI350P Brings Big AI Muscle to Standard PCIe Servers

Tags