Oracle and AMD Partner to Deliver Cutting-Edge AI Infrastructure with New AMD GPUs

Oracle and AMD team up to deploy Instinct MI355X GPUs on OCI, unlocking next-gen performance and scalability for AI training and inference at massive scale.

Oracle and AMD have announced a new collaboration to provide customers with access to AMD Instinct™ MI355X GPUs on Oracle Cloud Infrastructure (OCI), marking a significant upgrade for AI training and inference workloads.

The new offering promises more than 2X price-performance improvement over previous generations, providing scalable and cost-effective solutions for demanding AI applications.

OCI will feature zettascale AI clusters, accelerated by the latest AMD Instinct processors, with up to 131,072 MI355X GPUs. This will enable customers to efficiently build, train, and run inference for large-scale AI models, addressing the increasing demands of AI-driven innovations.

Superior Performance with AMD Instinct MI355X GPUs

AMD’s Instinct MI355X GPUs deliver up to 2.8X more throughput for AI workloads, with 288GB HBM3 memory and 8TB/s bandwidth for faster model training and inference.

FP4 support enables efficient large model deployment, while a liquid-cooled design supports 64 GPUs per rack at 125kW. Paired with AMD Turin CPUs and up to 3TB memory, the system ensures high-performance orchestration and data processing.

Built for AI at Scale

The MI355X GPUs, paired with OCI's advanced networking infrastructure, provide a comprehensive solution for businesses deploying AI workloads at scale. Oracle’s infrastructure supports new agentic AI applications by offering a faster time-to-first token (TTFT) and high tokens-per-second throughput.

The open-source stack from AMD ROCm ensures flexibility and portability, enabling users to migrate their existing code with no vendor lock-in. This approach empowers customers with greater choice in developing and deploying their AI solutions on OCI.

Strategic Collaboration for the Future of AI

Forrest Norrod, EVP and General Manager of AMD’s Data Center Solutions Business Group, said, "AMD and Oracle have a shared history of providing customers with open solutions to accommodate high performance, efficiency, and greater system design flexibility.”

He added, “The latest generation of AMD Instinct GPUs and Pollara NICs on OCI will help support new use cases in inference, fine-tuning, and training, offering more choice to customers as AI adoption grows."

In addition to the hardware advancements, the OCI supercluster architecture offers high-throughput, low-latency RDMA networks, enabling customers to scale their AI workloads effortlessly.

Also read: Oracle to Invest $40 Billion in Nvidia Chips for OpenAI’s Texas Data Center

Innovative Networking with AMD Pollara AI NICs

As part of the partnership, Oracle will be the first to deploy AMD Pollara AI NICs, providing advanced RoCE (RDMA over Converged Ethernet) functions. These innovations will enable innovative network fabric designs, improving latency and performance for large-scale AI applications. With support for open industry standards from the Ultra Ethernet Consortium (UEC), customers will benefit from seamless integration and unified network solutions.