CEO's Column
Search
More
AI Infrastructure

F5 and NVIDIA Boost AI with Smarter LLM Routing & Security

ByMegha Pathak
2025-06-17.about 1 month ago
F5 and NVIDIA Boost AI with Smarter LLM Routing & Security
F5 and NVIDIA partner to advance AI infrastructure with optimized LLM routing, GPU efficiency, and secure, scalable deployments powered by BlueField-3 DPUs, Image Credit: X | @NVIDIAEU

F5 has teamed up with NVIDIA to launch enhanced capabilities for F5 BIG-IP Next for Kubernetes, powered by NVIDIA BlueField-3 DPUs and NVIDIA DOCA software framework, aimed at optimizing large-scale AI infrastructure. This collaboration brings improvements in performance, security, and efficiency, particularly for AI applications requiring scalable, secure delivery and traffic management. The new solution has been validated by Sesterce, a European operator specializing in next-gen infrastructures and sovereign AI.

Optimized LLM Routing and GPU Utilization

The BIG-IP Next for Kubernetes on NVIDIA BlueField-3 DPUs provides advanced traffic management and security for large-scale AI workloads. Enhanced performance includes a 20% improvement in GPU utilization and multi-tenancy support, making it ideal for AI-driven applications like large language models (LLMs). The integration with NVIDIA Dynamo and KV Cache Manager reduces latency and optimizes GPU and memory resources, ensuring faster LLM inference and improving performance at scale.

Also Read: Polar & Crusoe Launch 12MW Sustainable AI Data Center in Norway

Smart Security and Load Balancing for AI Models

F5’s dynamic LLM routing intelligently directs simple AI tasks to lighter models, reserving advanced models for complex queries. This dynamic load balancing improves query response time, reduces latency, and enhances the quality of AI outputs. The combined F5 and NVIDIA solution also ensures secure Model Context Protocol (MCP) servers with added reverse proxy protection, enabling secure AI deployments with robust protection against cybersecurity risks.

Empowering Distributed AI Inference

With the NVIDIA Dynamo framework, F5 ensures that distributed AI inference runs seamlessly, utilizing BlueField DPUs for efficient memory management and reducing costs compared to GPU memory. This strategic collaboration between F5 and NVIDIA delivers faster, more reliable, and secure AI infrastructure, paving the way for efficient and scalable AI-driven services in industries worldwide.

Related Topics

AI Investments

Subscribe to NG.ai News for real-time AI insights, personalized updates, and expert analysis—delivered straight to your inbox.