OpenAI Introduces Budget-Friendly 'Flex Processing' API for Lower-Priority AI Tasks

OpenAI launches Flex Processing API, offering budget-friendly AI solutions for non-critical tasks with o3 and o4-mini models.

In an effort to offer more affordable AI services and stay competitive with tech giants like Google, OpenAI has launched Flex processing, a new API pricing tier that offers lower rates for its advanced models in exchange for slower speeds and occasional service unavailability as reported by TechCrunch.

Tailored for Non-Critical Use Cases

Currently in beta, Flex processing is available for OpenAI's latest reasoning models, o3 and o4-mini. The option targets lower-priority use cases such as background data enrichment, model testing, and asynchronous workflows that do not require real-time responses. The new tier is designed to help developers manage costs while experimenting with or scaling large language models.

Price Cuts for o3 and o4-mini

According to TechCrunch, Flex processing significantly reduces the cost of using OpenAI’s models, cutting token prices by half. For the o3 model, input tokens now cost $5 per million (down from $10), and output tokens are $20 per million (down from $40). Similarly, for the o4-mini model, input tokens are reduced to $0.55 per million (from $1.10), and output tokens to $2.20 per million (from $4.40).

This aggressive pricing strategy reflects OpenAI’s response to mounting pressure from competitors, particularly Google, which recently launched its budget-friendly Gemini 2.5 Flash model that performs competitively at a lower cost.

Verification Required for Higher-Tier Users

Alongside the Flex announcement, OpenAI introduced a new ID verification process for users in its top three usage tiers, which are determined by the amount spent on the platform. Access to features like o3 model usage, reasoning summaries, and streaming API support will now be contingent on completing this identity verification.

OpenAI has clarified that this move aims to improve platform security and prevent misuse of its powerful tools by unauthorized or malicious users.

Also read: OpenAI Launches o3 and o4-mini

The introduction of Flex pricing comes at a time when the operational costs of cutting-edge AI are rising. By providing a budget-conscious alternative for non-urgent tasks, OpenAI aims to broaden accessibility to its models while optimizing infrastructure use for high-priority applications.

Flex processing is available immediately in beta, and developers can start integrating it via the OpenAI API platform.

Related Topics

AI Model ScalingGenerative AI ModelsLarge Language Models (LLMs)LLMs