OpenAI Introduces Flex Processing for Cost-Effective AI Tasks

Preface

As AI technology becomes more intertwined with business practices, the financial aspect of utilizing these services continues to be a pivotal concern. OpenAI recognizes the competitive market landscape, particularly with formidable players such as Google, and has responded by unveiling a new processing option titled Flex processing. This move is intended to provide a more economical choice for tasks that can tolerate slower processing times, by cutting costs significantly without compromising the foundational integrity for non-critical operations.

Lazy bag

OpenAI's Flex processing offers a budget-friendly solution for non-urgent AI tasks, reducing API costs by half. This strategic initiative is targeted at lowering expenses for tasks like model evaluations and asynchronous operations.

Main Body

OpenAI's latest innovation, Flex processing, signals a strategic shift in how AI services are priced and prioritized in business operations. Flex processing comes amidst a landscape where companies grapple with the rising costs associated with cutting-edge AI models. Key competitors, including Google, are already offering more cost-efficient models, prompting OpenAI to adopt a similar strategy.

Flex processing, now available in beta for OpenAI’s recently launched o3 and o4-mini models, targets lower-priority and non-production tasks. These tasks include model evaluations, data enrichment, and asynchronous workloads, areas where immediate response times are not as crucial. By opting for slower response times, businesses can avail themselves of a 50% reduction in API costs.

For instance, the pricing for o3 under Flex processing is dramatically reduced. Input tokens are priced at $5 per million (approximately 750,000 words), while output tokens are $20 per million, compared to the standard rates of $10 per million for input and $40 per million for output tokens. Similarly, the o4-mini model sees its costs cut to $0.55 per million input tokens and $2.20 per million output tokens from the original $1.10 and $4.40 rates respectively.

This release also highlights OpenAI’s adaptation to meet market demands while preserving its competitive edge. Flex pricing serves as a testament to OpenAI’s commitment to accessible AI solutions, particularly as the industry witnesses increasing input token costs from competing models, such as Google's recent introduction of Gemini 2.5 Flash.

Additionally, OpenAI has instituted a new ID verification protocol for accessing certain models including o3, emphasizing secure usage and alignment with its policies against misuse by bad actors. This verification applies specifically to developers categorized within tiers 1 through 3 of the usage policy, reflecting the volume of expenses incurred on OpenAI services.

Key Insights Table

Aspect	Description
Flex Processing Pricing	Offers a 50% cost reduction for non-critical AI tasks.
Application Scope	Targets lower-priority, asynchronous workloads.
ID Verification	Ensures policy compliance for model access.

Last edited at：2025/4/17