OpenAI launches Flex processing for cheaper, slower AI tasks

Chatgpt anmated logo

Chatgpt anmated logo

OpenAI has developed a new pricing option called Flex processing, designed to reduce AI usage costs by half for jobs that don't require immediate reaction times.

The launch comes as part of OpenAI's strategic push to stay competitive with rivals like Google, which recently unveiled its budget-friendly Gemini 2.5 Flash model.

ChatGPT on iPhone
expand image

Flex processing is currently in beta and supports OpenAI's newer reasoning models, o3 and o4-mini. It's aimed at developers handling low-priority or non-production workloads such as testing, data enrichment, or background batch processes; speed is less critical.

Flex reduces the API price for the o3 model to $5 per million input tokens (approximately 750,000 words) and $20 per million output tokens, compared to regular rates of $10 and $40. Prices for o4-mini have fallen to $0.55 per million input tokens and $2.20 per million output tokens from $1.10 and $4.40, respectively.

These cost savings come with trade-offs. Flex processing has longer reaction times and could experience intermittent resource outages, rendering it unsuitable for real-time or mission-critical applications. However, for those prepared to compromise on performance, the cost savings could make large-scale AI experimentation more feasible.

The timing reflects broader industry trends: AI development is becoming more expensive, while there is a growing desire for low-cost, efficient tools. Google's new Gemini 2.5 Flash model offers a glimpse into this change, offering high performance at a lower cost.

ChatGPT
expand image

Per the Flex launch, OpenAI announced a new ID verification requirement for developers at use levels 1-3. To access the o3 model, users must first complete identity checks based on their expenditure level. According to OpenAI, the policy is designed to prevent misuse and promote responsible usage.

Flex processing offers a balanced path for developers and businesses studying AI on a budget, blending access to powerful models with significant savings, as long as they are willing to wait a little longer for results.