Unit Economics Shift From Pricing And Cached Input
Sources: 1 • Confidence: High • Updated: 2026-04-13 03:50
Key takeaways
- A post estimates that describing 76,000 photos would cost about $52.44 using the per-photo cost example provided.
- OpenAI's self-reported benchmarks indicate GPT-5.4-nano can outperform the prior GPT-5 mini when run at maximum reasoning effort.
- OpenAI introduced GPT-5.4-mini and GPT-5.4-nano as additions to the GPT-5.4 model released two weeks earlier.
- The author released llm version 0.29 with support for the new GPT-5.4 mini and nano models.
- OpenAI priced GPT-5.4-nano at $0.20 per million input tokens, $0.02 per million cached input tokens, and $1.25 per million output tokens.
Sections
Unit Economics Shift From Pricing And Cached Input
- A post estimates that describing 76,000 photos would cost about $52.44 using the per-photo cost example provided.
- OpenAI priced GPT-5.4-nano at $0.20 per million input tokens, $0.02 per million cached input tokens, and $1.25 per million output tokens.
Performance Tradeoffs Depend On Reasoning Effort Settings
- OpenAI's self-reported benchmarks indicate GPT-5.4-nano can outperform the prior GPT-5 mini when run at maximum reasoning effort.
- In a pelican-bicycle SVG comparison, the author preferred GPT-5.4 output at xhigh reasoning effort.
Product Line Expansion Into Lower Cost Tiers
- OpenAI introduced GPT-5.4-mini and GPT-5.4-nano as additions to the GPT-5.4 model released two weeks earlier.
Time To Tooling Support Is Short
- The author released llm version 0.29 with support for the new GPT-5.4 mini and nano models.
Unknowns
- Do independent evaluations reproduce the claimed performance relationship between GPT-5.4-nano (at maximum reasoning effort) and the prior GPT-5 mini across representative task suites?
- What latency, throughput, and reliability penalties (if any) accompany 'maximum' or 'xhigh' reasoning effort settings for these models?
- Under what exact conditions does cached input pricing apply, and what fraction of typical workloads can realistically benefit from it?
- What is the real token-usage distribution for large-scale photo description jobs (including prompt overhead and desired caption/detail length) and how does that translate to realized total cost at the stated rates?
- Are there any stated usage limits, quotas, or availability constraints for the new mini/nano variants that would affect high-volume workloads?