Low-Cost Model Release And Unit Economics
Sources: 1 • Confidence: High • Updated: 2026-04-12 10:21
Key takeaways
- Google released Gemini 3.1 Flash-Lite as an update to its inexpensive Flash-Lite model family.
- Gemini 3.1 Flash-Lite supports four different thinking levels.
- Gemini 3.1 Flash-Lite pricing is $0.25 per million input tokens and $1.5 per million output tokens.
- Gemini 3.1 Flash-Lite is described as one-eighth the price of Gemini 3.1 Pro.
- The four thinking levels shown for Gemini 3.1 Flash-Lite are minimal, low, medium, and high.
Sections
Low-Cost Model Release And Unit Economics
- Google released Gemini 3.1 Flash-Lite as an update to its inexpensive Flash-Lite model family.
- Gemini 3.1 Flash-Lite pricing is $0.25 per million input tokens and $1.5 per million output tokens.
- Gemini 3.1 Flash-Lite is described as one-eighth the price of Gemini 3.1 Pro.
Inference-Time Controllability Via Thinking Levels
- Gemini 3.1 Flash-Lite supports four different thinking levels.
- The four thinking levels shown for Gemini 3.1 Flash-Lite are minimal, low, medium, and high.
Unknowns
- What are the measured quality and reliability differences between Gemini 3.1 Flash-Lite and prior Flash-Lite versions for representative tasks?
- What is the exact, published Gemini 3.1 Pro pricing that substantiates the one-eighth price relationship, and what comparison basis is used?
- How do the four thinking levels affect latency, token usage, and output quality in practice, and what are the recommended/default settings?
- What are the API details for thinking levels (parameter name, accepted values, backward compatibility), and are the labels stable over time?
- What deployment constraints apply to Gemini 3.1 Flash-Lite (regional availability, quotas, rate limits, context window limits), if any?