New Low-Cost Model Option And Unit Economics
Sources: 1 • Confidence: High • Updated: 2026-03-08 21:22
Key takeaways
- Google released Gemini 3.1 Flash-Lite as an update to its inexpensive Flash-Lite model family.
- Gemini 3.1 Flash-Lite supports four different thinking levels.
- Gemini 3.1 Flash-Lite pricing is stated as $0.25 per million input tokens and $1.5 per million output tokens.
- Gemini 3.1 Flash-Lite is described as one-eighth the price of Gemini 3.1 Pro.
- The four thinking levels shown for Gemini 3.1 Flash-Lite are minimal, low, medium, and high.
Sections
New Low-Cost Model Option And Unit Economics
- Google released Gemini 3.1 Flash-Lite as an update to its inexpensive Flash-Lite model family.
- Gemini 3.1 Flash-Lite pricing is stated as $0.25 per million input tokens and $1.5 per million output tokens.
- Gemini 3.1 Flash-Lite is described as one-eighth the price of Gemini 3.1 Pro.
Inference Controllability Via Thinking Levels
- Gemini 3.1 Flash-Lite supports four different thinking levels.
- The four thinking levels shown for Gemini 3.1 Flash-Lite are minimal, low, medium, and high.
Unknowns
- What are the benchmarked quality and task-performance characteristics of Gemini 3.1 Flash-Lite relative to prior Flash-Lite and to Gemini 3.1 Pro?
- What are the latency and throughput characteristics (including any rate limits) for Gemini 3.1 Flash-Lite at each thinking level?
- How are the thinking levels invoked in practice (exact API parameters, defaults, and whether behavior is deterministic across releases)?
- What additional billing dimensions apply (e.g., separate charges for tool use, caching, multimodal inputs, or other metered features), if any?
- What is the release timing context (exact date, regions, and availability across products) for Gemini 3.1 Flash-Lite?