Capability-Consolidation-And-Request-Level-Control
Sources: 1 • Confidence: High • Updated: 2026-03-17 15:15
Key takeaways
- Mistral states that Mistral Small 4 unifies reasoning, multimodal, and agentic coding capabilities previously associated with Magistral, Pixtral, and Devstral into one model.
- The author tested the model via the Mistral API using the llm-mistral plugin and invoked the model identifier "mistral/mistral-small-2603".
- Mistral Small 4 is described as a 119B-parameter Mixture-of-Experts model with 6B active parameters.
- Mistral announced Leanstral, an open-weight model tuned specifically to produce Lean 4 formally verifiable code.
- The Mistral Small 4 model weights are 242GB on Hugging Face.
Sections
Capability-Consolidation-And-Request-Level-Control
- Mistral states that Mistral Small 4 unifies reasoning, multimodal, and agentic coding capabilities previously associated with Magistral, Pixtral, and Devstral into one model.
- Mistral Small 4 supports a reasoning_effort setting with values "none" or "high".
- Mistral claims that reasoning_effort="high" yields verbosity equivalent to previous Magistral models.
Integration-Path-And-Api-Surface-Gaps
- The author tested the model via the Mistral API using the llm-mistral plugin and invoked the model identifier "mistral/mistral-small-2603".
- At the time of writing, the author could not find documentation for setting reasoning effort in the Mistral API.
- The author expects that the ability to set reasoning effort may be added soon to the Mistral API.
Architecture-And-Deployability-Constraints
- Mistral Small 4 is described as a 119B-parameter Mixture-of-Experts model with 6B active parameters.
- The Mistral Small 4 model weights are 242GB on Hugging Face.
Specialization-Toward-Formal-Verification
- Mistral announced Leanstral, an open-weight model tuned specifically to produce Lean 4 formally verifiable code.
Watchlist
- At the time of writing, the author could not find documentation for setting reasoning effort in the Mistral API.
- The author expects that the ability to set reasoning effort may be added soon to the Mistral API.
Unknowns
- Do independent benchmarks confirm the claimed consolidation of reasoning, multimodal, and agentic coding capabilities into a single model at the level implied?
- Is the reasoning_effort parameter actually supported in the public Mistral API for the cited model identifier, and if so, what are the measurable impacts on latency, token usage, and output quality?
- What concrete serving requirements (GPU memory, recommended tensor-parallel/sharding setup, throughput) are implied by the stated architecture and the reported 242GB weights artifact?
- Are there smaller or alternative weight formats (e.g., sharded downloads or quantized variants) available for the reported Hugging Face release, and what constraints apply to their use?
- What are the licensing and usage constraints for the releases described (including the “open-weight” Leanstral model), and do they differ materially across the two announcements?