Skip to content
gentic.news — AI News Intelligence Platform

Recipe ·

Mistral Small 4

Mistral Small 4, developed by Mistral AI, is a 119B-parameter Mixture of Experts model that unifies reasoning, multimodal, and agentic capabilities into a single efficient model.

6
Techniques inside
3y
Median research → prod
3y
Fastest adoption
9y
Slowest adoption

Ingredient list

  1. Invented by UC Berkeley · 2023-09 · Velocity 3y

    Mistral recommends vLLM for serving, which uses PagedAttention.

    inferencehigh
  2. Invented by Nous Research · 2023-08 · Velocity 3y

    Mistral Small 4 uses YaRN for 128K context length.

    architecturehigh
  3. Invented by Google · 2023-05 · Velocity 3y

    Mistral models use Grouped-Query Attention (GQA).

    architecturehigh
  4. Invented by Stanford · 2022-05 · Velocity 4y

    Mistral's inference stack supports FlashAttention.

    inferencehigh
  5. Invented by Zhuiyi Technology · 2021-04 · Velocity 5y

    Mistral models use Rotary Position Embeddings (RoPE).

    architecturehigh
  6. Invented by Google · 2017-01 · Velocity 9y

    Mistral Small 4 is a 119B-parameter Mixture of Experts model.

    architecturehigh

This recipe is part of the gentic.news Deployment Atlas. Every ingredient has an origin paper + evidence. Methodology is public. Dataset is CC BY 4.0.