Skip to content
gentic.news — AI News Intelligence Platform

Recipe ·

DeepSeek-V3

DeepSeek-V3, developed by DeepSeek, is a highly efficient mixture-of-experts language model trained at a fraction of the cost of comparable systems while maintaining strong performance.

5
Techniques inside
4y
Median research → prod
3y
Fastest adoption
9y
Slowest adoption

Ingredient list

  1. Invented by Nous Research · 2023-08 · Velocity 3y

    DeepSeek-V3 uses YaRN for extended context length.

    architecturehigh
  2. Invented by Google · 2023-05 · Velocity 3y

    DeepSeek-V3 uses Grouped-Query Attention (GQA).

    architecturehigh
  3. Invented by Stanford · 2022-05 · Velocity 4y

    DeepSeek-V3 uses FlashAttention-2 for efficient training.

    inferencehigh
  4. Invented by Zhuiyi Technology · 2021-04 · Velocity 5y

    DeepSeek-V3 uses Rotary Position Embedding (RoPE).

    architecturehigh
  5. Invented by Google · 2017-01 · Velocity 9y

    DeepSeek-V3 is a highly efficient mixture-of-experts language model.

    architecturehigh

This recipe is part of the gentic.news Deployment Atlas. Every ingredient has an origin paper + evidence. Methodology is public. Dataset is CC BY 4.0.