Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

Open-source repository webpage showing architecture diagrams and code for 38 LLMs like DeepSeek V3 and Qwen3, with…

LLM Architecture Gallery Compiles 38 Model Designs from 2024-2026 with Diagrams and Code

A new open-source repository provides annotated architecture diagrams, key design choices, and code implementations for 38 major LLMs released between 2024 and 2026, including DeepSeek V3, Qwen3 variants, and GLM-5 744B.

AAAla SMITH & AI Research Desk·Mar 16, 2026·2 min read··219 views·AI-Generated·Report error

Source: x.comvia @akshay_pachaarMulti-Source

What Happened

A new public repository called the "LLM Architecture Gallery" has been released, compiling technical documentation for 38 distinct large language model architectures launched between 2024 and 2026. The gallery, highlighted by AI engineer Akshay Pachaar, was created by Sebastian Raschka (@rasbt) with contributions from Pachaar during his time at Lightning AI.

The repository serves as a centralized technical reference, providing three core elements for each model:

An annotated architecture diagram visualizing the model's structure.
A breakdown of key design choices (e.g., attention mechanisms, normalization layers, activation functions).
A code implementation, likely in PyTorch, demonstrating the architecture.

The Models Covered

The gallery spans a wide range of models from major AI labs and companies, focusing on releases from the last two years. The list of 38 models includes:

Open-Source Foundation Models: Llama 3 8B, OLMo 2/3 variants (7B, 32B), Gemma 3 27B, Mistral Small 3.1 24B, SmolLM3 3B, GPT-OSS (20B, 120B).
Recent High-Performance Models: DeepSeek V3, DeepSeek V3.2, DeepSeek R1, Qwen3 series (4B to 235B-A22B), Qwen3.5 397B, GLM-4.5/4.7/5 (up to 744B).
Proprietary & Regional Models: Grok 2.5 270B, Kimi K2, Kimi Linear 48B-A3B, Xiaomi MiMo-V2-Flash 309B, Arcee AI Trinity Large 400B, Sarvam AI models (30B, 105B).
Announced/Upcoming Architectures: Models like Llama 4 Maverick and Nemotron 3 Super 120B-A12B, which represent published architectures from this period.

The repository is hosted on GitHub, accessible via the link provided in the source: https://github.com/rasbt/LLM-architecture-gallery.

Context

This project addresses a growing pain point in the fast-moving field of LLM research and development: fragmented and inconsistent documentation. While major models are often accompanied by academic papers or blog posts, the exact architectural details, layer configurations, and implementation nuances can be difficult to find, compare, or reproduce. A standardized gallery allows engineers and researchers to quickly understand design trends, compare architectural choices (like the use of grouped-query attention vs. multi-head attention), and have a verified code reference for experimentation or educational purposes.

The period from 2024 to 2026 has seen rapid architectural innovation beyond the now-standard Transformer, with models experimenting with mixture-of-experts (MoE) configurations (e.g., DeepSeek V3), new attention variants, and alternative topologies. Having these designs cataloged in one place provides a valuable snapshot of this evolutionary phase.

Source: gentic.news · Mar 16, 2026 · author=Ala SMITH · citation.json

AI-assisted reporting. Generated by gentic.news from multiple verified sources, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

The LLM Architecture Gallery is a utility, not a research breakthrough, but its value to practitioners is direct and significant. For engineers building on top of or fine-tuning these models, having a clear, code-backed diagram eliminates hours of digging through papers and source code to answer basic questions about tensor shapes, normalization placement, or residual connections. It turns architecture from a descriptive concept into an executable specification. From a research perspective, the gallery enables rapid comparative analysis. One could systematically track the adoption rate of techniques like RMSNorm, SwiGLU activations, or rotary positional embeddings (RoPE) across this model set. It makes the "design space" of modern LLMs tangible. The inclusion of models like Xiaomi's MiMo and Sarvam's offerings also provides easier visibility into architectural trends emerging from labs outside the US. The main limitation is maintenance. The field moves quickly, and new variants emerge constantly. The gallery's usefulness will depend on its ability to stay current. Furthermore, while the code provides the skeleton, it may not include the exact hyperparameters, tokenizer details, or training data pipeline, which are equally critical for full replication. Nonetheless, as a starting point for understanding and implementing these architectures, it sets a new standard for open technical documentation in the community.

#research-tools #open-source #model-architecture

Compare side-by-side

DeepSeek-V3 vs Llama 3 8B

→

Mentioned in this article

LLM Architecture Gallery Sebastian Raschka DeepSeek-V3 Llama 3 8B

Enjoyed this article?

Get the weekly AI intelligence briefing

✨AI Toolslive

Five one-click lenses on this article. Cached for 24h.

Pick a tool above to generate an instant lens on this article.

AI Research

Google’s Virgo network interconnects 134K TPUv8t chips at 47 Pbps

From the lab

The framework underneath this story

Every article on this site sits on top of one engine and one framework — both built by the lab.

Original research · EUMAS 2026

MNEMA — A Witness Lattice for Multi-Agent AI Memory

Cryptographic memory units · 1−α detection floor · 15 pp PDF

Field framework · v1.0

Epistemic Infrastructure

12 pillars · 11-stage knowledge metabolism · pathology catalog

More in AI Research

View all

AI Research

Visual-Seeker: Active Visual Reasoning Beats Proprietary MLLMs on 5 Benchmarks

Visual-Seeker achieves SOTA on five multimodal search benchmarks, surpassing proprietary models by actively harvesting visual evidence during search.

arxiv.org/6h ago/3 min read

agentsresearchmultimodal

Researchers analyze fusion strategies on a computer dashboard displaying patient data and survival curves for PE…

AI Research

No single fusion strategy wins

Zhang et al. test 4 fusion strategies on 7K+ patients, finding no universal best. Contrastive alignment with CLMBR wins for PE mortality; cross-attention and co-attention split for CVD.

arxiv.org/6h ago/3 min read

healthcare aimultimodal learningai research

Two researchers in a lab analyzing a chart showing cost reduction, with a laptop displaying a graph of annotation…

AI Research

Metric Match Cuts LLM Judge Annotation Cost 32.5% via Subset Selection

MIT and Stanford researchers developed Metric Match, a subset selection method that reduces LLM judge annotation costs by 32.5% and estimation error by 18.7%, achieving a 0.838 win-rate against random selection.

arxiv.org/6h ago/3 min read

paperresearchllm

What Happened

The Models Covered

Context

AI Analysis

✨AI Toolslive

Related Articles

Google Open-Sources DiffusionGemma, 26B Model Hits 1K Tokens/Sec on H100

Stanford, Meta 'Code as Agent Harness' Paper Rethinks AI Agent Design

Selective Attackers Cut Agent Safety by 28pp, Paper Finds

Chinese LLMs Surge on OpenRouter as U.S. AI Traffic Shifts

DeepMind paper: hidden web content hijacks agents 86% of the time

Google’s Virgo network interconnects 134K TPUv8t chips at 47 Pbps

The framework underneath this story

More in AI Research

Visual-Seeker: Active Visual Reasoning Beats Proprietary MLLMs on 5 Benchmarks

No single fusion strategy wins

Metric Match Cuts LLM Judge Annotation Cost 32.5% via Subset Selection