Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

Methodology

How the numbers are computed.

gentic.news publishes scores: relevance, quality, sentiment, velocity, prediction accuracy. This page documents the exact algorithms — thresholds, formulas, and known limitations — so you can decide how much to trust each number.

1. Relevance score (0–100)

Applied to every incoming article and tweet. Three stages:

  • Stage 1 — Keyword filter. Regex match against ~60 AI-industry terms (model names, lab names, venue acronyms). No match → dropped. Removes ~70% of incoming items.
  • Stage 2 — DeepSeek scoring. The source text (headline + first 500 chars) is sent to DeepSeek with a scoring rubric (novelty, specificity, verifiability, AI-industry relevance). Returns integer 0–100. Items below threshold (60 for RSS, 40 for X) are dropped.
  • Stage 3 — Corroboration boost. If an item's title matches an existing article (see dedup below), +2 relevance per new source, capped at 100.

Known limitation: DeepSeek scores vary ±5 between runs on identical input. Treat scores as buckets (low / medium / high), not precise rankings.

2. Title deduplication

Before generating a new article from a tweet or RSS item, we check if it's a duplicate of the last 200 articles. An incoming title is considered a duplicate if any of these hold:

  • Exact URL or exact title match
  • Character-level similarity (SequenceMatcher) ≥ 0.65
  • Stemmed keyword overlap ≥ 3 and ≥ 30% of the shorter title
  • Shared proper-noun entities + matching numeric values (e.g., both mention "100,000 qubits")
  • 2+ shared proper-noun entities and 2+ shared content words

CamelCase and hyphen splitting is applied first ("PoisonedRAG" ≡ "Poisoned RAG"). When a duplicate is detected, the new source is merged into the existing article's source list and corroboration count, rather than creating a second article.

3. Quality score (0–100) — 8-dimension rubric

Computed at article-generation time, distinct from relevance. The rubric scores each draft on eight dimensions, each 0–10, summed and normalised:

  • Source reliability — manually curated weight of the originating feed (1–10).
  • Entity specificity — presence of named companies, models, people, papers (vs. generic terms).
  • Verifiable facts — numbers, dates, dollar amounts, paper IDs, GitHub commits.
  • Corroboration — count of independent sources merged into the article.
  • Recency — published-at age relative to the topic's news cycle.
  • Originality of framing — whether the draft adds context beyond the source headline.
  • Internal coherence — passes a structure check (lead, body, sources, no orphan claims).
  • Anti-listicle / anti-hype — penalises "top 10" framing, marketing language, unsourced superlatives.

Models used. Article drafting and general scoring run on DeepSeek (cost-efficient at high volume); retail-luxury scoring runs on GPT-5.4-mini. Total monthly LLM spend is capped at $20.30, audited via per-call cost logs.

Regen gate. Drafts scoring below 60/100 are auto-regenerated up to 2 times with a stronger rubric. Drafts still below 60 after two regens are dropped (not published). Drafts between 60 and 70 are published but demoted off the homepage. Drafts at 70+ are eligible for the homepage; the breakthrough flag triggers above 85.

Source-weight assignments live in backend/app/core/source_weights.py (public in the repo when released). Quality < 40 articles are demoted off the homepage.

4. Prediction accuracy

The /predictions page shows an aggregate accuracy percentage. It's computed as:

accuracy = resolved_correct / (resolved_correct + resolved_incorrect + expired_unresolved)

Expired unresolved predictions count as failures — we do not hide them. The current figure is based on a small sample (< 50 resolved predictions as of April 2026), so confidence intervals are wide. Treat it as directional, not precise.

5. Sources

We monitor 89+ sources:

  • 54 RSS feeds — arXiv (cs.AI, cs.CL, cs.LG), TechCrunch, MIT Technology Review, Wired, The Verge, Bloomberg, Reuters, OpenAI Blog, DeepMind Blog, Anthropic, HuggingFace, Stanford AI, Berkeley AI, plus domain-specific feeds for data centers (Data Center Frontier, DCD) and retail/luxury.
  • 35 curated X/Twitter accounts — researchers (Ethan Mollick, Elvis Saravia, Simon Willison), analysts (Dylan Patel/SemiAnalysis, Rich Miller, Tim Prickett Morgan), lab accounts (AnthropicAI, DarioAmodei), and industry publications (Data Center Dynamics, Data Center Frontier). Full list published at /editorial.

Known bias: the source list is English-language and Western-tech-heavy. Chinese and Indian AI developments may be under-represented. We're actively expanding coverage.

6. What we explicitly do NOT claim

  • That scores are precise to the digit. Treat them as bucketed rankings.
  • That our predictions are investment advice. They're probabilistic forecasts with a track record still forming.
  • That AI-generated articles are error-free. Every page says so. If you find an error, please contact us — a correction mechanism is on our roadmap.
  • That we have human editorial review on every article. We don't. One human (Ala Ayadi) curates sources, sets rules, and does spot-checks.

Audit + reproducibility

The backend scoring and dedup code is available on request for academic review. The prediction resolution ledger (every prediction, its deadline, and its verdict) is public at /predictions. We welcome third-party audits of both.

Last reviewed: April 29, 2026 · Authored by Ala AYADI · Corrections log