Methodology
How the numbers are computed.
gentic.news publishes scores: relevance, quality, sentiment, velocity, prediction accuracy. This page documents the exact algorithms — thresholds, formulas, and known limitations — so you can decide how much to trust each number.
1. Relevance score (0–100)
Applied to every incoming article and tweet. Three stages:
- Stage 1 — Keyword filter. Regex match against ~60 AI-industry terms (model names, lab names, venue acronyms). No match → dropped. Removes ~70% of incoming items.
- Stage 2 — DeepSeek scoring. The source text (headline + first 500 chars) is sent to DeepSeek with a scoring rubric (novelty, specificity, verifiability, AI-industry relevance). Returns integer 0–100. Items below threshold (60 for RSS, 40 for X) are dropped.
- Stage 3 — Corroboration boost. If an item's title matches an existing article (see dedup below), +2 relevance per new source, capped at 100.
Known limitation: DeepSeek scores vary ±5 between runs on identical input. Treat scores as buckets (low / medium / high), not precise rankings.
2. Title deduplication
Before generating a new article from a tweet or RSS item, we check if it's a duplicate of the last 200 articles. An incoming title is considered a duplicate if any of these hold:
- Exact URL or exact title match
- Character-level similarity (SequenceMatcher) ≥ 0.65
- Stemmed keyword overlap ≥ 3 and ≥ 30% of the shorter title
- Shared proper-noun entities + matching numeric values (e.g., both mention "100,000 qubits")
- 2+ shared proper-noun entities and 2+ shared content words
CamelCase and hyphen splitting is applied first ("PoisonedRAG" ≡ "Poisoned RAG"). When a duplicate is detected, the new source is merged into the existing article's source list and corroboration count, rather than creating a second article.
3. Quality score (0–100) — 8-dimension rubric
Computed at article-generation time, distinct from relevance. The rubric scores each draft on eight dimensions, each 0–10, summed and normalised:
- Source reliability — manually curated weight of the originating feed (1–10).
- Entity specificity — presence of named companies, models, people, papers (vs. generic terms).
- Verifiable facts — numbers, dates, dollar amounts, paper IDs, GitHub commits.
- Corroboration — count of independent sources merged into the article.
- Recency — published-at age relative to the topic's news cycle.
- Originality of framing — whether the draft adds context beyond the source headline.
- Internal coherence — passes a structure check (lead, body, sources, no orphan claims).
- Anti-listicle / anti-hype — penalises "top 10" framing, marketing language, unsourced superlatives.
Models used. Article drafting and general scoring run on DeepSeek (cost-efficient at high volume); retail-luxury scoring runs on GPT-5.4-mini. Total monthly LLM spend is capped at $20.30, audited via per-call cost logs.
Regen gate. Drafts scoring below 60/100 are auto-regenerated up to 2 times with a stronger rubric. Drafts still below 60 after two regens are dropped (not published). Drafts between 60 and 70 are published but demoted off the homepage. Drafts at 70+ are eligible for the homepage; the breakthrough flag triggers above 85.
Source-weight assignments live in backend/app/core/source_weights.py (public in the repo when released). Quality < 40 articles are demoted off the homepage.
4. Prediction accuracy
The /predictions page shows an aggregate accuracy percentage. It's computed as:
accuracy = resolved_correct / (resolved_correct + resolved_incorrect + expired_unresolved)
Expired unresolved predictions count as failures — we do not hide them. The current figure is based on a small sample (< 50 resolved predictions as of April 2026), so confidence intervals are wide. Treat it as directional, not precise.
5. Sources
We monitor 89+ sources:
- 54 RSS feeds — arXiv (cs.AI, cs.CL, cs.LG), TechCrunch, MIT Technology Review, Wired, The Verge, Bloomberg, Reuters, OpenAI Blog, DeepMind Blog, Anthropic, HuggingFace, Stanford AI, Berkeley AI, plus domain-specific feeds for data centers (Data Center Frontier, DCD) and retail/luxury.
- 35 curated X/Twitter accounts — researchers (Ethan Mollick, Elvis Saravia, Simon Willison), analysts (Dylan Patel/SemiAnalysis, Rich Miller, Tim Prickett Morgan), lab accounts (AnthropicAI, DarioAmodei), and industry publications (Data Center Dynamics, Data Center Frontier). Full list published at /editorial.
Known bias: the source list is English-language and Western-tech-heavy. Chinese and Indian AI developments may be under-represented. We're actively expanding coverage.
6. What we explicitly do NOT claim
- That scores are precise to the digit. Treat them as bucketed rankings.
- That our predictions are investment advice. They're probabilistic forecasts with a track record still forming.
- That AI-generated articles are error-free. Every page says so. If you find an error, please contact us — a correction mechanism is on our roadmap.
- That we have human editorial review on every article. We don't. One human (Ala Ayadi) curates sources, sets rules, and does spot-checks.
Audit + reproducibility
The backend scoring and dedup code is available on request for academic review. The prediction resolution ledger (every prediction, its deadline, and its verdict) is public at /predictions. We welcome third-party audits of both.
Last reviewed: April 29, 2026 · Authored by Ala AYADI · Corrections log