Methodology

How the numbers are computed.

gentic.news publishes scores: relevance, quality, sentiment, velocity, prediction accuracy. This page documents the exact algorithms — thresholds, formulas, and known limitations — so you can decide how much to trust each number.

1. Relevance score (0–100)

Applied to every incoming article and tweet. Three stages:

Stage 1 — Keyword filter. Regex match against ~60 AI-industry terms (model names, lab names, venue acronyms). No match → dropped. Removes ~70% of incoming items.
Stage 2 — DeepSeek scoring. The source text (headline + first 500 chars) is sent to DeepSeek with a scoring rubric (novelty, specificity, verifiability, AI-industry relevance). Returns integer 0–100. Items below threshold (60 for RSS, 40 for X) are dropped.
Stage 3 — Corroboration boost. If an item's title matches an existing article (see dedup below), +2 relevance per new source, capped at 100.

Known limitation: DeepSeek scores vary ±5 between runs on identical input. Treat scores as buckets (low / medium / high), not precise rankings.

2. Title deduplication

Before generating a new article from a tweet or RSS item, we check if it's a duplicate of the last 200 articles. An incoming title is considered a duplicate if any of these hold:

Exact URL or exact title match
Character-level similarity (SequenceMatcher) ≥ 0.65
Stemmed keyword overlap ≥ 3 and ≥ 30% of the shorter title
Shared proper-noun entities + matching numeric values (e.g., both mention "100,000 qubits")
2+ shared proper-noun entities and 2+ shared content words

CamelCase and hyphen splitting is applied first ("PoisonedRAG" ≡ "Poisoned RAG"). When a duplicate is detected, the new source is merged into the existing article's source list and corroboration count, rather than creating a second article.

3. Quality score (0–100) — 8-dimension rubric

Computed at article-generation time, distinct from relevance. The rubric scores each draft on eight dimensions, each 0–10, summed and normalised:

Source reliability — manually curated weight of the originating feed (1–10).
Entity specificity — presence of named companies, models, people, papers (vs. generic terms).
Verifiable facts — numbers, dates, dollar amounts, paper IDs, GitHub commits.
Corroboration — count of independent sources merged into the article.
Recency — published-at age relative to the topic's news cycle.
Originality of framing — whether the draft adds context beyond the source headline.
Internal coherence — passes a structure check (lead, body, sources, no orphan claims).
Anti-listicle / anti-hype — penalises "top 10" framing, marketing language, unsourced superlatives.

Models used. Article drafting and general scoring run on DeepSeek (cost-efficient at high volume); retail-luxury scoring runs on GPT-5.4-mini. Total monthly LLM spend is capped at $20.30, audited via per-call cost logs.

Regen gate. Drafts scoring below 60/100 are auto-regenerated up to 2 times with a stronger rubric. Drafts still below 60 after two regens are dropped (not published). Drafts between 60 and 70 are published but demoted off the homepage. Drafts at 70+ are eligible for the homepage; the breakthrough flag triggers above 85.

Source-weight assignments live in backend/app/core/source_weights.py (public in the repo when released). Quality < 40 articles are demoted off the homepage.

4. Prediction accuracy

The /predictions page shows an aggregate accuracy percentage. It's computed as:

accuracy = resolved_correct / (resolved_correct + resolved_incorrect + expired_unresolved)

Expired unresolved predictions count as failures — we do not hide them. The current figure is based on a small sample (< 50 resolved predictions as of April 2026), so confidence intervals are wide. Treat it as directional, not precise.

5. Sources

We monitor 89+ sources:

54 RSS feeds — arXiv (cs.AI, cs.CL, cs.LG), TechCrunch, MIT Technology Review, Wired, The Verge, Bloomberg, Reuters, OpenAI Blog, DeepMind Blog, Anthropic, HuggingFace, Stanford AI, Berkeley AI, plus domain-specific feeds for data centers (Data Center Frontier, DCD) and retail/luxury.
35 curated X/Twitter accounts — researchers (Ethan Mollick, Elvis Saravia, Simon Willison), analysts (Dylan Patel/SemiAnalysis, Rich Miller, Tim Prickett Morgan), lab accounts (AnthropicAI, DarioAmodei), and industry publications (Data Center Dynamics, Data Center Frontier). Full list published at /editorial.

Known bias: the source list is English-language and Western-tech-heavy. Chinese and Indian AI developments may be under-represented. We're actively expanding coverage.

6. What we explicitly do NOT claim

That scores are precise to the digit. Treat them as bucketed rankings.
That our predictions are investment advice. They're probabilistic forecasts with a track record still forming.
That AI-generated articles are error-free. Every page says so. If you find an error, please contact us — a correction mechanism is on our roadmap.
That we have human editorial review on every article. We don't. One human (Ala Ayadi) curates sources, sets rules, and does spot-checks.

Audit + reproducibility

The backend scoring and dedup code is available on request for academic review. The prediction resolution ledger (every prediction, its deadline, and its verdict) is public at /predictions. We welcome third-party audits of both.

Last reviewed: April 29, 2026 · Authored by Ala AYADI · Corrections log