Rosa Del Mar

Daily Brief

Issue 92 2026-04-02

Pangram Detection Approach And Reported Metrics

Issue 92 Edition 2026-04-02 8 min read
General
Sources: 1 • Confidence: Medium • Updated: 2026-04-03 03:53

Key takeaways

  • Pangram Labs offers a paid product and a free service that returns an estimated probability of human versus AI authorship for pasted text.
  • Some books are beginning to include explicit disclaimers stating they were written only by humans with no AI used.
  • The boundary of unacceptable AI assistance is unclear because tools like spellcheck or AI copy-editing may be treated differently despite similar functional roles.
  • Max Spiro expects detector evasion could become practical by optimizing simultaneously for a detector's human score and a separate LLM-based coherence judge.
  • Pangram anticipates difficulty sourcing clean contemporary human text because online text increasingly contains AI content, and plans to rely more on pre-2023 corpora and trusted actors for newer human text.

Sections

Pangram Detection Approach And Reported Metrics

  • Pangram Labs offers a paid product and a free service that returns an estimated probability of human versus AI authorship for pasted text.
  • In Pangram’s initial baseline testing, a human evaluator could classify AI versus human text with about 90% accuracy.
  • Pangram reports a false-positive rate of about 1 in 10,000 on human writing.
  • Pangram reports roughly a 1% false-negative rate for detecting straightforward AI-generated outputs, with worse performance under adversarial prompting.
  • Pangram’s detector infers AI authorship by learning many small writing-choice patterns across a passage rather than relying on a few explicit tells.
  • Pangram trains a deep-learning classifier using millions of human texts paired with synthetic AI mirror texts matched for topic and length.

Credibility Heuristics Breakdown And Norms

  • Some books are beginning to include explicit disclaimers stating they were written only by humans with no AI used.
  • AI writing is often perceived as strong on basic mechanics but disliked in feel, and can occasionally be striking.
  • Polished grammar and spelling historically served as a heuristic for intelligence and credibility, but LLMs weaken that link by producing fluent arguments for absurd propositions.
  • AI-generated long-form text may be recognizable by a consistent 'sickly sweetness' and weak style even when specific errors are hard to articulate.
  • AI struggles to convincingly write in the style of specific writers unless the target style is extremely obvious, while remaining clear for basic comprehension.
  • As AI produces a large share of written text, interest in reliably distinguishing human from AI authorship is expected to increase.

Workflow Costs And Decision Points For Detection

  • The boundary of unacceptable AI assistance is unclear because tools like spellcheck or AI copy-editing may be treated differently despite similar functional roles.
  • If AI-detection models falsely label human writing as AI-generated and are treated as authoritative, they can create reputational and career risk.
  • A practical motivation for detecting AI writing is deciding whether to engage with social media replies that may be bots rather than real people.
  • Trying to identify whether everyday incoming writing is AI-generated can impose a large ongoing cognitive burden on journalists.
  • AI-writing concerns are expected to be especially acute in education and legal work where authorship and accountability matter.

Adversarial Evasion And Arms Race Constraints

  • Max Spiro expects detector evasion could become practical by optimizing simultaneously for a detector's human score and a separate LLM-based coherence judge.
  • In Jill Weisenthal's initial tests, Pangram classified her writing as human and AI outputs as AI, and still flagged AI after multiple translation steps.
  • An attempt to evade Pangram by iteratively searching for prompts that score as human succeeded only by producing largely incoherent or grammatically incorrect text.
  • An adversary could iteratively generate text that appears human by jointly optimizing against a detector score and an LLM-based coherence judge.
  • As LLMs become more capable, their output distributions become more complex, requiring larger or more powerful detector models to keep pace.

Training Data Provenance And Prevalence Claims

  • Pangram anticipates difficulty sourcing clean contemporary human text because online text increasingly contains AI content, and plans to rely more on pre-2023 corpora and trusted actors for newer human text.
  • Max Spiro estimates roughly 40% of internet pages are AI-written, driven largely by SEO-focused article production switching to AI for cost reasons.
  • Pangram’s scan of Medium found that over 50% of newly written Medium articles were AI-generated at the time of that scan.
  • Max Spiro expects AI-generated content to rise to a majority of internet content within about a year.

Watchlist

  • Some books are beginning to include explicit disclaimers stating they were written only by humans with no AI used.
  • Max Spiro expects detector evasion could become practical by optimizing simultaneously for a detector's human score and a separate LLM-based coherence judge.

Unknowns

  • What are Pangram’s independently verified precision/recall metrics across domains, genres, languages, and populations (including non-native English writers), and how do they change over time?
  • How robust are AI-authorship detectors to multi-objective evasion methods that optimize simultaneously for detector scores and coherence/style constraints?
  • What is the current share of AI-generated text on the open web and on major publishing/UGC platforms under transparent measurement methodologies?
  • Can provenance initiatives based on device-capture signatures achieve meaningful adoption across device makers and platforms, and do they remain trustworthy under tampering and relay scenarios?
  • Do emerging 'no AI' disclaimers and social norms around disclosure measurably reduce low-quality AI content production or change reader trust and engagement?

Investor overlay

Read-throughs

  • Rising demand for AI authorship detection as a recurring operational decision for moderation, grading, publishing, and trust, shifting spend toward detector vendors and related workflow tooling.
  • A data advantage may accrue to actors with access to clean human-only corpora and trusted contemporary sources, as open web contamination makes training and evaluation harder.
  • An arms race dynamic could drive ongoing compute and iteration needs for detectors, benefiting providers able to sustain continuous red-teaming and retraining and support granular scoring of AI assistance.

What would confirm

  • Public, independently verified precision and recall results across domains, languages, and non-native writers, with monitoring that shows stable or improving performance over time.
  • Evidence of adoption of disclosure norms such as no AI disclaimers or platform policies that incorporate detection into publish, grade, or moderation workflows with stated process safeguards.
  • Demonstrated robustness against multi-objective evasion that optimizes for detector human score and coherence, or documented retraining cadence that closes such gaps without large false positive increases.

What would kill

  • Independent tests show materially high false positives or unstable performance across genres, languages, or non-native writing, making detectors unusable in high-stakes workflows.
  • Multi-objective evasion becomes practical and widely available, causing detector performance to degrade despite retraining, especially without detectable quality loss.
  • Inability to source clean contemporary human text leads to degraded model training and evaluation, with increasing contamination of reference corpora and no credible mitigation via trusted sources.

Sources