Derivative-Work Dispute And Clean-Room Validity Under Prior Exposure
Sources: 1 • Confidence: High • Updated: 2026-04-12 10:23
Key takeaways
- Mark Pilgrim argues that chardet 7.0.0 cannot be relicensed to MIT because it is a modification of LGPL-licensed work and is not a valid clean-room implementation given the maintainers' prior exposure to the code.
- The chardet project was originally released in 2006 by Mark Pilgrim under the LGPL.
- AI coding agents can produce a fresh codebase from a specification and tests fast enough to approximate a clean-room reimplementation workflow compared to traditional multi-team clean-room processes.
- Dan Blanchard reports using the JPlag tool and obtaining low similarity scores for chardet 7.0.0 versus prior versions, while older versions show high similarity with each other.
- A key ecosystem watch item is whether low-cost reimplementation from test suites will cause software to re-emerge under more permissive open-source or proprietary licenses at scale.
Sections
Derivative-Work Dispute And Clean-Room Validity Under Prior Exposure
- Mark Pilgrim argues that chardet 7.0.0 cannot be relicensed to MIT because it is a modification of LGPL-licensed work and is not a valid clean-room implementation given the maintainers' prior exposure to the code.
- Dan Blanchard asserts that chardet 7.0.0 is an independent work and therefore can be MIT licensed despite the project's LGPL history.
- Dan Blanchard acknowledges that a traditional clean-room separation did not exist because he had extensive prior knowledge of the original chardet codebase from maintaining it for over a decade.
- The dispute over whether chardet 7.0.0 can be MIT licensed is expected to be difficult to resolve definitively in the near term.
License Transition Via Rewrite With Same Name And Api
- The chardet project was originally released in 2006 by Mark Pilgrim under the LGPL.
- chardet has been maintained by others since 2011, and Dan Blanchard has made every release since version 1.1 in July 2012.
- Dan Blanchard released chardet 7.0.0 and described it as a ground-up rewrite under the MIT license that keeps the same package name and public API as a drop-in replacement for 5.x/6.x.
Ai-Assisted Reimplementation Lowers Clean-Room Friction
- AI coding agents can produce a fresh codebase from a specification and tests fast enough to approximate a clean-room reimplementation workflow compared to traditional multi-team clean-room processes.
- Dan Blanchard reports that the rewrite process began in an empty repository, included instructing Claude not to use LGPL/GPL-licensed code, and proceeded via iterative review, testing, and refinement.
Similarity Measurement As Evidence Standard (And Its Limits)
- Dan Blanchard reports using the JPlag tool and obtaining low similarity scores for chardet 7.0.0 versus prior versions, while older versions show high similarity with each other.
- Dan Blanchard argues that non-derivation can be supported by measurement of structural independence rather than strict clean-room process separation alone.
Forward-Looking Ecosystem And Litigation Watch Items
- A key ecosystem watch item is whether low-cost reimplementation from test suites will cause software to re-emerge under more permissive open-source or proprietary licenses at scale.
- Well-funded litigation is expected to emerge around AI-assisted clean-room-like rewrites as commercial firms perceive their IP is threatened by cheap reimplementation.
Watchlist
- A key ecosystem watch item is whether low-cost reimplementation from test suites will cause software to re-emerge under more permissive open-source or proprietary licenses at scale.
Unknowns
- Is chardet 7.0.0 legally considered an independent work or a derivative of prior LGPL-licensed versions?
- Can similarity measurements (as described) serve as persuasive evidence of non-derivation in legal or widely accepted community processes?
- Are the reported JPlag similarity results reproducible with independent tools, configurations, and reviewers?
- Was the model used in the rewrite trained on the original chardet repository, and if so, how should that training exposure affect clean-room and derivative-work analysis?
- What, if any, authoritative third-party positions (legal analyses, foundations, distributors) will emerge that materially change how the ecosystem treats chardet 7.0.0?