Mechanism: Binary Search Via Http Range Requests Over Large Static Files
Sources: 1 • Confidence: High • Updated: 2026-03-02 19:33
Key takeaways
- The demo accepts either a single character or a hexadecimal Unicode codepoint and displays the steps of the binary search through the large file.
- HTTP range request techniques are not compatible with HTTP compression because compression breaks byte-offset calculations.
- The tool was deployed at tools.simonwillison.net and issues range requests against a CORS-enabled 76.6MB file hosted in S3 and fronted by Cloudflare.
- Claude was used to generate a specification and Claude Code for web was used to convert that specification into working code via an asynchronous research workflow.
- A prototype was built using a phone as an experiment in using HTTP range requests.
Sections
Mechanism: Binary Search Via Http Range Requests Over Large Static Files
- The demo accepts either a single character or a hexadecimal Unicode codepoint and displays the steps of the binary search through the large file.
- A prototype was built using a phone as an experiment in using HTTP range requests.
- The prototype performs binary search over a large file by issuing HTTP range requests.
- A proposed use case for the approach is looking up Unicode codepoint metadata that spans many megabytes of data.
Conditions And Correctness Constraints: Sorted Data And No Compression
- HTTP range request techniques are not compatible with HTTP compression because compression breaks byte-offset calculations.
- This range-request binary-search approach requires data that is naturally sorted.
Deployment Architecture: Cors-Enabled Large Object In S3 Behind Cloudflare
- The tool was deployed at tools.simonwillison.net and issues range requests against a CORS-enabled 76.6MB file hosted in S3 and fronted by Cloudflare.
Workflow Delta: Ai-Assisted Prototyping From Spec To Code
- Claude was used to generate a specification and Claude Code for web was used to convert that specification into working code via an asynchronous research workflow.
Unknowns
- What are the measured performance characteristics (latency per lookup, number of range requests per query, total bytes transferred per query) under realistic network conditions?
- What exact server/CDN configuration ensures responses are not compressed for range requests, and how is correctness validated across different clients and network paths?
- How is the large file structured (record format, fixed vs variable-length records, indexing approach if any) to support binary search on byte ranges?
- What are the operational cost implications (S3 egress, Cloudflare bandwidth, request volume) relative to alternative approaches like shipping a local index or precomputed compact data?
- What failure modes are handled (range request unsupported, CORS misconfiguration, partial content errors, inconsistent byte serving) and what are the user-visible fallbacks?