Baseline Pir Constructions And Limits
Sources: 1 • Confidence: Medium • Updated: 2026-04-12 10:30
Key takeaways
- PIR with preprocessing can remove the 1-of-2 trust requirement by having the client precompute many random-query XOR sums after an upfront full-database download, effectively acting as one of the two servers.
- Plinko reduces both client communication and server compute from O(N) to O(sqrt(N)) while using preprocessing to eliminate the trust assumption.
- For an Ethereum-like dataset of 10 billion 32-byte values arranged as a 100000x100000 grid, Plinko would require storing about 100000*128 hints, approximately 390 MB, plus additional backup hints.
- Eliminating the client-side O(N) preprocessing download is described as the hardest practical improvement target and is expected to require substantially heavier server-side computation beyond Plinko’s XOR-based workload.
- A major privacy gap not well addressed by ZK-SNARKs or traditional FHE is private reads, where users want to read from a database without the server learning what they read.
Sections
Baseline Pir Constructions And Limits
- PIR with preprocessing can remove the 1-of-2 trust requirement by having the client precompute many random-query XOR sums after an upfront full-database download, effectively acting as one of the two servers.
- Classic two-server PIR can be implemented by querying two servers with two nearly identical random subsets that differ only in whether index i is included, such that XORing the two servers’ XOR-sums reveals D[i].
- Classic two-server PIR requires trusting that at least one of the two servers is honest.
- Classic two-server PIR has client communication overhead proportional to the number of cells in the database.
- Classic two-server PIR requires the server to XOR about half the database per query.
- Single-server PIR can be built using homomorphic encryption to encrypt queries so the server can compute an encrypted answer, but it remains computationally expensive for very large datasets.
Plinko Protocol Mechanics
- Plinko reduces both client communication and server compute from O(N) to O(sqrt(N)) while using preprocessing to eliminate the trust assumption.
- In Plinko setup, the database can be arranged as a sqrt(N) by sqrt(N) grid and each hint XORs one pseudorandomly chosen cell from each of roughly (sqrt(N)/2 + 1) rows, repeated for about 128*sqrt(N) hints.
- Plinko has a setup phase where the client streams through the entire dataset once and stores a small set of hints, and a query phase where the client combines a server reply with a matching hint to recover the target cell.
- Plinko prefers an invertible PRF so that given a row seed and a column, the client can find all hint indices that include that cell, avoiding costly hash grinding during queries.
- Plinko uses backup hints constructed as complementary row-subset pairs so that after a query the client can discard the used hint and promote a backup into a fresh regular hint without re-downloading data.
- In Plinko query phase, the client sends the server two same-sized sets of row-indexed points (one derived from a hint excluding the target cell and one random junk set) in random order, and the server returns XORs for both sets so the client can combine the correct one with its hint to recover the answer.
Deployment Sizing And Bottlenecks
- For an Ethereum-like dataset of 10 billion 32-byte values arranged as a 100000x100000 grid, Plinko would require storing about 100000*128 hints, approximately 390 MB, plus additional backup hints.
- In the same Ethereum-scale example, naive per-query communication is about 215 kB (row partition about 12.2 kB plus column indices about 202.7 kB).
- In the same Ethereum-scale example, server-side work involves reading and XORing 100000 cells per query (about 3.05 MB of reads), and adding Merkle-branch queries would increase this to roughly 13.4 MB in theory.
- By making the Plinko grid rectangle unbalanced, one can trade about 2x more hint storage for about 2x lower per-query communication.
Comparative Tradeoffs And Research Frontier
- Eliminating the client-side O(N) preprocessing download is described as the hardest practical improvement target and is expected to require substantially heavier server-side computation beyond Plinko’s XOR-based workload.
- Efficiently doing Plinko preprocessing via FHE is described as an open problem.
- TreePIR reduces communication to logarithmic size using a puncturable PRF and additional cryptographic steps, but it increases server work and forces the client to grind through many hints due to losing Plinko’s invertible-PRF lookup advantage.
Problem Reframe Private Reads
- A major privacy gap not well addressed by ZK-SNARKs or traditional FHE is private reads, where users want to read from a database without the server learning what they read.
- Private Information Retrieval (PIR) protocols let a client retrieve D[i] from a server-held database D without the server learning the index i.
Watchlist
- Eliminating the client-side O(N) preprocessing download is described as the hardest practical improvement target and is expected to require substantially heavier server-side computation beyond Plinko’s XOR-based workload.
Unknowns
- What end-to-end benchmarks exist for Plinko at the stated scales (client setup time, query latency, server throughput, and resource costs) under clearly specified hardware and network assumptions?
- What is the precise threat model assumed (server capabilities, side channels, multi-query adaptivity, client compromise) and what security guarantees are proven for Plinko under that model?
- What concrete mechanism would eliminate the client-side O(N) preprocessing download while keeping server costs acceptable, and what are the best-known performance tradeoffs for that direction?
- How many backup hints are required in practice to support a target query volume without re-downloading, and how do backup-hint parameters affect storage, query latency, and privacy risk?
- For dynamic datasets, how does a client learn the correct old/new values (or deltas) needed to update hints, and what trust or integrity assumptions are required for that update stream?