Shipping Constraint Drove Mvp Encryption Scheme (Lrtq) Despite Limitations
Sources: 1 • Confidence: Medium • Updated: 2026-04-06 03:44
Key takeaways
- Oxide racks use U.2 drives for encrypted ZFS datasets and internal M.2 drives for sled-local metadata and installable software.
- The interim channel-program solution was merged and tested in simulated multi-sled environments and on racklets, including upgrades from LRTQ to Trust Quorum and sled add/remove scenarios.
- Per-drive encryption key rotation cannot be done atomically across all drives, and Oxide needed per-volume metadata indicating which key epoch encrypted it, but ZFS 'change key' could not set that property atomically with the key change.
- Oxide initially explored SPDM (and began implementing it in Rust) but later discarded that approach.
- Oxide’s rack control plane requires a mechanism for establishing mutual trust among sleds within a rack.
Sections
Shipping Constraint Drove Mvp Encryption Scheme (Lrtq) Despite Limitations
- Oxide racks use U.2 drives for encrypted ZFS datasets and internal M.2 drives for sled-local metadata and installable software.
- TrustQuorum uses Shamir secret sharing so key shares stored on sled-local M.2 drives can be exchanged among sleds to reconstruct a rack secret used to derive disk encryption keys.
- The initial low-rent key-share distribution occurs over an internal rack bootstrap network not exposed outside the rack, and intercepting shares would require an active physical/network attack that is not in scope for the stated threat model.
- LRTQ pre-generated up to 255 secret shares and stored extra shares encrypted under the rack secret across the initial sled set to enable adding sleds later under a fixed rack secret.
- A primary driver for TrustQuorum is enabling encrypted storage at rest while still allowing a rack to reboot autonomously after outages without on-site password entry.
- To ship without full TrustQuorum, Oxide implemented a 'low-rent TrustQuorum' that generates a rack secret at setup time and distributes key shares over plain TCP, lacking remote attestation, key rotation, and sled expunging capabilities.
Real Trust Quorum: Fault-Tolerant Rotation, Self-Healing Shares, And Rolling-Upgrade Compatibility
- The interim channel-program solution was merged and tested in simulated multi-sled environments and on racklets, including upgrades from LRTQ to Trust Quorum and sled add/remove scenarios.
- Oxide’s secret sharing implementation can reconstruct not only the rack secret at x=0 but also any participant’s share value at its x-coordinate, allowing reconstitution of a lost share from other shares.
- Real Trust Quorum keeps a trusted-dealer initial distribution but uses secure channels and is designed to tolerate asynchrony and partial failures during initial distribution and key rotation.
- Trust Quorum was implemented as a side-effect-free Sans-I/O state machine, enabling fuzz/property testing and a step-through simulator/debugger.
- Real Trust Quorum supports asynchronous key rotation where an offline sled can later learn its latest share and decrypt prior-epoch secrets to rotate itself forward without global atomic rotation.
- A rack upgrade requires mixed-mode operation where some sleds run new Trust Quorum while not-yet-upgraded sleds still communicate using LRTQ.
Zfs Key-Rotation Atomicity Was A Critical-Path Integration Bottleneck And Drove An Interim In-Kernel Lua Solution
- The interim channel-program solution was merged and tested in simulated multi-sled environments and on racklets, including upgrades from LRTQ to Trust Quorum and sled add/remove scenarios.
- Per-drive encryption key rotation cannot be done atomically across all drives, and Oxide needed per-volume metadata indicating which key epoch encrypted it, but ZFS 'change key' could not set that property atomically with the key change.
- Oxide implemented a fallback recovery that trial-decrypts with possible keys and logs a loud warning if the recorded epoch is inconsistent, and this fallback cost grows with the number of rotations.
- Oxide used ZFS channel programs (Lua in-kernel scripts) to perform key change and property updates within a single transaction group without modifying ZFS itself.
- ZFS channel programs are not automatically rolled back on script error, so the Lua implementation required an application-level rollback protocol to prevent partially committed states.
Protocol Architecture Pivot: Spdm To Tls Plus Gated Attestation
- Oxide initially explored SPDM (and began implementing it in Rust) but later discarded that approach.
- Oxide moved to using standard TLS for authenticated encrypted channels and then performed remote attestation as a second phase before releasing the channel to higher-level agents.
- Oxide’s approach shifted from standards (SPDM) to a custom protocol effort and then back to standards by using TLS plus a standardized attestation format.
- Oxide uses a CoRIM-based manifest format to represent acceptable software measurements for remote attestation verification.
Rack-Scale Mutual Trust Anchored In Hardware Identity
- Oxide’s rack control plane requires a mechanism for establishing mutual trust among sleds within a rack.
- Each sled includes a root-of-trust chip intended to store non-extractable secrets and provide unique device identity via a PUF.
- During manufacturing, the root of trust generates an RSA-4096 keypair, exports a CSR to be signed by an Oxide intermediate signing service, and stores a certificate binding the baseboard serial number to the public key while the private key stays on-chip.
Watchlist
- A remaining security gap is that secret shares stored on sled M.2 drives are currently unencrypted, enabling reconstruction of the rack secret if an attacker steals all the M.2 drives.
Unknowns
- What is the current deployment status of Real Trust Quorum in customer racks (percentage of fleet upgraded, and whether mixed-mode operation is occurring in production)?
- What are the concrete Trust Quorum parameters (threshold t, total shares n, and how these map to sled count and availability targets) in shipping configurations?
- What remote attestation evidence is actually checked (measurements, signing roots, policy rules), and what are the failure/override behaviors when attestation fails?
- How is the 'trusted dealer' implemented operationally, and what controls prevent it from being a single point of compromise during initial distribution and rotations?
- Has the ZFS atomicity issue been resolved beyond the interim Lua channel-program mechanism, and what is the long-term maintenance plan for this integration point?