GPU-accelerated quantum entropy distillation with persistent streaming delivery for post-quantum cryptographic systems. No network round trip. No latency floor.
NIST's 2024 finalization of FIPS 203 (ML-KEM) and FIPS 204 (ML-DSA) mandates post-quantum cryptography across federal systems. ML-KEM-768 requires 96 bytes of certified entropy per session — three times the 32 bytes needed by ECDH.
At 100,000 sessions per second, a realistic load for a large financial institution, entropy demand jumps from 76 Mbps to 153 Mbps overnight. Existing entropy services cannot support this migration.
If the server continuously pushes entropy into a client-resident buffer before the application needs it, the application never waits for the network.
All benchmarks run on Lambda Labs H100 80GB HBM3. Code and raw results publicly available.
| System | Application Latency | Notes |
|---|---|---|
| Qrypt REST API (cross-region) | ~50 ms | Typical enterprise, published cloud RTT |
| Qrypt REST API (same region) | ~1 ms | Best case — still 280× slower |
| ID Quantique Quantis PCIe | ~15 µs | Local PCIe driver call, hardware only |
| Thomes Quantum — streaming + buffer | 3.58 µs | Measured on H100, Python prototype |
Thomes Quantum was founded to solve the entropy infrastructure gap that post-quantum cryptography migration creates. The system is built on GPU-native distillation, validated against NIST SP 800-90B, and designed from the ground up for the latency and sovereignty requirements of the defense and financial sectors.
The full software prototype — CUDA kernels, gRPC streaming server, client SDK, and benchmark suite — is publicly available for independent verification.
github.com/rg-2006/qrng-distillation-paperReach out to discuss Tier 2 enterprise streaming pilots, Tier 3 sovereign deployments, research collaborations, or investment inquiries.