At Reperio, we build fully supported, fully integrated enterprise virtualization platforms that deliver staggering performance per dollar—especially when you measure what actually governs modern applications: 4K random synchronous write IOPS. Our architecture pairs Proxmox VE with QEMU/VirtIO and LVM on Xinnor’s xiRAID, a software RAID engine designed for NVMe that drives performance close to raw device speed.

In Xinnor’s recent joint post with Reperio (an official Xinnor reseller in North America), a single 1U host with 8 NVMe drives sustained ~1.4–1.45 million 4K random synchronous write IOPS with ~36–38µs latency—a profile that mirrors the toughest real-world database and virtualization workloads. This is exactly the tier of performance hyperscalers market—only here it’s predictable, repeatable, and owned by you.

Why small, random, synchronous I/O is the differentiator

Almost everything inside an enterprise virtualization stack eventually bottlenecks on small, random writes: database WAL/redo logs, key-value stores, metadata updates, message queues, etc. That’s why platforms such as CockroachDB publish TPC-C guidance and testing paths that implicitly assume tens of thousands of IOPS per node in realistic deployments. These I/Os are the most expensive to produce at low latency, and very few vendors publish credible, repeatable 4K random synchronous write numbers. Xinnor’s case study fills that gap and shows what’s achievable on a single host; maximizing the IOPs available from the enterprise NVMe SSDs.

The architecture: Proxmox VE + QEMU/VirtIO + LVM + xiRAID

xiRAID exposes a local block device, so Proxmox treats it like any other LVM/LVM-thin storage target. That means your familiar Proxmox workflows—backups, snapshots (with LVM-thin), replication, and live migration—work as expected, using native tooling.

Predictability at the VM/disk level. On Proxmox, QEMU’s IOThread Virtqueue Mapping lets you assign multiple I/O threads to a single VirtIO-blk device, scaling per-disk throughput and helping reduce “noisy neighbor” contention. Proxmox’s core doesn’t expose this mapping in the UI today, but it can be configured manually; community threads and patches track ongoing integration.

In our lab (Xeon Gold 6530 class), a Windows Server guest on a single VirtIO disk achieved ~43k 4K sync write IOPS; enabling IOThread-VQ mapping with multiple queues pushed a single guest disk to roughly ~110k IOPS. These results align with the scaling described by the QEMU/Oracle/Red Hat guidance.

Reality check on capacity planning. A host capable of ~1.45M sync write IOPS can be carved into, say, 14–15 virtual disks at ~100k IOPS each, with headroom for burst and overhead—precisely the kind of per-VM allocation hyperscaler architects plan around.

Hyperscaler-class performance, without hyperscaler pricing

Hyperscalers expect you to provision five- and six-figure IOPS to hit published app benchmarks. For example, AWS EBS io2 charges per IOPS-month in tiers; provisioning 100,000 IOPS on one volume tallies about $4,704/month just for IOPS—before capacity, throughput, or data transfer.

By contrast, a similarly specified 1U NVMe host delivering ~1.45M 4K sync write IOPS is a one-time CapEx. Using a representative ~$25k build as a planning anchor, that’s roughly $0.017 per IOPS (one-time)—about 1.7 cents per 4K sync write IOPS. Even before you factor in multi-tenant density, it’s hard to justify paying monthly (@ $0.047/IOP/Month) for IOPS you can own outright.

“But EBS is replicated”—and why that’s OK

Many modern data platforms (CockroachDB, Elasticsearch, CouchDB, etc.) replicate at the application layer. Paying for infrastructure-level replication and application-level replication doubles your write amplification and cost. Keep local RAID for node-level resilience, and let the database do what it was designed to do. For guests that can’t replicate (e.g., legacy SQL Server), we can layer LINSTOR/DRBD atop xiRAID to supply replicated block volumes—giving you EBS-like semantics only where you need them.

Why this matters for real apps

Cockroach Labs’ guidance and test harness for TPC-C “small” runs assume substantial per-node I/O capability; in our own verification on AWS c5d.4xlarge (local NVMe), we measured ~75k 4K sync write IOPS with fio, yielding a tpmC score in line with CockroachDB’s reference run. If you want to self-host those same apps on-prem or in colo, your virtualization stack needs to match or exceed that per-node IOPS—and do so predictably.

Fully supported, end-to-end

This is not a science-project build. Reperio delivers a turnkey, supported platform—design, installation, monitoring, training, backups, DR patterns, security hardening, networking, and hardware warranty—on your site or at the edge. Proxmox tooling remains native (VM lifecycle, backups, replication, migration), xiRAID provides the raw I/O engine, and our playbooks give you per-VM allocation you can count on.

Ready to see it with your workload?

If you’re evaluating the move from hyperscalers—or you’re scaling beyond them—let’s map your database and service mix to a predictable, measurable IOPS plan. We’ll size hosts, set per-VM I/O budgets (with IOThread-VQ mapping where appropriate), and prove it with your tests. Talk to Reperio to blueprint a cluster that delivers hyperscaler-class performance per dollar—with your data, on your terms.

Complete 50%

Enter your email to get instant access to the case study

Your information is 100% secure