DRBD 9 over RDMA with Micron SSDs

We have been testing out some 240GB Micron M500DC SSDs with DRBD 9 and DRBD’s RDMA Transport layer.  Micron, based in Boise Idaho, is a leader in NAND, flash production and storage.  We found that that their M500DC SSD’s are write optimized for data center use cases and in some cases exceeded the expected performance.

For those who are just joining us, leveraging RDMA as a transport protocol is relatively new to DRBD and is only possible with DRBD 9. You can find some background on RDMA and how DRBD benefits from it in one of our past blog posts, “What is RDMA, and why should we care?”. Also, check out our technical guide on benchmarking DRBD 9 on Ultrastar SN150 NVMe SSDs if you are interested in seeing some of the numbers we were able to achieve with DRBD 9.0.1-1 and RDMA on very fast storage.

Back to the matter at hand.

In our test environment we used two 240GB Micron M500DC SSDs in RAID0 in each of our two nodes. We connected the two peers using Infiniband ConnectX-4 10Gbe. We then ran a series of tests to compare the performance of DRBD disconnected (not-replicating), DRBD connected using TCP over Infiniband, and DRBD connected using RDMA over Infiniband, all against the performance of the backing disks without DRBD.

For testing random read/write IOPs we used fio with 4K blocksize and 16 parallel jobs. For testing sequential writes we used dd with 4M blocks. Both tests used the appropriate flag for direct IO in order to remove any caching that might skew the results.

We also levereaged DRBD’s “when-congested-remote” read-balancing option to pull read’s from the peer if the IO subsystem is congested on the Primary node. We will see that this produces dramatic increases to performance of our random reads; especially when combined with RDMA’s extremely low latency.

Here are the results from our Random Read/Write IOPs testing:
micron-random-chart

micron-random-graph

As you can see from the numbers and graphs we achieve huge gains in read performance when using DRBD with read-balancing; roughly a 26% increase when using TCP and 62% with RDMA.

We also see that using the RDMA transport protocol results in less than 1% of overhead when synchronously replicating the writes to our DRBD device; that’s pretty sweet. :)

Sequential reads cannot benefit from DRBD’s read-balancing at all, and large sequential writes are going to be heavily segmented by the TCP stack, so our numbers for sequential writes better represent the impact a transport protocol has on synchronous replication.

Here are the results from our Sequential Write testing:micron-sequential-chartmicron-sequential-graph

Looking at the graph it’s easy to see that RDMA is the transport mode of choice if your IO patterns are sequential. With TCP we see ~19.1% overhead, while RDMA results in ~1.1% overhead.

Persistent and Replicated Docker Volumes with DRBD9 and DRBD Manage

Nowadays, Docker has support for plugins; for LINBIT, volume plugins are certainly the most interesting feature. Volume plugins open the way for storing content residing in usual Docker volumes on DRBD backed storage.

In this blog post we show a simple example of using our new Docker volume plugin to create a WordPress powered blog with a MariaDB database, where both the content of the blog and the database is replicated among two cluster nodes. Continue reading

Having Fun with the DRBD Manage Control Volume

As you might know, DRBD Manage is a tool that is used in the DRBD9 stack to manage (create, remove, snapshot) DRBD resources in a multi-node DRBD cluster. DRBD Manage stores the cluster information in the so called Control Volume. The control volume is a DRBD9 resource itself which is then replicated across the whole cluster. This means that the control volume itself is just a block device, like all the regular DRBD resources. Continue reading

Testing SSD Drives with DRBD: SanDisk Optimus Ascend

This week we continue our SSD testing series with the SanDisk Optimus Ascend 2.5 800GB SAS drives. Sandisk-Logo

Background:
SanDisk Corporation designs, develops and manufactures flash memory storage solutions. LINBIT is known for developing DRBD (Distributed Replicated Block Device), the backbone of Linux High Availability software. LINBIT tested how quickly data can be synchronously replicated from a SanDisk 800 GB SSD in server A, to an identical SSD located in server B. Disaster Recovery replication was also investigated, using the same hardware to an off-site server.

Continue reading

Testing SSD Drives with DRBD: Intel DC 3700 Series

Over the next few weeks we’ll be posting results from tests that we’ve run against various manufactures SSD drives; including Intel, SanDisk, and Micron, to name a few.

The first post in this series goes over our findings of the Intel DC S 3700 Series 800GB SATA SSD drives. Continue reading