<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>LINBIT Blogs</title>
	<atom:link href="http://blogs.linbit.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.linbit.com</link>
	<description>Tips and Tricks, Hints and Solutions around Linux High-Availability</description>
	<lastBuildDate>Tue, 04 Jun 2013 16:15:08 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.4.2</generator>
		<item>
		<title>&#8220;umount is too slow&#8221;</title>
		<link>http://blogs.linbit.com/p/548/umount-takes-time/</link>
		<comments>http://blogs.linbit.com/p/548/umount-takes-time/#comments</comments>
		<pubDate>Mon, 27 May 2013 07:31:35 +0000</pubDate>
		<dc:creator>flip</dc:creator>
				<category><![CDATA[drbd]]></category>
		<category><![CDATA[kernel]]></category>
		<category><![CDATA[fast]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[slow]]></category>
		<category><![CDATA[storage]]></category>
		<category><![CDATA[sync]]></category>
		<category><![CDATA[write]]></category>

		<guid isPermaLink="false">http://blogs.linbit.com/?p=548</guid>
		<description><![CDATA[A question we see over and over again is Why is umount so slow? Why does it take so long? Part of the answer was already given in an earlier blog post; here&#8217;s some more explanation. The write() syscall typically &#8230; <a href="http://blogs.linbit.com/p/548/umount-takes-time/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>A question we see over and over again is</p>
<blockquote><p>Why is <code>umount</code> so slow? Why does it take so long?</p></blockquote>
<p>Part of the answer was already given in <a title="Make the kernel start write-out earlier" href="http://blogs.linbit.com/p/33/kernel-dirty_ratio/" target="_blank">an earlier blog post</a>; here&#8217;s some more explanation.<span id="more-548"></span></p>
<p>The <code>write()</code> syscall typically writes into RAM only. In Linux we call that &#8220;<em>page cache</em>&#8220;, or &#8220;<em>buffer cache</em>&#8220;, depending on what exactly the actual target of the <code>write()</code> system call was.</p>
<p>From that RAM (cache inside the operating system, high in the IO stack) the operating system does periodically do writeouts, at its leisure, unless it is urged to write out particular pieces (or all of it) <em>now</em>.</p>
<p>A <code>sync</code> (or <code>fsync()</code>, or <code>fdatasync()</code>, or &#8230;) does exactly that: it urges the operating system to do the write out.<br />
A <code>umount</code> also causes a write out of all not yet written data of the affected file system.</p>
<p class="no-margin"><strong>Note:</strong></p>
<ul style="list-style-type: none;">
<li>Of course the &#8220;performance&#8221; of writes that go into volatile RAM only will be much better than anything that goes to stable, persistent, storage. All things that have only been written to cache but not yet <em>synced</em> (written out to the block layer) will be lost if you have a power outage or server crash.<br />
<strong>The linux block layer has never seen these changes, DRBD has never seen these changes, they cannot possibly be replicated anywhere.<br />
Data will be lost.</strong></li>
</ul>
<p>There are also controller caches which may or may not be volatile, and disk caches, which typically are volatile. These are <strong>below and outside the operating system</strong>, and not part of this discussion. Just make sure you disable all volatile caches on that level.</p>
<p class="no-margin">Now, for a moment, assume</p>
<ul class="no-margin">
<li>you don&#8217;t have DRBD in the stack, and</li>
<li>a moderately capable IO backend that writes, say, 300 MByte/s, and</li>
<li>around 3 GiByte of dirty data around at the time you trigger the umount, and</li>
<li>you are not seek-bound, so your backend can actually reach that 300 MB/s,</li>
</ul>
<p>you get a umount time of around 10 seconds.</p>
<hr width="10%" />
<p class="no-margin">Still with me?</p>
<p>Ok. Now, introduce DRBD to your IO stack, and add a long distance replication link. Just for the sake of me trying to explain it here, assume that because it is long distance and you have a limited budget, you can only afford 100 MBit/s. And &#8220;long distance&#8221; implies larger round trip times, so lets assume we have a <a title="RTT" href="http://en.wikipedia.org/wiki/Round-trip_delay_time" target="_blank">RTT</a> of 100 ms.</p>
<p>Of course that would introduce a single IO request latency of &gt; 100 ms for anything but DRBD protocol A, so you opt for protocol A. (In other words, using protocol A &#8220;masks&#8221; the RTT of the replication link from the application-visible latency.)</p>
<p>That was <em>latency</em>.</p>
<p>But, the limited <em>bandwidth</em> of that replication link also limits your average sustained write throughput, in the given example to about 11MiByte/s.<br />
The same 3 GByte of dirty data would now drain much slower, <strong>in fact that same <code>umount</code> would now take not 10 seconds, but 5 minutes</strong>.</p>
<p>You can also take a look at a <a title="Throughput is not latency" href="http://lists.linbit.com/pipermail/drbd-user/2012-September/019080.html" target="_blank">drbd-user mailing list post</a>.</p>
<hr width="10%" />
<p>So, concluding: try to avoid having much unsaved data in RAM; it might bite you. For example, you want your cluster to do a switchover, but the umount takes too long and a timeout hits: the node (should) get fenced, and the data not written to stable storage will be lost.</p>
<p><strong>Please follow the advice about <a title="Make the kernel start write-out earlier" href="http://blogs.linbit.com/p/33/kernel-dirty_ratio/" target="_blank">setting some sysctls</a> to start write-out earlier!</strong></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.linbit.com/p/548/umount-takes-time/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>DRBD 8.4.3: faster than ever</title>
		<link>http://blogs.linbit.com/p/469/843-random-writes-faster/</link>
		<comments>http://blogs.linbit.com/p/469/843-random-writes-faster/#comments</comments>
		<pubDate>Fri, 22 Feb 2013 08:17:25 +0000</pubDate>
		<dc:creator>flip</dc:creator>
				<category><![CDATA[drbd]]></category>
		<category><![CDATA[activity log]]></category>
		<category><![CDATA[fast]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[sync]]></category>
		<category><![CDATA[write]]></category>

		<guid isPermaLink="false">http://blogs.linbit.com/?p=469</guid>
		<description><![CDATA[For the people who don&#8217;t already have DRBD 8.4.3 deployed: here&#8217;s another good reason — Performance. As you know DRBD marks the to-be-changed disk areas in the Activity Log. Until now that meant that for random-write workloads a DRBD speed &#8230; <a href="http://blogs.linbit.com/p/469/843-random-writes-faster/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>For the people who don&#8217;t already have DRBD 8.4.3 deployed: here&#8217;s another good reason — Performance.<span id="more-469"></span></p>
<p>As you know DRBD marks the to-be-changed disk areas in the <a title="Activity Log in the Users Guide" href="http://www.drbd.org/users-guide/s-activity-log.html" target="_blank">Activity Log</a>.</p>
<p>Until now that meant that for random-write workloads a DRBD speed penalty of up to 50%, ie. each application-issued write request translated to two write requests on storage.</p>
<hr width=10% height=1>
<p>With DRBD 8.4.3 <a href="http://www.linbit.com/en/company/about/eu-team#lars" target="_blank">Lars</a> managed to reduce that overhead<sup class='footnote'><a href='http://blogs.linbit.com/p/469/843-random-writes-faster/#fn-469-1' id='fnref-469-1' onclick='return fdfootnote_show(469)'>1</a></sup>, from 1:2 down to 64:65, ie. to about 1.6%. (In sales speak &#8220;up to 64 times faster&#8221; <img src='http://blogs.linbit.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' />  )</p>
<p>Here are two graphics showing the difference on one of our test clusters; both using 10GigE and synchronous replication (protocol C):</p>
<p><a class="no-arrow" href="http://blogs.linbit.com/wp-content/uploads/2013/02/randwrite-128g.png"><img src="http://blogs.linbit.com/wp-content/uploads/2013/02/randwrite-128g.png" alt="Random Writes Benchmark, Spinning Disk" title="randwrite-128g" width="640" height="480" class="size-full wp-image-490" /></a><br />
The <font color="red">raw LVM</font> line shows the hardware limit of 350 IOPS; while <font color="blue">8.4.2</font> and <font color="#f0f">8.3.15</font> are quickly limited by harddisk seeks, the <font color="green">8.4.3</font> bars go up much further &#8211; in this hardware setup we get 4 times the randwrite performance!</p>
<hr width=10% height=1>
<p>When using SSDs the difference is even more visible ­— the 8.4.2 to 8.4.3 speedup is a factor <strong>~16.7</strong>.</p>
<p><a class="no-arrow" href="http://blogs.linbit.com/wp-content/uploads/2013/02/randwrite-128g-ssd.png"><img src="http://blogs.linbit.com/wp-content/uploads/2013/02/randwrite-128g-ssd.png" alt="Random Writes Benchmark, SSD" title="randwrite-128g-ssd" width="640" height="480" class="size-full wp-image-490" /></a><br />
Again, the <font color="red">raw LVM</font> line shows the hardware limit of 50k IOPS; <font color="blue">8.4.2</font>  needs to wait for the synchronous writes (at 1.5k IOPS), but <font color="green">8.4.3</font> gives 25k IOPS, at least half the pure SSD speed.</p>
<hr width=10% height=1>
<p>Please note that <em>every</em> setup is different &#8212; and storage subsystems are very complex beasts, with many, non-linear, interacting parts. During our tests we found many &#8220;interesting&#8221; (but reproduceable) behaviours &#8211; so you&#8217;ll have to tune your specific setup<sup class='footnote'><a href='http://blogs.linbit.com/p/469/843-random-writes-faster/#fn-469-2' id='fnref-469-2' onclick='return fdfootnote_show(469)'>2</a></sup><sup>,</sup><sup class='footnote'><a href='http://blogs.linbit.com/p/469/843-random-writes-faster/#fn-469-3' id='fnref-469-3' onclick='return fdfootnote_show(469)'>3</a></sup>.</p>
<hr width=10% height=1>
<p>Furthermore, the activity log can now be much bigger<sup class='footnote'><a href='http://blogs.linbit.com/p/469/843-random-writes-faster/#fn-469-4' id='fnref-469-4' onclick='return fdfootnote_show(469)'>4</a></sup>; but, as the impact on performance of leaving the &#8220;hot&#8221; area is now very much reduced, you may even want to <strong>lower</strong> the <code>al-extents</code> &#8211; ie. tune the AL-size to the working set, to reduce re-sync times after a failed Primary.</p>
<p>And, last but not least, the AL can be striped &#8211; this might help for some hardware setups, too.<br />
Please see the <a href="http://asadf" title="al-stripes" target="_blank">documentation</a> for more details.</p>
<p>BTW: these changes are in the DRBD 9 branch too, so you won&#8217;t lose the benefits.</p>
<hr width=40% height=1>
]]></content:encoded>
			<wfw:commentRss>http://blogs.linbit.com/p/469/843-random-writes-faster/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Change the cluster distribution without downtime</title>
		<link>http://blogs.linbit.com/p/458/distribution-change-during-runtime/</link>
		<comments>http://blogs.linbit.com/p/458/distribution-change-during-runtime/#comments</comments>
		<pubDate>Mon, 11 Feb 2013 13:26:58 +0000</pubDate>
		<dc:creator>flip</dc:creator>
				<category><![CDATA[drbd]]></category>
		<category><![CDATA[pacemaker]]></category>
		<category><![CDATA[high availability]]></category>
		<category><![CDATA[lucid]]></category>
		<category><![CDATA[rhel]]></category>
		<category><![CDATA[ubuntu]]></category>

		<guid isPermaLink="false">http://blogs.linbit.com/?p=458</guid>
		<description><![CDATA[Recently we&#8217;ve upgraded one of our virtualization clusters (more RAM), and in the course of this did an upgrade of the virtualization hosts from Ubuntu Lucid to RHEL&#160;6.3&#160;—&#160;without any service interruption. That was not that complicated, really; as our core &#8230; <a href="http://blogs.linbit.com/p/458/distribution-change-during-runtime/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Recently we&#8217;ve upgraded one of our virtualization clusters (more RAM), and in the course of this did an upgrade of the virtualization hosts from Ubuntu Lucid to RHEL&nbsp;6.3&nbsp;—&nbsp;without any service interruption.<span id="more-458"></span></p>
<p>That was not that complicated, really; as our core product <a title="DRBD" href="http://drbd.org/" target="_blank">DRBD</a> works on (nearly) every Linux distribution, we simply</p>
<ol>
<li>live-migrated all VMs to one of the nodes;</li>
<li>reinstalled the root filesystem on the other node with RHEL 6.3<sup class='footnote'><a href='http://blogs.linbit.com/p/458/distribution-change-during-runtime/#fn-458-1' id='fnref-458-1' onclick='return fdfootnote_show(458)'>1</a></sup> and configured GRUB to boot into that one;</li>
<li>installed matching DRBD modules</li>
<li>waited a few seconds for the resync to complete (which was <em>really</em> that fast, because we didn&#8217;t touch the existing logical volumes, and so the changed data were only a few GiB);</li>
<li>and then let Pacemaker take control over the cluster again, allowing us to migrate the VMs to the newly installed node. <strong>Without any service interruption.</strong></li>
</ol>
<p>The key to this was that DRBD and Pacemaker are available in compatible versions on most current distributions — and that&#8217;s not a big problem, because we make such packages available for our customers in <a title="LINBIT DRBD packages" href="http://packages.linbit.com/" target="_blank">our repositories</a>.</p>
<p>Upgrading DRBD from 8.3 to 8.4 at the same time is only a small, secondary change; after all, its network code can talk to different versions by design.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.linbit.com/p/458/distribution-change-during-runtime/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Raspberry Tau: a Pi cluster</title>
		<link>http://blogs.linbit.com/p/406/raspberry-tau-cluster/</link>
		<comments>http://blogs.linbit.com/p/406/raspberry-tau-cluster/#comments</comments>
		<pubDate>Wed, 28 Nov 2012 01:20:32 +0000</pubDate>
		<dc:creator>flip</dc:creator>
				<category><![CDATA[drbd]]></category>
		<category><![CDATA[debian]]></category>
		<category><![CDATA[kernel]]></category>
		<category><![CDATA[raspberry pi]]></category>
		<category><![CDATA[slow]]></category>

		<guid isPermaLink="false">http://blogs.linbit.com/?p=406</guid>
		<description><![CDATA[The Raspberry PI is a small ARM computer (hardware specifications in wiki, outline and FAQs). Of course, you can build a cluster with it! As 2π is proposed to be named τ we&#8217;ve chosen the name &#8220;Raspberry Tau&#8221; for this &#8230; <a href="http://blogs.linbit.com/p/406/raspberry-tau-cluster/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>The <a title="Raspberry PI" href="http://www.raspberrypi.org/" target="_blank">Raspberry PI</a> is a small ARM computer (<a title="hardware specifications in wiki" href="http://elinux.org/RPi_Hardware#Specifications" target="_blank">hardware specifications in wiki</a>, <a title="outline and FAQs" href="http://www.raspberrypi.org/faqs" target="_blank">outline and FAQs</a>). Of course, you can build a cluster with it!<span id="more-406"></span></p>
<p>As 2<span title="Pi">π</span> is <a title="Tau-Day" href="http://tauday.com/" target="_blank">proposed to be named <span title="Tau">τ</span></a> we&#8217;ve chosen the name &#8220;Raspberry Tau&#8221; for this proof-of-concept. <img class="alignright size-full wp-image-424" title="One Raspberry Pi" src="http://blogs.linbit.com/wp-content/uploads/2012/10/1pi.jpg" alt="" width="400" height="220" /></p>
<p>We&#8217;ve connected two Raspberry Pis via their on-board ethernet interfaces (via a switch, so we can simply SSH into them), booted via 2GB SD-cards with a <a title="Raspbian" href="http://www.raspbian.org/">Raspbian</a> image on them. After upgrading to a kernel that has kernel-headers available we built DRBD modules, and voilá! A Raspberry Tau cluster is born.<img class="alignright size-full wp-image-425" title="Two Raspberry Pi" src="http://blogs.linbit.com/wp-content/uploads/2012/10/2pi.jpg" alt="" width="400" height="423" /></p>
<p>We&#8217;re replicating the data on the USB-Sticks; their performance nicely matches the available network. Here&#8217;s <code>/proc/drbd</code> (shortened and line-wrapped for readability):</p>
<pre>root@raspberry-alice:~# cat /proc/version 
Linux version 3.2.0-3-rpi (Debian 3.2.21-1+rpi1) 
  (debian-kernel@lists.debian.org) (… Debian 4.6.3-1.1+rpi2)…)
root@raspberry-alice:~# cat /proc/drbd 
version: 8.4.2 (api:1/proto:86-101)
GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by
    root@raspberry-bob, 2012-09-18 12:58:08
 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
    ns:805304 nr:0 dw:348628 dr:818596 al:127 bm:70 lo:0 pe:0
      ua:0 ap:0 ep:1 wo:d oos:0</pre>
<p>As Raspbian is Debian-based, there are <a title="Pacemaker" href="http://archive.raspbian.org/raspbian/pool/main/p/pacemaker/" target="_blank">Pacemaker</a> (and Heartbeat resp. Corosync) packages available &#8230; so a cheap, low-power, High-Availability cluster is easily built.</p>
<p>Disclaimer: for a <em>real </em>HA-cluster you&#8217;d need a few more things.</p>
<ul>
<li>a STONITH device (if power is supplied via a Linux-PC, you could turn off the USB port by software), and</li>
<li>redundant network connectivity (USB ethernet adapter).</li>
</ul>
<p>Of course, if you&#8217;re just clustering your media library these things might not be mandatory.</p>
<p>Packages are available for everyone &#8211; just drop an email to sales@linbit.com, and we will be happy to provide them.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.linbit.com/p/406/raspberry-tau-cluster/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Backup ideas: using a double-stacked setup</title>
		<link>http://blogs.linbit.com/p/388/backup-double-stacked/</link>
		<comments>http://blogs.linbit.com/p/388/backup-double-stacked/#comments</comments>
		<pubDate>Mon, 01 Oct 2012 08:17:22 +0000</pubDate>
		<dc:creator>flip</dc:creator>
				<category><![CDATA[drbd]]></category>
		<category><![CDATA[backup]]></category>
		<category><![CDATA[desaster recovery]]></category>
		<category><![CDATA[high availability]]></category>
		<category><![CDATA[storage]]></category>

		<guid isPermaLink="false">http://blogs.linbit.com/?p=388</guid>
		<description><![CDATA[Have you ever wanted to do a file based backup of your data without impacting your application, and without stopping your HA replication? Here is one possible method. For DRBD 8.4.2 we&#8217;ve removed the double-stacked check in the userspace tools; &#8230; <a href="http://blogs.linbit.com/p/388/backup-double-stacked/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Have you ever wanted to do a file based backup of your data without impacting<br />
your application, and without stopping your HA replication? Here is one<br />
possible method.<span id="more-388"></span></p>
<p>For DRBD 8.4.2 we&#8217;ve removed the double-stacked check in the userspace tools; now it&#8217;s easier to do a multi-stacked setup.</p>
<div>A picture says more than a thousand words; please see here:<br />
<a class="no-arrow" href="http://blogs.linbit.com/wp-content/uploads/2012/09/double-stacked.png"><img class="aligncenter size-full wp-image-389" title="double-stacked" src="http://blogs.linbit.com/wp-content/uploads/2012/09/double-stacked.png" alt="" width="600" height="315" /></a></div>
<div>
<p>Here is</p>
<ul>
<li>a HA cluster, consisting of nodes A and B;</li>
<li>a DR node, attached via DRBD proxy; and</li>
<li>a Backup node.</li>
</ul>
</div>
<p>The special thing here is that the dashed DRBD backup connection is <em>not</em> always connected; via a cronjob the backup node disconnects itself, does a (file-based) backup of the data to another storage (tape, harddisk, DVD-RW, paper tape&#8230;), and reconnects afterwards &#8211; to get the newer data, by a standard synchronization.</p>
<p>The advantage is that the IO load on the backup node has <strong>no</strong> impact on the primary node &#8211; you can even start a local database, and do some CPU- and IO-intensive evaluations. As the backup node is not connected to the HA cluster it cannot slow down the normal operation.</p>
<p>The backup itself can be taken with automatic LVM Snapshots, which prevents a split brain situation when mounting the backup drbd resource &#8211; or you need to do a <a title="Resolving a Split Brain" href="http://www.drbd.org/users-guide/s-resolve-split-brain.html" target="_blank"><code>drbdadm connect --discard-my-data</code></a> before re-connecting.</p>
<p>And: all of that is possible <em>while having a continuous connection</em> to the disaster-recovery site.</p>
<p>Credit for this also goes to Mark Olliver from <a href="http://www.thermeon.com/">Thermeon</a>, who was involved in testing and developing this setup.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.linbit.com/p/388/backup-double-stacked/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Mirrored SAN vs. DRBD</title>
		<link>http://blogs.linbit.com/p/347/san-vs-drbd/</link>
		<comments>http://blogs.linbit.com/p/347/san-vs-drbd/#comments</comments>
		<pubDate>Tue, 11 Sep 2012 12:42:46 +0000</pubDate>
		<dc:creator>flip</dc:creator>
				<category><![CDATA[drbd]]></category>
		<category><![CDATA[high availability]]></category>
		<category><![CDATA[online-verify]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[price]]></category>
		<category><![CDATA[san]]></category>
		<category><![CDATA[storage]]></category>

		<guid isPermaLink="false">http://blogs.linbit.com/?p=347</guid>
		<description><![CDATA[Every now and then we get asked &#8220;why not simply use a mirrored SAN instead of DRBD&#8221;? This post shows some important differences. Basically, the first setup is having two servers, one of them being actively driving a DM-mirror (RAID1) &#8230; <a href="http://blogs.linbit.com/p/347/san-vs-drbd/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Every now and then we get asked &#8220;why not simply use a mirrored SAN instead of DRBD&#8221;? This post shows some important differences.<span id="more-347"></span></p>
<p>Basically, the first setup is having two servers, one of them being actively driving a DM-mirror (RAID1) over (eg.) two iSCSI volumes that are exported by two SANs; the alternative is using a standard DRBD setup. Please note that both setups need some kind of cluster manager (like Pacemaker).</p>
<p>Here are the two setups visualized:<br />
<a class="no-arrow" href="http://blogs.linbit.com/wp-content/uploads/2012/09/sanmirror-vs-drbd.png"><img class="aligncenter size-full wp-image-348" title="sanmirror-vs-drbd" src="http://blogs.linbit.com/wp-content/uploads/2012/09/sanmirror-vs-drbd.png" alt="" width="540" height="150" /></a>The main differences are:</p>
<table>
<tbody>
<tr>
<th width=6%>#</th>
<th width=47%>SAN</th>
<th width=47%>DRBD</th>
</tr>
<tr>
<td>1.</td>
<td>High cost, single supplier</td>
<td>Lower cost, commercial-off-the-shelf parts</td>
</tr>
<tr>
<td>2.</td>
<td>At least 4 boxes (2 application servers, 2 SANs)</td>
<td>2 servers are sufficient</td>
</tr>
<tr>
<td>3.</td>
<td>DM-Mirror has only recently got a <code>write-intent-bitmap</code>, and at least had <a title="dm-mirror write-intent-bitmap performance loss" href="http://blog.liw.fi/posts/write-intent-bitmaps/" target="_blank">performance problems</a> (needed if active node crashes)</td>
<td>Optimized <a title="Activity Log in the Users Guide" href="http://www.drbd.org/users-guide/s-activity-log.html" target="_blank">Activity Log</a></td>
</tr>
<tr>
<td>4.</td>
<td>Maintenance needs multiple commands</td>
<td>Single userspace command: <code>drbdadm</code></td>
</tr>
<tr>
<td>5.</td>
<td>Split-Brain not automatically handled</td>
<td>Automatical Split-Brain detection, policies via DRBD configuration</td>
</tr>
<tr>
<td>6.</td>
<td>Data Verification needs to get <em>all</em> data over the network &#8211; twice</td>
<td>Online-Verify transports (optionally) only checksums over the wire</td>
</tr>
<tr>
<td>7.</td>
<td>Asynchronous mode (via WAN) not in standard product</td>
<td>Protocol A available, optional proxy for compression and buffering</td>
</tr>
<tr>
<td>8.</td>
<td>Black Box</td>
<td>GPL solution, integrated in standard Linux Kernel since 2.6.33</td>
</tr>
</tbody>
</table>
<p>So the Open-Source solution via DRBD has some clear technical advantages &#8212; not just the price.</p>
<p>And, if that&#8217;s not enough &#8212; with LINBIT you get world-class support, too!</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.linbit.com/p/347/san-vs-drbd/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Best Practice: Use Backup with your DRBD cluster!</title>
		<link>http://blogs.linbit.com/p/277/use-backup-with-drbd/</link>
		<comments>http://blogs.linbit.com/p/277/use-backup-with-drbd/#comments</comments>
		<pubDate>Wed, 30 May 2012 20:16:37 +0000</pubDate>
		<dc:creator>devin</dc:creator>
				<category><![CDATA[drbd]]></category>
		<category><![CDATA[backup]]></category>
		<category><![CDATA[high availability]]></category>

		<guid isPermaLink="false">http://blogs.linbit.com/?p=277</guid>
		<description><![CDATA[We want to take an opportunity to explain LINBIT&#8217;s best practices in regards to DRBD and backup procedures. DRBD is designed as a storage solution to provide High Availability, Disaster Recovery and Cross Site High Availability to your systems.  As &#8230; <a href="http://blogs.linbit.com/p/277/use-backup-with-drbd/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>We want to take an opportunity to explain LINBIT&#8217;s best practices in regards to DRBD and backup procedures.</p>
<p><span id="more-277"></span></p>
<p>DRBD is designed as a storage solution to provide High Availability, Disaster Recovery and Cross Site High Availability to your systems.  As developers of DRBD, we sometimes get community feedback that some folks are using DRBD as a &#8220;pseudo&#8221; backup solution, and in response to this we wanted to share some abstract guidelines on utilizing DRBD properly by following some key best practice methodologies.</p>
<p>Although DRBD is not backup software, it doesn&#8217;t mean you can&#8217;t use it in your backup procedures. Utilizing DRBD with LVM as a backing device, one can create backups with minimal to no interference to performance. This is done by utilizing LVM snapshotting as outlined in <a title="LINBIT's DRBD User's Guide" href="http://www.drbd.org/users-guide/s-lvm-snapshots.html" target="_blank">LINBIT&#8217;s DRBD User&#8217;s Guide</a>.  Although this page outlines how to do snapshots before and after a resync, these could easily be adapted to a cron job.  Essentially one would disconnect the Secondary, snapshot the backing device, mount the snapshot, perform the backups, umount the snapshot, reconnect the Secondary.  These point in time backups are great for technology such as iSCSI targets, Virtual Machine storage or Databases such as MySQL and PostgreSQL.  As you can imagine, this methodology is quite popular in the Linux HA and DRBD communities.</p>
<h2>LINBIT advises systems administrators to:</h2>
<ol>
<li>Utilize DRBD for High Availability, Disaster Recovery and Cross Site High Availability (business continuity) purposes.</li>
<li>Plan, review and execute a full backup strategy that makes sense for your organization and data.   Be sure to keep in mind how much data you&#8217;re planning on storing, backing up and at what intervals.  It is important to choose the point in time to make your backups to minimize things such as user error.  In many cases, backing up every day is the appropriate strategy.</li>
<li>Test, test, test.  We cannot say this enough.We develop software that is designed to prevent loss from failure, so you could say we&#8217;re experts on this topic.  It&#8217;s very important that you not only test DRBD&#8217;s configuration, but the components that make up your backup system as well.  Then, on a scheduled basis, you should be reviewing your data to ensure its completeness and correctness.  As well, on an annual basis it would be wise to review your top level strategy and make updates if your requirements have changed.  In summation, it is advised to routinely test your backup procedure and also verify (checksum) your backups to ensure their completeness.</li>
</ol>
<hr />
<p>In closing, DRBD is designed to prevent loss of service as the result of equipment failure.  LINBIT strongly advises systems administrators to implement a strategy that incorporates &#8220;point in time&#8221; backups so administrators can restore, rewind and rejoice knowing that they&#8217;re not only backed by the best open source replication technology: DRBD, but a comprehensive backup solution that is designed for the organization&#8217;s needs in mind.</p>
<p>How do <strong><em>you</em></strong> backup your DRBD cluster?</p>
<p>Share your thoughts or comments below! <img src='http://blogs.linbit.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.linbit.com/p/277/use-backup-with-drbd/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>&#8220;read-balancing&#8221; with 8.4.1+</title>
		<link>http://blogs.linbit.com/p/246/read-balancing/</link>
		<comments>http://blogs.linbit.com/p/246/read-balancing/#comments</comments>
		<pubDate>Wed, 09 May 2012 06:45:18 +0000</pubDate>
		<dc:creator>flip</dc:creator>
				<category><![CDATA[drbd]]></category>
		<category><![CDATA[kernel]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[read]]></category>

		<guid isPermaLink="false">http://blogs.linbit.com/?p=246</guid>
		<description><![CDATA[DRBD 8.4.1 introduces a new feature: read-balancing, which is configured in the disk section of the configuration file(s). This feature enables DRBD to balance read requests between the Primary/Secondary nodes. While writes occur on both sides of the cluster, by &#8230; <a href="http://blogs.linbit.com/p/246/read-balancing/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>DRBD 8.4.1 introduces a new feature: <code>read-balancing</code>, which is configured in the <code>disk</code> section of the configuration file(s). This feature enables DRBD to balance read requests between the Primary/Secondary nodes.<span id="more-246"></span></p>
<p>While writes occur on both sides of the cluster, by default the reads are served locally (ie., the value is <code>prefer-local</code>). This might not be optimal if you&#8217;ve got a big pipe to the other node and a heavily loaded IO subsystem.</p>
<p><code>read-balancing</code> has several options to choose from:</p>
<ul>
<li><code>32K-striping</code> up to <code>1M-striping</code> chooses the node to read from via the block address &#8211; eg. for <code>512K-striping</code> the first half of each MiByte would be read from one machine, and the second half from the other<sup class='footnote'><a href='http://blogs.linbit.com/p/246/read-balancing/#fn-246-1' id='fnref-246-1' onclick='return fdfootnote_show(246)'>1</a></sup>.<br />
This is a simple, static load-balancing.</li>
<li><code>round-robin</code> just passes the request to alternating nodes.<br />
This might go wrong if your application reads 4kiB, 1MiB, 4kiB, 1MiB, and so on &#8211; but this is fairly unlikely.</li>
<li><code>least-pending</code> chooses the node with the smallest number of open requests.</li>
<li><code>when-congested-remote</code> uses the remote node if there are local requests<sup class='footnote'><a href='http://blogs.linbit.com/p/246/read-balancing/#fn-246-2' id='fnref-246-2' onclick='return fdfootnote_show(246)'>2</a></sup>.</li>
<li><code>prefer-remote</code> is implemented for completeness, however as of this writing there is no viable use case.</li>
</ul>
<p>Please note that all this is still <strong>below</strong> the filesystem layer &#8211; so even if the secondary is used for reading, this won&#8217;t speed up a failover, as the pages read are not kept anywhere.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.linbit.com/p/246/read-balancing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>LINBIT participates in the German Cloud (&#8220;Deutsche Wolke&#8221;)</title>
		<link>http://blogs.linbit.com/p/202/linbit-deutsche-wolke/</link>
		<comments>http://blogs.linbit.com/p/202/linbit-deutsche-wolke/#comments</comments>
		<pubDate>Mon, 23 Apr 2012 18:30:34 +0000</pubDate>
		<dc:creator>flip</dc:creator>
				<category><![CDATA[drbd]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[deutsche wolke]]></category>
		<category><![CDATA[high availability]]></category>
		<category><![CDATA[partner]]></category>
		<category><![CDATA[storage]]></category>

		<guid isPermaLink="false">http://blogs.linbit.com/?p=202</guid>
		<description><![CDATA[Deutsche Wolke (“German Cloud”) was founded to establish Federal Cloud Infrastructure in Germany. This infrastructure will provide additional legal and security protections for hosted data.  No longer will small businesses be exposed to the legal risk of losing their website presence &#8230; <a href="http://blogs.linbit.com/p/202/linbit-deutsche-wolke/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><a class="no-arrow" href="http://deutsche-wolke.de/"><img class="alignright size-full wp-image-203" title="deutsche_wolke_logo" src="http://blogs.linbit.com/wp-content/uploads/2012/04/deutsche_wolke_logo.png" alt="Deutsche Wolke, Logo" width="249" height="86" /></a></p>
<div>
<div>
<p><a title="Deutsche Wolke" href="http://deutsche-wolke.de/" target="_blank">Deutsche Wolke</a> (“German Cloud”) was founded to establish Federal Cloud Infrastructure in Germany.</p>
</div>
<p>This infrastructure will provide additional legal and security protections for hosted data.  No longer will small businesses be exposed to the legal risk of losing their website presence without a trial (an unfortunate reality when doing business on transatlantic clouds).</p>
<p>The natural partner for backend storage infrastructure is <a href="http://www.linbit.com/en">LINBIT</a>; as authors and maintainers of <a href="http://drbd.org/">DRBD</a>, we are best suited to provide the technical expertise to achieve High Availability.  Also, <a href="http://www.linbit.com/en/products-services/drbd-proxy/">DRBD Proxy</a> is the obvious choice for off-site or disaster recovery replication (from the office into the cloud).</p>
<p>We at LINBIT look forward to seeing this project grow and prosper!</p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://blogs.linbit.com/p/202/linbit-deutsche-wolke/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Monitoring: better safe than sorry&#8230;</title>
		<link>http://blogs.linbit.com/p/173/monitoring-strict/</link>
		<comments>http://blogs.linbit.com/p/173/monitoring-strict/#comments</comments>
		<pubDate>Tue, 03 Apr 2012 09:07:31 +0000</pubDate>
		<dc:creator>flip</dc:creator>
				<category><![CDATA[drbd]]></category>
		<category><![CDATA[kernel]]></category>
		<category><![CDATA[monitor]]></category>
		<category><![CDATA[stop]]></category>

		<guid isPermaLink="false">http://blogs.linbit.com/?p=173</guid>
		<description><![CDATA[Stumbling upon the Holy time-travellin’ DRBD, batman! blog post there&#8217;s only one thing to be said &#8230; Be strict in what you emit, liberal in what you accept1 is simply not true when dealing with mission-critical systems. It&#8217;s ok to &#8230; <a href="http://blogs.linbit.com/p/173/monitoring-strict/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Stumbling upon the <a href="http://www.anchor.com.au/blog/2012/04/holy-time-travellin-drbd-batman/" target="_blank">Holy time-travellin’ DRBD, batman!</a> blog post there&#8217;s only one thing to be said &#8230;</p>
<blockquote><p>Be strict in what you emit, liberal in what you accept<sup class='footnote'><a href='http://blogs.linbit.com/p/173/monitoring-strict/#fn-173-1' id='fnref-173-1' onclick='return fdfootnote_show(173)'>1</a></sup></p></blockquote>
<p><strong>is simply not true</strong> when dealing with mission-critical systems.</p>
<p>It&#8217;s ok to be alerted on upgrading a machine because the &#8220;old, working&#8221; RegEx that did the parsing doesn&#8217;t match anymore<sup class='footnote'><a href='http://blogs.linbit.com/p/173/monitoring-strict/#fn-173-2' id='fnref-173-2' onclick='return fdfootnote_show(173)'>2</a></sup>; it&#8217;s not a problem to get an email when someone adds the 100th DRBD resource and causes the grep to fail; and so on.<span id="more-173"></span></p>
<p>Better to have a few false positives <strong>when you&#8217;re actively changing things</strong> than to get a false negative that costs you months of data; that&#8217;s what an <code>assert</code> (and monitoring isn&#8217;t <em>that</em> different) is for, after all.</p>
<p>Keep monitoring strict, and let it fail loudly on unexpected things &#8211; after the first few occurrences they&#8217;re not unexpected anymore and can be dealt with.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.linbit.com/p/173/monitoring-strict/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
