<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	>
<channel>
	<title>Comments on: DRBD and MySQL: Just Say Yes</title>
	<atom:link href="http://fghaas.wordpress.com/2008/04/27/drbd-and-mysql-just-say-yes/feed/" rel="self" type="application/rss+xml" />
	<link>http://fghaas.wordpress.com/2008/04/27/drbd-and-mysql-just-say-yes/</link>
	<description>Linux, DRBD, and other stuff of interest</description>
	<pubDate>Tue, 06 Jan 2009 05:43:41 +0000</pubDate>
	<generator>http://wordpress.org/?v=MU</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Morgan Tocker</title>
		<link>http://fghaas.wordpress.com/2008/04/27/drbd-and-mysql-just-say-yes/#comment-1753</link>
		<dc:creator>Morgan Tocker</dc:creator>
		<pubDate>Wed, 30 Apr 2008 12:36:02 +0000</pubDate>
		<guid isPermaLink="false">http://fghaas.wordpress.com/?p=50#comment-1753</guid>
		<description>@Bill: FWIW, you are correct assuming it is statement based replication.  In the case of row based replication, it's possible that a bit flip could cause a syntactically correct 'event' that will corrupt data.</description>
		<content:encoded><![CDATA[<p>@Bill: FWIW, you are correct assuming it is statement based replication.  In the case of row based replication, it&#8217;s possible that a bit flip could cause a syntactically correct &#8216;event&#8217; that will corrupt data.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Florian Haas</title>
		<link>http://fghaas.wordpress.com/2008/04/27/drbd-and-mysql-just-say-yes/#comment-1743</link>
		<dc:creator>Florian Haas</dc:creator>
		<pubDate>Tue, 29 Apr 2008 07:57:39 +0000</pubDate>
		<guid isPermaLink="false">http://fghaas.wordpress.com/?p=50#comment-1743</guid>
		<description>Bill,

you're right, a MySQL replication statement is probably more likely to altogether fail due to network corruption, rather than propagate garbage. But "more likely" doesn't mean "certain". :-)

Just like for DRBD, without replication integrity checking, there is a chance that network corruption affects the DRBD protocol header rather than the packet payload. This would cause DRBD on the remote end to receive (and discard) a malformed packet. You just can't tell for sure. Which is why in DRBD we adopted the end-to-end approach.</description>
		<content:encoded><![CDATA[<p>Bill,</p>
<p>you&#8217;re right, a MySQL replication statement is probably more likely to altogether fail due to network corruption, rather than propagate garbage. But &#8220;more likely&#8221; doesn&#8217;t mean &#8220;certain&#8221;. <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>Just like for DRBD, without replication integrity checking, there is a chance that network corruption affects the DRBD protocol header rather than the packet payload. This would cause DRBD on the remote end to receive (and discard) a malformed packet. You just can&#8217;t tell for sure. Which is why in DRBD we adopted the end-to-end approach.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Florian Haas</title>
		<link>http://fghaas.wordpress.com/2008/04/27/drbd-and-mysql-just-say-yes/#comment-1742</link>
		<dc:creator>Florian Haas</dc:creator>
		<pubDate>Tue, 29 Apr 2008 07:52:49 +0000</pubDate>
		<guid isPermaLink="false">http://fghaas.wordpress.com/?p=50#comment-1742</guid>
		<description>Kris Buytaert adds an interesting angle to the discussion in a post titled "DRBD and MySQL: often say NO"; see http://www.krisbuytaert.be/blog/node/657.</description>
		<content:encoded><![CDATA[<p>Kris Buytaert adds an interesting angle to the discussion in a post titled &#8220;DRBD and MySQL: often say NO&#8221;; see <a href="http://www.krisbuytaert.be/blog/node/657" rel="nofollow">http://www.krisbuytaert.be/blog/node/657</a>.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: MySQL Replication vs DRBD Battles &#124; MySQL Performance Blog</title>
		<link>http://fghaas.wordpress.com/2008/04/27/drbd-and-mysql-just-say-yes/#comment-1741</link>
		<dc:creator>MySQL Replication vs DRBD Battles &#124; MySQL Performance Blog</dc:creator>
		<pubDate>Tue, 29 Apr 2008 03:50:50 +0000</pubDate>
		<guid isPermaLink="false">http://fghaas.wordpress.com/?p=50#comment-1741</guid>
		<description>[...] these days we see a lot of post for and against (more, more) using of MySQL and DRBD as a high availability [...]</description>
		<content:encoded><![CDATA[<p>[...] these days we see a lot of post for and against (more, more) using of MySQL and DRBD as a high availability [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bill</title>
		<link>http://fghaas.wordpress.com/2008/04/27/drbd-and-mysql-just-say-yes/#comment-1740</link>
		<dc:creator>Bill</dc:creator>
		<pubDate>Mon, 28 Apr 2008 19:12:35 +0000</pubDate>
		<guid isPermaLink="false">http://fghaas.wordpress.com/?p=50#comment-1740</guid>
		<description>How is mysql replication worse off the dbrd in the case of nic /network corruption, if the binary logs contain full sql statements? If it mangles a byte changing INSERT to INSERQ. replication will break, but it won't destroy your data on the slave.  Now if it hits  a blob field of binary data, then yes that might be a problem. I would think hat would be less likely to happen, or at least more dependent upon the schema and usage patterns.</description>
		<content:encoded><![CDATA[<p>How is mysql replication worse off the dbrd in the case of nic /network corruption, if the binary logs contain full sql statements? If it mangles a byte changing INSERT to INSERQ. replication will break, but it won&#8217;t destroy your data on the slave.  Now if it hits  a blob field of binary data, then yes that might be a problem. I would think hat would be less likely to happen, or at least more dependent upon the schema and usage patterns.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Florian Haas</title>
		<link>http://fghaas.wordpress.com/2008/04/27/drbd-and-mysql-just-say-yes/#comment-1738</link>
		<dc:creator>Florian Haas</dc:creator>
		<pubDate>Sun, 27 Apr 2008 19:43:02 +0000</pubDate>
		<guid isPermaLink="false">http://fghaas.wordpress.com/?p=50#comment-1738</guid>
		<description>Eric,

in your post you refer broadly to "any corruption on the primary master" getting propagated over to the DRBD peer. Now as I've stated and will continue to state, you're right as far as upper I/O layers are concerned. And this won't get "fixed" unless someone redesigns the complete Linux I/O stack.

But looking at possible issues _below_ DRBD, we're chipping away at possible sources of corruption one by one (note that these are sources of corruption _outside_ DRBD that DRBD just happens to handle gracefully or rectify):

- Disk I/O errors on any node: Automatic detach, introduced pre-DRBD 8.0
- Network bit flips/network traffic corruption/NIC driver bugs: End-to-end replication integrity checks, introduced in 8.2.0
- Subtle disk I/O errors, bit flips, local-disk data corruption: Online device verification, introduced in 8.2.5

The latency concerns you mentioned have also greatly been mitigated by better CPU affinity handling introduced in 8.2.3. This has been back-ported to the 8.0 branch as well, and makes a particularly big difference on multi-core systems.</description>
		<content:encoded><![CDATA[<p>Eric,</p>
<p>in your post you refer broadly to &#8220;any corruption on the primary master&#8221; getting propagated over to the DRBD peer. Now as I&#8217;ve stated and will continue to state, you&#8217;re right as far as upper I/O layers are concerned. And this won&#8217;t get &#8220;fixed&#8221; unless someone redesigns the complete Linux I/O stack.</p>
<p>But looking at possible issues _below_ DRBD, we&#8217;re chipping away at possible sources of corruption one by one (note that these are sources of corruption _outside_ DRBD that DRBD just happens to handle gracefully or rectify):</p>
<p>- Disk I/O errors on any node: Automatic detach, introduced pre-DRBD 8.0<br />
- Network bit flips/network traffic corruption/NIC driver bugs: End-to-end replication integrity checks, introduced in 8.2.0<br />
- Subtle disk I/O errors, bit flips, local-disk data corruption: Online device verification, introduced in 8.2.5</p>
<p>The latency concerns you mentioned have also greatly been mitigated by better CPU affinity handling introduced in 8.2.3. This has been back-ported to the 8.0 branch as well, and makes a particularly big difference on multi-core systems.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Eric Bergen</title>
		<link>http://fghaas.wordpress.com/2008/04/27/drbd-and-mysql-just-say-yes/#comment-1736</link>
		<dc:creator>Eric Bergen</dc:creator>
		<pubDate>Sun, 27 Apr 2008 19:02:46 +0000</pubDate>
		<guid isPermaLink="false">http://fghaas.wordpress.com/?p=50#comment-1736</guid>
		<description>What specifically has changed to address my concerns and in which versions of drbd? I'll gladly add updated sections to that blog entry so people aren't getting outdated information.</description>
		<content:encoded><![CDATA[<p>What specifically has changed to address my concerns and in which versions of drbd? I&#8217;ll gladly add updated sections to that blog entry so people aren&#8217;t getting outdated information.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Phil Hildebrand</title>
		<link>http://fghaas.wordpress.com/2008/04/27/drbd-and-mysql-just-say-yes/#comment-1734</link>
		<dc:creator>Phil Hildebrand</dc:creator>
		<pubDate>Sun, 27 Apr 2008 16:16:20 +0000</pubDate>
		<guid isPermaLink="false">http://fghaas.wordpress.com/?p=50#comment-1734</guid>
		<description>Though we haven't run DRBD yet in production, it is our plan (we're setting up in QA currently for perf and fail over testing).

We run all our databases (SQLServer and MySQL) on a SAN for multiple reasons, and for most of the reasons you talk about (snapshots, DRBD sitting below the file system, etc) we have decided on this as a great HA solution for us.  It also allows us to have each node attached to separate storage processors on the SAN, which will allow for a full storage processor failure without taking out our database system.

This is also way faster than any non-block device replication that I know of.

The issue with replication as a sole setup for HA is that it requires intelligent coding in the application should you have to switch to the slave as 'master' because there is not a shared ip that get's migrated.  Though there are patches for auto promoting masters from google, it's not really the most straight forward process.

Ideally, if you can afford both (as in our case) from a disk usage perspective, then you get the benefits from both....</description>
		<content:encoded><![CDATA[<p>Though we haven&#8217;t run DRBD yet in production, it is our plan (we&#8217;re setting up in QA currently for perf and fail over testing).</p>
<p>We run all our databases (SQLServer and MySQL) on a SAN for multiple reasons, and for most of the reasons you talk about (snapshots, DRBD sitting below the file system, etc) we have decided on this as a great HA solution for us.  It also allows us to have each node attached to separate storage processors on the SAN, which will allow for a full storage processor failure without taking out our database system.</p>
<p>This is also way faster than any non-block device replication that I know of.</p>
<p>The issue with replication as a sole setup for HA is that it requires intelligent coding in the application should you have to switch to the slave as &#8216;master&#8217; because there is not a shared ip that get&#8217;s migrated.  Though there are patches for auto promoting masters from google, it&#8217;s not really the most straight forward process.</p>
<p>Ideally, if you can afford both (as in our case) from a disk usage perspective, then you get the benefits from both&#8230;.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Morgan Tocker</title>
		<link>http://fghaas.wordpress.com/2008/04/27/drbd-and-mysql-just-say-yes/#comment-1733</link>
		<dc:creator>Morgan Tocker</dc:creator>
		<pubDate>Sun, 27 Apr 2008 15:53:04 +0000</pubDate>
		<guid isPermaLink="false">http://fghaas.wordpress.com/?p=50#comment-1733</guid>
		<description>&#62; NIC and network corruption is also propagated.

That one is interesting - since it was mentioned as a minus for DRBD, when it's just as true for MySQL replication ;)

The worklog for event checksums in MySQL is &lt;a href="http://forge.mysql.com/worklog/task.php?id=2540" rel="nofollow"&gt;#2540&lt;/a&gt;.  Until that's implemented, DRBD is doing *better* (not worse) in this regard.</description>
		<content:encoded><![CDATA[<p>&gt; NIC and network corruption is also propagated.</p>
<p>That one is interesting - since it was mentioned as a minus for DRBD, when it&#8217;s just as true for MySQL replication <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p>The worklog for event checksums in MySQL is <a href="http://forge.mysql.com/worklog/task.php?id=2540" rel="nofollow">#2540</a>.  Until that&#8217;s implemented, DRBD is doing *better* (not worse) in this regard.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
