Best Practice: Use Backup with your DRBD cluster!

We want to take an opportunity to explain LINBIT’s best practices in regards to DRBD and backup procedures.

DRBD is designed as a storage solution to provide High Availability, Disaster Recovery and Cross Site High Availability to your systems.  As developers of DRBD, we sometimes get community feedback that some folks are using DRBD as a “pseudo” backup solution, and in response to this we wanted to share some abstract guidelines on utilizing DRBD properly by following some key best practice methodologies.

Although DRBD is not backup software, it doesn’t mean you can’t use it in your backup procedures. Utilizing DRBD with LVM as a backing device, one can create backups with minimal to no interference to performance. This is done by utilizing LVM snapshotting as outlined in LINBIT’s DRBD User’s Guide.  Although this page outlines how to do snapshots before and after a resync, these could easily be adapted to a cron job.  Essentially one would disconnect the Secondary, snapshot the backing device, mount the snapshot, perform the backups, umount the snapshot, reconnect the Secondary.  These point in time backups are great for technology such as iSCSI targets, Virtual Machine storage or Databases such as MySQL and PostgreSQL.  As you can imagine, this methodology is quite popular in the Linux HA and DRBD communities.

LINBIT advises systems administrators to:

  1. Utilize DRBD for High Availability, Disaster Recovery and Cross Site High Availability (business continuity) purposes.
  2. Plan, review and execute a full backup strategy that makes sense for your organization and data.   Be sure to keep in mind how much data you’re planning on storing, backing up and at what intervals.  It is important to choose the point in time to make your backups to minimize things such as user error.  In many cases, backing up every day is the appropriate strategy.
  3. Test, test, test.  We cannot say this enough.We develop software that is designed to prevent loss from failure, so you could say we’re experts on this topic.  It’s very important that you not only test DRBD’s configuration, but the components that make up your backup system as well.  Then, on a scheduled basis, you should be reviewing your data to ensure its completeness and correctness.  As well, on an annual basis it would be wise to review your top level strategy and make updates if your requirements have changed.  In summation, it is advised to routinely test your backup procedure and also verify (checksum) your backups to ensure their completeness.

In closing, DRBD is designed to prevent loss of service as the result of equipment failure.  LINBIT strongly advises systems administrators to implement a strategy that incorporates “point in time” backups so administrators can restore, rewind and rejoice knowing that they’re not only backed by the best open source replication technology: DRBD, but a comprehensive backup solution that is designed for the organization’s needs in mind.

How do you backup your DRBD cluster?

Share your thoughts or comments below! :)

6 thoughts on “Best Practice: Use Backup with your DRBD cluster!

  1. I was trying to do backups using snapshot for some time, but it failed. I am using DRBD as a backing device for LVM which in turn contains our cluster data (XEN, pacemaker and so on).

    One of quirks is that you need to do snapshot and backup on same node, also when doing backup it may happend that due to high troughput of your disks, pacemaker starts lagging and gets STONITHed.

    So now I do backups inside every virtual machine. Not nice solution, but safe and working fine.

    Please note: it is not DRBD fault, just describing my case here :).

    • With LVM on top of DRBD taking snapshots can only be done where DRBD is primary. However, if using LVM underneath DRBD (logical volume as DRBD’s backing device) it is possible to take the snapshot and backup from the secondary node. Thus you can make a backup without interfering or impeding on the performance of the primary node.

      • Agreed, backing up from the secondary via LVM under DRBD snapshots works a treat though it will still impact peformance of writes to the DRBD device and, if both of your nodes have Primary resource(s) you will still have a performance impact regardless.

        It’s easy to mitigate this by prefixing your backup commands with:
        nice -n 19 ionice -c 3

        This will give your backup process the lowest CPU and I/O priority which will have a much lesser impact on performance while the backup is taking place especially against the pacemaker lag problem mentioned by the OP.

  2. Hello,

    for me was the best way (not the savest.. ) to break up the drbd synchonisation every night and backup the raw data inside.. i made a script for this autmation and its working fine ;-)
    the only one is.. the time there’s no secondary for failover..

    may it could help someone ;-)

    /root/backup_raw_automated.sh
    #!/bin/bash
    # backup_raw_automated.sh
    #
    # gea@itandtel.at – 2011/2012

    ##### VARIABLES ###################################################################################

    CUSTOMER=
    HOST=`hostname |awk -F. ‘{ print $1 }’`

    iface=eth2
    resource=r0
    drbddev=/dev/drbd1
    mountpoint=/mysql_drbd

    localip=192.168.255.11
    remoteip=192.168.255.10

    LOG=/var/log/backup/backup_database_raw_`date +%Y-%m-%d`.log

    BKTARDESTFLD=/Backup
    BKTARDESTFILE=$BKTARDESTFLD/backup_raw_mysql_`date +%Y.%m.%d`.tar.bz2
    BKTARSOURCES=$mountpoint/mysql/

    RESYNC=
    #VARERRORS=0
    #FFERRORS=0
    LOCK=/tmp/backup_databases_raw.lock

    ROLE=`drbdadm role r0 |awk -F/ ‘{ print $1 }’`

    TIMEDIFFSEC=0
    TIMEDIFFMIN=0

    EMAIL1=
    EMAIL2=
    MAILSENDER=”`hostname -a |tr [:lower:] [:upper:]`”

    ERRSUM=0

    #devel exit
    #exit
    ###### STARTUP ####################################################################################

    echo “##################################################################################” >> $LOG
    echo “`date +%H:%M:%S` + `date +%Y.%m.%d` – start backup up raw files” >> $LOG
    echo “———————————————————————————-” >> $LOG

    ##### LOCK FILE ###################################################################################
    [ -e $LOCK ] && echo “lock file exist! abort..” >> $LOG
    [ -e $LOCK ] && echo “lock file exist! abort..”
    [ -e $LOCK ] && exit
    touch $LOCK

    ##### PRE CHECK + WHO I AM? PRIMARY/SECONDARY #####################################################
    echo “`date +%H:%M:%S` – PRECHECK ON” >> $LOG
    TIMESTART=`date +%s`

    [ -e $BKTARDESTFLD ] || mkdir $BKTARDESTFLD
    [ -e $BKTARDESTFLD ] || echo “$BKTARDESTFLD could not created.. abort..” >> $LOG
    [ -e $BKTARDESTFLD ] || exit

    [ -e $drbddev ] || echo “no $drbddev – abort..” >> $LOG
    [ -e $drbddev ] || exit

    [ -e $mountpoint ] || mkdir $mountpoint
    [ -e $mountpoint ] || echo “$mountpoint could not created.. abort..” >> $LOG
    [ -e $mountpoint ] || exit

    echo -n -e “`date +%H:%M:%S` – drbd role check: ” >> $LOG
    if [ $ROLE = "Primary" ];then
    echo “primary => abort!” >> $LOG
    rm $LOCK
    exit
    elif [ $ROLE = "Secondary" ];then
    echo “secondary => ok” >> $LOG
    else
    echo “CONFUSION – i dont know which role we have! abort..” >> $LOG
    exit
    fi

    [ -z $iface ] && echo “var iface is empty”
    [ -z $iface ] && exit

    [ -z $resource ] && echo “var resource is empty”
    [ -z $resource ] && exit

    [ -z $drbddev ] && echo “var drbddev is empty”
    [ -z $drbddev ] && exit

    [ -z $mountpoint ] && echo “var \$mountpoint is empty”
    [ -z $mountpoint ] && exit

    [ -z $localip ] && echo “var \$localip is empty”
    [ -z $localip ] && exit

    [ -z $remoteip ] && echo “var \$remoteip is empty”
    [ -z $remoteip ] && exit

    [ -z $LOG ] && echo “var \$LOG is empty”
    [ -z $LOG ] && exit

    [ -z $BKTARDESTFLD ] && echo “var \$BKTARDESTFLD is empty”
    [ -z $BKTARDESTFLD ] && exit

    [ -z $BKTARDESTFILE ] && echo “var \$BKTARDESTFILE is empty”
    [ -z $BKTARDESTFILE ] && exit

    [ -z $BKTARSOURCES ] && echo “var \$BKTARSOURCES is empty”
    [ -z $BKTARSOURCES ] && exit

    [ -z $LOCK ] && echo “var \$LOCK is empty”
    [ -z $LOCK ] && exit

    echo “`date +%H:%M:%S` – PRECHECK DONE – all right!” >> $LOG

    ###################################################################################################
    ### DELETE OLD BACKUPS ###
    ###################################################################################################

    echo “`date +%H:%M:%S` – Clean up $BKTARDESTFLD” >> $LOG
    find $BKTARDESTFLD -name ‘backup_raw*’ -ctime +1 >> $LOG ## -ctime +5 => last file status change older than 5x24h
    find $BKTARDESTFLD -name ‘backup_raw*’ -ctime +1 -exec rm {} \; ## and remove them..
    echo “`date +%H:%M:%S` – Clean up down” >> $LOG

    ###################################################################################################
    ### NOW BREAK UP SYNCHRONISATION ###
    ###################################################################################################

    ##### DISCONNECT IFACE ############################################################################
    echo “`date +%H:%M:%S` – STARTING BACKUP” >> $LOG
    echo “`date +%H:%M:%S` – >> shutting down interface $iface <> $LOG
    ifconfig $iface down

    # check iface is down
    if [ `ethtool $iface |grep "Link detected" |awk '{ print $3 }'` = yes ];then
    echo ” ABORT – Interface $iface is still up! abort..” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    exit
    elif [ `ethtool $iface |grep "Link detected" |awk '{ print $3 }'` = no ];then
    echo ” OK – Interface $iface done” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    else
    echo ” CONFUSION – Interface $iface – failure! abort..” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    fi

    ##### DRBD ROLE TO PRIMARY ########################################################################
    echo “`date +%H:%M:%S` – >> drbd role to primary <> $LOG
    drbdadm primary $resource
    sleep 2

    # check drbd role
    if [ `drbdadm role $resource |awk -F/ '{ print $1 }'` = "Primary" ];then
    echo ” OK – drbd state of resource $resource is now primary” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    elif [ `drbdadm role $resource |awk -F/ '{ print $1 }'` = "Secondary" ];then
    echo ” ABORT – drbd state of resource $resource is still secondary! abort..” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    exit
    else
    echo ” CONFUSION – drbd state of resource $resource … abort” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    exit
    fi

    ###### MOUNT DRBD DEVICE ##########################################################################
    echo “`date +%H:%M:%S` – >> mounting drbd device <> $LOG
    mount $drbddev $mountpoint
    if [ $? = 0 ];then
    echo ” OK – mounting of $drbddev to $mountpoint done” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    elif [ $? = 1 ];then
    echo ” ABORT – mounting of $drbddev to $mountpoint failed! abort..” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    exit
    else
    echo ” CONFUSION – i have no plan whats going on.. abort..” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    exit
    fi

    ##### TAR OF MYSQL RAW FILES ######################################################################
    echo “`date +%H:%M:%S` – >> starting tar backup <> $LOG
    tar -cjPf $BKTARDESTFILE $BKTARSOURCES
    if [ $? = 0 ];then
    echo ” OK (`date +%H:%M:%S`) – tar backup from $BKTARSOURCES to $BKTARDESTFILE finished” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    elif [ $? = 0 ];then
    echo ” ABORT (`date +%H:%M:%S`) – tar backup has problems! $BKTARSOURCES to $BKTARDESTFILE failed” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    else
    echo ” CONFUSION (`date +%H:%M:%S`) – i have no plan what happend! abort..” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    exit
    fi

    ###################################################################################################
    ### NOW BACK TO SYNCHRONISATION ###
    ###################################################################################################
    echo “`date +%H:%M:%S` – BACK TO SYNC” >> $LOG

    ##### UNMOUNTING DRBD DEVICE ######################################################################
    echo “`date +%H:%M:%S` – >> unmount $mountpoint <> $LOG
    mount |grep $mountpoint > /dev/null
    if [ $? = 0 ];then
    umount $mountpoint
    if [ $? = 0 ];then
    echo ” OK – unmounting done” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    else
    echo “ABORT – unmounting failed!! abort..” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    exit
    fi
    else
    echo “$mountpoint is not mounted.. but.. forward..” >> $LOG
    fi

    ##### HIGH CRITICAL – DRBD ROLE SECONDARY ########################################################
    echo “`date +%H:%M:%S` – >> drbd role to secondary <> $LOG

    if [ `drbdadm role $resource |awk -F/ '{ print $1 }'` = "Primary" ];then
    echo ” OK – drbd role is currently `drbdadm role $resource |awk -F/ ‘{ print $1 }’` – now set to Secondary” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    drbdadm disconnect $resource
    drbdadm secondary $resource
    if [ $? = 0 ];then
    sleep 1
    if [ `drbdadm role $resource |awk -F/ '{ print $1 }'` = "Secondary" ];then >> $LOG
    echo ” OK – drbd role is now `drbdadm role $resource |awk -F/ ‘{ print $1 }’`” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    RESYNC=1
    else
    echo ” ABORT – drbd role is not secondary – role: `drbdadm role $resource |awk -F/ ‘{ print $1 }’` !! abort..” >> $LOG
    RESYNC=0
    ERRSUM=`expr $ERRSUM + $?`
    exit
    fi
    else
    echo ” FAILED – drbd role to secondary has errors.. abort..” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    exit
    fi
    else
    echo ” WARN – drbd role is not primary!! but.. forward..” >> $LOG
    echo “role: `drbdadm role $resource |awk -F/ ‘{ print $1 }’`”
    RESYNC=1
    ERRSUM=`expr $ERRSUM + $?`
    fi

    ##### CRITICAL – CONNECT IFACE ####################################################################
    echo “`date +%H:%M:%S` – >> connect iface <> $LOG
    link=`ethtool $iface |grep “Link detected” |awk ‘{ print $3 }’`

    if [ $link = no ];then
    ifconfig $iface up
    if [ `ethtool $iface |grep "Link detected" |awk '{ print $3 }'` = yes ];then
    echo ” OK – link is now connected!” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`

    # ping localip
    ping -c 1 $localip > /dev/null
    if [ $? = 0 ];then
    echo ” OK – ping $localip success” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    else
    echo ” FAILED – ping $localip failed!” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    fi
    # ping remoteip
    ping -c 1 $remoteip > /dev/null
    if [ $? = 0 ];then
    echo ” OK – ping $remoteip success” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    else
    echo ” FAILED – ping $remoteip failed!” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    fi
    else
    echo ” ABORT – link is not connected!! abort..” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    exit
    fi
    else
    echo “WARN – link IS connected – should now be – but.. forward..” >> $LOG
    fi

    ##### HIGHEST CRITICAL – DRBD DISCARD DATA ########################################################o
    echo “`date +%H:%M:%S` – >> drbd resync – discard local data <> $LOG
    [ -z $RESYNC ] && echo “`date +%H:%M:%S` – VAR RESYNC is empty – exit!!” >> $LOG
    [ -z $RESYNC ] && exit

    #echo “resync: $RESYNC”

    if [ $RESYNC = 1 ];then
    drbdadm — –discard-my-data connect $resource
    if [ $? = 0 ];then
    echo ” OK – drbd discarding local data” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    else
    echo ” FAILED – drbd discarding local data” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    fi
    else
    echo ” VAR RESYNC is 0 – no resync!” >> $LOG
    exit
    fi

    ##### SYNC STATE
    echo “`date +%H:%M:%S` – SYNC-STATE CHECK” >> $LOG
    GREPSTRING=sync
    GREPFILE=/proc/drbd

    ### give drbd time to start syncing
    sleep 5

    echo “`date +%H:%M:%S` – are we synchron?” >> $LOG
    while [ `cat $GREPFILE |grep $GREPSTRING > /dev/null` ]; do
    echo “Sync in progress.. wait 1 sec..” >> $LOG
    echo “———————————————” >> $LOG
    ` cat $GREPFILE` >> $LOG
    echo “———————————————” >> $LOG
    sleep 1
    done
    echo “`date +%H:%M:%S` Sync process finished!” >> $LOG

    ##### STATE CHECK
    echo “`date +%H:%M:%S` – STATE CHECK” >> $LOG
    echo ” Sleeping 60 seconds – giving drbd states chance to take over” >> $LOG
    echo “” >> $LOG
    sleep 60
    ROLELOCAL=`cat /proc/drbd |grep ro: |awk ‘{ print $3 }’ |awk -F: ‘{ print $2 }’ |awk -F/ ‘{ print $ 1}’`
    ROLEREMOTE=`cat /proc/drbd |grep ro: |awk ‘{ print $3 }’ |awk -F: ‘{ print $2 }’ |awk -F/ ‘{ print $ 2}’`

    STATELOCAL=`cat /proc/drbd |grep ro: |awk ‘{ print $4 }’ |awk -F: ‘{ print $2 }’ |awk -F/ ‘{ print $ 1}’`
    STATEREMOTE=`cat /proc/drbd |grep ro: |awk ‘{ print $4 }’ |awk -F: ‘{ print $2 }’ |awk -F/ ‘{ print $ 2}’`

    ROLESTATEERR=0

    if [ $ROLELOCAL = Secondary ];then
    echo “Role local: $ROLELOCAL – OK” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    else
    echo “Role local: $ROLELOCAL – ERROR” >> $LOG
    ROLESTATEERR=`expr $ROLESTATEERR + 1`
    ERRSUM=`expr $ERRSUM + $?`
    fi

    if [ $ROLEREMOTE = Primary ];then
    echo “Role remote: $ROLEREMOTE – OK” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    else
    echo “Role remote: $ROLEREMOTE – ERROR” >> $LOG
    ROLESTATEERR=`expr $ROLESTATEERR + 1`
    ERRSUM=`expr $ERRSUM + $?`
    fi

    if [ $STATELOCAL = UpToDate ];then
    echo “State local: $STATELOCAL – OK” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    else
    echo “State local: $STATELOCAL – ERROR” >> $LOG
    ROLESTATEERR=`expr $ROLESTATEERR + 1`
    ERRSUM=`expr $ERRSUM + $?`
    fi

    if [ $STATEREMOTE = UpToDate ];then
    echo “State remote: $STATEREMOTE – OK” >> $LOG
    ERRSUM=`expr $ERRSUM + $?`
    else
    echo “State remote: $STATEREMOTE – ERROR” >> $LOG
    ROLESTATEERR=`expr $ROLESTATEERR + 1`
    ERRSUM=`expr $ERRSUM + $?`
    fi

    echo “” >> $LOG
    echo “********************” >> $LOG
    echo “Error Summary: $ERRSUM” >> $LOG
    echo “********************” >> $LOG
    echo “” >> $LOG

    if [ $ROLESTATEERR -gt 0 ];then

    ###################################################################################################
    ### EMERGENCY – NO REDUNDANCY ###
    ###################################################################################################

    echo >> $LOG
    echo >> $LOG
    echo “#############################################” >> $LOG
    echo ” >>>>>> EMERGENCY — NO REDUNDANCY <<<<<<> $LOG
    ERRSUM=`expr $ERRSUM + 255`
    echo >> $LOG
    echo “`date +%H:%M:%S` – emergency – shuting down all relevanty services” >> $LOG
    echo >> $LOG

    ### EMERG – IFACE
    echo “`date +%H:%M:%S` – >> shutting down interface $iface <> $LOG
    ifconfig $iface down

    # check iface is down
    if [ `ethtool $iface |grep "Link detected" |awk '{ print $3 }'` = yes ];then
    echo ” ABORT – Interface $iface is still up! abort..” >> $LOG
    exit
    elif [ `ethtool $iface |grep "Link detected" |awk '{ print $3 }'` = no ];then
    echo ” OK – Interface $iface done” >> $LOG
    else
    echo ” CONFUSION – Interface $iface – failure! abort..” >> $LOG
    fi

    #### EMERG – DRBD ROLE to SECONDARY
    echo “`date +%H:%M:%S` – >> drbd role to secondary <> $LOG

    if [ `drbdadm role $resource |awk -F/ '{ print $1 }'` = "Primary" ];then
    echo ” OK – drbd role is currently `drbdadm role $resource |awk -F/ ‘{ print $1 }’` – now set to Secondary” >> $LOG
    drbdadm disconnect $resource
    drbdadm secondary $resource
    if [ $? = 0 ];then
    sleep 1
    if [ `drbdadm role $resource |awk -F/ '{ print $1 }'` = "Secondary" ];then >> $LOG
    echo ” OK – drbd role is now `drbdadm role $resource |awk -F/ ‘{ print $1 }’`” >> $LOG
    else
    echo ” ABORT – drbd role is not secondary – role: `drbdadm role $resource |awk -F/ ‘{ print $1 }’` !! abort..” >> $LOG
    fi
    else
    echo ” FAILED – drbd role to secondary has errors.. abort..” >> $LOG
    fi
    else
    echo ” WARN – drbd role is not primary!! but.. forward..” >> $LOG
    echo “role: `drbdadm role $resource |awk -F/ ‘{ print $1 }’`”
    fi

    echo >> $LOG
    echo “******************************************” >> $LOG
    echo “`date +%H:%M:%S` EMERGENCY STATE: ” >> $LOG
    echo “Network Interface $iface: `ethtool $iface |grep Link` ” >> $LOG
    echo “DRBD Role Local $resource: `cat /proc/drbd |grep ro: |awk ‘{ print $3 }’ |awk -F: ‘{ print $2 }’ |awk -F/ ‘{ print $ 1}’` ” >> $LOG
    echo “DRBD Role Remote $resource: `cat /proc/drbd |grep ro: |awk ‘{ print $3 }’ |awk -F: ‘{ print $2 }’ |awk -F/ ‘{ print $ 2}’` ” >> $LOG
    echo “DRBD State Local $resource: `cat /proc/drbd |grep ro: |awk ‘{ print $4 }’ |awk -F: ‘{ print $2 }’ |awk -F/ ‘{ print $ 1}’`” >> $LOG
    echo “DRBD State Remote $resource: `cat /proc/drbd |grep ro: |awk ‘{ print $4 }’ |awk -F: ‘{ print $2 }’ |awk -F/ ‘{ print $ 2}’`” >> $LOG
    echo “” >> $LOG
    echo “********************” >> $LOG
    echo “Error Summary: $ERRSUM” >> $LOG
    echo “********************” >> $LOG
    echo “” >> $LOG
    echo “******************************************” >> $LOG
    echo “” >> $LOG

    fi

    TIMEEND=`date +%s`
    TIMEDIFFSEC=`expr $TIMEEND – $TIMESTART`

    while [ $TIMEDIFFSEC -gt 59 ]
    do
    TIMEDIFFSEC=`expr $TIMEDIFFSEC – 60`
    TIMEDIFFMIN=`expr $TIMEDIFFMIN + 1`
    done

    echo “” >> $LOG
    echo “Total Backup Duration: $TIMEDIFFMIN minutes, $TIMEDIFFSEC seconds” >> $LOG
    echo “” >> $LOG
    echo “———————————————————————————-” >> $LOG
    echo “`date +%H:%M:%S` + `date +%Y.%m.%d` – end backup up raw files” >> $LOG
    echo “” >> $LOG
    echo “##################################################################################” >> $LOG

    rm $LOCK

    ##### EMAIL NOTIFICATION ##########################################################################

    cat $LOG |mail -s “$CUSTOMER $HOST Backup DRBD Raw Automated” $EMAIL1 — -F $MAILSENDER
    cat $LOG |mail -s “$CUSTOMER $HOST Backup DRBD Raw Automated” $EMAIL2 — -F $MAILSENDER

  3. Why did you wait to reconnect the secondary until after the backup is created. The snapshot should be a view of the backing device at that point in time, you should be able to reconnect the secondary and allow it to continue backing the primary while the backup of the snapshot takes place.

    So, for instance:
    – disconnect the secondary
    – snapshot the backing device on the secondary
    – reconnect the secondary to the primary (which will start syncing)
    – mount the snapshot
    – perform the backup on the snapshot
    – unmount the snapshot
    – remove the snapshot

    Am I missing something?

    • You are absolutely correct, and your method is completely valid. In fact, you need not even disconnect the secondary at all.

      However, because of LVM snapshots copy-on-write nature it often has a noticeable negative impact on write performance, and when using DRBD to replicate synchronously this performance hit will carry over to the primary. It is for this reason we normally suggest disconnecting the secondary and not reconnecting it until after the snapshot has been discarded.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

What is 5, multiplied by 11 ?
Please leave these two fields as-is:
IMPORTANT! To be able to proceed, you need to solve the following simple math (so we know that you are a human) :-)