Setting DRBD in Primary / Primary — common commands to sync resync and make changes
Setting DRBD in Primary / Primary — common commands to sync resync and make changes
As we have been setting up our farm with an NFS share the DRBD primary / primary connection between servers is important.
We are setting up a group of /customcommands/ that we will be able to run to help us keep track of all of the common status and maintenance commands we use, but when we have to create, make changes to the structure, sync and resync, recover, grow or move the servers, We need to document our ‘Best Practices’ and how we can recover.
From base Server install
apt-get install gcc make flex
wget http://oss.linbit.com/drbd/8.4/drbd-8.4.1.tar.gz
tar xvfz drbd-8.4.1.tar.gz
cd drbd-8.4.1/
./configure --prefix=/usr --localstatedir=/var --sysconfdir=/etc --with-km
make KDIR=/lib/modules/3.2.0-58-virtual/build
make install
Setup in/etc/drbd.d/disk.res
resource r0 {
protocol C;
syncer { rate 1000M; }
startup {
wfc-timeout 15;
degr-wfc-timeout 60;
become-primary-on both;
}
net {
#requires a clustered filesystem ocfs2 for 2 prmaries, mounted simultaneously
allow-two-primaries;
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
cram-hmac-alg sha1;
shared-secret "sharedsanconfigsecret";
}
on server1{
device /dev/drbd0;
disk /dev/xvdb;
address 192.168.100.10:7788;
meta-disk internal;
}
on riofarm-base-san2 {
device /dev/drbd0;
disk /dev/xvdb;
address 192.168.100.11:7788;
meta-disk internal;
}
}
Setup your /etc/hosts
192.168.100.10 server1
192.168.100.11 server2
Setup /etc/hostname with
server1
reboot, verify your settings and SAVE A DRBDVMTEMPLATE clone your VM to a new server called server2
Setup /etc/hostname with
server2
start drbd with /etc/init.d/drbd this will likely try and create the connection, but this is where we are going to ‘play’ to learn the commands and how we can sync, etc.
cat /proc/drbd #shows the status of the connections server1> drbdadm down r0 #turns of the drbdresource and connection server2> drbdadm down r0 #turns of the drbd resource and connection server1> drbdadm -- --force create-md r0 #creates a new set of meta data on the drive, which 'erases drbds memory of the sync status in the past server2> drbdadm -- --force create-md r0 #creates a new set of meta data on the drive, which 'erases drbds memory of the sync status in the past server1> drbdadm up r0 #turns on the drbdresource and connection and they shoudl connect without a problem, with no memory of a past sync history server2> drbdadm up r0 #turns on the drbdresource and connection and they shoudl connect without a problem, with no memory of a past sync history server1> drbdadm -- --clear-bitmap new-current-uuid r0 # this create a new 'disk sync image' essentially telling drbd that the servers are blank so no sync needs to be done both servers are immediately UpToDate/UptoDate in /proc/drbd server1> drbdadm primary r0 server2> drbdadm primary r0 #make both servers primary and now when you put an a filesystem on /dev/drbd0 you will be able to read and write on both systems as though they are local
So, lets do some failure scenarios, Say, we loose a server, it doesn’t matter which one since they are both primaries, in this case though we will say server2 failed. Create a new VM from DRBDVMTEMPLATE which already had drbd made on it with the configuration or create another one using the instructions above.
Open /etc/hostname and set it to
server2
reboot. Make sure /etc/init.d/drbd start is running
server1>watch cat /proc/drbd #watch the status of dtbd, it is very useful and telling about what is happening, you will want DRBD to be Connected Primary/Unknown UpToDate/DUnknown server2>drbdadm down server2>dbadm wipe-md r0 #this is an optional step that is used to wipe out the meta data, I have not seen that it does anything different than creating the metadata using the command below, but it is useful to know the command in case you want to get rid of md on your disk server2>drbdadm -- --force create-md r0 ##this makes sure that their is no partial resync data left over from where you cloned it from server2>drbdadm up r0 # this brings drbd server2 back into the resource and connects them, it will immediately sart syncing you should see SyncSource Primary/Secondary UpToDate/Inconsistent on server1, for me it was soing to to 22 hours for my test of a 1TM (10 MB / second)
Lets get funky, what happens if you stop everything in the middle of a sync
server1>drbdadm down r0 #we shut down the drdb resource that has the most up to date information, on server2 /proc/drbd shows Secondary/Unknown Inconsitent/DUnknown , server2 does not know about server1 any more, but server2 still knows that server2 is inconsitent, (insertable step here could be on server2: drbdadm down ro; drbdadm up ro, with no change to the effect) server1>drbdadm up ro # this brings server1 back on line and /proc/drbd on server1 shows SyncSource, server2 shows SyncTarget, server1 came backup as the UpToDate server, server2 was Inconsistent, it figured it out
Where things started to go wrong and become less ‘syncable’ was when servers were both down and had to be brought back up again separately with a new uuid was created on them separately. so lets simulate that the drbd config fell apart, and we have to put it together again.
server2>drbdadm disconnect ro; drdbadm -- --force create-md r0 ; drbd connect ro; #start the sync process over