Category: Server Hardware
Mdadm – Failed disk recovery (unreadable disk)
Mdadm – Failed disk recovery (unreadable disk)
Well,
After 9 more months I ran into a nother disk failure. (First disk failure found here https://www.matraex.com/mdadm-failed-disk-recovery/)
But this time, The system was unable to read the disk at all
#fdisk /dev/sdb
This process just hung for a few minutes. It seems I couldn’t simply run a few commands like before to remove and add the disk back to the software RAID. So I had to replace the disk. Before I went to the datacenter I ran
#mdadm /dev/md0 --remove /dev/sdb1
I physically went to our data center, found the disk that showed the failure (it was disk sdb so I ‘assumed’ it was the center disk out of three, but I was able to verify since it was not blinking from normal disk activity. I removed the disk, swapped it out for one that I had sitting their waiting for this to happen, and replaced it. Then I ran a command to make sure the disk was correctly partitioned to be able to fit into the array
#fdisk /dev/sdb
This command did not hang, but responded with cannot read disk. Darn, looks like some error happened within the OS or on the backplane that made it so a newly added disk wasn’t readable. I scheduled a restart on the server later when the server came back up, fdisk could read the disk. It looks like I had used the disk for something before, but since I had put it in my spare disk pile, I knew I could delete it and I partitioned it with one partion to match what the md was expecting (same as the old disk)
#fdisk /dev/sdb
>d 2 -deletes the old partition 2
>d 1 -deletes the old partition 1
>c -creates a new partion
>p – sets the new partion as primary
>1 – sets the new partion as number 1
>> <ENTER> – just press enter to accept the defaults starting cylinder
>> <ENTER> – just press enter to accept the defaults ending cylinder
>> w – write the partion changes to disk
>> Ctrl +c – break out of fdisk
Now the partition is ready to add back to the raid array
#mdadm /dev/md0 –add /dev/sdb1
And we can immediately see the progress
#mdadm /dev/md0 --detail /dev/md0: Version : 00.90.03 Creation Time : Wed Jul 18 00:57:18 2007 Raid Level : raid5 Array Size : 140632704 (134.12 GiB 144.01 GB) Device Size : 70316352 (67.06 GiB 72.00 GB) Raid Devices : 3 Total Devices : 3 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Sat Feb 22 10:32:01 2014 State : active, degraded, recovering Active Devices : 2 Working Devices : 3 Failed Devices : 0 Spare Devices : 1 Layout : left-symmetric Chunk Size : 64K Rebuild Status : 0% complete UUID : fe510f45:66fd464d:3035a68b:f79f8e5b Events : 0.537869 Number Major Minor RaidDevice State 0 8 1 0 active sync /dev/sda1 3 8 17 1 spare rebuilding /dev/sdb1 2 8 33 2 active sync /dev/sdc1
And then to see the progress of rebuilding
#cat /proc/mdadm Personalities : [raid1] [raid6] [raid5] [raid4] md0 : active raid5 sdb1[3] sda1[0] sdc1[2] 140632704 blocks level 5, 64k chunk, algorithm 2 [3/2] [U_U] [==============>......] recovery = 71.1% (50047872/70316352) finish=11.0min speed=30549K/sec md1 : active raid1 sda2[0] 1365440 blocks [2/1] [U_]
Wow in the time I have been blogging this, already 71 percent rebuilt!, but wait! what is this, md1 is failed? I check my monitor and what do I find but another message that shows that md1 failed with the reboot. I was so used to getting the notice saying md0 was down I did not notice that md1 did not come backup with the reboot! How can this be?
It turnd out that sdb was in use on both md1 and md0, but even through sdb could not be read at all on /dev/sdb and /dev/sdb1 failed out of the md0 array, somehow the raid subsystem had not noticed and degraded the md1 array even though the entire sdb disk was not respoding (perhaps sdb2 WAS responding back then just not sdb), who knows at this point. Maybe the errors on the old disk could have been corrected by the reboot if I had tried that before replacing the disk, but that doesn’t matter any more, All I know is that I have to repartion the sdb device in order to support both the md0 and md1 arrays.
I had to wait until sdb finished rebuilding, then remove it from md0, use fdisk to destroy the partitions, build new partitions matching sda and add the disk back to md0 and md1
MDADM – Failed disk recovery (too many disk errors)
MDADM – Failed disk recovery (too many disk errors)
This only happens once every couple of years, but occasionally a SCSI disk on one of our servers has too many errors, and is kicked out of the md array
And… we have to rebuild it. Perhaps we should replace it since it appears to be having problems, but really, the I in RAID is inexpensive (or something) so I would rather lean to being frugal with the disks and replacing them only if required.
I can never remember of the top of my head the commands to recover, so this time I am going to blog it so I can easily find it.
First step, take a look at the status of the arrays on the disk
#cat /proc/mdstat
(I don't have a copy of what the failed drive looks like since I didn't start blogging until after)
Sometimes an infrequent disk error can cause md to fail a hard drive and remove it from an array, even though the disk is fine.
That is what happened in this case, and I knew the disk was at least partially good. The disk / partition that failed was /dev/sdb1 and was part of a RAID V, on that same device another partition is part of a RAID I, that RAID I is still healthy so I knew the disk is just fine. So I am only re-adding the disk to the array so it can rebuild. If the disk has a second problem in the next few months, I will go ahead and replace it, since the issue that happened tonight is probably indicating a disk that is beginning to fail but probably still has lots of life in it.
The simple process is
#mdadm /dev/md0 --remove /dev/sdb1
This removed the faulty disk, that is when you would physically replace the disk in the machine, since I am only going to rebuild the disk I just skip that and move to the next step.
#mdadm /dev/md0 --re-add
The disk started to reload and VOILA! we are rebuilding and will be back online in a few minutes.
Now you take a look at the status of the arrays
#cat /proc/mdstat Personalities : [raid1] [raid6] [raid5] [raid4] md0 : active raid5 sdb1[3] sdc1[2] sda1[0] 140632704 blocks level 5, 64k chunk, algorithm 2 [3/2] [U_U] [=======>.............] recovery = 35.2% (24758528/70316352) finish=26.1min speed=29020K/sec md1 : active raid1 sda2[0] sdb2[1] 1365440 blocks [2/2] [UU]
In case you want to do any trouble shooting on what happened, this command is useful in looking into the logs.
#grep mdadm /var/log/syslog -A10 -B10
But this command is the one that I use to see the important events related to the failure and rebuild. As I am typing this I am just over 60% complete rebuilt which you see in the log
#grep mdadm /var/log/syslog Jun 15 21:02:02 xxxxxx mdadm: Fail event detected on md device /dev/md0, component device /dev/sdb1 Jun 15 22:03:16 xxxxxx mdadm: RebuildStarted event detected on md device /dev/md0 Jun 15 22:11:16 xxxxxx mdadm: Rebuild20 event detected on md device /dev/md0 Jun 15 22:19:16 xxxxxx mdadm: Rebuild40 event detected on md device /dev/md0 Jun 15 22:27:16 xxxxxx mdadm: Rebuild60 event detected on md device /dev/md0
You can see from the times, it took me just over an hour to respond and start the rebuild (I know, that seems too long if I were to just do this remotely, but when I got the notice, I went on site since I thought I would have to do a physical swap and I had to wait a bit while the Colo security verified my ID, and I was probably moving a little slow after some Nachos at Jalepeno’s) Once the rebuild started it took about 10 minutes per 20% of the disk to rebuild.
————————-
Update: 9 months later the disk finally gave out and I had to manually replace the disk. I blogged again:
https://www.matraex.com/mdadm-failed-d…nreadable-disk/
Linux System Discovery
Linux System Discovery
Over the last couple of weeks I have been working on doing some in depth “System Discovery” work for a client.
The client came to us after a major employee restructuring, during which they lost ALL of the technical knowledge of their network.
The potentially devastating business move on their part turned into a very intriguing challenge for me.
They asked me to come in and document what service each of their 3 Linux servers.
As I dug in I found that their network had some very unique, intelligent solutions:
- A reliable production network
- Thin Client Linux printing stations, remotely connected via VPN
- Several Object Oriented PHP based web applications
Several open source products had been combined to create robust solutions
It has been a very rewarding experience to document the systems and give ownership of the systems, network and processes back to the owner.
The documentation I have provided included
- A high level network diagram as a quick reference overview for new administrators and developers
- An overall application and major network, server and node object description
- Detailed per server/node description with connection documentation, critical processes , important paths and files and dependencies
- Contact Information for the people and companies that the systems rely on.
As a business owner myself, I have tried to help the client recognize that even when they use an outside consultant, it is VERY important that they maintain details of their critical business processes INSIDE of their company. Their might not be anything in business that is as rewarding as giving ownership of a “lost” system back to a client.
Debian Lenny Network Boot on Dell 2650 (Broadcom Network Card)
Debian Lenny Network Boot on Dell 2650 (Broadcom Network Card)
Debian Lenny (and etch) both do not include the drivers for the broadcom network cards that come with many dell servers, I use the Dell 2650 a lot but I have also had issues with the 1750 and 1950, I am sure their are other
I hear it is a licensing problem with debian not being able to distribute the drivers or something.
Here is the solution I have come up with from my end.
On my pxeboot server (I refer to pxe booting but I dont describe how to set one up, check this howto out)
I cd into the directory that my lenny installation is to be setup in (based on the pxelinux.cfg/default file)
#cd /tftpboot/debian/lenny/i386/
I am going to download all of the network installation files for debian lenny on i386, this should apply 64 bit too though.
# wget http://ftp.nl.debian.org/debian/dists/lenny/main/installer-i386/current/images/netboot/netboot.tar.gz
# wget http://ftp.nl.debian.org/debian/dists/lenny/main/installer-i386/current/images/netboot/debian-installer/i386/initrd.gz
# wget http://ftp.nl.debian.org/debian/dists/lenny/main/installer-i386/current/images/netboot/debian-installer/i386/linux
download the broadcom drivers package and extract it to a folder called bnx2
# wget http://ftp.us.debian.org/debian/pool/non-free/f/firmware-nonfree/firmware-bnx2_0.14+lenny1_all.deb
# dpkg-deb -x firmware-bnx2_0.14+lenny1_all.deb bnx2
Create a temp working directory
# mkdir temp
# cd temp
extract all of the installation files from the initrd.gz file so you can manipulate them (include the driver)
# zcat ../initrd.gz |cpio -iv
Copy all of the firmware drivers from the extracted bnx2 directory into the root of the extracted initrd.gz kernel directory
# cp ../bnx2/lib/firmware/* ../bnx2/usr/share/initramfs-tools/hooks/firmware_bnx2 .
Since the kernel initrd.gz is only used during installation of the OS, this fix so far hasn’t addressed installing the broadcom driver package for the OS after installation.
To do this you will need to customize the kernel to select and install this package during installation using “preseeding”
Create and edit a file called preseed.cfg in the root of the extracted kernel directory (
# vi preseed.cfg
Place the following contents in that file (I have aso included the ssh server since I typcially do a minimum install without ANY packages but I need ssh)
#automatically select these packages when installing the server
#d-i pkgsel/include string openssh-server firmware-bnx2
base-config apt-setup/non-free boolean true
d-i preseed/late_command string apt-install firmware-bnx2; apt-install openssh-server;
As another shortcut that can shave a tiny bit of time of of your installation, if you do not use USB storage during your installation, there is no need to wait for the delay and errors that occur during the system’s search for those USB storage devices. Deleting the USB drivers from the installation kernel will prevent these errors
# rm -rf lib/modules/2.6.26-2-486/kernel/drivers/usb/storage
Now it is time to put the extracted kernel directory back together in the location that the pxe boot is looking for it.
# find . -print0 | cpio -0 -H newc -ov | gzip -c > ../initrd.gz
that is it! you have customized and rebuilt your installation kernel for network boot.
Simply pxeboot to this installation with your Dell or broadcom server and the drivers will be included.
The concepts used above can also help you to setup and customize a net boot which has packages already selected or otherwise speed your install along with drivers or other customizations already selected, look into preseeding for more options here.
Adding Disk Space to an Array on a Dell PERC using AFACLI
Adding Disk Space to an Array on a Dell PERC using AFACLI
This blog describes the commands necessary to add a disk to existing RAID V array in the case where you have an empty available slot where you can add a disk.
The actual manual for afacli can be found here:
http://docs.us.dell.com/support/edocs/storage/57kgr/cli/en/index.htm
Accessing the CLI from the UNIX Prompt
To access the CLI from the UNIX prompt, display a window and type afacli
in any directory. The system displays the FASTCMD>
prompt, which indicates you can now use CLI commands. The path in the startup file (.login
or .cshrc
) must include the directory where the software is installed for the command to work in any directory. See your UNIX documentation for information on setting up directory paths in the .login
and .cshrc
files.
To view all controllers use €˜controller list€™
To connect to the controller with the command line utility, execute:
FASTCMD> open afa0
AFA0>
To show the status of all disks in all arrays and get an overview of the disks in the RAID, execute €˜enclosure show status€™.
AFA0> enclosure show status
Executing: enclosure show status
Enclosure
ID (B:ID:L) UpTime D:H:M PowerCycle Interval Door Alarm
———– ————– ———- ——– ——– —–
0 0:06:0 0:00:00 0 10 UNLOCKED OFF
Enclosure
ID (B:ID:L) Fan Status
———– — ————-
Enclosure
ID (B:ID:L) Power State Status
———– —– ———— ——-
Enclosure
ID (B:ID:L) Slot scsiId Insert Status
———– —- —— ——- ——————————————
0 0:06:0 0 0:00:0 1 OK ACTIVATE
0 0:06:0 1 0:01:0 1 OK ACTIVATE
0 0:06:0 2 0:02:0 1 OK ACTIVATE
0 0:06:0 3 0:03:0 1 OK ACTIVATE
0 0:06:0 4 0:255:0 0 OK UNCONFIG EMPTY I/R READY NOTACTIVATE
Enclosure
ID (B:ID:L) Sensor Temperature Threshold Status
———– —— ———– ——— ——–
0 0:06:0 0 73 F 120 NORMAL
0 0:06:0 1 69 F 120 NORMAL
Above, there is no disk in slot 4, insert the disk again and execute the command again to see it again.
AFA0> enclosure show status
Executing: enclosure show status
Enclosure
ID (B:ID:L) UpTime D:H:M PowerCycle Interval Door Alarm
———– ————– ———- ——– ——– —–
0 0:06:0 0:00:00 0 10 UNLOCKED OFF
Enclosure
ID (B:ID:L) Fan Status
———– — ————-
Enclosure
ID (B:ID:L) Power State Status
———– —– ———— ——-
Enclosure
ID (B:ID:L) Slot scsiId Insert Status
———– —- —— ——- ——————————————
0 0:06:0 0 0:00:0 1 OK ACTIVATE
0 0:06:0 1 0:01:0 1 OK ACTIVATE
0 0:06:0 2 0:02:0 1 OK ACTIVATE
0 0:06:0 3 0:03:0 1 OK ACTIVATE
0 0:06:0 4 0:04:0 1 OK UNCONFIG ACTIVATE
Enclosure
ID (B:ID:L) Sensor Temperature Threshold Status
———– —— ———– ——— ——–
0 0:06:0 0 73 F 120 NORMAL
0 0:06:0 1 73 F 120 NORMAL
You can see that the disk in slot 4 is waiting to be configured. lets take a look at the RAID V container that we are going to add the new disk to, execute “container list”
AFA0> container list
Executing: container list
Num Total Oth Chunk Scsi Partition
Label Type Size Ctr Size Usage B:ID:L Offset:Size
—– —— —— — —— ——- —— ————-
0 RAID-5 101GB 64KB Valid 0:00:0 64.0KB:33.8GB
/dev/sda 0:01:0 64.0KB:33.8GB
0:02:0 64.0KB:33.8GB
0:03:0 64.0KB:33.8GB
Even though it is visible in the enclosure list above, you will need to execute a “controller rescan” to find the new disk.
AFA0> controller rescan
Executing: controller rescan
Now initialize the disk so it can be used.
AFA0> disk initialize 4
Executing: disk initialize (ID=4)
Finally you can add the disk to the container, you will simply run the container reconfigure command with the container number (in our case 0) and the device number (in our case 4)
AFA0> container reconfigure 0 4
Executing: container reconfigure 0 (ID=4)
Now, wait for the disk to rebuild. You can view the rebuild process with €˜task list€™.
AFA0> task list
Executing: task list
Controller Tasks
TaskId Function Done% Container State Specific1 Specific2
—— ——– ——- ——— —– ——— ———
101 Reconfg 8.4% 0 RUN 00000000 00000000
Also while adding a new disk to this array I found that the existing array was only using 25.6 GB of each disk and not the full 36 GB
I issued a “container reconfigure” command to utilize the more space on each disk
AFA0> container reconfigure /partition_size=36388763000 0
Executing: container reconfigure /partition_size=36,388,763,000 0
PowerEdge 1750 RAID Array Repair
PowerEdge 1750 RAID Array Repair
This blog describes some basic commands to repair an array in the case of a failed disk on the Dell 1750 running Linux via Dells afacli command line utility.
The actual manual for afacli can be found here:
http://docs.us.dell.com/support/edocs/storage/57kgr/cli/en/index.htm
Accessing the CLI from the UNIX Prompt
To access the CLI from the UNIX prompt, display a window and type afacli
in any directory. The system displays the FASTCMD>
prompt, which indicates you can now use CLI commands. The path in the startup file (.login
or .cshrc
) must include the directory where the software is installed for the command to work in any directory. See your UNIX documentation for information on setting up directory paths in the .login
and .cshrc
files.
To view all controllers use ‘controller list’
To connect to the controller with the command line utility, execute:
FASTCMD> open afa0
AFA0>
To show the status of all disks in all arrays, execute ‘enclosure show status’.
AFA0> enclosure show status
Executing: enclosure show status
Enclosure
ID (B:ID:L) UpTime D:H:M PowerCycle Interval Door Alarm
———– ————– ———- ——– ——– —–
0 0:06:0 0:00:00 0 10 UNKNOWN OFF
Enclosure
ID (B:ID:L) Fan Status
———– — ————-
0 0:06:0 0 OK
0 0:06:0 1 OK
0 0:06:0 2 OK
Enclosure
ID (B:ID:L) Power State Status
———– —– ———— ——-
Enclosure
ID (B:ID:L) Slot scsiId Insert Status
———– —- —— ——- ——————————————
0 0:06:0 0 0:00:0 0 OK ACTIVATE
0 0:06:0 1 0:01:0 0 OK ACTIVATE
0 0:06:0 2 0:02:0 0 ERROR FAILED CRITICAL WARNING ACTIVATE
0 0:06:0 3 0:03:0 0 ERROR FAILED CRITICAL WARNING ACTIVATE
0 0:06:0 4 0:04:0 0 ERROR FAILED CRITICAL WARNING ACTIVATE
0 0:06:0 5 0:05:0 0 ERROR FAULTY FAILED CRITICAL WARNING ACTIVATE
Enclosure
ID (B:ID:L) Sensor Temperature Threshold Status
———– —— ———– ——— ——–
0 0:06:0 0 82 F 120 NORMAL
0 0:06:0 1 86 F 120 NORMAL
Above, the disk in slot 5 is bad, so first deactivate the slot:
AFA0> enclosure prepare slot 0 5
Wait for the lights to go out, then remove the disk. Replace the disk with a functional, identical replacement disk.
Then, activate the slot:
AFA0> enclosure activate slot 0 5
Now, wait for the disk to rebuild. You can view the rebuild process with ‘task list’.
AFA0> task list
Executing: task list
Controller Tasks
TaskId Function Done% Container State Specific1 Specific2
—— ——– ——- ——— —– ——— ———
101 Rebuild 0.7% 1 RUN 00000000 00000000
The disk may be ready for use once this is complete however if it’s not try the command ‘disk initialize’
disk initialize
To initialize a SCSI disk for use with the currently opened controller, use the disk initialize
command. This command writes data structures to the disk so that the controller can use the disk.
HINT: If you need to actually see which disk to pull out of the server ‘disk blink’ causes the disk drive light to blink.