[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Why didn't software RAID detect a faulty drive?



On Sat, 18 Jul 2009, Seth Mattinen wrote:
> It's running software raid, so why is it locking up? I managed to

Because the EH can be quite... anoying to other ports in the same
controller, especially if the controller is a shitty one (I don't know if
that's the case).

> Huh, I think to myself, stupid thing didn't work. So I try to
> manually fault it because it didn't figure it out on its own:
> 
> # mdadm /dev/md0 --fail /dev/sdb1
> 
> But that didn't help either:
> 
> [3948838.514699] raid1: Disk failure on sdb1, disabling device.
> [3948838.514702] raid1: Operation continuing on 1 devices.
> [3948846.397781] RAID1 conf printout:
> [3948846.409726]  --- wd:1 rd:2
> [3948846.418353]  disk 0, wo:0, o:1, dev:sda1
> [3948846.430623]  disk 1, wo:1, o:0, dev:sdb1
> [3948846.452002] md: recovery of RAID array md0
> [3948846.464006] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
> [3948846.482338] md: using maximum available idle IO bandwidth (but
> not more than 200000 KB/sec) for recovery.
> [3948846.511468] md: using 128k window, over a total of 78148096 blocks.
> [3948846.530732] md: resuming recovery of md0 from checkpoint.
> [3948846.547400] md: md0: recovery done.
> [3948846.568079] RAID1 conf printout:
> [3948846.576383]  --- wd:1 rd:2
> [3948846.585012]  disk 0, wo:0, o:1, dev:sda1
> [3948846.597284]  disk 1, wo:1, o:0, dev:sdb1
> [3948846.617747] md: recovery of RAID array md0
> [3948846.630524] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
> [3948846.648489] md: using maximum available idle IO bandwidth (but
> not more than 200000 KB/sec) for recovery.
> [3948846.678992] md: using 128k window, over a total of 78148096 blocks.
> [3948846.696152] md: resuming recovery of md0 from checkpoint.
> [3948846.715443] md: md0: recovery done.
> 
> This kept repeating until I pulled the plug. Luckily it remembered
> it should stay faulted when it came back up.
> 
> It's running stable/lenny. What happened here?

I have never seen anything like this.  Do you have any daemons trying to do
"hotspare" services for MD?  THAT could be it...

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh


Reply to: