Opened 5 years ago

Closed 5 years ago

#2342 closed defect (fixed)

Change hard-drive on filter12.adblockplus.org

Reported by: matze Assignee: matze
Priority: P2 Milestone:
Module: Infrastructure Keywords:
Cc: fhd, trev, fred Blocked By:
Blocking: #2343 Platform: Unknown
Ready: yes Confidential: no
Tester: Verified working: no
Review URL(s):

Description

The RAID on filter12.adblockplus.org has failed:

# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md1 : active raid1 sda2[2](S) sdb2[1]
      523968 blocks super 1.2 [2/1] [_U]
      
md2 : active raid1 sda3[2] sdb3[1]
      483532672 blocks super 1.2 [2/2] [UU]
      
md0 : active raid1 sda1[2] sdb1[1]
      4192192 blocks super 1.2 [2/2] [UU]
      
unused devices: <none>
# mount | grep md1
/dev/md1 on /boot type ext3 (rw)

Adding /dev/sda2 results in an error during synchronization and thus aborts prematurely. Yet it's obviously /dev/sdb causing the issue:

# dmesg | tail
[383046.897404] end_request: I/O error, dev sdb, sector 9440270
[383046.897450] md/raid1:md1: sdb: unrecoverable I/O read error for block 1047040
[383046.897463] ata4: EH complete
[383047.213276] RAID1 conf printout:
[383047.213289]  --- wd:1 rd:2
[383047.213296]  disk 0, wo:1, o:1, dev:sda2
[383047.213300]  disk 1, wo:0, o:1, dev:sdb2
[383047.220068] RAID1 conf printout:
[383047.220076]  --- wd:1 rd:2
[383047.220082]  disk 1, wo:0, o:1, dev:sdb2
# smartctl -x /dev/sdb | grep -B1 -i raw_read_error_rate
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR--   100   100   051    -    165
# smartctl -x /dev/sda | grep -B1 -i raw_read_error_rate
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR--   100   100   051    -    0

I've had removed the host from balancing already, and now backed up the logs (scp -p to preserve timestamps etc). Hetzner has been instructed to replace the HDD.

Change History (4)

comment:1 Changed 5 years ago by matze

Note that bringing the box back up afterwards may become a bit tricky - /boot will most likely be corrupted because we can only use the /dev/sda2 data, which popped from the RAID before. Thus I will probably modify #2341 to require only a new RAID setup for server_15.adblockplus.org and create a new ticket to reinstall filter12 instead (copying the TODO list).

comment:2 Changed 5 years ago by matze

By the way, please note that this server has just been replaced by Hetzner last week or so - except for the hard-drives. So /dev/sda will be the only original component after the operation..

comment:3 Changed 5 years ago by matze

  • Blocking 2343 added

comment:4 Changed 5 years ago by matze

  • Resolution set to fixed
  • Status changed from new to closed

The hard-drive has been replaced, the server must become re-installed (see above).

Note: See TracTickets for help on using tickets.