Opened on 04/30/2015 at 02:27:05 PM
Closed on 05/04/2015 at 10:13:08 AM
#2441 closed defect (fixed)
Revive filter9.adblockplus.org
Reported by: | matze | Assignee: | matze |
---|---|---|---|
Priority: | P1 | Milestone: | |
Module: | Infrastructure | Keywords: | |
Cc: | fred, Kirill, fhd, trev | Blocked By: | |
Blocking: | Platform: | Unknown | |
Ready: | yes | Confidential: | no |
Tester: | Verified working: | no | |
Review URL(s): |
Description
***** Nagios ***** Notification Type: PROBLEM Host: filter9.adblockplus.org State: DOWN Address: filter9.adblockplus.org Info: PING CRITICAL - Packet loss = 100% Date/Time: Thu Apr 30 11:01:11 UTC 2015
Attachments (0)
Change History (6)
comment:1 Changed on 04/30/2015 at 02:30:30 PM by matze
comment:2 Changed on 04/30/2015 at 02:40:14 PM by matze
- Cc Kirill fhd trev added
- Resolution set to fixed
- Status changed from new to closed
Sync is finished, dmesg(1) does not show anything out of the ordinary any more, and the host is back in balancing. Note, however, that most of today's logs now contain a sequence of a few nil-bytes in between.
comment:3 Changed on 05/02/2015 at 03:55:38 PM by matze
- Resolution fixed deleted
- Status changed from closed to reopened
The server went down again; investigation ongoing --
comment:4 Changed on 05/02/2015 at 04:16:53 PM by matze
There's no obvious cause for the host going down, not even a load peak. All logs seem to show nothing out of the ordinary, and no hardware-check has raised any flags. There have been some minor differences in the BIOS setup though (compared to the servers of the same type), namely power management. I've fixed them, but that's somehow a long shot.
The server is now running a stress-test, let's see if that produces any interesting results.
comment:5 Changed on 05/03/2015 at 06:36:51 AM by matze
The stress-test finished without any incident. The host is back in balancing for now, in order to see if it remains instable after the power management changes.
comment:6 Changed on 05/04/2015 at 10:13:08 AM by matze
- Resolution set to fixed
- Status changed from reopened to closed
No further issues so far - thus I consider this ticket done.
I've unregistered the host from balancing and tried to restart it into the rescue system. The hard-reset had no effect, so I've instructed Hetzner to reboot the server manually - here some excerpts from the response:
After that the RAID required a re-sync, which is almost finished by now.