Opened 4 years ago
Closed 4 years ago
#3385 closed defect (fixed)
Revive filter2[34].adblockplus.org
Reported by: | matze | Assignee: | fred, matze |
---|---|---|---|
Priority: | P1 | Milestone: | |
Module: | Infrastructure | Keywords: | |
Cc: | fhd, Kirill, sporz, trev | Blocked By: | |
Blocking: | Platform: | Unknown / Cross platform | |
Ready: | no | Confidential: | no |
Tester: | Unknown | Verified working: | no |
Review URL(s): |
Description
***** Nagios ***** Notification Type: PROBLEM Host: filter23.adblockplus.org State: DOWN Address: filter23.adblockplus.org Info: PING CRITICAL - Packet loss = 100% Date/Time: Mon Dec 7 08:18:09 UTC 2015
(Last issue tracked in #3157, though without any obvious cause either.)
Change History (7)
comment:1 Changed 4 years ago by matze
- Cc Kirill sporz added
comment:2 Changed 4 years ago by trev
- Cc trev added
comment:3 Changed 4 years ago by sporz
connections to filter24 currently time out - should that be a ticket of it's own?
comment:4 Changed 4 years ago by matze
- Summary changed from Revive filter23.adblockplus.org to Revive filter2[34].adblockplus.org
<palant> matze: it's two servers [...], we would try it on one first of course <matze> palant: very well, let's go with 24 and use 23 for comparison
@sporz Sorry, forgot to update the ticket yesterday.
comment:5 Changed 4 years ago by matze
Both hosts are back in balancing, filter24 with the latest (driver) packages, and monitored for significant differences in behavior.
comment:6 Changed 4 years ago by matze
No. 23 just crashed:
Notification Type: PROBLEM Host: filter23.adblockplus.org State: DOWN Address: filter23.adblockplus.org Info: PING CRITICAL - Packet loss = 100% Date/Time: Thu Jan 21 18:50:39 UTC 2016
After 5 weeks this is the first incident with any of the servers in question. Analysis pending.
comment:7 Changed 4 years ago by matze
- Resolution set to fixed
- Status changed from new to closed
As expected, the server (resp. it's uplink) went down with the exact same symptoms as before.
I consider this partial confirmation for our former theory, thus updated the drivers and put the host back in balancing. We'll update the remaining ones next week, @fred and I have scheduled another complete deployment of all servers anyway.
When filter23 and filter24 go down their kern.log is full of eth0: link up messages. Judging by my googling, this indicates an issue with the network adapter driver.