Opened 4 years ago

Closed 4 years ago

#2889 closed defect (fixed)

Swap not recognized by the system

Reported by: matze Assignee:
Priority: P3 Milestone:
Module: Infrastructure Keywords:
Cc: fhd, fred Blocked By:
Blocking: Platform: Unknown / Cross platform
Ready: yes Confidential: no
Tester: Unknown Verified working: no
Review URL(s):

Description

The swap-space (md0) on filter27.adblockplus.org is not recognized: Although the partition itself seems perfectly valid, the kernel does not even attempt to use it in any fashion.

While is not an issue performance-wise, so far the swap is unused on the similar servers anyway, it actually breaks our Nagios checks:

Current Status:      WARNING (for 0d 1h 26m 0s)
Status Informationn: NRPE: Unable to read output
$ python /usr/lib/nagios/plugins/check_memory
Traceback (most recent call last):
  File "/usr/lib/nagios/plugins/check_memory", line 45, in <module>
    swappercentage = round(float(swapfree) / swaptotal * 100)
ZeroDivisionError: float division by zero

Change History (5)

comment:1 Changed 4 years ago by matze

Applied a quick fix to allow for keeping an eye on the load during the current rollout:

$ grep "if swaptotal else" /usr/lib/nagios/plugins/check_memory
  swappercentage = round(float(swapfree) / swaptotal * 100) if swaptotal else 0

comment:2 Changed 4 years ago by matze

  • Resolution set to fixed
  • Status changed from new to closed

The host has been re-installed in the context of #2900, now seemlingly recognizing the swap space without any issues.

comment:3 Changed 4 years ago by matze

  • Resolution fixed deleted
  • Status changed from closed to reopened
  • Summary changed from Swap on filter27 is not recognized by the system to Swap not recognized by the system

This issue actually re-occurs now, this time with filter26. After digging into this a bit, it seems to be caused by swap-partitions that retain fragments from former file systems:

$ sudo wipefs -n /dev/md/0
offset               type
----------------------------------------------------------------
0x410                minix   [filesystem]

0xff6                swap   [other]
                     UUID:  acca00bb-8f13-485b-a969-7f90e02a839b

Interestingly, at boot-time there seems to be a wrong UUID for md0 as well:

$ cat /etc/initramfs-tools/conf.d/resume
RESUME=UUID=2c4f34e2-4803-4753-98ae-ac398223c2e2
$ ls /dev/disk/by-uuid -o
total 0
lrwxrwxrwx 1 root 9 Aug 11 15:58 016625b5-6607-4864-9d70-89d127fcbda6 -> ../../md1
lrwxrwxrwx 1 root 9 Aug 11 15:58 f45fd258-f047-45e1-bc7e-a77e68ea55d8 -> ../../md2

Note that manual activation via swapon(8) works without any issues, but it won't help after the next reboot - unless executed manually again. For now I consider it a better quick-fix than the Python patch above, however.

comment:4 Changed 4 years ago by matze

There seems to be one person who run into the same issue. At least every symptom seems to match, although the answer did not seem to help in that thread. The solution proposed, however, is basically the same: Wiping the device before setting it up again.

comment:5 Changed 4 years ago by matze

  • Resolution set to fixed
  • Status changed from reopened to closed

Disabling the quick-fix:

# swapoff -a

Cleaning the device:

# wipefs -a /dev/md0
2 bytes were erased at offset 0x410 (minix)
they were: 8f 13
10 bytes were erased at offset 0xff6 (swap)
they were: 53 57 41 50 53 50 41 43 45 32

Creating the swap file system:

# mkswap  /dev/md0
mkswap: /dev/md0: warning: don't erase bootbits sectors
        on whole disk. Use -f to force.
Setting up swapspace version 1, size = 16768892 KiB
no label, UUID=addc5b2c-dc9e-4232-aafd-b558db1bb734

Mounting everything according to /etc/fstab:

# mount -a

Verifying the results:

# cat /proc/swaps 
Filename				Type		Size	Used	Priority
/dev/md0                                partition	16768892	0	-1
Note: See TracTickets for help on using tickets.