Opened 5 years ago

Closed 5 years ago

Last modified 8 months ago

#765 closed defect (fixed)

Monitoring / Nagios: Inconsistency with documentation

Reported by: matze Assignee: matze
Priority: P3 Milestone:
Module: Infrastructure Keywords:
Cc: Blocked By:
Blocking: #760 Platform: Unknown
Ready: yes Confidential: no
Tester: Verified working: no
Review URL(s):

http://codereview.adblockplus.org/5712550654115840/
http://codereview.adblockplus.org/5189767234846720/

Description

How to reproduce

  1. Clone the infrastructure repository
  2. Check the README.md file
  3. Search for keywords like "Monitoring" or "Nagios" or similar

Observed behaviour

The information about Monitoring and Nagios, especially how to use and test it in a development setup, as documented in infrastructure/README.md, is inconsistent with the actual state:

mhennig@kali:~/AdBlockPlus/infrastructure$ grep -A 20 Monitoring README.md 
Monitoring
----------

Monitoring is fully functional in the development environment:
[https://10.8.0.98/](https://10.8.0.98/)

User name and password are both _nagiosadmin_.

The monitoring service of our production environment runs on
_monitoring.adblockplus.org_. Add yourself to _files/nagios-htpasswd_
in the _private_ module used on the server, or have someone add you if
you don't have access.

The IP mentioned above can be found anywhere in the repository - beside the README.md file:

mhennig@kali:~/AdBlockPlus/infrastructure$ grep -R -i '10\.8\.0\.98' *
./README.md:[https://10.8.0.98/](https://10.8.0.98/)

And in fact there's nothing behind that IP, irregardless of which boxes are up and running.

Expected behaviour

Get up-to-date information from the README.md.

Side notes

Instead of just fixing the IP info, one should also provide a bit more of information in the README.md file. A short hint that "server4" is the (or one?) monitoring server, for example.

Change History (9)

comment:1 Changed 5 years ago by matze

I discovered yet another issue with both commands; the initial vagrant up server4 when the node is initially created, as well as e.g. vagrant provision server4 for updating later on:

err: /Stage[main]/Nagios::Server/Service[nagios3]: Failed to call refresh: Could not start Service[nagios3]: Execution of '/etc/init.d/nagios3 start' returned 1:  at /tmp/vagrant-puppet-2/modules-0/nagios/manifests/server.pp:41

Seems to be a minor issue:

vagrant@server4:~$ sudo /etc/init.d/nagios3 start | grep -A 999 -i error; \
>    echo STATUS: $? .. PIPESTATUS: $PIPESTATUS

Error: Could not add object property in file '/etc/nagios3/conf.d/contactgroups.cfg' on line 5.
   Error processing object config files!


***> One or more problems was encountered while processing the config files...

     Check your configuration file(s) to ensure that they contain valid
     directives and data defintions.  If you are upgrading from a previous
     version of Nagios, you should be aware that some variables/definitions
     may have been removed or modified in this version.  Make sure to read
     the HTML documentation regarding the config files, as well as the
     'Whats New' section to find out what has changed.

 * errors in config!
   ...fail!

STATUS: 0 .. PIPESTATUS: 1

In case it's not so easy I'll open a new issue; otherwise I'll fix it in the context of this one.

comment:2 Changed 5 years ago by matze

  • Blocking 760 added

comment:3 Changed 5 years ago by trev

You are correct, server4 is the monitoring server and has the IP address 10.8.0.99. The idea is to rename the remaining serverNN servers to follow the new naming scheme eventually but I couldn't find any existing issue on it.

comment:4 Changed 5 years ago by matze

Code-reviews for this issue and the caveats discovered in between have been requested:

Unfortunately, it seems like there's yet another error occurring when the server4 box is provisioned:

2014/07/10 08:48:26 [error] 14086#0: *1 connect() to unix:/tmp/php-fastcgi.sock failed (111: Connection refused) while connecting to upstream, client: 10.8.0.1, server: monitoring.adblockplus.org, request: "GET / HTTP/1.1", upstream: "fastcgi://unix:/tmp/php-fastcgi.sock:", host: "10.8.0.99"

However, since that looks a bit more complicated than the other stuff above, I've created a new ticket (#766) rather than misusing this one here any further.

comment:5 Changed 5 years ago by matze

  • Blocking 760 removed

comment:6 Changed 5 years ago by trev

  • Priority changed from Unknown to P3
  • Ready set

comment:7 Changed 5 years ago by trev

  • Review URL(s) modified (diff)
  • Status changed from new to reviewing

comment:8 Changed 5 years ago by matze

  • Resolution set to fixed
  • Status changed from reviewing to closed

comment:9 Changed 5 years ago by trev

  • Blocking 760 added
Note: See TracTickets for help on using tickets.