Opened on 07/10/2014 at 04:38:25 AM

Closed on 07/11/2014 at 05:15:05 PM

Last modified on 03/24/2019 at 04:27:12 PM

#765 closed defect (fixed)

Monitoring / Nagios: Inconsistency with documentation

Reported by: matze Assignee: matze
Priority: P3 Milestone:
Module: Infrastructure Keywords:
Cc: Blocked By:
Blocking: #760 Platform: Unknown
Ready: yes Confidential: no
Tester: Verified working: no
Review URL(s):

http://codereview.adblockplus.org/5712550654115840/
http://codereview.adblockplus.org/5189767234846720/

Description

How to reproduce

  1. Clone the infrastructure repository
  2. Check the README.md file
  3. Search for keywords like "Monitoring" or "Nagios" or similar

Observed behaviour

The information about Monitoring and Nagios, especially how to use and test it in a development setup, as documented in infrastructure/README.md, is inconsistent with the actual state:

mhennig@kali:~/AdBlockPlus/infrastructure$ grep -A 20 Monitoring README.md 
Monitoring
----------

Monitoring is fully functional in the development environment:
[https://10.8.0.98/](https://10.8.0.98/)

User name and password are both _nagiosadmin_.

The monitoring service of our production environment runs on
_monitoring.adblockplus.org_. Add yourself to _files/nagios-htpasswd_
in the _private_ module used on the server, or have someone add you if
you don't have access.

The IP mentioned above can be found anywhere in the repository - beside the README.md file:

mhennig@kali:~/AdBlockPlus/infrastructure$ grep -R -i '10\.8\.0\.98' *
./README.md:[https://10.8.0.98/](https://10.8.0.98/)

And in fact there's nothing behind that IP, irregardless of which boxes are up and running.

Expected behaviour

Get up-to-date information from the README.md.

Side notes

Instead of just fixing the IP info, one should also provide a bit more of information in the README.md file. A short hint that "server4" is the (or one?) monitoring server, for example.

Attachments (0)

Change History (9)

comment:1 Changed on 07/10/2014 at 05:05:32 AM by matze

I discovered yet another issue with both commands; the initial vagrant up server4 when the node is initially created, as well as e.g. vagrant provision server4 for updating later on:

err: /Stage[main]/Nagios::Server/Service[nagios3]: Failed to call refresh: Could not start Service[nagios3]: Execution of '/etc/init.d/nagios3 start' returned 1:  at /tmp/vagrant-puppet-2/modules-0/nagios/manifests/server.pp:41

Seems to be a minor issue:

vagrant@server4:~$ sudo /etc/init.d/nagios3 start | grep -A 999 -i error; \
>    echo STATUS: $? .. PIPESTATUS: $PIPESTATUS

Error: Could not add object property in file '/etc/nagios3/conf.d/contactgroups.cfg' on line 5.
   Error processing object config files!


***> One or more problems was encountered while processing the config files...

     Check your configuration file(s) to ensure that they contain valid
     directives and data defintions.  If you are upgrading from a previous
     version of Nagios, you should be aware that some variables/definitions
     may have been removed or modified in this version.  Make sure to read
     the HTML documentation regarding the config files, as well as the
     'Whats New' section to find out what has changed.

 * errors in config!
   ...fail!

STATUS: 0 .. PIPESTATUS: 1

In case it's not so easy I'll open a new issue; otherwise I'll fix it in the context of this one.

comment:2 Changed on 07/10/2014 at 05:07:27 AM by matze

  • Blocking 760 added

comment:3 Changed on 07/10/2014 at 05:53:55 AM by trev

You are correct, server4 is the monitoring server and has the IP address 10.8.0.99. The idea is to rename the remaining serverNN servers to follow the new naming scheme eventually but I couldn't find any existing issue on it.

comment:4 Changed on 07/10/2014 at 09:39:03 AM by matze

Code-reviews for this issue and the caveats discovered in between have been requested:

Unfortunately, it seems like there's yet another error occurring when the server4 box is provisioned:

2014/07/10 08:48:26 [error] 14086#0: *1 connect() to unix:/tmp/php-fastcgi.sock failed (111: Connection refused) while connecting to upstream, client: 10.8.0.1, server: monitoring.adblockplus.org, request: "GET / HTTP/1.1", upstream: "fastcgi://unix:/tmp/php-fastcgi.sock:", host: "10.8.0.99"

However, since that looks a bit more complicated than the other stuff above, I've created a new ticket (#766) rather than misusing this one here any further.

comment:5 Changed on 07/10/2014 at 09:48:27 AM by matze

  • Blocking 760 removed

comment:6 Changed on 07/10/2014 at 10:07:44 AM by trev

  • Priority changed from Unknown to P3
  • Ready set

comment:7 Changed on 07/10/2014 at 10:11:34 AM by trev

  • Review URL(s) modified (diff)
  • Status changed from new to reviewing

comment:8 Changed on 07/11/2014 at 05:15:05 PM by matze

  • Resolution set to fixed
  • Status changed from reviewing to closed

comment:9 Changed on 07/14/2014 at 06:16:33 AM by trev

  • Blocking 760 added

Add Comment

Modify Ticket

Change Properties
Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from matze.
 
Note: See TracTickets for help on using tickets.