server

You are currently browsing articles tagged server.

Recently I had to setup a system that would let me know if my servers, switches, printers are up and running. Any more information that I could get from the monitoring system would be a huge PLUS. Status of the services, hard drive sizes, memory usage, Uptime, CPU Usage, Printers status, etc… if anything happens, then system sends me an email so I could start troubleshooting the problem shortly after error/issue occures.
I looked as always for a system that could be open source, community supported and the best in what it supposed to do.

I found NAGIOS.

If you look for somthing free, this is the best thing you can get there, as far as I am aware at this moment.
You can install/configure a lot of different plugins that will allow NAGIOS to monitor a lot of different devices.

Easily, once installed, you can configure monitoring of your servers, printers, switches, using Templates/Examples that are included.

One of the problems that I had to look into was monitoring of CANON printers, and properly modifying Nagios config files for that:
Once you understand a little how NAGIOS works you will understand what I am trying to show you here in example:

in /usr/local/nagios/etc/objects/commands.cfg I added a command:

define command{
          command_name check_snmp_canon
          command_line $USER1$/check_snmp -H $HOSTADDRESS$ -l STATUS -C public $ARG1$ $ARG2$
          }

then in: /usr/local/nagios/etc/objects/printer.cfg you define host and service like this:

define host{
          use          generic-printer
          host_name    SOMECANONPRINTER
          alias        SOMECANONPRINTER Alias
          address      "IP of the printer goes here"
          hostgroups   network-printers
          }
define service{
          use                   generic-service
          host_name             SOMECANONPRINTER
          service_description   Printer Status
          check_command         check_snmp_canon!-o hrDeviceStatus.1 -r "2|3"
          normal_check_interval   10
          retry_check_interval  1
          }

That should work. Make sure you check file configuration the way Nagios suggest to do it, before you attempt to restart nagios.

Right now, I am in a process of configuring a plugin that will allow me to use data that Dell Openmanage gathers on server, and once there is something wrong with hardware nagios will notify me. If you interested in looking into it then take a look at this link.

PS:.
There is one cool product out there: Microsoft System Center Operations Manager. Once I get my hands on it, and will have a chance to play in the system I will write probably a few good words about that.

Locations of visitors to this page