The product for which Nagios was used to monitor was:
- Brightmail back-end 7/24 Operation
- Brightmail heuristics engine and development infrastructure
The ability to write your own plugins is the most valuable feature, but also it's nice that you can rudimentary monitoring primarily for Linux systems right out of the box.
Improvements to My Organization
We spend less time verifying that everything is up and running as Nagios does that for us leaving us time to do the other things. When something does break Nagios directed you quickly to the cause. With Nagios enabled, disruptions were less frequent and attended to more quickly.
Room for Improvement
Scaling Nagios to cover multiple regions or data-centers is challenging. It requires another tool which I never incorporated. Due to this gap I used a dedicated Nagios servers within each specific operation.
Use of Solution
I've used Nagios since 2000. Before that it was called BigBrother and NetSaint which I hadn't used.
Implementation on Windows was painful. Also the use of NRPE can be problematic as its generally not inherent is OSes.
We have had no issues with the stability.
There have been no issues scaling it.
Customer Service and Technical Support
I didn't use customer service and technical support for Nagios. Everything I did I learned online through the extended community.
Prior to using Nagios for monitoring we had grown our own monitoring solution which latter became the company NOCpulse and was picked up by RedHat. Previous to that I used various other homegrown monitoring methods.
The initial setup is straightforward, but you need to have a good grasp of the underlying files structure. All the pieces are there but without this understanding where to put things is not entirely intuitive.
I always installed Nagios by myself. I never used a team. Advice I would have is you need management buy in. More than a few times I would implement this solution but without managements support it got little traction upfront. Meanwhile management pursued pricey solutions which were cumbersome and had long implementation cycles.
Pricing, Setup Cost and Licensing
I can implement Nagios is a day for a medium sized (500 units) operation. Since the cost is zero and it can use a fairly cheap server to run on the ROI is nearly immediate.
Other Solutions Considered
Currently I use New Relic and Munin to track and maintain the operation I run. New Relic however isn't designed to alert like Nagios and they have told me as such.
Find a site where Nagios working and look at their implementation. Understand the file structure, dependencies and implementation. Start with the basics and add as needed. Identify what needs to be monitored and why.
Also, Nagios does not do trending/graphing well. I used the Nagios version of MRTG and it was pretty awful. I incorporated Cacti which partners well with Nagios. Without both tools you don't have a good understanding of how your operation is functioning.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Apr 30 2016