We just raised a $30M Series A: Read our story

Nagios Core OverviewUNIXBusinessApplication

Nagios Core is the #16 ranked solution in our list of Infrastructure Monitoring tools. It is most often compared to Zabbix: Nagios Core vs Zabbix

What is Nagios Core?

This is IT infrastructure monitoring's industry-standard, open-source core. Free without professional support services.

Buyer's Guide

Download the Network Monitoring Software Buyer's Guide including reviews and more. Updated: October 2021

Nagios Core Customers
Airbnb, Cisco, PayPal, FanDuel
Nagios Core Video

Archived Nagios Core Reviews (more than two years old)

Filter by:
Filter Reviews
Industry
Loading...
Filter Unavailable
Company Size
Loading...
Filter Unavailable
Job Level
Loading...
Filter Unavailable
Rating
Loading...
Filter Unavailable
Considered
Loading...
Filter Unavailable
Order by:
Loading...
  • Date
  • Highest Rating
  • Lowest Rating
  • Review Length
Search:
Showingreviews based on the current filters. Reset all filters
Manoj Nair
Tech Specialist at Select Softwares
Real User
Top 5Leaderboard
Alerts to network element errors, but the core version is no match for the XI version

Pros and Cons

  • "The most valuable feature is the performance parameters of the system."
  • "The core version is no match for the XI version."

What is our primary use case?

The primary use for this solution is basic network monitoring of a MPLS network.

How has it helped my organization?

Nagios Core informs me when my network elements are misfiring.

What is most valuable?

The most valuable feature is the performance parameters of the system. Nagios Core ill continuous monitor the endpoint for most common parameters on an obsessive mode and that will give you a better insight into the endpoint operating conditions

Please bear in mind that the Core edition is very limited in capability/features unlike the Enterprise XI . Inspite of this limitation it is a fantastic product to use at zero cost

I had deployed Nagios Core to monitor my MPLS network endpoints like routers and switches and also my firewalls and printers

It does a very good job I must say

An analogous product if I may refer to is Spiceworks which is a free tool for IT inventory and does an excellent job of monitoring inventory and reporting on the software etc on the endpoint

What needs improvement?

The core version is no match for the XI version. But the OEM should consider introducing some of the features of the XI version in the core version so that potential customers are actually compelled to to consider upgrading to the enterprise version

For how long have I used the solution?

I have been using this solution for eight years.

What do I think about the stability of the solution?

Supremely stable if you right size and maintain the system periodically

What do I think about the scalability of the solution?

My old instance still runs on a Esxi VM with zero support on the hypervisor or the Nagios system its just a 2GB vRAM and single vCPU with about 40 GB storage and handles about 70 odd hosts and runs just fine apart from the log files maintenance its easy

How are customer service and technical support?

No vendor involvement 

Which solution did I use previously and why did I switch?

Tried many other similar products all demo versions hence did not take to production

How was the initial setup?

Pretty straightforward to set up and configure basic system

What about the implementation team?

I did it myself so no vendor is involved

What was our ROI?

Can't say coz my cost of investment is only personal effort , no money so its 100% ROI

What's my experience with pricing, setup cost, and licensing?

Zero cost for the core edition but you need to know linux based apps configuration a=in general but in this case the installation and configuration guides make things a lot easy . A bit of patience and clear thinking you may end up doing a lot more in your setup than you expected !!!!

Which other solutions did I evaluate?

Tried many other similar products all demo versions hence did not take to production

What other advice do I have?

Definitely try it out if you have zero budget 

Even if you don't have a budget restriction please do give it a try

Disclosure: I am a real user, and this review is based on my own experience and opinions.
SS
Computer Engineer at a tech services company with 501-1,000 employees
Real User
A feature-rich solution with valuable plugins and automatically escalating alerts

Pros and Cons

  • "I like the way the solution sends alerts and how it keeps on escalating them."
  • "I would like to see more training videos."

What is our primary use case?

We use the solution to monitor our IT infrastructure, like servers, the network, and things like that.

What is most valuable?

I like the way the solution sends alerts and how it keeps on escalating them. I also find
the plugins by which you can easily add the divisions valuable.

What needs improvement?

I am satisfied, but I think there is a little bit of improvement that can be made.

Lessening the price point would be an improvement.

I would like to see more training videos. It is a vast product and it covers so many areas and so many kinds of devices, so I do understand that it's a challenge when you want some kind of integration, or add a plugin, to always have documentation. But, yeah, as much as possible on the documentation, if it can be done better, that would be good.

If there was more application monitoring, it would be much better.

For how long have I used the solution?

I've been using the solution for one year.

What do I think about the stability of the solution?

I find the solution stable. I don't have any complaints in regards to the instability of the product. We have used this product now for quite some time and we are happy with it.

How are customer service and technical support?

For technical support, I think I would try to rate it somewhere around seven out of ten.

Which solution did I use previously and why did I switch?

No, this is the first one that we started using. There was nothing that we have to complain about here from the past experience.

How was the initial setup?

The initial setup was actually done by one of the vendors so we were one of the partners who collaborated with this for the installation. We were working together to do the installation and then they handed it over to us and then we took it over from there.

What other advice do I have?

In terms of advice, I'd say that you need to know what the plan is and try to understand from which direction you are going to monitor. And, to understand what additional things you'll probably want to do from your side, like putting in scripts and other kinds of automation. So the planning is everything. If there is a particular tool you want to integrate with those things have to be properly planned beforehand.

With the number of features that it has and the ease of integration, I would rate the solution somewhere close to nine out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Find out what your peers are saying about Nagios, Zabbix, Centreon and others in Network Monitoring Software. Updated: October 2021.
541,708 professionals have used our research since 2012.
AA
Software Engineer at a transportation company with 10,001+ employees
Real User
Improves memory and disc space usage, but is not user friendly

Pros and Cons

  • "Nagios monitors our servers, so we know if anything goes wrong and can solve the problem before it happens."
  • "It's not that easy to install the product itself. Also, the UI is a bit hard for regular users to navigate through."

What is our primary use case?

We used Nagios Core to monitor our servers in other countries. Our main server is in Cairo, while we monitor other servers in Germany, which are hosting Jenkins and other web services to make sure that the infrastructure is stable and if anything goes wrong it reports it automatically.

How has it helped my organization?

Before using this solution, sometimes Jenkins went down and we didn't know the reason. We eventually discovered that the issue was disc space that exceeded a certain percentage. Now that we have Nagios to monitor the servers, we know if anything goes wrong it can solve the problem before it happens.

What is most valuable?

The most valuable features to us are the ability to improve memory usage, disc space usage, and the PDU load of each node.

What needs improvement?

It's not that easy to install the product itself. Also, the UI is a bit hard for regular users to navigate through. In addition, I would appreciate an FNP server for sending emails, which now depends on the resting servers for Nagios Core. If it comes with its own FNP server, it would be much better. Also, if it can be installed in other cores, that would be awesome but right now it only uses Linux.

Alias excavation and configurations from the wall rather than the server itself would be great improvements. Also, general UI enhancements and better UX, user experience.

For how long have I used the solution?

We've been using Nagios Core for four months.

What do I think about the stability of the solution?

It's stable because it's a Linux based code, which is very basic. It doesn't have many big features, so it's stable. You can add a node in less than half an hour, I think.

What do I think about the scalability of the solution?

We're only currently using Nagios Core on one to ten servers. In the future, we may add more nodes.

How are customer service and technical support?

I haven't tried to contact support. I was searching on the support forums, but that was not for me. I tried many solutions from the support forums. One of them is working, but only after a long time.

How was the initial setup?

The initial setup was complex, mainly because it was in Linux and had many packages that we're not used to. I had to install them one by one on the app to configure the complication on the app that was solved to authenticate Nagios on the central app. It comes with regular users in files and in order to authenticate, you have to make a lot of confirmations, using Apache as well as Nagios. This was all very hard, and it took me a week to configure it.

I think deployment took about two weeks at the most. We did the deployment by ourselves. We have two people for deployment and maintenance.

What's my experience with pricing, setup cost, and licensing?

Nagios Core is free to use.

What other advice do I have?

I would rate Nagios Core as seven out of ten because it was hard to configure and the implementation process itself took about two weeks. Also, the UI is not friendly. Other products have features that aren't included in Nagios Core. I think that one was the easiest to restore. Also, Nagios supports only Linux, not A/UX. It can't be installed on the servers. If they supported all of these things, it would be much better.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
DS
Corporate Infrastructure Manager with 1,001-5,000 employees
Real User
This is the open source product, so it's a toolkit rather than a complete solution

What is our primary use case?

Monitoring the critical services and network environment for a large multi-site company.  It is also used for troubleshooting issues and capacity planning.

How has it helped my organization?

Nagios allows IT staff and end-users to see the status of critical services on the network. It also can alert and notify selected users if critical services fail, reducing the mean time to recover.

What is most valuable?

Availability of additional plugins like SNMP for instant alerts and PNP4Nagios for graphs make this a powerful solution.

What needs improvement?

This is the open source product, so it's a toolkit rather than a complete solution. See Nagios XI for a more complete version. 

For how long have I used the solution?

More than five years.

What is our primary use case?

  • Monitoring the critical services and network environment for a large multi-site company. 
  • It is also used for troubleshooting issues and capacity planning.

How has it helped my organization?

Nagios allows IT staff and end-users to see the status of critical services on the network. It also can alert and notify selected users if critical services fail, reducing the mean time to recover.

What is most valuable?

Availability of additional plugins like SNMP for instant alerts and PNP4Nagios for graphs make this a powerful solution.

What needs improvement?

This is the open source product, so it's a toolkit rather than a complete solution. See Nagios XI for a more complete version. 

For how long have I used the solution?

More than five years.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
ITCS user
Senior Systems Architect at Rezgateway
Real User
It prevents disasters long before they can take place

Pros and Cons

  • "It has made the life of the network operations staff more proactive in managing the resources of the infrastructure. It prevents disasters long before they can take place."
  • "It is a bit slow due to latency."

What is our primary use case?

We use Nagios to monitor hundreds of CentOS cloud servers (and a few Legacy Windows servers). Nagios is monitoring well over 5000 service endpoints. Some plugins were handwritten in PHP, Perl, Python, Java and Bash.

How has it helped my organization?

It has made the life of the network operations staff more proactive in managing the resources of the infrastructure. It prevents disasters long before they can take place.

What is most valuable?

  • Historical Alert records/data
  • Plugins
  • Data sources (MySQL)
  • Grouping of services and servers

We use the Alerting and Graphing to minimize the downtime. The old RRD Graph module is now used by Grafana. We outgrew the old PNP4Nagios a few weeks back. 

What needs improvement?

The GUI of the Core is still a long way off, but the features are 100 percent above average. It would be great to see better UI themes which could be configured by Netadmin or instructions that help combine graphs and Nagios.

For how long have I used the solution?

More than five years.

What do I think about the stability of the solution?

I have never had stability issues. Nagios has been stable for over 10 years. Although, we never left it running for more than two weeks without uploading new services, plugins, and threshold changes, then restarting it..

What do I think about the scalability of the solution?

No scalability issues, though it is a bit slow due to latency. However, after tweaking the Nagios and off-loading the graphing to NPCD, I was able to scale the Nagios to more than 5000 services checks with 0.5s latency.

How are customer service and technical support?

The end-users love quick alerting and Grafana dashboards.

Which solution did I use previously and why did I switch?

We did not previously use a different solution. Nagios was the first solution that we started using 10 years ago.

How was the initial setup?

Since its Nagios, it is a bit time consuming, but worth the effort. It took a few hours setting up the entire environment, including RRD, PHP, Apache, Nagios, PNP4Nagios, Perl, Python, OpenSSL, etc.

What about the implementation team?

We did an in-house installation.

What was our ROI?

We have saved a lot of time, money, and effort in reducing disaster times, which is owed to Nagios quick alerting.

What's my experience with pricing, setup cost, and licensing?

The Nagios Core (PNP4Nagios + Core) is free and can be setup by Netadmin within a few hours. The only additional cost is the cloud server. 

Which other solutions did I evaluate?

10 years ago, there were not too many options.

What other advice do I have?

There are thousands upon thousands of plugins. This is a winning product. Nothing can match the plugins, even I have contributed about six plugins.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
it_user873816
Strategic Staffing Solutions at a retailer with 1,001-5,000 employees
Real User
Monitors our entire production environment, alerts us to any issues that may occur

Pros and Cons

  • "Key features include the GUI interface, its notification capabilities, and the real-time reporting."
  • "Making it a little easier to configure and set up from the start would help. There are multiple layers that you have to wade through to be able to set it up, to do it the right way, and to get it to do what you want it to do."

What is our primary use case?

It's monitoring all of our production environment and alerting us to any issues that might pop up.

How has it helped my organization?

The benefits are that it's free and it allows us to monitor all of our production. So it gives us a comfort level of knowing that if there is a problem that pops up, we get notified.

What is most valuable?

  • GUI interface
  • Notification capabilities
  • Real-time reporting

What needs improvement?

In terms of any further features, that would bump us into their paid product. For what we get and what we use, and all the libraries that are available, it's pretty robust.

However, for the version we're using, making it a little easier to configure and set up from the start would help. There are multiple layers that you have to wade through to be able to set it up, to do it the right way, and to get it to do what you want it to do.

For how long have I used the solution?

Three to five years.

What do I think about the stability of the solution?

It has been rock solid.

What do I think about the scalability of the solution?

So far, it has met our needs. As far as I know, from reading blogs and the like, it can go to many more servers and even multiple data centers. It seems like it's pretty scalable.

How are customer service and technical support?

Tech support has been pretty good. There have been a couple of occasions where we've had to pick up the phone and call, and for the most part, they are very prompt, very quick, very responsive.

Which solution did I use previously and why did I switch?

I had used this solution before and there wasn't anything in place here that was any good, so it was a no-brainer for me.

My most important criteria when selecting a vendor are 

  • flexibility
  • supportability
  • scalability.

How was the initial setup?

I set the whole thing up. It wasn't complex, it was just that I had to do a lot of planning. If you follow your plan then you won't end up in trouble. If you deviate from the plan, you are going to have trouble.

What other advice do I have?

We have the ability right now to see and create reports to tell whether or not we're meeting our SLAs on our production servers, through it. That is something that we wrote and implemented as a plug-in.

I would rate this solution a nine out of 10 because it's relatively easy to implement and the cost is great, it's free.

My advice would be, save yourself a lot of time - go get it and install it.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
ITCS user
Network and System Engineer at a tech services company with 1,001-5,000 employees
Consultant
The poller is really good, I can easily implement new stuff and it is scalable.

What is most valuable?

The poller is really good, I can easily implement new stuff and it is scalable.

How has it helped my organization?

Monitoring is the most important thing to avoid any production issue. It's important to get some alerts on the server and network devices. It's the day-to-day management to avoid any production issues.

For how long have I used the solution?

I have used Nagios with the Adagios interface for two years.

What was my experience with deployment of the solution?

I have not yet encountered any issues with deployment, stability or scalability.

Which solution did I use previously and why did I switch?

I used Centreon/Nagios for eight years before I chose Nagios with the Adagios interface to simplify day-to-day configuration.

What about the implementation team?

Implementation requires knowledge on the production.

What's my experience with pricing, setup cost, and licensing?

There is no license cost, just a cost in time.

What other advice do I have?

There is a large Nagios community for new sensors, etc.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Nishith Vyas
Sr. System Administrator at Guj Info Petro Limited
Real User
Top 5
Its code is lightweight and it has easy-to-manage plug-ins.

Valuable Features:

The most valuable features of this product are the very lightweight code and easy-to-manage plug-ins.

Apart from the main Nagios core engine, one can add several APIs and add-ons to make the Nagios engine more stronger without compromising it's performance.

Each config parameter can be easily tuned as per individual's need. Many proven frontends are available to get performance related output from the Nagios engine.

Also, Nagios can work without any database back end. It generates a single file for each day and maintains it in a separate directory until a Linux System Admin removes it manually. In Nagios terms, it is called archiving. When someone wants an availability report for a particular server for the last year, Nagios simply fetches all of the relevant files and outputs the data within the shortest period of time. There is no need to query any database to get historical data, which puts extra burden on CPU and memory.

Improvements to My Organization:

Using Nagios, I'm managing more than 1000 services, which involves the following operating nodes:

  • IBM AIX
  • Red Hat Enterprise Linux
  • HP-Unix
  • Windows enterprise-grade OS
  • Cisco router/switches
  • FortiGate and WatchGuard firewalls
  • APC UPS systems
  • Many more...

The majority of the above nodes support SNMP v1/2, thru which one needs to tune up the monitoring plug-in as needed.

Room for Improvement:

Considering my utilization of Nagios on a daily basis, it would be really great if Nagios can concentrate on the following areas of improvement:

  • Custom availability report and export as PDF
  • Nagios SLA. I'm currently working on Nagios Digger, which has many code-level problems. In my present configuration, I've observed PHP level coding issues. I'm able to fetch all Nagios data into the Nagios Digger database (mariadb in rhel7) successfully, but found difficulties fetching and replicating it into the PHP front end. I've already contacted its author and coordination is in progress to make it available for the community.
  • SMS tool integration with Nagios

Use of Solution:

I have been using Nagios for more than five years.

Stability Issues:

In Nagios Core, I haven't had any minor problems in terms of stability. If any did arise, I never knew about it....!!!

Customer Service:

I require less customer service because I am using an open source product. But, sincere thanks to the Nagios community for providing excellent and prompt support as and when required.

Initial Setup:

Initial setup was very straightforward. Just check the official Nagios website (www.nagios.org) for installation instructions.

Implementation Team:

If a person has basic or in-depth level knowledge of all required network/server equipment, than he/she can easily implement Nagios.

Also, it is advisable to have proper knowledge of SNMP v1/2/3 with Nagios agents for Windows and Unix-like OS.

Other Solutions Considered:

I started learning and configuring out data centre monitoring system by using Zabbix and openNMS. But, finally I selected Nagios due to its very large user community and maximum tunable parameters.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
ITCS user
LAN/Wlan Administrator at a construction company
Vendor
It allows you to write your own plugin if you have no alternatives and you need to have things under control.

What is most valuable?

Alerting and proactive monitoring are invaluable. It also allows you to write your own plugin if you have no alternatives and you need to have things under control.

How has it helped my organization?

It sends alerts to the right people which ensures there is no delay in the correct person or team looking into issues. It's helped to reduce downtime in the production environment to almost zero. Due to this, we now spend less time on network or server administration.

Sometimes we have had downtimes because of stupid problems such as a service suddenly stopping without any reason, or a SQL server datababase or logs growing too big, and too fast, network devices failure etc. With Nagios, this has been reduced.

What needs improvement?

The configuration and reporting modules need to be improved. I'd like to have them include a basic install package, and if you don't like the packages included, have the ability to can replace them with different ones from the Nagios plug-in site.

For how long have I used the solution?

I've been using it for six years, and we're currently planning to upgrade it.

What was my experience with deployment of the solution?

We have had no issues with the deployment.

What do I think about the stability of the solution?

I'm running Nagios with CentOS 5 on one server with no problems. I'm going to update the server because of plugin requirements.

What do I think about the scalability of the solution?

I've not had to scale it.

Which solution did I use previously and why did I switch?

This was our first monitoring solution. We were looking for a way to monitor IT structure as some years ago we began deploying a few servers and switches. Over the years, computers have spread everywhere into our offices and factory, so we needed a way to check for systems/network availability 24x7. Nagios Core was a good solution to start with.

How was the initial setup?

I read the manual, set up a test server, and performed some tests. After our initial setup, I then added more servers and any IP device on our network that had SNP support (switches, sensors, printers and so on). It was my first time working with a Unix environment, and it didn't take so much time to set it up.

What about the implementation team?

I did it myself.

What's my experience with pricing, setup cost, and licensing?

There's no licensing costs, and I feel that monitoring is not essential software for a management board, at least until bad things happen then it becomes invaluable.

What other advice do I have?

Take your time to understand how it works. Then start monitoring a small number of assets in your department and then add some device/server every day. It takes some time to tune all your checks. Once done you'll have almost everything under control. I even managed to adapt some plugins to suit my needs.

Nagios is a good choice for network monitoring. It's up to you decide if you need assistance of skilled people or try it by yourself. I was curious about Linux and Nagios was a good reason to start working with it.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
BalajiDorairajan
Manager - Service Management ( Event & Capacity ) with 1,001-5,000 employees
Real User
It is stable and fit for purpose. Setup is bit complicated due to the large set of libraries it needs.

Valuable Features:

It is stable and fit for purpose. Various plugins are available as per your need in the open source marketplace to use and customize according to the need.

Improvements to My Organization:

We use other enterprise products for most monitoring activities. However, Nagios has been the product to go to if we need a cost effective solution that can directly fit our needs. We use it in our business operations center to view dashboards (i.e. it provides a Google map view) of critical systems within stores.

Room for Improvement:

Setup is bit complicated due to the large set of libraries it needs, but this may be because it's open source.

Use of Solution:

We use only Nagios Core.

Deployment Issues:

We have had no issues with the deployment.

Stability Issues:

Stability wise it just works without any major maintenance.

Scalability Issues:

It's highly scalable and will scale according to your needs.

Implementation Team:

It is an open source tool which provides capability to customize it according to the needs. So internally you need to have expertise to consume its servers unless you go for the paid options.

Cost and Licensing Advice:

You should be sure that you have the expertise to customize it, and I would recommend the paid-for Nagios XI for the additional support.

Other Advice:

This is a fit for purpose product which means that if you have a definite list of requirements and are not willing, or unable, to spend money on big enterprise tools, then Nagios is a tool to go to. Also, any changes to the customization means that you need to have the skill sets internally within the organisation to effectively use it. Otherwise it's a great open source product.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
ITCS user
System Administrator at a tech services company with 51-200 employees
Consultant
We like the automatic alerting functions. Also, there are a lot of free monitoring modules available.

What is most valuable?

We like the automatic alerting functions. Also, there are a lot of free monitoring modules available for any purpose you may need.

How has it helped my organization?

We've got a medium size distributed system with a lot of locally installed machines, and Nagios provides us with a solid and reliable monitoring solution for these machines.

What needs improvement?

They should simplify the features so it becomes easier to setup out of the box.

For how long have I used the solution?

I've been using it for about five years.

What do I think about the stability of the solution?

There have been no performance issues.

What do I think about the scalability of the solution?

It's been able to scale for our needs.

Which solution did I use previously and why did I switch?

Nagios was our first solution. At the time, we read that Nagios was the industry standard solution, so we chose it.

How was the initial setup?

It is a complex software with many-many features, so the setup not an easy task.

What about the implementation team?

We did it in-house.

What's my experience with pricing, setup cost, and licensing?

We are using the free version.

What other advice do I have?

The free version is good enough for most people, but it is somewhat hard to make it a working solution.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
ITCS user
Site Reliability Engineer at a tech company with 10,001+ employees
Real User
The integrations with other tools has improved our monitoring.

What is most valuable?

Gathering data from various machines easily without worrying about the underlying OS or technology.

How has it helped my organization?

We can get real time statistics of our servers which improves our monitoring. The integration of Nagios with other tools makes our monitoring way better than what we previously had in place.

What needs improvement?

It would be nice if it was hosted in cloud. Also, they need to improve the graphs.

For how long have I used the solution?

I've been using it for four years.

What was my experience with deployment of the solution?

We have had no issues with the deployment.

What do I think about the stability of the solution?

There have been no performance issues.

What do I think about the scalability of the solution?

What is most valuable?

Gathering data from various machines easily without worrying about the underlying OS or technology.

How has it helped my organization?

We can get real time statistics of our servers which improves our monitoring. The integration of Nagios with other tools makes our monitoring way better than what we previously had in place.

What needs improvement?

It would be nice if it was hosted in cloud. Also, they need to improve the graphs.

For how long have I used the solution?

I've been using it for four years.

What was my experience with deployment of the solution?

We have had no issues with the deployment.

What do I think about the stability of the solution?

There have been no performance issues.

What do I think about the scalability of the solution?

It's been able to scale for our needs.

How are customer service and technical support?

I haven't had the need to use technical support.

Which solution did I use previously and why did I switch?

This is our first infrastructure monitoring tool.

How was the initial setup?

It was straightforward.

What about the implementation team?

We did it in-house. There is no extra effort needed if you just go through the regular installation instructions.

What's my experience with pricing, setup cost, and licensing?

We are using the free version.

What other advice do I have?

You should go ahead and try it.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
ITCS user
Graduate Linux System Administrator at a tech services company with 10,001+ employees
Consultant
It has helped the companies I've worked with to achieve an acceptable level of monitoring. It does not work in a distributed fashion.

Valuable Features:

Extensibility is good as this makes it easy to write checks on your own. Also, it's light as since v4 it isn't resource heavy.

Improvements to My Organization:

No IT company would survive without a monitoring system. Nagios helped the companies I've worked with to achieve an acceptable level of monitoring.

Room for Improvement:

Nagios was not built with scalability in mind. It does not work in a distributed fashion, and fixing this issue would probably require rewriting a big chunk of its code. There are other solutions, born as a fork of Nagios, that do this but it would be great if Nagios could do it.

Use of Solution:

I've been using it for five years.

Deployment Issues:

We have had no issues with the deployment.

Stability Issues:

There have been no performance issues.

Scalability Issues:

It does not scale horizontally. We had to wrote our own web interface and puppetry to manage/view all the hosts managed by various, indipendent Nagios hosts.

Implementation Team:

It's very easy to install Nagios as its package is provided by many Linux distributions and there are plenty of documentation online.

Other Advice:

Don't use it. If Nagios is what you already have, you can try keep using it. If you're starting from scratch, there are products that scale better and perform better, and they use the same plugin syntax as they were initially Nagios forks.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
ITCS user
IT Coordinator at a tech services company
Consultant
I can see trends over time and it gives me perspective of what needs to be improved and we are able to work proactively as opposed to reactively.

What is most valuable?

Getting the alerts is the most valuable feature. This way I know when servers are acting up or just plainhosed. It also helps me to know which things need to be recovered and when so I do not have to bother with checking into it immediately.

How has it helped my organization?

Before we implemented Nagios, we did not know which servers were up or down until a customer told us. Now, I can see trends over time and it gives me perspective of what needs to be improved and we are able to work proactively as opposed to reactively.

What needs improvement?

Generally, it does what I need it to do, but better error reporting would be great. It's so flexible that I do not use half the capability that it has. Also, Nagios 4 does not work with NConf or Adagios so we haven't upgraded yet.

For how long have I used the solution?

I have worked with it as a monitoring and alerting solution for 10 years accross two jobs.

What was my experience with deployment of the solution?

We have had no issues with the deployment.

What do I think about the stability of the solution?

There have been no performance issues.

What do I think about the scalability of the solution?

We are monitoring under 200 devices and less than 1200 services so I do not need this availability yet.

How are customer service and technical support?

I've never needed to contact the vendor as I have always found my answers via the documentation and Google searches.

Which solution did I use previously and why did I switch?

I have used Zabix and Big Brother, but neither was as workable as Nagios.

How was the initial setup?

Setup is not for the GUI lover as it requires you to perform a lot of CLI work.

What about the implementation team?

You do not need a vendor. I have always deployed it myself.

What's my experience with pricing, setup cost, and licensing?

It's free.

Which other solutions did I evaluate?

I have looked at other solutions but none are as simple, and I would hate to have to learn another system.

What other advice do I have?

It's well worth it to ensure your up time and to catch the bigger issues.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
ITCS user
Labs infrastructure & technology team leader at a comms service provider with self employed
Real User
It alerts us before our customer is even aware of an issue and we can always fix it before they notice it.

What is most valuable?

Generally, it's an open source software, so it's free, and despite this, it covers all infrastructure health issues.

How has it helped my organization?

This product covers all of our infrastructure health checks, and has triggers for us to alert us to any unusual behaviour of the monitored systems. It alerts us before our customer is even aware of an issue and we can always fix it before they notice it. We can easily generate any type of notification that we went, and integrate it with other tools using any REST API which is a simple to do.

What needs improvement?

I would like to see a much better Nagios Manager GUI that can support all type of configuration items, and advance search options. They should develop a way to avoid restarting the entire application upon making any change, enable parallel checks, and improve SNMP support for SNMP traps.

For how long have I used the solution?

I've used it for around 10 years

What was my experience with deployment of the solution?

We had no deployment issues.

What do I think about the stability of the solution?

The product is quite stable, but it needs to be able to support larger amounts of hosts/services checks.

What do I think about the scalability of the solution?

There's been no issues having it monitor our entire infrastructure.

How are customer service and technical support?

Customer Service:

I'm not using any customer service.

Technical Support:

I'm not in need of any technical support.

Which solution did I use previously and why did I switch?

I started with this product.

How was the initial setup?

It was quite straightforward, but I would love to see an RPM package that includes all needed package dependencies.

What about the implementation team?

We used an in-house team. I would advise you to learn from users past experience as it always helps.

What other advice do I have?

Use the Nagios community, and go for the basic product. Design your system configuration before installing the product.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
ITCS user
Senior DevOps Engineer at a tech services company with 1,001-5,000 employees
MSP
It's able to automatically monitor any new server added to the organization.

Valuable Features:

The Auto Inventory is valuable.

Improvements to My Organization:

Automatic monitoring of any new server added to the organization.

Room for Improvement:

A better UI for graphing would make it better. The present graphs are not very friendly and good to see. 

Use of Solution:

I've been using it for two years.

Deployment Issues:

I needed to add a tweak to make the monitoring work. 

Stability Issues:

There have been no issues with the stability.

Scalability Issues:

We have had no scaling issues.

Initial Setup:

It was a little complex.

Implementation Team:

I implemented it all by myself. 

Cost and Licensing Advice:

Its free. I have not used the Enterprise version. 

Other Solutions Considered:

I evaluated Zabbix, but found Nagios…

Valuable Features:

The Auto Inventory is valuable.

Improvements to My Organization:

Automatic monitoring of any new server added to the organization.

Room for Improvement:

A better UI for graphing would make it better. The present graphs are not very friendly and good to see. 

Use of Solution:

I've been using it for two years.

Deployment Issues:

I needed to add a tweak to make the monitoring work. 

Stability Issues:

There have been no issues with the stability.

Scalability Issues:

We have had no scaling issues.

Initial Setup:

It was a little complex.

Implementation Team:

I implemented it all by myself. 

Cost and Licensing Advice:

Its free. I have not used the Enterprise version. 

Other Solutions Considered:

I evaluated Zabbix, but found Nagios better for my needs due to its simplicity and one dashboard for all servers.

Other Advice:

Its an awesome product to use. 100% recommended for all organizations.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
ITCS user
Client Engineer at a tech services company with 501-1,000 employees
Consultant
The first most valuable feature are the notifications that can be customized and even received via WhatsApp.

What is most valuable?

The first most valuable feature are the notifications that can be customized and even received via WhatsApp.

Another valuable feature is the reporting. As far as I know, there's no way to cheat on the reporting, that is, there's no way to go into the system to change the results. This makes the reporting feature very reliable. The reports are also very easy to understand, which is good when I present them to my boss.

Lastly, Nagios is not a resource hog. I can set it up on a busy server and it will still function reliably. This allows sysadmins to keep server maintenance costs low.

How has it helped my organization?

I can give an example. It was during a seasonal festival and visitors to our e-commerce site increase several-fold. The log partition quickly filled up within two days. If it wasn't for Nagios' alerts every minute until we acknowledged the problem, our website would have stopped working. (I can't remember why the logrotate didn't work, though.)

What needs improvement?

I like to have the option to configure Nagios using the web interface. Although I agree that the CLI gives a lot of customization options, I'd like to take a break from looking at lines of words. Also, configuration via a web interface could be expanded to not-so-Linux-literate users.

What was my experience with deployment of the solution?

There have been no issues with the deployment.

What do I think about the stability of the solution?

I did encounter stability issues when exploring plugins, but not with Nagios itself. Other than that, I never faced any issues on the production side.

What do I think about the scalability of the solution?

There have been no issues scaling it for our needs.

How are customer service and technical support?

Since Nagios is open source, I had to rely completely on forums and web articles. However, Nagios was set up before I joined the company, so my colleagues were able to give me ample support when trying to understand how it works.

Which solution did I use previously and why did I switch?

I never used a different solution because this current position is my first. Nagios was already set up before I joined the company. Nagios was already good enough for us so we didn't allocate time to research other products.

How was the initial setup?

The initial setup is easy if you just follow the basic guide. The complexity comes when you want to customize it to suit your environment. For example, different plugins require different configurations. There's also another challenge in that Nagios was originally designed to monitor Linux servers but has since expanded to Windows servers as well.

What about the implementation team?

It was all done by us. We were given time to do our own research and through regular testing, trials and errors, we finally implemented it. My advice is to not be scared by the need to configure everything through the CLI. It's actually quite fun and rewarding when you see your monitoring system finally up and you know you can count on it to give you a heads up on alerts before something nasty happens to your server.

What was our ROI?

Nagios is able to minimize server downtime and this in turn helps to generate more revenue.

What's my experience with pricing, setup cost, and licensing?

Nagios is open sourced, therefore there's no need for licensing.

What other advice do I have?

The product is robust and reliable. The notifications can be customized so that I can even configure it to send the notifications via WhatsApp! Last but not least, the reporting feature is very easy to understand, which is good when presenting to my boss.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
ITCS user
IT Administrator at a tech services company with 51-200 employees
Consultant
When any part of the system went down, it would inform us right away with alerts. In most cases, we were able to find the problem before the client did.

What is most valuable?

The most valuable feature of Nagios is its monitoring capability. Once you configure it correctly, it will help you monitor all your servers and services.

How has it helped my organization?

When working on ISPs, we used Nagios to monitor all our servers and network switches in the entire city. When any part of the system went down, Nagios would inform us right away with alerts. In most cases, we were able to find the problem before the client did.

What needs improvement?

We use the free version of Nagios, which needs some administrative skills in order to configure correctly. It would be great to see some of the paid features in the free version, such as web-based administration.

For how long have I used the solution?

I've used Nagios since 2008 year, and I'm really pleased with it. It helps me a lot with my system administrator work. I used it on my local servers initially, then I started to work at an ISP where I implemented Nagios. It's still in use there.

What was my experience with deployment of the solution?

I rarely upgrade Nagios as everything works fine. I've had no issues deploying it.

What do I think about the stability of the solution?

I've had no stability issues. It's been very stable.

What do I think about the scalability of the solution?

There have been no issues scaling it.

How are customer service and technical support?

I've never used tech support and I find all my answers on Google or forums.

Which solution did I use previously and why did I switch?

I tried Zabbix and OpenNMP but I didn't like them. I use Cacti and SmokePing for detailed graphics.

How was the initial setup?

A few years ago, the initial setup was complex, but now it's not. It just has some config files where you should add your host. Everything is written in the documentation.

What about the implementation team?

I implemented it by myself.

What other advice do I have?

You should really try Nagios. It will help a lot and I have found that it is the best buddy for system admins.


Disclosure: I am a real user, and this review is based on my own experience and opinions.
Sid Roy
Vice President - Operations & Client Support at Scicom Infrastructure Services
Real User
Top 5Leaderboard
The dashboarding and heads up display is practical and useful. Dashboards and HUDS could use a facelift to be more in line with next generation monitoring tools.

Valuable Features:

Nagios doesn't get the respect it deserves; most likely due to the fact that it doesn't have a licensing cost. However, when implemented correctly, this is a powerful enterprise toolset. Specifically, Nagios provides massive flexibility in terms of the types of endpoints you want to monitor (infrastructure, rudimentary application, process, and storage) and a wide variety of conditions to evaluate across including binary type conditions analysis (like threshold exceeded or not) or degrees of conditions violations (such as 30% warning; 80% critical). The dashboarding and heads up display is practical and useful for enterprise/network operations center use cases. The extensibility of Nagios also allows for integration to ticketing systems further adding value for service support and production monitoring use cases. 

Improvements to My Organization:

  • Low cost approach for massive scale infrastructure monitoring
  • Rapid deployment, if you know what you are doing you can have a solid Nagios implementation up and running in short order
  • Accurate and actionable information 
  • Ability to fine tune alert and condition management engines

Room for Improvement:

Dashboards and HUDS could use a facelift to be more in line with next generation monitoring tools that really have amazing UI’s. Sadly, many people may think that Nagios itself as a tool may not be sophisticated because it lacks the typical definition of a sophisticated UI. This is to ensure it is more in line with next generation monitoring tools that really have amazing UI’s. Sadly, many people may think that Nagios itself as a tool may not be sophisticated because it lacks the typical definition of a sophisticated UI.
 
Nagios has significant capability and opportunity for customizations to really “dial-in” the implementation to suit your specific enterprise requirements. But, enabling many of these capabilities requires an SME and to sustain and support the implementation requires effort and manpower. The larger the implementation and more extensive the customizations- the more resource intensive the deployment will become.

Application level monitoring is limited.

Other Advice:

To really maximize the power of Nagios, you need an SME (but true if anything in IT).

Disclosure: I am a real user, and this review is based on my own experience and opinions.
ITCS user
Constructor of the computer systems at a security firm with 51-200 employees
Vendor
When compared to earlier versions, it looks like 4.x has lost the statusmap.cgi module.

What is most valuable?

  • Reliability
  • Security
  • Flexibility
  • Functionality
  • Availability - controllability anywhere and with different methods

What needs improvement?

When compared to earlier versions, it looks like 4.x has lost the statusmap.cgi module.

Update April 2016:

I have fixed the problem with statusmap.cgi by upgrading to version 4.1.1. In the old version this module had not been compiled.

For how long have I used the solution?

I've used it for six years.

What was my experience with deployment of the solution?

I have had no problems deploying it.

What do I think about the stability of the solution?

I have no stability issues.

What do I think about the scalability of the solution?

I currently do not need to scale on my network.

How are customer service and technical support?

Customer Service:

I only have the free version, which does not have customer service.

Technical Support:

I only have the free version, which does not have technical support.

Which solution did I use previously and why did I switch?

We use Cisco ASA and MySQL devices alongside Nagios as our network infrastructure needs expanding and required more serious hardware solutions.

What was our ROI?

I believe it is hard to calculate for hardware.

What's my experience with pricing, setup cost, and licensing?

I only use the free version.

Which other solutions did I evaluate?

  • Amanda
  • Cacti
  • Zabbix
  • Icinga (after installation).

What other advice do I have?

As a rule, any device upon delivery is obsolete. Pick up the solution for your business, based on your specific needs.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
ITCS user
IT Support Technician
Vendor
It works. What more did you want?

What is most valuable?

It has been a reliable source of information regarding the state of the servers within the organisation and the flexibility of some of the features including the command structure has been invaluable in tracking some recurring faults.

How has it helped my organization?

A good example more recently is where the DHCP/DNS servers kept dropping their scopes, making it difficult for users whose machines were releasing. I managed to come up with a modification to a script that could be inserted into the Nagios client (NSClient++) and checked so that an alert could be generated if the scopes were dropped to allow the administrators to immediately remedy the fault in the short term. By retaining some of the information they could also check for trending as part of their fault finding process for a longer term fix.

What needs improvement?

Some of the reporting functionality is a bit basic and configuration is a chore although by the use of NagiosQL this can be made a lot easier.

For how long have I used the solution?

5-6 years

What was my experience with deployment of the solution?

Beyond the usual learning curve when adopting a new package, not really, though I did need to brush up on some Linux skills including Apache so that the web interface could be seen.

What do I think about the stability of the solution?

None. Under Linux, Nagios is pretty stable to the point that it could stay in place and active longer than most of the servers it monitored. Since the system can self test its configuration, it is normally impossible to start Nagios with a fault present.

What do I think about the scalability of the solution?

No.

How are customer service and technical support?

Customer Service:

Can't comment on this as Nagios Core is supplied without support.

Technical Support:

This is one down side to Nagios Core as it is supplied without support (Nagios XI can be obtained for a price which includes support). There are some support boards, however, that are an invaluable source of help which I have both used and contributed to.

Which solution did I use previously and why did I switch?

The outgoing system ws Network Eagle which was good at monitoring but not very good at presenting its results. Nagios was certainly a step up as we had previously needed to use a Visual Basic add on to display results which was limited to little more than a ping test display.

How was the initial setup?

The initial setup involved making sure that you knew what you were monitoring, where, what and how. Once this was done it was then possible to complete a default template which could be used to set up a server. As ever, the main effort in the beginning (once the product was selected) was in designing the layout. The actual setup was somewhat laborious (as I had not yet set up NagiosQL) and repetitive but once done, the housekeeping was minimal.

What about the implementation team?

This was all completed in-house.

What's my experience with pricing, setup cost, and licensing?

The only actual cost was the cost of a set of feet for the display unit that was used in the service desk area. Everything else was either end of life machinery (i.e. the server) or freeware/gnuware (openSUSE Linux, the packages themselves). There is no day-to-day cost other than the usual running cost of the server.

Which other solutions did I evaluate?

OpsManager

Zabbix

What other advice do I have?

Nagios Core is a great solution for monitoring pretty much any size of deployment but you do need to know your way around a Linux system to set it up and run it. The skills you need include knowing the Apache setup on your chosen distro, configuring and compiling GCC tarballs and some idea about configuration syntax. Adding NagiosQL makes it simpler but that also needs some fettling to get it to work reliably. It also helps to be good with Windows administration though chances are that if you are looking at this sort of thing, you may be aware of that. Nagios does not detect systems out of the box and while it can be made to use WMI, it tends to be better working with the NSClient++ service on Windows which can be made to work much like the NRPE service which does the same duties under Linux and Unix.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
it_user68349
Head of Development at a financial services firm with 501-1,000 employees
Vendor
Zabbix vs Nagios comparison
For years, I was using Nagios for server monitoring, but now I'm in the process of switching to Zabbix. I also use a third, much simpler system to monitor the main monitoring system. Here is a practical comparison of Nagios vs Zabbix: Zabbix Nagios Pros: Zabbix monitors all main protocols (HTTP, FTP, SSH, SMTP, POP3, SMTP, SNMP, MySQL, etc) Alerts in e-mail and/or SMS Very good web interface Native agent available on Windows, OS X, Linux, FreeBSD, etc Multi-step web application monitoring (content, latency, speed) Pros: Nagios monitors all main protocols (HTTP, FTP, SSH, SMTP, POP3, SMTP, SNMP, MySQL, etc) Alerts in e-mail and/or SMS Multiple alert levels: ERROR, WARNING, OK "Flapping" detection Automatic topography display Completely stand-alone,…

For years, I was using Nagios for server monitoring, but now I'm in the process of switching to Zabbix. I also use a third, much simpler system to monitor the main monitoring system.

Here is a practical comparison of Nagios vs Zabbix:

Zabbix

Nagios

Pros:

  • Zabbix monitors all main protocols (HTTP, FTP, SSH, SMTP, POP3, SMTP, SNMP, MySQL, etc)
  • Alerts in e-mail and/or SMS
  • Very good web interface
  • Native agent available on Windows, OS X, Linux, FreeBSD, etc
  • Multi-step web application monitoring (content, latency, speed)

Pros:

  • Nagios monitors all main protocols (HTTP, FTP, SSH, SMTP, POP3, SMTP, SNMP, MySQL, etc)
  • Alerts in e-mail and/or SMS
  • Multiple alert levels: ERROR, WARNING, OK
  • "Flapping" detection
  • Automatic topography display
  • Completely stand-alone, no other software needed
  • Web content monitoring
  • Can visualize and compare any value it monitors
  • System "templates"
  • Monitoring of log files and reboots *
  • Local monitoring proxies **
  • Customizable dashboard screens
  • Real-time SLA reporting

Cons:

  • Zabbix is more complex to set up
  • Escalation is a bit strange ***
  • No flapping detection
  • Documentation is spotty sometimes
  • Uses a database (like MySQL)

Cons:

  • Nagios needs SSH access or an addon (NRPE) to monitor remote system internals (open files, running processes, memory, etc)
  • Web interface is mostly read-only ****
  • No charting of monitored values (different systems like "Cacti" or "Nagiosgraph" can be bolted on)

* Albeit log and reboot monitoring means that one gets an "ERROR" and an "RECOVERY" message instead of one "CHANGED" or "REBOOTED" message. One gets used to it.

** For example, when there are multiple sites, each site can have it's own "proxy" (local Zabbix monitor), taking load off the main Zabbix server, and collecting data even if the connection to the main server is severed.

*** It's great that higher levels of escalation get "ERROR" alerts only after some time; but in Zabbix their "RECOVERY" messages are delayed too. I don't see the point.

**** On the web admin of Nagios, one can acknowledge problems, disable alerts, and reschedule testing. But one can not add a new host or service.

Of course, both systems have much more features than what's listed here. I only wanted to list the points that I base my decision on.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
it_user12225
Engineer at a manufacturing company with 501-1,000 employees
Real User
Nagios vs. SolarWinds - two completely different playing fields
I have setup a Nagios server from scratch as well as worked with Solarwinds pretty extensively. From my perspective they are on two completely different playing fields. Nagios definitely has its place, it's free... and it works well in a smaller environment. Solarwinds is expensive but it is a lot more robust than Nagios. Solarwinds does require you to install "Modules" in order to have in depth application monitoring, etc... Then again, so does Nagios... but you have to pay an arm and a leg for Solarwinds. So depending on how big your environment is, you'll have to evaluate if the cost is worth it. Nagios, you'll spend your money you save on time to set it up. It takes a lot of time and determination to understand its inner-workings. Solarwinds is a lot more…

I have setup a Nagios server from scratch as well as worked with Solarwinds pretty extensively. From my perspective they are on two completely different playing fields. Nagios definitely has its place, it's free... and it works well in a smaller environment. Solarwinds is expensive but it is a lot more robust than Nagios. Solarwinds does require you to install "Modules" in order to have in depth application monitoring, etc... Then again, so does Nagios... but you have to pay an arm and a leg for Solarwinds.

So depending on how big your environment is, you'll have to evaluate if the cost is worth it. Nagios, you'll spend your money you save on time to set it up. It takes a lot of time and determination to understand its inner-workings.

Solarwinds is a lot more than just a network monitoring tool. A quick example: You can develop "ghost runs" of an application and have it monitor the latency between steps. Meaning, you could configure it to load a web page, login to the webpage and run a link to gather data, all the while timing how long it takes to get from step to step. That gives you an idea of how much more Solarwinds has to it.

Nagios does have many open-source modules you can use (hell I even used one to telnet into an old AS400 and monitoring running processes).

So like I said, it depends on the environment and what you want out of the system. To answer the question about netflow, Nagios itself I don't think can do netflow but it can pair up with another module that can (and you still get to see it from a single pane of glass). Any specific questions let me know!

There's a ton of open source software out there that use Nagios and not. Ninja (front end GUI for nagios), Zenoss, What's Up Gold (YUCK!), etc... You could also get things like Alienvault (nagios is built in) that has more than just monitoring in it (it's an Open Source IDS). Cacti can be paired with Nagios to provide you with graphs for bandwidth utilization... Ok now I'm starting to blab, I'll end it here.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
ITCS user
Consultant at a tech consulting company with 51-200 employees
Consultant
Everyone ends up using Nagios or a derivative just because everyone else does
Everyone ends up using nagios or a derivative just because... well everyone else does. The size of your org really matters a lot with what you are doing here as Zabbix might fit you right or not at all. Lately I've been setting up nagios with a graphite back end for people. Then taking advantage of writing your own plugins for nagios to send data to both systems. You can throw a lot of data at graphite and make some super pretty graphs if that is what you are after. For example imagine having all the contents of a vmstat/iostat every X seconds... for ALL your servers that can be queried with less than a minute latency. You can do that with nagios+graphite+yourownfixins. ... and then you show Dev how easy it is to log data into carbon/graphite and become a super…

Everyone ends up using nagios or a derivative just because... well everyone else does. The size of your org really matters a lot with what you are doing here as Zabbix might fit you right or not at all.

Lately I've been setting up nagios with a graphite back end for people. Then taking advantage of writing your own plugins for nagios to send data to both systems. You can throw a lot of data at graphite and make some super pretty graphs if that is what you are after. For example imagine having all the contents of a vmstat/iostat every X seconds... for ALL your servers that can be queried with less than a minute latency. You can do that with nagios+graphite+yourownfixins. ... and then you show Dev how easy it is to log data into carbon/graphite and become a super hero.

When you start hoarding this much data you can start asking some really detailed questions about disk performance, network latencies, system resources, etc... that before were just guestimates. Now you have the data and the graphs to back them up.

I'm also a big fan of Pandora FMS but I've never implemented it anywhere professionally and the scope it takes is pretty large.

(I should note, nagios is pretty terrible, it's no better than things we had a decade ago.)

The real truth here is that all the current monitoring systems are pretty terrible given that they are no better than what we had a decade ago. Every good sysadmin group makes them work well enough, but there is a lot of making them work. Great sysadmins go on to combine a couple of them with their own bits to make the system a bit more proactive than reactive, which is what most people expect out of monitoring.


Reactive monitoring is fine for certain companies and certain situations and it is easily obtainable with nagios, zabbix, home-brew, stupidspendmoney solution, etc... However reactive monitoring is just the base point for most, it certainly doesn't handle big problems well, or have the capacity to predict events slightly before they are happening. This level of monitoring also doesn't give you much data after an event to figure out what went wrong.


Great admins go on to add proactive systems monitoring and in some cases basic logic monitoring. This is what a lot of us do all the time, to avoid getting paged in the middle of the night, or to know what to pick up at fry's on the way into the office. Proactive monitors a lot more things than basic, and it is essentially the level where everyone works at now, with nagios, etc... That's certainly fine for today and tomorrow. But it doesn't tell you anything about next quarter, and when you ask queries about events in the past they are often very basic in scope.


The other amazingly huge drawback with current monitoring is that if you want to monitor business or application logic, it is going to be something you custom fit into whatever monitoring system you have. This will lead to it being unwieldy and while effective for answering basic questions like, "What's the impact on sales if we lose the east coast data center and everything routes through the west?" That's a fine question but it isn't a question that will get you to the next level, better than your competitors.


So what's next? I'll tell you where I think we should be going and how I am sort of implementing it at some places.


Predictive monitoring on systems AND business logic, with lots of data, and very complex questions being answered. This can be done right now with nagios, graphite and carbon. Nagios fills the monitoring and alerting needs. Carbon stores lots of numerical data, very fast from a lot of sources. Finally with Graphite you can start asking really serious questions like "How did the code push effect overall page performance time, while one colo site was down? What's the business cost loss? Where were the bottlenecks in our environment? Server? Disk? Memory? Network? Code? Traffic?" Once you've constructed one of these list of questions in graphite you can save it for the future, and not only monitor it, but because of legacy data kept on so many key points use it for future predictions.


That said, how do you all that now? Well you throw nagios, graphite and carbon out there and then you CREATE a whole lot of stuff that is specific to your org. This is a lot of work, a lot of effort and takes time and real understanding of the full application and what your end SLA goals are.


So how do we do all this?


You as an admin do this, by creating custom nagios plugins and data handlers on your systems and throwing them in to carbon. As an admin you measure everything, and I mean everything. Think all of the output from a vmstat and an iostat logged in aggregate one minute chunks on every single server you have and kept for years.


From the dev site you get the Lead Dev to agree on some key points where the AppStack should put out some data to carbon. This can be things like time to login, some balance value, whatever metric you want to measure. The key here is to have business logic metrics AND system metrics in the same datastore within Carbon. Now you get to ask question across both data sets, and you get to ask them frequently and fast. You are able to easily make predictions about more load impacting the hardware in what manner, i.e. do we need more spindles, more memory, etc...


This is what I have been doing with some companies in SV right now. It's not pretty or fully blown out yet, because it is a big huge problem and our current monitoring sucks. :D
but it IS doable with current stuff and is quite amazing to know answers to questions that were previously only dreamed about.


What's after that? The pie in the sky next level, would be having an app box in every app group running in debug mode, receiving less traffic of course through the load balancers, and loading all that debug data into carbon. Then you get to ask questions about specific bits of a code release and performance on your real production environment.


... so those are my initial thoughts. Any comments? :)


Further once you have all this, you can now write nagios plugins to poll carbon for values on questions you have created and then alert not only on systems logics and basic app metrics, but real queries that are complex. Stuff like "How come no one has bought anything off page X in the last two hours, is it related to these other conditions? Oh. It is. Create me an alert in nagios so we can be warned when it looks like this is about to happen again." With much more data across more areas you can ask and alert on pretty much anything you can imagine. This is how you make it to next level.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
sandeep
Senior Manager of Network at a tech company with 1,001-5,000 employees
Real User
Top 20
Popular, cost effective, versatile Open Source NMS tool that requires some amount of exploration and effort by net admi
Popular, cost effective, versatile Open Source NMS tool that requires some amount of exploration and effort by network admin Nagios Overview Nagios is a free Open source network monitoring system. It monitors Router, Switches, Servers, websites etc… for flaps and service interruption and bandwidth monitoring Via SNMP. Different color code can be used for easily identify the link state. Nagios can be used for monitoring small (Few nods network) to very big enterprise network. Nagios is very stable and has an ample of plugins available for added Monitoring capability. Nagios Core is free basic application, plugins are used to extend Core capability. Plugin are either compiled binaries written in languages such as C , C++ or executable script such as Pearl, Shell , PHP python and…

Popular, cost effective, versatile Open Source NMS tool that requires some amount of exploration and effort by network admin
Nagios Overview
Nagios is a free Open source network monitoring system. It monitors Router, Switches, Servers, websites etc… for flaps and service interruption and bandwidth monitoring Via SNMP. Different color code can be used for easily identify the link state. Nagios can be used for monitoring small (Few nods network) to very big enterprise network. Nagios is very stable and has an ample of plugins available for added Monitoring capability. Nagios Core is free basic application, plugins are used to extend Core capability. Plugin are either compiled binaries written in languages such as C , C++ or executable script such as Pearl, Shell , PHP python and vbscripts. Plugin are executed by core and return the results to core for further processing. If you require support you can purchase Nagios XI with fixed onetime fee and limited email support or support contract.

Pros : 1. We selected Nagios Core as it is Very Cost effective then its competitors. (Core is free Under GNU General Public License).
2. Highly Robust, flexible & versatile tool as Swiss army knife.
3. Nagios is Scalable, scalability was essential for our upcoming projects.
4. Thousands of Plugins are available to extend features and functionality i.e. Checking Cisco CPU utilization, Interface state and BW, alert changes in IOS device, email alert when certain threshold is reached in interface etc ...
5. Email and SMS alerts avalable.
6. Monitor via SNMP.
7. Support by very big active community on internet

Cons :
1) In Nagios some features in Core are not provided out of the box, but can carried out with existing plugins and config tool or can be scripted by self.
2) For Core Usability is limited without proper tweaking or customization.
3) For Core users - Administrator needs to put some efforts and having knowledge of Linux and Scripting knowledge will be advantageous to customize.
4) Core Does not does support auto discovery, but can be implemented Nagios Discovery Tool (NDT) also Nagios XI has this.
5) Nagios XI is not free but has value and cheaper than competitors.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
it_user1020
Head of Data Center at a university with 1,001-5,000 employees
Vendor
Nagios is a great network and equipment monitoring system. Installing and configuring it from source is not easy but there are prepackaged bundles that can get you started with Nagios in a jiffy.

Valuable Features:

We started experimenting with Nagios six years ago to get a feel of it as a recommended network monitoring system. We tested other products like opennms, zabbix and zenoss but we finally decided to go for Nagios as it was very extensible. At that time, only Nagios could be configured to work with our in-house developed SMS-based messaging system. This is probably the greatest advantage of Nagios - it can be customized to a degree to suit your monitoring needs. It's architecture also allows distributed monitoring, which is really a great feature to reduce network traffic.

Room for Improvement:

Nagios' great customizability is also one of it's greatest drawback. In the early days, installation and configuration of nagios was not for the faint of heart. You need a lot patience and system administration skill to figure out what gets into what file. This is still true today, as some still install nagios from source. Aside from installation, configuring nagios from the command line is very tedious, time consuming and prone to user errors. The core nagios installation from source does not provide an integrated management system for nagios, and you have to install these systems separately from nagios.Nagios is a great product. I would highly recommend this to organizations which requires a great deal of flexibility in terms of customizing their network monitoring system.

Other Advice:

Though the core nagios system is still very challenging to install, a lot of bundled installers with very good GUI's for configuring nagios now exists. Instead of doing everything manually from the command line, you can just grab one of these packaged forks and get started with nagios in as little as 10 minutes.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
it_user2061
Operations Expert at a tech services company with 51-200 employees
Consultant
Large ecosystem of tools, but default interface is clunky and slow.

Valuable Features:

Open source and very flexible. large ecosystem of tools and custom monitors built up around nagios. All configuration is text files, so it is easy to keep this in version control and generate new configs from other tools.

Room for Improvement:

Default interface is clunky and slow. Can be a steep learning curve if you haven't worked with it before. Adding devices or services requires reloading the config or restarting the service. Would like for this to be more dynamic. It seems that most of the new development is going towards Nagios XI(paid enterprise version) rather than the opensource nagios. Would like to see some of the newer ideas in Icinga and/or MK livestaus integrated into opensource nagios.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Download our free Network Monitoring Software Report and find out what your peers are saying about Nagios, Zabbix, Centreon, and more!