Nagios Core Review

Zabbix vs Nagios comparison


For years, I was using Nagios for server monitoring, but now I'm in the process of switching to Zabbix. I also use a third, much simpler system to monitor the main monitoring system.

Here is a practical comparison of Nagios vs Zabbix:

Zabbix

Nagios

Pros:

  • Zabbix monitors all main protocols (HTTP, FTP, SSH, SMTP, POP3, SMTP, SNMP, MySQL, etc)
  • Alerts in e-mail and/or SMS
  • Very good web interface
  • Native agent available on Windows, OS X, Linux, FreeBSD, etc
  • Multi-step web application monitoring (content, latency, speed)

Pros:

  • Nagios monitors all main protocols (HTTP, FTP, SSH, SMTP, POP3, SMTP, SNMP, MySQL, etc)
  • Alerts in e-mail and/or SMS
  • Multiple alert levels: ERROR, WARNING, OK
  • "Flapping" detection
  • Automatic topography display
  • Completely stand-alone, no other software needed
  • Web content monitoring
  • Can visualize and compare any value it monitors
  • System "templates"
  • Monitoring of log files and reboots *
  • Local monitoring proxies **
  • Customizable dashboard screens
  • Real-time SLA reporting

Cons:

  • Zabbix is more complex to set up
  • Escalation is a bit strange ***
  • No flapping detection
  • Documentation is spotty sometimes
  • Uses a database (like MySQL)

Cons:

  • Nagios needs SSH access or an addon (NRPE) to monitor remote system internals (open files, running processes, memory, etc)
  • Web interface is mostly read-only ****
  • No charting of monitored values (different systems like "Cacti" or "Nagiosgraph" can be bolted on)

* Albeit log and reboot monitoring means that one gets an "ERROR" and an "RECOVERY" message instead of one "CHANGED" or "REBOOTED" message. One gets used to it.

** For example, when there are multiple sites, each site can have it's own "proxy" (local Zabbix monitor), taking load off the main Zabbix server, and collecting data even if the connection to the main server is severed.

*** It's great that higher levels of escalation get "ERROR" alerts only after some time; but in Zabbix their "RECOVERY" messages are delayed too. I don't see the point.

**** On the web admin of Nagios, one can acknowledge problems, disable alerts, and reschedule testing. But one can not add a new host or service.

Of course, both systems have much more features than what's listed here. I only wanted to list the points that I base my decision on.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
8 Comments
Director of IT with 1,001-5,000 employeesVendor

Could you please add the size of the test/PoC environment?

That would be helpful to read and to put this comparison you wrote into a better context.

Thank you.

26 December 13
PatrikConsultantTOP 20LEADERBOARD

Nice comparison but the point where you mark a database as a con is a bit weird to me. Yes your DB needs to be fast and it probably adds some extra costs to your setup but on the other hand looking up data is quick and you don't throw away gathered information from your system. So the use of a database has certain advantages and drawback. Also "Completely stand-alone, no other software needed" for Nagios seems weird to me as Nagios relies heavily on 3rd party tools already with the installation a lot of extra packages are needed while Zabbix installs almost out of the box with a minimal set of extra packages.

27 March 14
Henry HuangReal UserTOP 20

That's why you need to look into OMD (Open Monitoring Distribution), it aggregate the most powerful 3rd party plug-ins for Nagios into one easy to install package. No individual installation required and historical data are kept for reporting. Please check out the blog post I made for more detail on why OMD is a better choice. http://blog.unicsolution.com/2013/11/best-monitoring-solution-omd-nagios.html

22 July 14
PatrikConsultantTOP 20LEADERBOARD

It's a package but you still rely on 3rd parties.
The link you point at talks about failing clients in zabbix 1.8
zabbix is now in 2.2 and almost 2.4 even in 2.0 i never had a client crashing.
and Slow graphs in zabbix ?? My experience is just the other way around, I come from nagios forks and they where painfully slow compared to zabbix. Also the number of values per second that you can monitor with zabbix is much much higher then with nagios.
Yes files are more easy to manage with puppet ansible chef .... Puppet has modules for it i think but yes it makes it more complicated but on the other hand the api makes it easy to communicate with other software.
The cmdb is limited but who cares for that i use a real cmdb program and let zabbix just feed it with the api.

22 July 14
Henry HuangReal UserTOP 20

Did you even read the performance part of the article? What's the performance of Zabbix? https://mathias-kettner.de/checkmk_checkmk_benchmarks.html
And OMD is not just about files, it has API too, and I am using it's API to do automation. CMDB is a good idea, which one do you use?

22 July 14
PatrikConsultantTOP 20LEADERBOARD

Off course I read it 3M values per minute that's less then 900 vps still far below what zabbix is capable of.
But I was with slow pointing out to the graphs part.
People just fail in zabbix on the DB part you need descent hardware and good database tweaking.
To many people want to visualise these days and zabbix should not be virtualised it needs descent hardware.
Like I wrote before api makes it more complicated but should not be the reason for not choosing it.
I do agree it would make life more easy with files. I just dont agree with the arguments in the article like that more contributors would make it a better alternative as zabbix keeps development more in house each contribution needs to follow strict coding rules and its written in C making it less easy to contribute.
What people not know is that many new features in zabbix like VMware monitoring are complete sponsored implementations by companies. Don't get me wrong I not say that omd is bad it's probably the best solution if you already have a monitoring solution based on nagios or if you have nagios knowledge in house. But for a new setup i would really compare them both more in depth then what the article did as the reasons they showed to not chose zabbix are not really valid.
ATM i use glpi + ocs

22 July 14
Henry HuangReal UserTOP 20

Thank you Patrick

You are right it is not a fair comparison for Zabbix. But I just wanted to add that I never like Nagios in the first place and was never an expert of it because of the complex configurations and all the different file relationship you have to manage. So what OMD stands out from Nagios is that it abstracted away all that complexity for you so you don't even need to know what's running in the backend, everything is nicely integrated and presented with one UI just like Zabbix. No effort required for integration, no need to understand how Nagios works, you just need to learn how OMD works.

22 July 14
Senior Network Engineer with 1,001-5,000 employeesReal UserTOP 10

So what you guys suggest if a company have 25000+ employees and thousands of network devices to monitor worldwide ? Currently we are using Solarwinds and we need to follow a distributed environment . We are looking for a centralized setup where are nodes can be managed and monitored from one location including the configuration backup and reporting. Any suggestions ?

11 August 15
Guest
Sign Up with Email