Co-Founder at Nobius IT
Reseller
Top 5
Fabulous flexibility - supporting a vast variety of data sources including IoT protocols. Very stable and well supported. Scalable to many thousands of values per second.
Pros and Cons
  • "The flexibility of this solution is amazing."
  • "Documentation terminology could be improved."

What is our primary use case?

I'm currently dealing with three implementations - two are on cloud and one is on-premise. Our clients use the solution to monitor IT infrastructures, IT networks, applications, and cloud as well as containers.

I have a very exciting additional use case as well that we're working on, using Zabbix 5.x for internet of things (IOT) monitoring. It supports MQTT and ModBus, which allows us to monitor IOT devices.

The flexibility of taking monitoring data from such a wide variety of platforms - beyond traditional IT - makes Zabbix a highly flexible solution.

I'm a consultant of this solution as well as a user. I'm a Zabbix certified professional and director of a Zabbix authorized reseller (Nobius) in the UK. 

How has it helped my organization?

As our business has grown, we use Zabbix to monitor our Zabbix implementations - a 'manager of managers' setup. This has given us better resource utilisation and improved our service delivery.

What is most valuable?

The ability to take data from multiple sources. The Zabbix agent is probably the most lightweight monitoring agent available. The agent itself is extensible providing simple expansion capabilities to support new use cases. Alongside the built-in agentless monitoring via SNMP, SSH, WMI and others, this means we have a solution that has  no limits as to the data that can be ingested and alerted on.

What needs improvement?

If anything could be improved, it would be some of the terminology that is used in the documentation. The documentation is good, but it's been translated into English and occasionally suffers from terminology issues. There are additional features that the commercial software has that Zabbix doesn't. Full AIOps isn't cheap, things like machine learning and artificial intelligence attract a massive price premium and are rarely implemented properly. But they are major, major features.

Buyer's Guide
Zabbix
April 2024
Learn what your peers think about Zabbix. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
770,141 professionals have used our research since 2012.

For how long have I used the solution?

I've been working with Zabbix for over four years following 25yrs as an IT Monitoring specialist within user organisations, consultancies and major global software vendors.

What do I think about the stability of the solution?

The stability is unparalleled. Processes are all well-behaved and logging is clear and succinct. Support for separating the front end, server and database allow resources to be load-balanced and clustered.

Zabbix runs equally well in cloud- and on-premise configurations.

What do I think about the scalability of the solution?

The solution can take hundreds and thousands of values per second, so scalability is excellent. The ability to add proxy servers to distribute the data handling load is impressive and they are very straightforward to set up. This also adds to reliability in distributed environments.

How are customer service and support?

If you consider that Zabbix is open source, their technical support is fantastic because the people, the developers of the application, are the guys who do the technical support as well. You don't have to wait to get through different levels of technical support. You get a very, very knowledgeable person on the phone straight away which is a big plus.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

Yes, previously used and consulted on HPE SiteScope, HPE Operations Manager, Micro Focus Operations Bridge and others.

Sitescope is useful, but doesn't have the muli-tenancy or proxy facilities and is agentless only making it unjustifiable in an MSP environment.

Operations Manager was too complex to maintain and became obsolete.

Operations Bridge is resource hungry, complex to install and configure and extremely expensive.

How was the initial setup?

I've deployed so many times that the initial setup is straightforward, but I would say that for someone who is totally inexperienced in Linux, it can be a little time consuming. If you understand a little about Linux, then it's no problem. A full system can easily be configured in two hours but it took two days the first time I did it. If you're not a technical person you can still install it but it will likely take some time.

As an example, configuring SNMP trapping into Zabbix needs configuration outside of Zabbix itself. This is not complex, but can slow down the process for inexperienced installers.

What's my experience with pricing, setup cost, and licensing?

The software itself is open source, it can be easily downloaded,  and use it with no limitations.

Be very careful about using the "appliance" configuration in a production environment. It is only suitable for evaluation or very small environments.

Invest in support, training and consultancy from Zabbix or from third parties. Architecting a robust, resilient and secure monitoring platform from day 1 will save time and money at a later stage.

Zabbix and 3rd parties offer far more than a traditional support contract. No other organisation in my experience includes pro-active and on-site support as a core part of their offerings. 

Which other solutions did I evaluate?

Nagios - too much development effort to maintain and configure.

What other advice do I have?

I believe it's crucial to plan the implementation. Just because the software is free, you shouldn't just install it and let it run. Plan your implementation carefully and you'll get more out of it than you ever thought possible

I would rate this solution a nine out of 10. 

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Disclosure: My company has a business relationship with this vendor other than being a customer: UK Based Certified Partner. Author is Zabbix Accredited Certified Professional
PeerSpot user
Regional Head at a tech services company with 51-200 employees
Real User
Open source, with great technical support and good scalability
Pros and Cons
  • "The product is very stable."
  • "There's a small module of APM, however, it is not an enhanced version. People usually ask for a full-fledged APM solution."

What is our primary use case?

The solution is a network monitoring tool. If you have any type of IT infrastructure, it can help you monitor that IT infrastructure and get the logs collected. You can control and manage your IT, et cetera.

What is most valuable?

The product is very stable.

It is a scalable tool. 

The solution is open source also. There's no cost for the license.

Technical support is very good.

What needs improvement?

They need to improve the APM solution, the Application Management solution. There's a small module of APM, however, it is not an enhanced version. People usually ask for a full-fledged APM solution. 

The initial setup could be a bit simpler. 

For how long have I used the solution?

I've been using the solution for five years. 

What do I think about the stability of the solution?

The stability is good. There are no bugs or glitches. It's reliable. It doesn't crash or freeze. 

What do I think about the scalability of the solution?

If a company needs to expand the product, it can do so with Zabbix. It's scalable. 

We have 21 to 22 clients using this solution currently.

How are customer service and support?

Technical support is very good. They are helpful and responsive and we are happy with their level of support. 

How was the initial setup?

I'd rate the solution's initial setup as a seven out of ten. 

The installation will take time according to the IT infrastructure involved. It depends on if there are multi-locations versus one location and whether it is on the cloud or whether it is on-premises. Even so, deployment will take at least two to three weeks at a minimum.

We have eight to ten people on our team that can handle deployment and maintenance. They are admins and engineers. 

What's my experience with pricing, setup cost, and licensing?

The solution is completely open-source and free to use. 

What other advice do I have?

I'm a Zabbix partner.

The solution is excellent. I would rate it at a ten out of ten. 

I would recommend the solution to other users and other organizations. 

Which deployment model are you using for this solution?

Hybrid Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user
Buyer's Guide
Zabbix
April 2024
Learn what your peers think about Zabbix. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
770,141 professionals have used our research since 2012.
Sr. Linux Analyst at a energy/utilities company with 1,001-5,000 employees
Real User
We are able to do problem determination on runaway processes
Pros and Cons
  • "It can send messages to our ticketing system."
  • "It has good graphs of what is going on within the operating system.​"
  • "We are able to do problem determination on runaway processes."
  • "​I would like to better be able to monitor Oracle processes.​"

What is our primary use case?

We use it to monitor Linux systems. It has performed well.

How has it helped my organization?

We are able to do problem determination on runaway processes. 

What is most valuable?

Graphing processes. It can send messages to our ticketing system. It has good graphs of what is going on within the operating system.

What needs improvement?

I would like to better be able to monitor Oracle processes.

What do I think about the stability of the solution?

The stability is good.

What do I think about the scalability of the solution?

The scalability is good.

How are customer service and technical support?

I have not used technical support.

Which solution did I use previously and why did I switch?

We did not previously use a different solution, so I asked my manager to look into it.

How was the initial setup?

I was involved in the initial setup. The initial setup was straightforward.

Which other solutions did I evaluate?

We evaluated SCOM.

What other advice do I have?

Most important criteria when selecting a vendor: 

  1. Features
  2. Price.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Founder at Art World Web Solutions
Real User
A great product with good alerts and fair price, but it needs more features and better development model, UI, and scalability
Pros and Cons
  • "It is a great product. The SNMP protocol tracking feature is good. I really like how it tracks SNMP. The alerts are also great."
  • "Its UI needs to be improved a little bit more so that an end-user is also able to handle it. I can handle it, but others should also be able to handle it in a better way. It becomes complex when we are growing and need to add proxies. We need more scalability features and documentation for different use cases. A lot of articles are available, but they need to be in proper documentation. For example, when you have thousands of servers that have to be monitored in different regions of the world, there should be some kind of documentation to describe how you can create proxies and add them. Sometimes, when you are using the database, it can get overloaded. When the network is growing, the number of transactions becomes very high, and the database gets overloaded. There should be information about how to reduce the load on the MySQL database, which is what Zabbix is using. The market is growing a lot, and it should be enhanced for a lot more things. We are currently bringing enhancements at our end for different use cases. For example, when dockerization is going on, how can we check the logs inside the Dockers. We should also be able to monitor and check the number of logins and add features such as SSO login and two-factor authentication as a protocol. These are the security features and concerns that we have to deal with. Currently, we are developing modules to add features to Zabbix, but they should also work on these features."

What is our primary use case?

We are using Zabbix in our project for demo purposes. One of our clients is also using Zabbix. They have a data center, and they use it for internal monitoring. They are on a cloud system.

What is most valuable?

It is a great product. The SNMP protocol tracking feature is good. I really like how it tracks SNMP. The alerts are also great. 

What needs improvement?

Its UI needs to be improved a little bit more so that an end-user is also able to handle it. I can handle it, but others should also be able to handle it in a better way.

It becomes complex when we are growing and need to add proxies. We need more scalability features and documentation for different use cases. A lot of articles are available, but they need to be in proper documentation. For example, when you have thousands of servers that have to be monitored in different regions of the world, there should be some kind of documentation to describe how you can create proxies and add them. Sometimes, when you are using the database, it can get overloaded. When the network is growing, the number of transactions becomes very high, and the database gets overloaded. There should be information about how to reduce the load on the MySQL database, which is what Zabbix is using. 

The market is growing a lot, and it should be enhanced for a lot more things. We are currently bringing enhancements at our end for different use cases. For example, when dockerization is going on, how can we check the logs inside the Dockers. We should also be able to monitor and check the number of logins and add features such as SSO login and two-factor authentication as a protocol. These are the security features and concerns that we have to deal with. Currently, we are developing modules to add features to Zabbix, but they should also work on these features.

For how long have I used the solution?

I have been using this solution for three to four years.

What do I think about the stability of the solution?

It is stable up to a level. 

What do I think about the scalability of the solution?

When you are growing and need to add proxies and other things, it becomes complex. To deal with this kind of complexity, more scalability features and documentation for different use cases are required.

How are customer service and technical support?

I did not connect with Zabbix support, but the client's team connected with them. I worked with the client initially, and after that, I gave them access and everything else. They directly sync up with Zabbix's support team.

How was the initial setup?

The initial setup is great. However, later on, when you are scaling it, it becomes complex.

What's my experience with pricing, setup cost, and licensing?

Its licensing is fair. It seems to be much cheaper than others.

What other advice do I have?

Zabbix is a good product. It just requires a better development model, better UI, and better scalability. It also needs more features.

I would rate Zabbix a seven out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
SOC Expert at a computer software company with 1,001-5,000 employees
Real User
Easy to use with good support and a fairly simple setup
Pros and Cons
  • "It meets my organizational needs. It's pretty easy to use."
  • "The product could be more secure and more stable."

What is our primary use case?

In my company, I have a lot of web services on the internet and use it for monitoring. For example, for concurrent sessions, I can count the HTTP requests, or I can use it to monitor the CPU and RAM in my devices, web application devices. 

What is most valuable?

It works well for my business. It meets my organizational needs. It's pretty easy to use.

It's very stable. It's got good reliability.

The pricing is okay.

For us, the support has been fine.

We have found the initial installation not that difficult.

What needs improvement?

The product could be more secure and more stable.

For how long have I used the solution?

I've been using the solution for about four years at this point. It's been a while. 

What do I think about the stability of the solution?

The solution is stable and reliable. The performance is good. That said, it could always be more stable. 

What do I think about the scalability of the solution?

We have more than 200 people on the product and it seems to work well for us. We've never had an issue with scaling. It's good for u and fits our needs.

How are customer service and support?

We haven't had an issue with technical support. We fill out a form when we run into issues. They are largely quite helpful. 

Which solution did I use previously and why did I switch?

I also am familiar with Datadog.

I'm not sure if we used anything before Zabbix.

How was the initial setup?

In terms of initial setup, it's petty straightforward. At the Linux stage, you can introduce the Linux commands. In the environment in Linux, you can install Zabbix pretty easily.

The deployment doesn't take too long. It might take only two weeks.

We do have a team that can manage the deployment and maintenance of Zabbix as needed. Usually, we have one or two managers that are able to handle anything that comes up. The rest of the team is a bit more technical. 

What's my experience with pricing, setup cost, and licensing?

We have about 100 active licenses, however, I don't have many other details beyond that in terms of licensing and costs. My understanding is that the pricing is okay.

What other advice do I have?

The product is a standalone in my data center, my local data center.

I would recommend the solution to others. It's been good to us so far and we do have experts in our country.

I'd rate the product at a nine out of ten. We've been pretty happy with it in general.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Deputy Manager - Infrastructure at a retailer with 1,001-5,000 employees
Real User
Good for monitoring but complicated to configure and it needs to be more customizable
Pros and Cons
  • "The most valuable feature is monitoring."
  • "Having a more customizable interface and dashboard would be an improvement."

What is our primary use case?

We primarily use Zabbix for monitoring our infrastructure.

What is most valuable?

The most valuable feature is monitoring.

What needs improvement?

Having a more customizable interface and dashboard would be an improvement.

The interface could be more user-friendly because it is can be really complicated if an end-user has to configure it. The administrator usually has to take care of that.

I would like to see more SNMP and storage support.

Application monitoring should be included in the future. I would like to see voice telephony monitoring and database monitoring.

The reporting functionality is limited.

For how long have I used the solution?

I have been working with Zabbix for the past four years.

What do I think about the stability of the solution?

This is a stable product.

What do I think about the scalability of the solution?

We have had no problem with scalability. The only people who use it are the IT staff, which is between 10 and 12 people.

How are customer service and technical support?

As we are using the free version, we do not have a support contract.

What's my experience with pricing, setup cost, and licensing?

We are using the free, open-source version.

What other advice do I have?

This is not a product that I recommend. Instead, I recommend using SolarWindows or ManageEngine for monitoring because there are more features on Zabbix with limited usability. Reports are also limited. Basically, you get more features in SolarWinds or ManageEngine.

I would rate this solution a five out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
PeerSpot user
Senior Manager of Engineering with 501-1,000 employees
Real User
Extremely powerful and flexible but the auto-discovery function has room for improvement
On a scale from 1-5 (1=worst, 5=best), how would you rate this product overall compared to similar products?
- In my experience there are two classes of network management systems, Open-source systems that are generally free, and commercial systems complete with support as well as some advanced self configuring features. Zabbix actually fits into both classes, but in my view it has more in common with the open source systems.
- Compared to SIMILAR products, I would give it a 5. This means as compared to similar open-source tools that do not have a strong network auto-discovery feature.
- By network auto-discovery, I mean features found in tools like SolarWinds, NetMRI and other commercial products that have part of the configuration work done in advance. For example, when I plug in NetMRI and give it a list of login/password combinations and an IP range, it is able to self-configure, finding most of my network gear automatically. Network discovery is a useful feature that can reduce the amount of time it takes to integrate a system.
- Zabbix does have a discovery feature, but it is configured by the user. Zabbix is extremely powerful, and I got the network discovery tool working in just a couple hours after my first installation. The advantage is that it can be used to detect and configure non-standard devices.
- The self-configuring systems like SolarWinds and NetMRI seem like they have an advantage, however there is a cost associated with that advanced function. The largest one being that they only support a small set of big-names, like Cisco. Not everything on a network is a Cisco, so that advantage quickly becomes less important.

For how long have you used this product?
- Over 6 years.

Which features of this product are most valuable to you?
- It is the flexibility of the system that I enjoy the most. I can make it do things that are unique to me, such as do deep analysis of a custom device that I built. Or non-standard hardware that require unique test methods. Of course it also does the standard stuff very well. I have Zabbix monitoring Servers, network components, Air conditioners, etc. I have it alerting field installers for an ISP, to let them know that they have made an error in configuring an end-user router.
- It can do anything I can imagine doing. I even keep an eye on my BBQ smoker at home with Zabbix.

Can you give an example of how this product has improved the way your organization functions?
- The best examples are in an ISP and in a large network of Hospitals.
- In an ISP, it allows the network operators to track the performance for each customer, and know about outages before the customers do. It allows the operators to track network quality so that problem trends are detected before customers are impacted. It also watches for new devices being connected to the network, and tracks environmental conditions in field. If we discover a new condition to watch for, it takes only seconds to add new tests to thousands of devices.
- In a hospital network where there are many mission critical systems, I can use it to track and report on SLA's as well as monitor unique medical devices that you are not going to find supported by a system like SolarWinds. It allows me to create dash-boards for executives, giving each management user a front page view that is specific to their needs. So each user sees what they need, and nothing that they don't need. With the discovery engine, I can take common network components, and create a template for the desired configuration. Then I can have the system scan the entire network and automatically identify and add each different type of equipment to the system.

What areas of this product have room for improvement?
- The auto-discovery function could be improved to include more hands-off automation. The current system is great for experts, but it could be improved so that a novice could use it as well.

Did you encounter any issues with deployment, stability or scalability?
- In the early versions, there were some scaling issues, but there have been several large improvements in that area, and in general the system is much more scalable than most systems, such as SolarWinds.

Did you previously use a different solution and if so, why did you switch?
- I have used many different systems over the years. As time passes, each system was replaced by a different competing system. Each new system was better than the ones before it, with improvements in ease of use, scalability, depth of function, and flexibility all seeing improvement as I progressed from one system to another.

Before choosing this product, did you evaluate other options? If so, which ones?
- HP Openview, Ipswitch whats up, Big Brother, Nagios (was Net Saint), MRTG, RRD, Cacti, Zenoss, GLPI, Solar Winds, NetMRI, LiveAction... and I'm sure there have been others that I left out, as well as many home-grown systems.

How would you rate the level of customer service and technical support?
- I have never used the official technical support channel for Zabbix, however I have engaged the community by using the support forums. And in the forums I was able to get help directly from one of the Zabbix developers when I found specific issues I needed help with.

Was the initial setup straightforward or complex? In what ways?
- There was a steep learning curve. I have found nearly all systems to have steep curves. The easiest systems were the expensive commercial systems, although even those had some difficulties when you wanted to do something non-standard. Zabbix was not the worst system, and was far from the easiest. However the need to learn something complex is rewarded by the capabilities gained. I'm an expert as implementing monitoring systems, but someone with fewer years of experience will probably find it even more challenging, and may feel the need for training, which is available.

Did you implement through a vendor team or an in-house one? If through a vendor team, how would you rate their level of expertise?
- I am an army of one!

What is your ROI on this product?
- Because I focused on an unsupported free version, my main investment is time. Because of my experience level, and the automation featured I used on day one, I found an immediate ROI half-way through the first day of use. I was able to get done in 4 hours on Zabbix what was going to take many months on the system I had been using before (a combination of Nagios and Cacti).

What was your original setup cost for this product and what is your day-to-day cost of using this product?
- The original set-up cost was an open-source OS deployed in a virtual environment... so about 1/4th the price of one server, and about half a day of labor.

What advice would you give to others looking into implementing this product?
- This is a system designed for professionals, and is most advantageous when used by someone with some training or a lot of experience. A novice can learn to use the system, but be prepared to work hard to learn a fairly complex system.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user4329 - PeerSpot reviewer
it_user4329Senior Manager of Engineering with 501-1,000 employees
Real User

I purchased a copy of that book myself. I can't say that I read very much of it, but I keep it around for others, and I like the idea of supporting the author of my favorite management system.

As for the SNMP traps, when I refer to the Zabbix documentation, I incorrectly lump the official and community documentation together. I'm a googler, so they tend to transparently intermingle under my fingertips. But yes, the SNMP traps are documented on the community wiki with four different recipes,

See all 10 comments
it_user3579 - PeerSpot reviewer
Consultant at a consultancy with 51-200 employees
Consultant
Nagios vs Zabbix

Everyone is familiar with the product Nagios, which is often considered the de-facto standard for monitoring. The other tools in that general category are OpenNMS, Zenoss, Groundworks, HyperIQ and others. I am only talking here about tools that would qualify in the NMS category: something that really tracks different systems and devices across the entire infrastructure.

A couple of years ago, I was so tired of Nagios that I was ready to try something new. A couple of tools didn’t make the list, simply because of the “fremium” model. The basics are there, but anything more typically carries a hefty price tag.

I decided to try Zabbix and I have pretty much been a fan ever since. One caveat here, is that I am talking about version 1.8.x. Version 2.0 just came out and offers a few notable improvements, which I haven’t tried out yet. A couple of things that look very promising are: Direct JXM support, multi-homed hosts, and mounted filesystem discovery. Full list of changes is here

As an overview, Zabbix offers the following benefits:

Relatively quick & simple install on a variety of platforms Agent-based, but available agentless options. A fairly vibrant community A large amount of templates covering most popular software Integrated graphs Escalation management

More specifically:

Graphs

There are a lot of graphic front ends for Nagios. In general, they are bolt-ons of varying quality. On the other hand, graphs are probably one of the stronger features of Zabbix. Typically, templates will have a few graphs predefined, but more can be added fairly easily. Any item that’s being collected can also be graphed on-demand. The one small drawback is the inability to save pics on the fly, which is sometimes useful for distribution. A workaround for that is described in this thread.

Graphing performance is decent if not spectacular. That will largely depend of data volume, your hardware and range of time. What I found especially valuable is something zabbix refers to as “screens“. Generally, the entire point of graphing or visualizing something is to be able to easily identify trends and correlations. “Screens” allow you to group disparate items together. For example, if you wanted to see the correlation between your requests per second, queries per second, response time, network traffic and read/write percentage, it’s fairly trivial to put it together. Besides that, I’ve tended to use screens almost as targeted dashboards. Something like putting all the MySQL relevant information on the same screen (disk IO, queries per second, replication lag, cpu/mem, cache hits, etc) can let you know the health of your MySQL infrastructure almost immediately. Same can be done on the web side and other areas.

Performance Performance will vary quite a bit. I’ve ran Zabbix on a large instance at EC2, backed by a 4-volume EBS RAID set and was able to receive 600-800 values/second without much of a problem. However, with that setup, the screens (particularly the ones with with a lot of metrics) would load in 2-5 seconds and the lag was noticeable. One key tweak that is absolutely necessary is the polling frequency. Most of the default (and 3rd party) templates will have the polling frequency too high. You generally don’t need to poll for free space every 5 seconds and there are plenty of examples like this. The data retention period also needs to be adjusted in a lot of cases. Reducing those intervals to something more reasonable is going to give a significant performance boost. It will behave better because you’ll reduce the volume of incoming values, but it will also reduce the amount of data you store and query against in the database. You likely don’t need precise-to-the-second numbers for every metric you collect going back a year. Historical data is still available, though in a somewhat less detailed form, which is generally sufficient for trend information. If the data volume gets too large, the clean up process might start failing. I’ve noticed that around 150GB of data it would start having trouble. At that point there aren’t very many good options and they tend to be quite hairy. It’s best to avoid getting into the situation in the fist place.

There are also a couple of options for distributed monitoring, if the performance requirements exceed the capability of a single node. There is a lot of documentation about it on their site, but it generally boils down to a choice between proxy or a node. I tend to prefer a proxy because of easier setup and maintenance. In a more specific example, I’d use proxies in an AWS environment which was spread across different regions. Another good use case in AWS is if you have a mix of a VPC and regular EC2 and you’d place your proxy in the VPC. This method can allow for significant scaling capabilities, though you would still need a very capable central master. The one significant benefit to a node approach is that they can be queried independently and support a hierarchical approach. However, in an environment with 1000s of devices that support different applications, nodes are likely a better approach.

Monitoring It’s a fairly standard feature set that is generally similar across other NMS systems. A couple of things worth noting:

Web Monitoring – it has a built in web transaction monitoring. It’s decent if not spectacular and doesn’t really compare against sophisticated transaction monitoring systems that are out there. It does support multiple steps and it’s based on curl, though it doesn’t expose all of curl’s functionality. That will present a problem if you need to do extensive cookie manipulation and/or variables. It’s also useless for heavily AJAXed pages and the ones that use flash. Still, it’s decent for basic monitoring and more then most other systems offer. IMPI support is worth noting, but I’ve personally never used it. Log Monitoring – this isn’t going to work well for high traffic web logs, but it does a pretty solid job at picking up exceptions and errors in various files. It does support a full regex engine for pattern matching. I’ve had it monitoring files that received ~500 lines per second and it had no issues with that. Templates – this is the core approach to monitoring in Zabbix. All your monitoring definitions are ideally grouped in templates. When a new server/instance shows up, you simply apply the template to it or add it to a group to which this template is assigned. There are a few templates that come out of the box of varying quality and there are a lot of user-generated templates for a variety of applications. A lot of them will have a script (PHP/Perl/Python) that polls the application and sends the data back. Typically you’ll have to make a few tweaks that are specific to your environment. Some of the ones that I found useful and better then others are: This is the “default” MySQL template for Zabbix and it’s based on a PHP script. The description says it wasn’t tested on 5.1, but I didn’t seem to notice any issues. There are range of values that have to be tuned in order to avoid false alerts. If you’re used to the Cacti templates for MySQL and the data those provide, this is a port to Zabbix. If I remember correctly, this template required a few tweaks to the PHP script, in order to get it working. This is another decent template for MySQL, but you don’t get InnoDB information out of the box. It is good for monitoring multiple MySQL instances on the same box though. The other templates would require modifications in their polling scripts. For Haproxy, I’ve used this template. It’s better than others, since it allows you to look and compare statistics of individual servers behind Haproxy. The downside is that it won’t automatically discover changes. That can be scripted, but it might get a little hairy. For Nginx, this is more than sufficient for most needs. Another one that is useful for Nginx, though the site is in Russian. Google translate does a pretty good job there. There are a few other templates on that site, but I’ve never tried them.

Misc

It does have an API for automation. I think it was improved in 2.0, but in 1.8 it was already solid. There is a decent CLI tool written in Ruby that will interface with the API, called zabcon There isn’t a great way to control alert floods. You can control trigger dependencies, but if something really goes haywire you might be manually clearing SQL tables after that. Alert escalations are a little wonky, but they work reasonably well. It is pretty trivial to port existing Nagios plugins or other scripts into Zabbix. JMX monitoring was done via zapcat. It wasn’t great, but for the lack of better options this was the only thing to work with. Version 2.0 does it natively and if they did it right, that’s probably one of the biggest improvements.

In summary, from what I’ve seen, Zabbix is easily one of the top NMS systems out there, though it’s probably somewhat less popular than others. If you’re fed up with Nagios or doing a brand new deployment, taking a serious look at Zabbix will be worth your while.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user4329 - PeerSpot reviewer
it_user4329Senior Manager of Engineering with 501-1,000 employees
Real User

The old-school systems produced graphs every time data was gathered. This resulted in a fast user experience displaying graphs, but it caused the number of values per second to be limited by the number of graphs per second you can produce.

Zabbix dynamically creates the graphs on demand. This reduces the number of times it much produce a graph, pushing up the number of values per second you can capture. But as the reviewer noted above, screens and individual graphs can display slowly if they contain too many data points.

I agree with the reviewer that many or most of the default poll rates in the templates have excessive poll frequency. In fact, they are so high as to have an impact on the machine your are polling if you have very many values you are pulling. Sometimes I think that the people that create the templates only have one machine they are monitoring, and they set the poll frequency high just to have quicker graphs appear when setting up a new zabbix server. Nothing is more boring than spending a couple hours setting up a monitoring system, only to have a bunch of graphs with single dots on them because your polling cycle for disk space is every 15 minutes. But regardless of the reason for it, I think it is irresponsible to release templates with inappropriate polling cycles.

But back to the graphs, if you have too much data, an otherwise simple graph will take a long time to display. On a screen this gets worse because you are displaying multiple graphs. So to get the best screen display performance, reduce the polling frequency to the lowest value that still produces good graphs.

I have been knows to produce two objects for the same item, with different polling cycles. A long polling cycle for graphs that appear on screens and public viewable pages, and faster polling cycles for detailed data collection to be used in debugging.

I've used nearly all of the network monitoring systems in the 30+ years I have been monitoring networks. Zabbix is my favorite for most applications. I do use more advanced commercial systems such as NetMRI, as the commercial systems can do things like discover all of your systems, and self configure. Commercial systems like NetMRI also do deep inspection, such as VOIP quality analysis, that Zabbix simply isn't designed to do.

I can do anything with Zabbix, anything that I have time to configure. But to be fair, systems like NetMRI can be configured for very large environments in 5 or 10 minutes, out of the box. But when I want to do something special, that I create code for myself, I don't use systems like NetMRI, I use Zabbix. Zabbix is my favorite general purpose network monitoring system. And to be fair, Zabbix is a commercial system too, when you need it to be.

Tools like NetMRI have a lot more power to self-configure, but that power is not free... The NetMRI quote for the hospital I worked for was $300,000!! The commercial version of Zabbix was much lower. And with some careful work with discovery templates, you could still get some self-configuration out of Zabbix.

Solar Winds is another commercial tool in the same space as NetMRI. Solar Winds is nice, but the performance is impacted by the fact it runs on Windows, so it takes more hardware to monitor large enterprises, but it is comfortable for the Windows geeks. I'm not a Windows geek...;)

George

See all 3 comments
Buyer's Guide
Download our free Zabbix Report and get advice and tips from experienced pros sharing their opinions.
Updated: April 2024
Buyer's Guide
Download our free Zabbix Report and get advice and tips from experienced pros sharing their opinions.