We just raised a $30M Series A: Read our story

Enigma NMS OverviewUNIXBusinessApplication

What is Enigma NMS?

Enigma NMS is the cost-effective, comprehensive, scalable and maintenance free solution for all your network needs.

Enigma NMS deployment always greatly improves quality and scope of network management and monitoring service delivery combined with significant cost reduction.

Pricing Advice

What users are saying about Enigma NMS pricing:
  • "I don’t know what the pricing is currently, but there are different price levels. We got an unlimited license, and considering what we get for the amount we paid, it’s a good deal. The other licenses are limited on number of devices (not monitors/metrics)."

Enigma NMS Reviews

Filter by:
Filter Reviews
Industry
Loading...
Filter Unavailable
Company Size
Loading...
Filter Unavailable
Job Level
Loading...
Filter Unavailable
Rating
Loading...
Filter Unavailable
Considered
Loading...
Filter Unavailable
Order by:
Loading...
  • Date
  • Highest Rating
  • Lowest Rating
  • Review Length
Search:
Showingreviews based on the current filters. Reset all filters
Matt   Davis
Monitoring Systems Technician
Real User
Leaderboard
Notifications, custom OIDs, good graphs are key for us, but needs more flexible alerting

Pros and Cons

  • "Enigma still does not have all the features we want, but it has enough of the most desired features: scalability, notifications, custom OIDs, good graphs, small polling interval, long term data retention, single poller."
  • "Enigma has a lot of specific detail coded into it for discovering and monitoring Cisco devices. If you are a Cisco shop this might be a major consideration."
  • "So far, as mentioned earlier, we have been able to provide graphs on at least 80,000 metrics spread across more than 10,000 devices. Enigma automatically begins monitoring interfaces that are up and active. There are a few different factors involved. This is not really configurable, but it works well enough by itself that it requires little to no maintenance."
  • "We would like to see more flexibility with alerting. Since our adoption of Enigma, it has improved greatly in this area, but there are still alerting threshold configurability limitations that we would like to see improvements on."
  • "I would like to see dynamic grouping. For example, in our case, we have Subscriber Units that are connected to APs. These are constantly being re-homed, or pointed to a different AP on the same tower or a different tower altogether. There is an SNMP value corresponding to the AP name, and one for the AP MAC address. I would like Enigma to be able to form groups based on the AP name (or the MAC address, either one), which are dynamically changed."
  • "I would also like to see Enigma move away from CentOS 6.5 and onto a more current platform, as well as away from myISAM tables to InnoDB. No improvements are being made to myISAM, and this has been the case for several years, from what I understand. So, this code is going to become more and more outdated by that virtue alone. The developers say that myISAM tables are faster. I did some reading up on it, and it seems that at one point, when InnoDB was new, it was slower than myISAM, but InnoDB has made major improvements since then."

What is our primary use case?

We are a wireless internet service provider (WISP). We use it to monitor an average of about 10 metrics on 27000+ devices (currently), polling every two minutes. It is customer-premises equipment. 

I should mention what we are not using Enigma for, and that is server monitoring. We don’t monitor any VMs, Windows, or Linux systems (besides Enigma itself). So I can’t really speak to aspects in that realm that don’t cross over into the network device realm. Having said that, Enigma has a lot of specific detail coded into it for discovering and monitoring Cisco devices. If you are a Cisco shop this might be a major consideration.

How has it helped my organization?

Before Enigma, we did not have a monitoring system capable of holding data from all of the devices we wanted to monitor. If customers called complaining of an intermittent problem, our techs would have to tell them "We will put you on monitor and look at the data collected after a week or so." Now, we can simply look at the past month of data that has been collected for any customer that calls in with a problem. Each device is polled every 2 minutes. This is a huge improvement and very valuable to our support teams.

What is most valuable?

It was difficult to find a product with all the features we want and that would not cost a fortune. I have to say that the decision to purchase Enigma came not because of a particular set of features, but because the developers are very fast to develop features if they agree that the feature is desirable. 

Enigma still does not have all the features we want, but it has enough of the most desired features: 

  • scalability
  • notifications
  • custom OIDs
  • good graphs
  • small polling interval
  • long term data retention
  • single poller
  • REST API

What needs improvement?

We would like to see more flexibility with alerting. Since our adoption of Enigma, it has improved greatly in this area, but there are still alerting threshold configurability limitations that we would like to see improvements on.

I would also like to see dynamic grouping. For example, in our case, we have Subscriber Units that are connected to APs. These are constantly being re-homed, or pointed to a different AP on the same tower or a different tower altogether. There is an SNMP value corresponding to the AP name, and one for the AP MAC address. I would like Enigma to be able to form groups based on the AP name (or the MAC address, either one), which are dynamically changed. That said, we have been able to code some things to automate grouping to a degree, using the REST API, which is a growing feature in Enigma as well as many other software products in general.

I would also like to see Enigma move away from CentOS 6.5 and onto a more current platform, as well as away from myISAM tables to InnoDB. No improvements are being made to myISAM, and this has been the case for several years, from what I understand. So, this code is going to become more and more outdated by that virtue alone. The developers say that myISAM tables are faster. I did some reading up on it, and it seems that at one point, when InnoDB was new, it was slower than myISAM, but InnoDB has made major improvements since then. If you are not looking for large-scale monitoring, then this might not be an issue at all. 

Over the years it seems Enigma's web interface has slowed down for us. We have more than tripled the amount of monitors since we started using it. I am wondering if the interface is falling behind in relation to current web browser code...? I honestly have not looked into why it is slow, but in some places it seems it should not be slow because it's not loading a lot of data, and if the data is indexed. It should return queries for, say, one node record pretty fast, I would think. But I am not an expert on this subject. 

Perhaps a solution for the slowness would be more granularity in the ability to only monitor certain things at a global level. I realize the difficulty of implementing this, but right now, Enigma does not seem to offer a way to slim down en masse. For example, for each port there is packets per second monitoring. This really is not a very useful metric for us. If I could turn off pps system-wide, that would probably free up a lot of resources.

For how long have I used the solution?

About 6 years

What do I think about the stability of the solution?

I did encounter issues with stability, but I believe it was a limitation of disk speed. In a monitoring system which is potentially performing multiple thousands of write and read operations every second, it will very quickly bottleneck in the disk I/O system. You need a fast disk system. We caused a hard crash that was unrecoverable because of this. In another incident, we lost a large amount of data. In this process, we decided to limit the data retention to 30 days, so the table sizes are limited, and read-write operations would suffer from less latency. The bigger your tables are, the longer it takes to seek through them to find the correct read or write position. This is an exponential factor, to my understanding.

What do I think about the scalability of the solution?

So far, as mentioned earlier, we have been able to provide graphs on at least 292000 operational ports spread across more than 27,000 devices. Each port has IN traffic, OUT traffic, ping packet drops and round trip latency,  Enigma automatically begins monitoring interfaces that are up and active. There are a few different factors involved. This is not really very configurable (you can enable them manually but disabling them once they are enabled is still kind of sketchy last time I checked), but it works well enough by itself that it requires little to no maintenance. 

Environment monitors are handled by different code than the interface monitors (in CentOS, you can see all the different scripts that are run through a Linux cron job), which means that they are logically separated to a certain extent, but are pretty well integrated into the GUI. 

For example, I gather that NETSAS has a difficult time relating database objects in the direction of Environment Monitor-TO-Device, which is, I believe, why we still don’t have the dynamic grouping feature I mentioned earlier. But when viewing an individual device, you can easily view all the Environment Monitors related to it.

If you are going to have a lot of devices being added and removed to/from a monitoring system daily, you need one with an API so that you don't have to manually do all of it. I did it manually for a while and then later handed it off to our IP provisioning team, but now Enigma has a REST API, and it works fairly well and has decent instructions. One difficulty I had while working with it was that both the node name and the node IP must be unique in the entire database. This makes it difficult sometimes to correct the name of one node if its IP has been reassigned to another device. I had to write a lot of code to work around this, and in the process I ended up switching from a real-time one-off solution to a scheduled audit solution.

How are customer service and support?

The level of tech support is top-notch. They are in Australia, so we have to wait until afternoon for a response, but they take care of us, and most problems are resolved within the same day.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

WhatsUp Gold for monitoring CPE, but now all CPE monitoring is done by Enigma.

We used to use WUG, because it has very flexible alerting configurability which is needed for our access points on towers, but it just cannot scale. It was not able to handle more than several hundred nodes before its performance would suffer significantly. And it was even thinning out the data tables (rolling up data) starting at an age of 12 hours (by default). WUG may have improved since the decision was made to move away from them, but we felt that the Linux/MySQL platform would be better able to handle the larger-scale demands without costing us a lot in licensing. 

For example with WUG being Microsoft SQL only, one must spend quite a bit of money to get the SQL version that will use more than 64GB of RAM. Currently, we have Enigma on a VM in vSphere with 100GB RAM, 12 CPU cores (Cisco UCS Mini with B200 blades, and NetApp all-flash storage). To increase the processing capacity, all we have to do is give it more resources (RAM, CPU, Storage).

How was the initial setup?

It’s hard to generalize and be objective in this assessment because I was new to network monitoring, and new to ISP operations, and new to SNMP, and new to Linux. Just about everything was new to me. Someone with better background knowledge would probably have been much faster than I was. 

If you are a company that is on a private network behind a firewall, I would say just let Enigma go to town auto-discovering the entire subnet. Looking back, I would recommend deciding on IP numbering conventions, and device and interface naming conventions, and implement those conventions before doing any discovery, but we didn’t do that. The benefit didn’t outweigh the cost. I would say if you decide beforehand on naming, and what you want to monitor, and what thresholds you want to alert on, and who needs to be notified about what, it would help significantly.

The help I received from NETSAS in initial setup was plenty sufficient. If I ran into any issues, it was usually fixed by the end of the day. 

What about the implementation team?

In-house.

What was our ROI?

We currently don't have the capacity to calculate ROI on something like this. The ROI is more of a customer satisfaction level or quality of service that we wanted to achieve, not necessarily related to quantity of time spent. 

What's my experience with pricing, setup cost, and licensing?

I don’t know what the pricing is currently, but there are different price levels. We got an unlimited license, and considering what we get for the amount we paid, it’s a good deal. The other licenses are limited on number of devices (not monitors/metrics).

Which other solutions did I evaluate?

Nagios XI, Statseeker, WhatsUp Gold, PRTG, InterMapper, Cacti.

Some of these fell out of the running pretty quickly just on the basis of the feature list or the pricing model (ie. licensing per interface, or worse: per metric), so I didn’t necessarily play with all of them. A close second was Statseeker, which was fast, but didn’t allow custom OIDs at the time, which was a big deal-breaker for us.

What other advice do I have?

It is always improving. Take a look at their release notes and you will see the pace. 

I like the philosophy of the developers, which is to listen to customer feedback and develop whichever features they think are desired most. Since they are a small company, they can do this with quite an impressive turnaround time. There have been multiple features that we have requested, and received immediate feedback on, in the form of a feature addition in the next release. This is beneficial but has drawbacks as well. Sometimes, the new code has not been tested thoroughly enough and thus does not work as expected right away, but these are quickly resolved if you pipe up about them.

Enigma has lots of features out of the box. You don’t have to be super technical to get it going, though every bit of general understanding you can get about Linux, monitoring, and databases will help. Again, as with any monitoring system, if you are going to be polling more than a few thousand metrics, make sure you have a disk system that can handle the load (all-flash would be best). I hope this info helps you make an objective decision.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate