Everbridge IT Alerting Review

Gets the right parties to the table at the right time - our mean time to restore has diminished, saving us money


What is our primary use case?

The primary use is to engage, to notify, and engage IT team members when an outage is underway. We do use it for proactive notifications, but our primary use is to communicate with support-group team members when we need to get their attention and fix a problem.

Some of the features we were looking to achieve included proactive notifications, for a situation where we might have a database server that has 50 databases on it. That means if I shut down that database server for patching, all 50 of those databases are offline and multiple applications, anywhere from 50 or more, are also going to be taken offline. We use this Everbridge IT Alerting tool as a "polling" product to reach out to the stakeholders of those 50 databases and give them five options of day of week and time of day, and say, "We have to shut the server down. You get to have a voice in when we do so to minimize the impact on you as a business stakeholder." We've been leveraging the product to do some of those proactive outage notifications and polling capabilities as well.

We are also striving to integrate it with other parts of our IT operating ecosystem. We already use it to communicate when a monitoring alert triggers one of the reactive notifications, and we are seeking to implement more of a full loop between that event and an incident being opened in the service management system. We're not quite there yet, but we're walking in that direction.

How has it helped my organization?

What we were looking at was: "How do you shorten the time to restoration when a crisis is occurring?" That's really the key benefit of the out-of-the-box Everbridge IT Alerting functionality for us.

In terms of improvements to our organization, we're still on that journey. I've used the terminology with our friends at Everbridge a few times, where I associate this with the traditional "crawl, walk, and run" metaphor. One year ago when we launched, we were barely crawling. Then we started crawling fairly quickly. I would say we're now in the "toddling" stage where we walk, but we don't walk all that well yet. For us, it is a continual improvement journey. 

We are anticipating that over the next 12 to 36 months we're going to go from toddling to walking very upright and then into running.

Organizationally, we have gained some benefits already. Even in the first few months, we recognized or realized some of those benefits that I described above around shortening the time to resolution. 

What we envision getting as an additional organizational benefit is system consolidation. For example, we've got four different systems today that contain some of the data and capabilities that Everbridge can very naturally accommodate. We just haven't moved there yet. Over time, we'll see some reduced cost in infrastructure, reduced cost in application maintenance and complexity, some improved consistency across these procedures as a result of using one system versus many. This should contribute to further reducing the time to restore service. In the end, we get benefits adding up over time, where time to restore gets better and better, and our ability to leverage the platform in multiple ways gets better and better.

What is most valuable?

The engagement component is the most valuable, and what I mean by that is, if I were to send out an alert notification to a half-dozen people when a major IT crisis occurs, what I want to be able to do is remediate the issue as fast as I possibly can. For the sake of the business, I want to minimize downtime. What we were seeing in our prior systems, in our prior procedures and capabilities was that it would take quite a long time to get the right people to the table, making the right decisions to restore service.

One of the key drivers for us, and this is still one of the key benefits for us, is that Everbridge IT Alerting helps to pull those right people in very quickly through a collection of utilities where you can say, "I want to notify more than one person at a time. I want to escalate at my discretion and via rules within the system." It enables you to pull all the people into these bridge calls.

Let's say for example you have somebody in a group who is not online, but they are the on-call primary. The first iteration of a notification might go to them, but I can - depending on the nature of the issue - send a communication to the entire group under the anticipation that the primary on-call might not respond first. 

What needs improvement?

In recent weeks we've been talking to Everbridge about leveraging some new functionality that they're demploying right now around orchestration. Imagine a full, closed-loop event remediation: auto-remediation. A server throws an alert. We catch it in our monitoring tool. We page or SMS text, using Everbridge IT Alerting. A group member receives that text and responds to the text with "Option One." Option one can say, "I want to go ahead and execute an orchestration that will automatically stop and restart the services on that box or even reboot the box." That would, again, further reduce service restoration time, and significantly reducing the manual engagement of logging a ticket, logging onto the box, restarting the box or the servers or services manually. All of that can be done through automation. We're not there yet, but that's what we're talking about right now, as a part of our next wave of moving along the crawl, walk, run journey.

In terms of what could be improved, almost always, there is something that could be improved. I've been in this industry long enough to know that there is no perfect system. All the good ones still offer opportunities for getting better. I think if you were to look from their point of view, they would also see themselves in a crawl, walk, run journey. They may be further along in their walk, but they're probably not in the "Olympic sprint" or "Olympic marathon" stage yet. They've got lots of potential, room for feature enhancements, improvements.

A couple of key ones might include - and I think they are working towards these things - analytics. If I want to do sophisticated reporting and analysis of the data that's being captured in IT Alerting, at the moment, the reporting interface is immature. They're very helpful. They get it. They're listening to us, but it's weak. It's growing. It's getting better. Reporting and analytics would be one space. 

Their integration capabilities are still progressing, but not quite where we'd like to see them yet. They're moving there with that orchestration capability where they're seeing the potential of an API-first mentality. So instead of trying to build custom connections into everything, you open up APIs to allow other systems to talk to IT Alerting and allow IT Alerting to talk to other systems. There is room for improvement, but they get it. They're listening in that space, too.

Sure, there are things they can be doing better, but in partnership with them, us among other customers, I think we've got their ear, and they're being very proactive about listening.

For how long have I used the solution?

One to three years.

What do I think about the stability of the solution?

We have encountered some issues with stability. The shorter answer is, "Yes, it's stable." 

The longer answer would be, we've had a couple of outages, and we had some very deep discussions with Everbridge on the fact that I can't alert people of an outage in my environment if I'm having an outage in their environment. That's bad, and they know it. They recognize it. They acknowledge that. 

We did have one problem within the first 30 to 60 days of going live where we had a day-and-a-half outage of the platform, and frankly, that's unacceptable. They heard that from us very directly. Since then, they've mitigated that by expanding their architecture and changing the method of their architecture to be more highly available and robust on their side. 

Since then, the stability has been top-drawer. We've had a few minor issues around things like messages not being delivered. Part of it is our expectation, that they deliver every message, 100 percent, 24/7, but I also absolutely recognize that we are literally all over the globe. We're everywhere in the world today as Cargill footprint. That means we're trying to deliver messages in near real-time, 20,000 miles away under infrastructure circumstances that could be very poor. It might be in a third-world nation. It might be in a place where there is no cellular signal or their cellular partnerships are not as well-built or professionally associated as in some other parts of the world. So sometimes, messages don't get delivered, but I would say that is a very rare challenge for us. Everbridge, along with any other service provider in their space, will have to face those once in a while, and I think they're very good at running interference with those "edge" connection points that are difficult to navigate. They're very good at it. Occasionally, we see a message dropped or a message not delivered, but it is rare, and I think they are doing everything they can to handshake with the providers around the world in a way that continues to minimize and, maybe someday, eliminate those one-offs.

What do I think about the scalability of the solution?

I think scalability was part of that architectural review we did about a year ago where, when they encountered that outage. One of our challenges to them was, "If you're a cloud-first solution yourself, how do you not build your platform to be highly scalable?" Literally, spin up, spin down, any time you want or any time the demand suggests it.

Initially, the scalability was good but not great. Since then, I think it now borders on great. They've learned some lessons. They've restructured their platform a bit, and it is highly scalable. I've never seen a performance problem.

We have about 155,000 workers around the globe at Cargill, and there are maybe 5,000 who log in with some regularity to the platform to do message queuing or message sending or message response or self-service profile updates; I can log in and change my cell phone number, or specify that I want to use my cell phone as my primary and my work phone as my secondary. That capability has never been met with any comments from our community saying, "It doesn't perform well."

How is customer service and technical support?

Their first-line tech support is good, but I think their method of providing support deserves some very real consideration. What I mean is, when I spend X dollars buying a product, our expectation for support is very high. I want you, as a vendor, to support your product 24/7 and give me appropriate response windows. If it's not urgent, I'm okay with you not being imminent, but if it is urgent, I want you on the phone right away.

They've pushed a Professional Services model where they're saying, for you to get this kind of attention or support for either "How do I" questions or "What could we do a little differently?" or those kinds of things, they're suggesting we buy a bucket of Professional Services hours. I've resisted that from day one, and I have not yet given into that request because my perspective is, I already paid you for that. I bought the licensing and I bought support as a percentage, if you will, of the licensing price. That's what maintenance is for.

To me, Professional Services is more an act of deeper consulting where I might say, "I want to actually go build an integration that's not leveraging your API strategy or methodology, so it's going to need some custom development work," or something like that. I get that. That's a pretty classic Professional Services engagement. But to hear, when I call you and ask a question like, "Well, how do I do this?" an answer like, "This is why you should buy a bucket of Professional Services hours," it feels a little "game-y" to me. I don't really like that. I'm working with Everbridge on that, too. I think that they're still wrestling with what their support model looks like internally and what their Professional Services business strategy is. I think they're trying to work their way through those growing pains themselves, but my gut reaction is, it's not a great start to say, "In order to support you, you have to pay me more."

Their technical skill on the support side is good. Their model is a little bit shaky.

I realized this, sadly, after the sale. I think it's partly because those same growing pains were part of what they were going through as a part of our normal sales cycle discussions. So they never put on the table that to get really top-level support, it will cost you more, until after everything was already deployed. We were probably well into our first quarter of deployment when the suggestion was, "I think you should buy a bucket of hours." It caught me, quite frankly, by surprise because I felt that we should have been talking about that during the sales cycle.

They're going to find us really reluctant to write another check for what we would consider standard practice for product support. We have a very good relationship with Everbridge, so I would not want to send the wrong signals. I think they'll be very open-minded to hearing that kind of feedback. I don't know if they'll back down completely from their business position on Professional Services and support, but it's certainly going to be a conversation I'll continue having with them.

Which solutions did we use previously?

We had an incumbent solution that had been in place for about seven years. The principal reason for switching was that the incumbent was losing momentum in the marketplace for traditional IT communications and engagement, to get people to the table and fix problems. The incumbent was slipping in the market. They were not putting money into R&D. They were not developing their platform at the same pace that some of the natural competitors were.

We did look at them as a part of our solution-selection activity. We absolutely kept the incumbent in the ring and had great conversations with them about what's missing and what they were going to do next. In fact, they were acquired by another company during those solution-selection discussions, and we were very uncertain about whether or not the acquiring company would invest or ingest. Would they swallow this thing up and sort of bury it under the rug, or would they invest in making it be a more competitive product? 

I think, in hindsight - it's been over a year since we made this selection and about a year since we deployed IT Alerting - I'd say that the casual observation would be that the incumbent did not gain any ground. If anything, they may have continued to lose some ground. For us, it was, "You don't have the feature functionality that we really want, and you're not really making progress towards that in your own market space." Whereas Everbridge and a couple of others were providing some good indicators that they were stepping up their game as opposed to backing off their game.

How was the initial setup?

I wasn't actually doing the install, I was leading the program and working very closely with the folks who were administrators of the tool. The feedback I got was that it was actually very intuitive until you'd get a little bit into the weeds. Some of the complications of the environment resulted in a few challenging topics. They weren't showstoppers. We never felt like we couldn't keep the ball rolling

It was a little bit of both. The initial response felt very reasonable, very intuitive to the extent it's possible, but it's a sophisticated enough system that there were parts of it where you scratch your head and you say, "Well, where do I go for this? How do I log in and change the administrative configuration of group names?" That sort of thing.

That's where some of our initial Professional Services help came in. We did pay for the implementation Professional Services. That was worthwhile, it was appropriate to do that, and they helped a lot. Wherever we did find some of those points of confusion, those were good learning experiences for us. They were good usability conversations with them.

They continue to develop, and they're very good at taking feedback from their customers and figuring out how, or if, to include that feedback in future releases. And their release cycles have gotten faster. When we first signed up with them, they were probably doing two a year, and now I think they're closer to four a year. And some of what we fed into them is already making its appearance in their code base.

What was our ROI?

One of the things we were attempting to measure when we established the program is time to restore service. One of the things that IT Alerting helps us do is bring an IT service back online faster than we did before. One of the ways it does that is by getting those right parties to the table at the right time. Our mean time to restore, or mean time to repair, has diminished by a couple of percentage points, saving the company upwards of hundreds of thousands of dollars a year. That was one of our key measures going in, and it's been demonstrable so far.

What's my experience with pricing, setup cost, and licensing?

For us, the pricing is a good value. I can't say whether or not their list pricing looks favorable to everyone who's checking, but I can say that the process of sourcing and procurement with them was very professional, comfortable, and friendly. The negotiations were done well on both sides, and in the end, I'd say the price was very effective.

My suggestion would be, do your homework. If you know what the marketplace will support, I think it is fairly traditional. Not every market or every product fits this, but it's pretty normal that list prices are designed to be discounted. Very few, especially on the enterprise scale, are going to pay full sticker price for a software product. So do your homework, know where the discounting can get you, and know what you're willing to pay. Because if you say, "This has a value of X for me as an organization," if you articulate your position well, you have some very real opportunity to get either close to or at what you perceive to be the real value of the product in your negotiations. It's never an easy step but, done well, I think that people will find that Everbridge is a great listener and is willing to meet in the middle.

Which other solutions did I evaluate?

We also looked at TelAlert and xMatters.

We went through a pretty traditional solution-selection activity where we prepared and documented our requirements for the market leaders and included our incumbent, an existing solution that was doing some of what Everbridge does. In the end, one of our key selection criteria was relationship, and Cargill and Everbridge already had an agreement in place for their business continuity product, non-IT, which is used to do things like notify employees when there is a weather event or a security or concern, a risk event in a particular region of the world. We were already using that product, and it was an Everbridge relationship that was already in place. One of our deciding factors was, "How strong is that relationship?"

What other advice do I have?

Scope the project well. What I mean by that is, don't bite off more than you can chew, but don't do less than you need to do. Scoping it well means that you've identified the happy medium of, "I'm going to get great value to start, but I'm going to get more value as we continue to grow into the solution." That's the approach we took. We said, "Hey, if I can get the 80/20 rule applied, where 80 percent of what we're expecting to get out of the gate is achievable in our first deployment, that's pretty solid." If the other 20 percent isn't crucial - figure out how to prioritize what you do need and what you don't need - it's okay to let it go. 

Part of what we saw with our own project was the danger of scope-creep, where we said, "If our first objective is a like-for-like replacement of the incumbent, then be prepared to sacrifice some golden opportunities if those golden opportunities will cost us time and money that we don't have right now."

If we said, "Implementation date is an important milestone and cost of implementing is an important measurement," then I need to measure inside of those scoping guardrails. Don't do more than you can handle, but don't do less than what you need. I think we accomplished that pretty well. I think we sacrificed a couple things that several of our stakeholders would have loved to see out of the gate, but it would have cost us time and money that we weren't really prepared to spend.

I would start out with rating this product at eight out of 10 because there is always room to improve. I'm not sure I'd rate anybody a 10. I've been in this for a long, long time. I don't know that I've ever seen a true knock-your-socks-off 10. But this solution is a solid eight in that they provide the core functionality we were always interested in obtaining, and they are very engaged at the table in discussing how they get better and how their getting better can help us get better.

Disclosure: IT Central Station contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
Add a Comment
Guest
Sign Up with Email