Any experience with Event & Incident Analytic engines like Moogsoft?

14

Looking for any comparative details for Event & Incident Analysis engines, such as Moogsoft's solution.

De-duplication of Event messages and automated isolation to upstream incidents/events seems promising.

Anonymous avatar x30
Guest
As seen in
Logosasseeninsmall

14 Answers

99d899e0 d68d 40d2 b2af adb4c34229dc avatar?1435085680

Hi Kevin, My team is set to begin a pilot Moogsoft's solution within the next couple weeks, and NOI will stand up in parallel. With any learning algorithm, it seems time & data are key ingredients. We should have some idea of how these compare in coming months. Thanks for checking in! --> R

Like (0)22 July 16
Anonymous avatar x30

Randall - just wondering how your analysis is going?

Like (0)22 July 16
99d899e0 d68d 40d2 b2af adb4c34229dc avatar?1435085680

** Altug, Your note is very helpful; Thanks very much! The outline of capabilities and requirements is insightful and echoes personal experience. I can see even without product names, you've almost certainly work with and hit your share tooling challenges. The products in this space need to meet the bar you describe.

** Omar/Manish/Phillippe, CA SOI/TESM & CA UIM are capable in that they will deliver Service Modeling and Event Mgmt, but they are both expensive and labor intensive to implement and support for their core functionality. Moreover, a tool that merely presents or produces events should NOT be considered an Event Mgmt solution or an Event Analysis engine.

** Dan, I've haven't taken time to read up on BigPanda. Agreed on the importance of Altug's point. Care & feeding can get out of hand quick....

** Philippe, You hit a point which started my question. Netcool Omnibus was an acquired product, originally by MicroMuse, whose founders have now created Moogsoft. How to compare NOI and Moog, when they are so similar... Real world implementation experience... better yet, a bake off, side by side implementations...?
Having tested Netuitive, Prelert, CA ABA, Tivoli Predictive Insights (PI), and BMC BPPM for Predictive capabilities, no vendor product has been able to pass muster. Both Moog & NOI have predictive'ish functions. Moog's is built in as an 'extension' of Incident Analysis, but I fear it may only be predictive'ish. NOI is a collection of Tivoli tools that require a rather large Tivoli Framework to build on for full visibility. PI is one of those add-ons but will only analyze Event data as part of NOI. Unless additional PI metric feeds are licensed, NOI does not advertise to compete as a Predictive.

What I want to achieve... Ideally?... Efficiency and focus for my staff that is manually handling (trending in source, correlating across in time and CI relation, and isolating business data flows to probable break point) of over a 1000+ events each in a single shift. The Holy Grail would be a tool accurately isolating to the earliest possible Event(s) and a specific Incident as far upstream as possible for a given issue or impact type that is the likely break point.

Like (1)30 May 16
4a884903 cbb4 4a01 ae18 b85d9b060dbc avatar?1439231342

Hi, I have used CA-Unicenter, CA-SOI and now TESM (OpsDirector). People are misguided in thinking that SOI is an event management product. Similarly, it would be wrong to think of Splunk as that too. Unicenter is obsolete and was very onerous in rules. TESM only works with ServiceNow.
I have exposure to CA-UIM, but it is not open enough to be seen as an event management platform. I have an understanding of how Moogsoft (a spin-off Netcool) goes about its business but I have never used it. There is also Netuitive, worth looking into. What exactly are you looking to achieve?

Like (0)30 May 16
9abf502a 9e47 49f9 9041 f6af83971188 avatar

Hi Randall, also have a look at BigPanda (my company). We automate event correlation and have pre-integrations with all leading monitoring tools. BigPanda automatically generates high-level incidents from monitoring events and automatically shares them with external ticketing solutions like ServiceNow and JIRA or collaboration tools like Slack or HipChat. Correlation occurs in the cloud and event collection is typically agentless via secure APIs or webhooks.

Service Health Analytics dashboards provide visibility into key metrics like MTTR, top alerting hosts, and top alerting checks. Most enterprise customers using BigPanda benefit from 99% noise suppression. Configuration takes hours and is code-free. We offer a free trial if you're interested. As Altug mentioned, stay away from solutions that require you to manually maintain rules. Feel free to reply with any questions about BigPanda capabilities or configuration. Hope it's a good fit...

Like (0)27 May 16
Omar sanchez mr tech avatar 1434666108?1434666106
Omar Sánchez (Mr.Tech)ConsultantTOP 10POPULAR

The question should be Monitor or Logging?
Here are the basics:

Log != event
Logs can contain many non-event based data points which are useful in the future, or may become useful in the future.

Engineering your own log collection and analysis system covers the top .5% of users who need that technology. Most clients I speak with cannot engineer their own systems, hence they rely on log analysis products which are purchased versus developed. You are also assuming that users have developers writing the apps which are logging, and that’s very often not the case.

The reason why monitoring and logging are separate in most cases is the monitoring tools don’t do the type of log analysis people want today, they do the log/event analysis people wanted in 1995.

Like (0)27 May 16
Manish parikh li?1414333967

Sorry, don’t have any experience with Moogsoft but take a look at CA Service Operations Insight (SOI). It will provide you that same capability but much more features.

Like (0)26 May 16
Anonymous avatar x30
Dan HobbsReal User

I have never looked at Moogsoft. We probably want to wait until UIM 84.1 is released since it is suppose add many incident management features.

Like (0)26 May 16
99d899e0 d68d 40d2 b2af adb4c34229dc avatar?1435085680

Thanks for sharing, Mike! I've seen BMCs approach as well as CA's, IBM Tivoli's, and Moogsoft's most recently.
Event de-dup is indeed a common feature when it comes to the same alert firing repeatedly on a single host. What these other vendors 'promise' is de-dup of same or similar alert events across multiple hosts within an app's infra and even across multiple apps with same similar tiers. The idea is to group Events if they correlate in time and/or CI relationship.
The Incident Analysis functions promised are much as you describe but with a twist, and I couldn't agree more with the challenges you describe. This approach is taking only Event messages (from any/all tool sources) & actual Incident Record details (Ex: ServiceNow) and comparing to Business rules, Service Models, and Knowledge on past occurrences to find a current ticket as far upstream as possible. I've seen many vendors with Triage/Isolation functions which are valuable, but they usually drill down into Host/App/Code/etc. This approach seems promising and worth testing.

** MemberSH/SaleMan, Nothing personal, but I am discounting your Vendor comments for a couple reasons. 1.) looking for comparative details from experience working with multiple vendors. 2.) have to think twice on vendors w anonymous profile names

Like (0)26 May 16
Def7cc8e c009 4adc 9f19 9aa9e652f6b0 avatar?1436881715

Hello,

I would think just about any Enterprise Monitoring Solution allows for de-duplication of events out of the box… and just update the Event Count. At least all of the solutions I’ve employed provide this feature.

If I can surmise what Incident Analysis refers to: Probable (Root) Cause Analysis? Most solutions employ something like this as well. However there is always a challenge with event correlation to understand what is impacted, and whether any underlying alerts actually contributed to the problem. This is always dependent upon customer requirements as not all platforms and applications are architected in the same fashion.

I recently attended a good BMC webinar which covers Service Impact Modeling, which may apply here in some way – or at least provide the many things to consider when employing a similar strategy: (You may need to create an account in BMC Communities to view…)

https://communities.bmc.com/servlet/JiveServlet/download/33539-2-31518616/BPPM%209.5%20Webinar%20Series%20Configuring%20Simple%20Service%20Modeling%20%26%20PCA-20141028-recording.mp4

Online Documentation: https://docs.bmc.com/docs/display/public/proactivenet95/About+service+modeling

Hope this provides a good start in navigating down this rabbit-hole… 

Like (0)26 May 16
Anonymous avatar x30
seniorso11238Real User

Each vendor has a different take on this aspect, based on their historical
development and the capabilities of the tools they offer.

Some only perform monitoring on a particular infrastructure layer (network,
systems, storage, etc.) and forward them to event analysis engines, some do
a very good job of isolating root cause of each issue and forward only the
pertinent details to upper level processing solutions.

Let me say one thing: if the solutions you consider have a detailed rules
based engine that requires you to enter and update individual rules for
monitoring, please STAY AWAY! It is a very high maintenance solution and
will either suck your resources dry or become obsolete too fast too soon.

Make sure that the solution you are considering can resolve relationships
between infrastructure components and update them automatically (either as
soon as they happen, periodically or through manual triggering).

Make sure that root cause determination takes place at each infrastructure
layer monitoring solution (automated resolution of issues is a plus
wherever applicable) and only this information is sent to higher level
incident monitoring/tracking solutions.

A good solution set at a minimum should consist of solutions that are
capable of:
* network monitoring/management
* systems monitoring/management
* storage infrastructure monitoring/management
* business application performance management/monitoring (if possible)
* higher level incident analysis engine that is fed from each of the above
solutions and has a point and click interface to configure rather than
endless keyboard typing
* service desk solution that is fed from all of the above solutions to be
able to implement ITIL guidelines

But the main hurdle is to engage business side of the company/institution
to be able to gather information to understand what is important for them
and what is not. Remember, IT is there to support business. If you're
monitoring each and everything left and right without understanding the
business, you're just burning resources for a war that's already lost. This
may sound hard for the average IT department but it is an evolutionary step
that is required in today's corporate environment to become a part of
business that adds value, rather than being perceived as a bottomless pit
into which the organization throws money for no apparent benefit.

Please do not hesitate to contact me for further details.

--

Altug Gur

Like (0)26 May 16
Anonymous avatar x30

Hi,

I don’t have experience in the tools you mentioned below but I have expertise in Infrastructure monitoring with other tools. I know that most of the tools work on the same lines, I have got one question, is IT central station a right place to ask questions, I have also got some questions on Appdynamics, APM tool.

Thanks
Rohit

Like (0)26 May 16
Ca34eeec 20ca 421b 801f 9af9c80b1f53 avatar
JoseMolinaConsultant

Hi!

I have experience with some monitoring tools like:

- Microsoft System Center Operations Manager

- Riverbed Application Performance Management

- Riverbed Network Performance Management

I have experience with incident management (and additional ITIL work items) tools like:

- Microsoft System Center Service Manager

- ProactivaNet

Event management best practices studying can be helpful to select the right tool.

Regards,

José A. Molina

Like (0)26 May 16
Membersh178113 li?1420470253

Try Operations Manager I (OMi) from Hewlett Packard Enterprise. Differentiated product, scales from SMB to large Enterprise/xSP networks. Comes in a solution bundle with options to include industry leading ITOA (big data analytics capability). documented reference customers with more than 70% event consolidation/suppression.
http://www8.hp.com/us/en/software-solutions/operations-bridge-event-correlation/index.html

Like (1)26 May 16
As seen in
Logosasseeninsmall

Sign Up with Email