Advice From The Community

Read answers to top Event Monitoring questions. 430,223 professionals have gotten help from our community of experts.
Rony_Sklar
Are event correlation and aggregation both needed for effective event monitoring and SIEM? 
author avatarDavid Collier
User

Both are techniques aimed at reducing the number of active alerts an operator receives from the monitoring tool.

I don't fully agree with the previous descriptions of correlation and aggregation, welcome though they are.

Let's take a typical scenario. Assume a network interface on a large switch fails to result in many systems experiencing a failure. In the 'raw' state, i.e. with no correlation or aggregation, the monitoring system would receive potentially thousands of events - possibly multiple SNMP traps from other network devices or servers, event logs records from Windows servers, Syslog entries from Linux, errors from the database management system, errors from web servers relying on that database and probably lots of incidents raised by users on the help desk. Good correlation algorithms will be able to distinguish between "cause" alarms and "symptom" alarms. In this scenario, the "cause" is the failing network switch port and the symptoms are the database failures and log file entries. Simplistically, fixing the cause will also address the symptoms.

Typically, aggregation is used to "combine" events into a single alarm. Again there are multiple methods to do this. A simple one would be - as previously described - duplicate reduction. In a poorly configured monitoring environment every check that breaches threshold results in an alarm. If monitoring is granular, say every 30 seconds the CPU utilization is measured and an alarm raised if it exceeds 80% then very quickly the operator would be overwhelmed by many meaningless alarms - especially if the CPU is doing some work where high CPU usage is expected. In this case, handling 'duplicates' is helpful when helping operators identify real issues. In this case, it may be enough to update the original alarm with the duration of the threshold breach.

There are many techniques for aggregation and correlation beyond identifying cause and symptoms events or ignoring duplicates. For instance, Time based event handling. Consider a scenario where an event is only considered relevant if another event hasn't happened in a given timeframe before or after the focus event. Or a scenario where avent aggregation occurs based on reset thresholds rather than alarm thresholds.

There are also some solutions that purport to intelligently correlate events using AI. Although, speaking personally, this seems more marketing speak than a one-click feature. In reality, these advanced (i.e. $$$$$$) solutions need to maintain a dynamic infrastructure topology in near-real-time and map events to service components in order to assess root cause correlation. In the days of rapidly flexing and shrinking infrastructures, cloud services, and containerization, it is extremely difficult to maintain an accurate, near-real-time view of an entire IT infrastructure from users through to lines of application code. A degree of machine learning has helped, but the cost-benefit simply isn't there yet for these topology-based event correlation features.

author avatarRandall Hinds
Real User

Agree on all the answers posted here, and I especially like Dave's explanation on the more advanced solutions available on the market. Excellent call outs on the need for deep & well maintained relationship mapping to enable an AI's algorithm to connect-the-dots between aggregated alerts firing from multiple separate source tools. Having a mature ITSM implementations with CI-discovery, automated dependency-mapping, and full integration between your correlation engine & CMDB will help too.

author avatarDaniel Sichel
Real User

Other answers are pretty much sum this up but there is one important point to make. In some technology it's important to take into account the number of events that got are aggregated and for your sim device to be able to treat them as individual events for the purpose of correlation.

author avatarJames Meeks
User

As previously mentioned, Correlation is the comparing of the same type of events. In my experience, alerts are created to notify when a series of these occurs and reaches as the prescribed threshold.

Aggregation, based on my experience, is the means of clumping/combining objects of similar nature together and providing a record of the "collection"; of deriving group and subgroup data by analysis of a set of individual data entries. Alerts for this are usually created for prognostication and forecasting. Often the "grouping" is not detailed information so there is a requirement for digging into the substantiating data to determine how this data was summarized.

Alerts/Alarms can be set for both, but usually only for the former and not the latter.

author avatarInformation security at a financial services firm with 1-10 employees
Real User

You can not process and generate advanced correlated alerts without aggregation: limiting your correlation to one set of source will let your SIEM blind and unaware
of global context.
So yes, to get an 'EFFECTIVE' event monitoring with the goal to correlate them, you need to aggregate many different sources.

author avatarAndreas Rühl
User

"Aggregation is a mechanism that allows two or more events that are raised to be merged into one for more efficient processing" from https://www.ibm.com/support/knowledgecenter/SSRH32_3.2.0/fxhdesigntutorevtaggreg.html
"Event correlation takes data from either application logs or host logs and then analyzes the data to identify relationships. " from https://digitalguardian.com/blog/what-event-correlation-examples-benefits-and-more

So yes you need both for siem. For simple monitoring you dont. Theres a big difference between what a siem does and that what simple event monitoring does.

author avatarRishan-Ahmed
Reseller

Simplly : Correlation is the process to track relation between events based on defined conditions. Aggregation is nothing but to aggregate similiar events. Both are required for effective monitoring.

author avatarSales Engineer at a tech services company with 201-500 employees
Reseller

Aggregation is taking several events and turning them into one single event, while Correlation enables you to find relationships between seemingly unrelated events in data from multiple sources and to understand which events are most relevant.

Both Aggregation and Correlation are needed for effective event monitoring and SIEM; In Enterprise Security (ES) correlation searches can search many types of data sources, including events from any security domain (access, identity, endpoint, network), asset lists, identity lists, threat intelligence, and other data in Splunk platform. The searches then aggregate the results of an initial search with functions in SPL and take action in response to events that match the search conditions with an adaptive response action.

Aggregation example - Splunk Stream lets you apply aggregation to network data at capture-time on the collection endpoint before data is sent to indexers. You can use aggregation to enhance your data with a variety of statistics that provide additional insight into activities on your network. When you apply aggregation to a Stream, only the aggregated data is sent to indexers. Using aggregation can help you decrease both storage requirements and license usage. Splunk Stream supports a subset of the aggregate functions provided by the SPL (Splunk Processing Language) stats command to calculate statistics based on fields in your network event data. You can apply aggregate functions to your data when you configure a stream in the Configure Streams UI.

Correlation example - Identify a pattern of high numbers of authentication failures on a single host, followed by a successful authentication by correlating a list of identities and attempts to authenticate into a host or device. Then, apply a threshold in the search to count the number of authentication attempts.

See more Event Monitoring questions »

What is Event Monitoring?

The users of IT Central Station evaluated event monitoring software to determine the most important aspects of a product. The consensus was that the tool must operate strong data collection with an intuitive filtering system so as to provide enough, but not too much, information that can be drilled-down. Users were also concerned with the software's ability to customize displays per user requirements. Other key features included accuracy, dynamic but simple user interface, and alerts.

Find out what your peers are saying about Microsoft, BMC, Zenoss and others in Event Monitoring. Updated: July 2020.
430,223 professionals have used our research since 2012.