Background#Effective event correlation should be an integral part of modern infrstrucutre administration. Reducing alarm volume while improving information content is a key determinant of a successfully managed infrstrucutre.
Why it is a Problem#If monitoring is done at the following levels:
- Operating System
If an outage happens at the Network Level, then obivously the OS and Application level should have an event generation as it is not available. But, we do not want to cause trouble tickets to be generated for erroneous or redundant reasons.
An event correlation engine would be able to determine the "root" cause to be the Network and not cause ticket to go to the application owner.
Correlation is really the final piece in a network management environment. You get just so many events out there, whether polled events or trap events. Taking all that information and correlating it, and efficiently figuring out whats important and whats not what are false positives, etc -- is critical.
If I get xyz it might mean one thing; if I get xyz and you add abc to it, then its something else entirely. So, the correlation engine really drives root cause analysis as well.
Without correlation, youre stuck with additional manual processes. If I can identify the event quickly with root cause analysis and I can figure out whats causing that event through automated correlation, then Ive effectively driven down response time and trouble shooting time.