False positives costing businesses ‘millions’, say CIOs

CIOs say their teams are flooded with alerts, with stolen resources amounting to potentially millions in lost revenue.
23 January 2020

Constant alerts are slowing down IT operations. Source: Shutterstock

IT and cloud operations teams are being bombarded by nearly 3,000 alerts from monitoring and management tools each day, according to new Vanson Bourne research commissioned by Dynatrace.

The 2020 Global CIO Report report describes these teams as “drowning in a data deluge.”

Of the hundreds or thousands of alerts received each day, many are false positives, duplicates or low priority said respondents comprising 800 CIOs in enterprises of more than 1,000 employees.

Just 26 percent of those daily alerts need actioning, they said. 

Traditional monitoring tools weren’t designed to handle the volume, velocity and variety of data generated by applications running in dynamic, web-scale enterprise clouds. Tools are often siloed and lack the broader context of events taking place.

IT teams are pinged with a surge of alerts as a result, and must spend an average of 15 percent of their time identifying which alerts need addressing. Time allocated to this task is making it hard to automate enterprise cloud operations, and is leading to problems which could have been prevented for the majority. 

Not only is this hampering time spent on supporting the business and its customers, time swallowed up manually filtering out false positives and duplicates is estimated to come at massive cost to organizations. 

In companies that spent an average of US$10.2 million on IT staff annually, for example, time ‘asking questions’ would cost US$1.5 million each year. 

Organizations require a “radically different” approach, and autonomous cloud operations represent the next stage, beginning with automating continuous delivery and operational tasks, to enable self-healing applications and auto-remediation.

“Several years ago, we saw that the scale and complexity of enterprise cloud environments was set to soar beyond the capabilities of today’s IT and cloud operations teams,” said Bernd Greifeneder, CTO and founder, Dynatrace. 

“We realized traditional monitoring tools and approaches wouldn’t come close to understanding the volume, velocity and variety of alerts that are generated today.”

The rise of AIOps solutions

In answer to these challenges– which are in no way unique to the Dynatrace study– artificial intelligence (AI), machine learning and big data analytics technologies are swiftly finding their place in IT operations, under the AIOps umbrella. 

The promise behind these tools is that organizations will be able to fully automate many of their IT Ops teams’ traditional tasks and free resources for higher-value, more complex challenges. 

Instead of relying on human engineers to figure out why a service or application has stopped working or slowed down, AIOps tools can use data to interpret the problem and fix it automatically. They can identify and preempt possible incidents on the horizon by analyzing behavioral and historical data with sophisticated algorithms and machine learning techniques. 

The intended benefits of these tools are pretty self-explanatory: by being able to detect and react to events in real time, they boost efficiency, hand businesses more control over their applications and infrastructure, and give IT and cloud operations teams their time back to focus on building seamless experiences. 

These tools are effective and, given the scale of the challenges exposed in Dynatrace’s survey, it’s unsurprising that there is now a swelling market of solutions.

 But activating AIOps is certainly not as easy as “flipping a switch”. AIOps tools still need to be steered and “trained” to serve their purposes. New technologies and functions– such as cloud-native infrastructure and DevOps– produce exponentially more data than legacy technologies. 

Human engineers must continually optimize tools and tweak workflows to accommodate changes in data sets, workloads and demands. Critical evaluation is required of where data is coming from, how it’s processed and whether its complete in the information it provides. 

Finally, and like the introduction of automation initiatives across other areas of industry, implementing AIOps can be met by internal friction and misunderstanding. Not all teams will be eager to embrace technology that’s poised to make redundant, to some degree, areas of their skill sets which have been developed over time. 

Business leaders must consider the potential cultural disruption caused by the introduction of automation technologies, ensuring transparency as to wider motivations. 

Training teams most impacted around the technology’s value, addressing fears and gaining buy-in is vital, as well as running several small-scale trials in a ‘real-world’ environment.