The Fault in Our Metrics: Rethinking How We Measure Detection & Response
2024-06-08, 14:30–15:15, Track 3 (Moody Rm 102)

Your metrics are boring and dangerous. Recycled slides with meaningless counts of alerts, incidents, true and false positives… SNOOZE. Even worse, it’s motivating your team to distort the truth and subvert progress. This talk is your wake-up call to rethink your detection and response metrics. You’ll get a practical framework for developing your own metrics, a new maturity model for measuring capabilities, and lots of visual examples of metrics that won’t put your audience to sleep.


Description

Metrics tell a story. But before we can describe the effectiveness of our capabilities, our audience first needs to grasp what modern detection and response is and its value. So, how do we tell that story, especially to leadership with a limited amount of time?

Measurements help us get results. But if you’re advocating for faster response times, you might be encouraging your team to make hasty decisions that lead to increased risk. So, how do we find a set of measurements, both qualitative and quantitative, that incentivizes progress and serves as a north star to modern detection and response?

Metrics help shape decisions. But legacy methods of evaluating and reporting are preventing you from getting the support and funding you need to succeed. At the end of this talk, you’ll walk away with a practical framework for developing your own metrics, a new maturity model for measuring detection and response capabilities, data gathering techniques that tell a convincing story using micro-purple testing, and lots of visual examples of metrics that won’t put your audience to sleep.

What’s new in this talk?

This talk presents a new approach to detection and response metrics. I propose moving away from the typical approach of measuring effectiveness solely based on quantitative indicators, such as event counts, which are often used by security operation centers or legacy detection and response programs. I introduce a new maturity model for measuring detection and response capabilities. I provide a methodology for utilizing micro-purple testing – tests that validate detection logic and analysis and response processes – to measure overall visibility into threats. Finally, I walk the audience through a practical framework that will help them develop their own metrics.

Key takeaways

  1. A new maturity model that helps tell the story of modern detection and response, the value it provides, and how your current capabilities level against your goal state.

  2. Visual examples of metrics you can use today to present across teams and leadership, along with a framework for developing your own detection and response metrics and practical advice on how to strategically move to these modern metrics when change is hard and leadership hates surprises.

  3. Methods to measure and prioritize threat coverage with micro-purple testing – tests that validate detection logic and analysis and response processes.

Who will enjoy this talk?

  • A CISO that wants to better understand what modern detection and response metrics should look like and how to include them in their overall program metrics.
  • Managers and directors that present detection and response metrics to leadership and the rest of their organization.
  • Engineers and analysts that are tired of their work being misrepresented with sad, unmotivating metrics.
  • Anyone interested in learning more about detection and response.

Outline

1. Introduction

This will present the key takeaways: a new maturity model to describe and measure detection and response capabilities, a framework for developing and moving to modern metrics, methods to measure and prioritize threat coverage with micro-purple testing, and visual examples of metrics that can be used today to present across teams and leadership. I will share a personal story of how metrics motivated me to subvert progress, and I’ll give an example of legacy detection and response metrics I’ve presented in the past and why they fell short.

2. Background and terminology

This will provide the context for why I believe blue teams have been doing detection and response metrics all wrong and how it’s prevented them from getting the support and funding needed to succeed. I will discuss how using sole quantitative indicators like event counts not only fails to tell the story of the value a detection and response program can bring to an organization, but also incorrectly incentivizes people to focus on the specific detections and events instead of thinking about the overall effectiveness of the program. I will also provide the required terminology and background research regarding metrics and measurements.

3. Measuring and describing what we do

We will begin our journey into modern metrics by telling the story of what we do and why it’s important. I will introduce a maturity model I created to measure detection and response capabilities, covering the pillars Observability, Proactive Threat Detection, and Rapid Response. I will provide examples of how to use the maturity model to visualize current and target state.

4. Measuring and prioritizing threat coverage

Next, I will address the issue of reporting on what threats can be detected without any context of what can’t be detected. I will introduce the concept of measuring and prioritizing threat coverage with micro-purple testing – tests that validate detection logic and analysis and response processes. I will visualize how to implement this testing and how to use the results to measure what types of threats can be detected (and not).

5. Developing your own metrics

Continuing, I will introduce a framework for developing metrics: the result is measurements that lead to better decisions, data that adds qualitative data to quantitative indicators, and consideration for the set of measurements so the team is incentivized toward progress. I will introduce the SAVER framework, which I created to measure the effectiveness of threat detection and response. I will walk the audience through its five categories of threat detection and response categories:

1) Streamlined: Efficiency, accuracy, and automation of the SOC. For example, how much time is spent on manual triage vs automated, and how often the automation leads to incorrect conclusions.
2) Awareness: Context and intelligence about existing and emerging threats, vulnerabilities, and risks. For example, how complete is the threat model for the environments being protected and how is threat intelligence sourcing increasing or decreasing the associated risks.
3) Vigilance: Visibility and detection coverage for known threats. For example, the percentage of MITRE ATT&CK techniques that can be investigated, detected, and responded to.
4) Exploration: Proactive investigations that expand our awareness and vigilance. For example, the discovery of gaps in current protections and illumination of new threats to the organization using threat hunting.
5) Readiness: How prepared are we for the next big incident? For example: the speed, accuracy, and completeness of runbooks across the organization.

For each of these categories, I’ll discuss the signals needed and how to collect them. We’ll discuss how to avoid the common pitfalls of metric selection and instead use metrics that can be measured today but impact future outcomes, find metrics that reward risk, ensure metrics can be affected by the team, and balance metrics to align with the overall goal. And finally, we’ll go through many examples of data visualizations, showing how to present these metrics.

6. Shifting to modern metrics

Before closing, I will discuss how to strategically move to these new types of metrics considering that change is hard and leadership hates surprises. I will give examples of transitional metrics and how to describe the shift to these more modern metrics.

7. Closing remarks

Finally, this section will provide a moment of bliss where we reflect on how we previously used to present metrics, and how going forward the audience is now empowered to tell the story of their detection and response program with a maturity model, describe threat coverage and priorities using micro-purple testing, and shape decisions that reduce risk using alert and incident data.

Allyn Stott is a senior staff engineer at Airbnb. He currently works on the information security technology leadership team where he spends most of his time working on threat detection and incident response. Over the past decade, he has built and run detection and response programs at companies including Delta Dental of California, MZ, and Palantir. Red team tears are his testimonials.

In the late evenings, after his toddler ceases all antics for the day, Allyn writes a semi-regular, exclusive security newsletter. This morning espresso shot can be served directly to your inbox by subscribing at meoward.co.

Allyn has previously presented at Black Hat, Kernelcon, The Diana Initiative, Texas Cyber Summit, and BSides Berlin, Singapore, Toronto, Seattle, Orlando, St Pete, San Antonio, Charleston, and Atlanta. He received his Masters in High Tech Crime Investigation from The George Washington University as part of the Department of Defense Information Assurance Scholarship Program.