The Measurable Benefits of Agentic-First Cybersecurity Operations
The global cybersecurity workforce cannot scale to meet demand. This research synthesises evidence from government bodies, independent academic institutions, and professional associations to evaluate the measurable operational benefits of adopting an agentic-first approach to cybersecurity operations — one in which AI agents handle high-volume, defined tasks while human analysts supervise by exception. All vendor-sponsored research has been excluded. Evidence is classified by confidence level throughout.
Executive Summary
The global cybersecurity landscape is characterised by a structural workforce deficit, escalating attack volumes, and mounting pressure on security teams operating with constrained budgets. Evidence from independent sources — including ISC2, ISACA, ENISA, the World Economic Forum, and peer-reviewed academic research — converges on a consistent finding: the current model of human-centric, reactive security operations is insufficient to meet modern threat volumes and is causing measurable harm through analyst burnout, workforce attrition, missed detections, and extended breach dwell times.
This paper evaluates the potential measurable benefits of adopting an agentic-first approach: one in which AI agents autonomously execute defined security tasks — detection, triage, investigation, containment, compliance evidence gathering — while human analysts supervise by exception.
The global cybersecurity workforce gap reached approximately 4.8 million unfilled roles in 2024 (ISC2 2024, n=15,852 respondents), a 19.1% year-on-year increase. Ninety percent of practitioners report skills gaps on their teams. This gap cannot be closed by hiring alone — automation of analyst tasks is the only scalable near-term mechanism to expand effective security operations capacity.
Sixty-six percent of cybersecurity professionals report their role is more stressful than five years ago (ISACA 2024). Nearly half report current burnout (ISC2 2024). Burned-out analysts miss detections, respond slowly, and leave — creating both a security and a human capital problem. Automating high-volume, low-complexity alert triage is the most tractable near-term intervention.
Academic research indicates that false positive rates in SOC environments can be extremely high. An Oxford University study (USENIX Security 2022) found practitioners characterised some tool-category rates as approaching 99%. Automated triage systems have demonstrated the ability to reduce alerts shown to analysts by up to 61% with a 1.36% false negative rate over millions of alerts (arXiv 2505.09843).
A peer-reviewed study by Brynjolfsson, Li & Raymond (NBER 2023) found generative AI assistance boosted knowledge worker productivity by 14% on average, with a 34% improvement for less-experienced workers. This finding — from a different domain (customer support) — provides directional evidence applicable to security triage and investigation tasks, but requires domain-specific validation before direct transposition.
This paper is based on secondary research using publicly available sources. Vendor-sponsored research has been excluded from primary claims. Several benefit areas — particularly MSP-specific efficiency gains — lack independent peer-reviewed evidence and are treated as inference rather than established findings. All claims are classified by confidence level throughout.
Research Scope and Methodology
Research Question
What measurable operational, economic, and security outcome benefits can organisations reasonably expect from adopting an agentic-first approach to cybersecurity and IT operations, and what is the quality of the evidence supporting those expectations?
Source Hierarchy
Evidence was collected through systematic searches targeting non-vendor-conflicted primary and secondary sources. The following hierarchy was applied:
| Tier | Category | Sources | Weight |
|---|---|---|---|
| Tier 1 | Government & regulatory bodies | NIST, CISA, ENISA, ISC2, ISACA, WEF, OECD, US BLS, GAO, UK DCMS, Jobs and Skills Australia | Highest |
| Tier 2 | Independent academic institutions | NBER (MIT/Stanford), USENIX (Oxford), ACM Computing Surveys, arXiv (methodology reviewed) | High |
| Tier 3 | Industry surveys with disclosed methodology | Kaseya MSP Benchmark, Sophos Active Adversary, Tines | Medium |
| Excluded | Vendor-conflicted research | IBM, Microsoft, CrowdStrike, Palo Alto Networks, SentinelOne, Fortinet, Splunk, Accenture, Deloitte, PwC, McKinsey, Gartner | Not used |
Evidence Classification
Throughout this paper, claims are explicitly classified as:
- Evidence — Supported by two or more credible, non-conflicted sources
- Single-source directional — Supported by one credible source but not independently confirmed
- Inference — Reasoned logical extension from established evidence, not directly evidenced
- Hypothesis — Plausible but not evidenced; requires empirical validation
32 documented evidence sources were reviewed across ten research topics. 15 sources were rejected for vendor conflict, methodology weakness, or non-independence.
What “Agentic-First” Means in Cybersecurity Operations
Definitional Context
Traditional security operations rely on human analysts to receive alerts, manually triage them, investigate suspicious activity, and escalate for action. “AI-assisted” approaches layer machine learning outputs onto this fundamentally human workflow — analysts still perform the core cognitive and operational tasks.
An agentic-first approach inverts this model. AI agents autonomously execute defined security tasks within prescribed authority boundaries, while human analysts supervise outcomes, set policy, and make judgments at exception points. The human moves from being “in the loop” (required for every action) to “on the loop” (monitoring and correcting, intervening when genuinely needed).
Agentic vs. AI Co-pilot: A Structural Comparison
| Dimension | AI Co-pilot / Assistant | Agentic-First |
|---|---|---|
| Primary executor | Human analyst | AI agent |
| Human role | Primary responder | Supervisor / exception handler |
| Alert throughput | Limited by analyst capacity | Scales with compute |
| Coverage hours | Limited to staffed hours | 24/7 continuous |
| Consistency | Variable (human fatigue) | Consistent within policy |
| Audit trail | Partial (analyst notes) | Complete (every agent step) |
| Skill requirement | Senior expertise to operate | Senior expertise to govern |
Regulatory Alignment
NIST SP 800-207 (Zero Trust Architecture) and the CISA Zero Trust Maturity Model v2.0 (April 2023) both explicitly include “Automation and Orchestration” as a cross-cutting capability. CISA describes this as capabilities that “leverage insights to support robust and streamlined operations to handle security incidents and respond to events.”
NIST AI RMF 1.0 (January 2023) provides a governance structure for AI systems through four functions: Govern, Map, Measure, and Manage. Agentic security platforms operating under explicit authority matrices with full audit logging are architecturally aligned with AI RMF governance requirements. Evidence
Market Context
4.1 The Cybersecurity Workforce Crisis
The cybersecurity workforce deficit is the most robustly evidenced structural problem in this research domain.
Evidence The ISC2 2024 Cybersecurity Workforce Study (n=15,852 practitioners) found the global workforce gap at approximately 4,763,963 people — 47% of total global need unmet. The gap grew 19.1% year on year. 58% state skills gaps put their organisation at significant risk. 37% faced budget cuts in 2024.
Evidence ENISA’s 2024 NIS Investments report, drawing on 1,080 professionals across EU 27 Member States, estimates an EU workforce shortage of approximately 300,000. The UK Government’s Cyber Security Skills in the UK Labour Market 2023 report found 50% of all UK businesses have a basic cybersecurity skills gap.
Evidence The US Bureau of Labor Statistics (May 2024) projects 29% employment growth for information security analysts 2024–2034, yet the ISC2 data shows the gap itself grew 19.1% in a single year — indicating supply cannot catch demand.
Inference Given that the workforce gap cannot be closed through hiring alone within any operational planning horizon, automation of analyst tasks represents the only scalable near-term mechanism to expand effective security operations capacity.
4.2 Analyst Burnout and Its Operational Consequences
Evidence ISACA’s 2024 State of Cybersecurity survey (n=1,800+): 66% of professionals say their role is more stressful than five years ago; 81% cite an increasingly complex threat landscape as the primary stressor; 55% report difficulties retaining qualified candidates; 46% cite high work stress as a reason practitioners leave.
Evidence ISC2 2024 Workforce Study found that nearly half of cybersecurity professionals at all levels currently report burnout, with teams expected to do more with fewer resources in increasingly complex environments.
Inference Burnout-induced performance degradation — including slower triage, higher missed detection rates, and analyst turnover — creates a compounding security risk distinct from headcount shortage. Automating high-volume, low-complexity alert triage is the most operationally tractable intervention for this specific problem.
4.3 Alert Volume and False Positive Rates
Evidence (academic) Alahmadi, Axon & Martinovic (USENIX Security ’22, University of Oxford), in a qualitative study of SOC practitioners, found they characterised false positive rates as extremely high — in some tool categories approaching what practitioners describe as “99% false positives” as an experiential reality. Researchers found most false positive alerts are explained by benign triggers rather than malicious activity, suggesting detection rule calibration and contextual enrichment could dramatically reduce analyst burden.
Single-source directional A research synthesis in ACM Computing Surveys (Tariq et al., 2025) indicates approximately 46% of all alerts across typical SOC environments are false positives.
Single-source directional An Automated Alert Classification and Triage system (AACT, arXiv 2505.09843, 2025) evaluated in a real production SOC over six months demonstrated a 61% reduction in alerts requiring human analyst attention, with a 1.36% false negative rate over millions of alerts. This represents a genuine measured outcome in a live environment, not a vendor claim or demo scenario.
Important caveat: Alert volume and false positive rates vary substantially by environment, tool set, and industry sector. The AACT study result derives from a single SOC environment and should not be assumed as a universal or guaranteed outcome.
4.4 AI Governance and Regulatory Context
Evidence NIST AI RMF 1.0 (January 2023) and the Generative AI Profile (NIST-AI-600-1, July 2024) provide the primary US governance framework for AI systems. The four functions — Govern, Map, Measure, and Manage — require organisations to maintain audit trails, assess risks, and implement controls.
Evidence CISA Zero Trust Maturity Model v2.0 (April 2023) includes Automation and Orchestration as a required cross-cutting capability for federal agencies pursuing zero trust architecture, with a FY2024 implementation deadline.
Evidence The WEF Global Cybersecurity Outlook 2025 found that 66% of organisations expect AI to have major impact on cybersecurity in 2025, yet only 37% have processes in place to assess AI tools before deployment — indicating a significant governance gap.
4.5 Economic Context of Cyber Incidents
Evidence WEF Global Cybersecurity Outlook 2025: 72% of respondents report an increase in organisational cyber risks; ransomware is the top concern for 45% of respondents.
Evidence ENISA Threat Landscape 2024 documented approximately 2,580 incidents across EU member states in its reporting period, with availability attacks, ransomware, and data attacks as the top three categories.
Single-source directional Average ransomware recovery costs (excluding ransom payment) were approximately $1.53 million in 2025 (Sophos Active Adversary data), down from $2.73 million in 2024, with improved recovery speed cited as a contributing factor. Note: Sophos is a security vendor; IR case selection may not represent all organisations.
Note: The most widely cited breach cost figures (IBM Cost of a Data Breach Report) derive from a vendor-commissioned study and have been excluded from this research as a primary evidence source.
Evidence-Based Benefit Categories
5.1 Operational Efficiency
5.1.1 Alert Triage and Processing Capacity
The most tractable and evidenced efficiency gain from agentic automation is in alert triage. The AACT academic study (arXiv 2505.09843) demonstrated a 61% reduction in alerts shown to analysts with a 1.36% false negative rate over millions of alerts in a real SOC environment. Single-source academic
Inference If analysts spend a material portion of their time on alert triage (estimates range from 27%–33% from available industry surveys, though these are not peer-reviewed), a 60% reduction in triage volume would release significant analyst capacity — equivalent to one to two additional analyst-equivalents for a team of eight.
Hypothesis Whether similar triage reduction rates are achievable in agentic-first platforms operating across heterogeneous enterprise environments at scale remains to be demonstrated in peer-reviewed studies.
5.1.2 Incident Response Speed
Single-source directional Sophos Active Adversary Report (2023) indicates median attacker dwell time was eight days for all attacks and five days for ransomware in H1 2023 — the lowest recorded since systematic tracking began. This improvement is attributed to improved detection tooling, but methodology does not isolate the specific contribution of automated versus human detection.
Inference Machine-speed detection and automated containment (operating in seconds rather than hours or days) would logically reduce dwell time relative to human-only SOC operations. The relationship between reduced dwell time and reduced breach cost is established in principle; quantification requires organisation-specific baseline data.
5.1.3 24/7 Coverage Without Linear Staffing Cost
Inference Maintaining 24/7 human security coverage requires approximately four to five analysts per coverage position (accounting for shift rotation, leave, training, and turnover). Agentic automation that operates continuously without shift constraints provides 24/7 coverage at platform cost rather than a multiple of human staffing. For MSPs and MSSPs, this changes the economic model from linear (more customers requires proportionally more analysts) toward non-linear (agent capacity scales faster than analyst headcount).
5.2 Economic Impact
5.2.1 Analyst Capacity and Cost
Evidence US Bureau of Labor Statistics (May 2024) reports the median annual wage for information security analysts at $124,910, with 29% employment growth projected 2024–2034. Fully loaded cost (salary plus benefits and overhead) is typically 1.35–1.5× base salary.
Inference If agentic automation doubles an MSP analyst’s effective coverage (a conservative estimate with no specific peer-reviewed evidence), a $124,910 median-salary analyst effectively becomes the economic equivalent of two analysts for monitoring and triage functions.
5.2.2 Skills Gaps and Breach Probability
Evidence ISACA 2024 found that 22% of organisations with critical or significant skills gaps experienced material breaches, versus 17% of organisations with no skills gaps — a 5 percentage point differential. This establishes an association (though not fully demonstrated causality) between skills gaps and breach probability.
Independent quantification of per-breach average costs by sector and organisation size is limited in non-vendor peer-reviewed literature. The IBM Cost of Data Breach Report — the most widely cited source — has been excluded from this research due to vendor conflict of interest. This is a genuine evidence gap for economic impact modelling.
5.3 Security Outcome Impact
5.3.1 Detection Coverage
Evidence (academic) The USENIX Security ’22 study (Oxford University) demonstrates that high false positive rates paradoxically reduce effective detection — when analysts experience alert fatigue, genuine threats may be dismissed or deprioritised. Automated triage filtering false positives with high accuracy should improve genuine threat detection rates.
5.3.2 Consistency of Response
Inference Human analysts vary in capability, attention, and performance — particularly across shift boundaries, after overnight periods, and when fatigued. AI agents executing defined response playbooks are consistent by design. Whether this consistency translates to materially better security outcomes in practice depends on playbook quality and coverage.
5.3.3 AI-Assisted Knowledge Work Productivity
Evidence (different domain) A peer-reviewed study by Brynjolfsson, Li & Raymond (NBER Working Paper No. w31161, 2023 — MIT and Stanford researchers) conducted a randomised evaluation of AI assistance across 5,179 customer support agents at a Fortune 500 enterprise. AI assistance improved productivity by 14% on average, with a 34% improvement for less-experienced workers, reduced customer escalations, and improved retention.
This represents the most rigorous non-conflicted productivity evidence available. It supports a directional expectation that AI assistance improves knowledge worker performance, with the largest benefits for less-experienced workers — who are disproportionately represented in analyst teams experiencing skills shortages. Direct transposition to cybersecurity requires explicit domain disclosure and further validation.
5.4 Enterprise and Government Impact
Evidence GAO reports (2023–2024) identify consistent gaps in federal agency cybersecurity performance measurement and zero trust implementation. Four DOD programs had not developed zero trust architecture plans by the 2027 deadline as of 2024.
Evidence CISA Zero Trust Maturity Model v2.0 requires Automation and Orchestration capabilities, creating a regulatory driver for agentic approaches in US federal contexts.
Evidence More than 50% of Australian government agencies experience critical cybersecurity skills shortages (State of the Service Report 2023–24).
5.5 MSP / MSSP Business Model Impact
Evidence (directional) Kaseya 2023 MSP Benchmark Survey (n=1,091) indicates that only 8% of MSP executives report technicians managing more than 750 endpoints, with the gold standard cited at approximately 350 managed endpoints per technician.
Inference Agentic automation handling first-line monitoring, alert triage, and routine response could allow MSP/MSSP analysts to oversee higher customer ratios. Increasing managed endpoints per analyst from 350 to 700 would double effective revenue capacity per analyst. Whether this ratio is achievable in practice depends on the proportion of analyst time currently consumed by automatable tasks.
MSP / MSSP Impact Model
The scenarios below are modelling exercises based on inference and partial evidence. They should not be presented as guaranteed outcomes. Actual results depend on: the proportion of analyst time currently spent on automatable tasks (not independently measured), automation reliability in the specific environment, playbook quality, and customer environment complexity.
Representative MSSP Baseline
For a representative MSSP with 15 analysts managing approximately 30–50 customers, using evidence-based benchmarks:
- Median US analyst fully loaded cost: approximately $175,000–$190,000 annually (BLS median salary $124,910 plus 40–50% benefits/overhead)
- Managing approximately 350 endpoints per technician (Kaseya 2023 gold standard benchmark)
- Average alert volume: estimated 5,000–8,000 alerts per analyst per month (industry survey; varies widely)
Estimated Analyst Time Distribution
| Task Category | Estimated % Analyst Time | Evidence Basis |
|---|---|---|
| Alert triage and first-line investigation | 27–33% | Industry surveys (partially disclosed methodology) |
| Report generation and documentation | 10–15% | Inference from knowledge work automation research |
| Compliance evidence collection | 10–20% | GRC automation literature (directional) |
| Ticket management and escalation routing | 5–10% | MSP benchmark data (indirect) |
| Genuine investigation and containment | 35–50% | Residual estimation |
These percentages are estimates based on partial evidence. No peer-reviewed study specifically quantifies MSSP analyst time distribution across task types.
Efficiency Scenarios
Enterprise Impact Model
Representative Enterprise Baseline
For a mid-market enterprise with 5,000 endpoints, 8 security analysts, mixed on-premises/cloud infrastructure, and compliance obligations across 2–3 frameworks (NIST CSF, SOC 2, ISO 27001):
Analyst Capacity Uplift Scenarios
- Conservative (20–25% triage burden reduction): Equivalent to 2 additional analyst-equivalents from existing 8-person team. Cost value: approximately $250,000–$375,000 per analyst-equivalent (fully loaded).
- Expected (35–40% reduction): Equivalent to 3 additional analyst-equivalents, approximately $750,000–$1.1M in freed capacity value.
Strategic Outcomes Beyond Cost
- Board-level visibility: Persona-tuned dashboards enable CISOs to present real-time security posture without manual aggregation
- AI governance compliance: Shadow AI discovery addresses the WEF-evidenced gap (only 37% have AI assessment processes)
- Regulatory readiness: Automated compliance evidence supports NIS2, SOC 2, ISO 27001, HIPAA, and other framework obligations
- Skills gap bridging: NBER research (directional) suggests AI assistance provides the greatest uplift — up to 34% — for less-experienced workers
Inference Compliance automation may reduce compliance-related labour costs by 20–25% (directional, GRC automation literature). For an enterprise spending $300,000–$500,000 annually on compliance activities, this suggests $60,000–$125,000 in potential annual savings. These figures require organisation-specific validation.
Government and Public Sector Impact Model
Evidence More than 50% of Australian government agencies experience critical cybersecurity skills shortages (State of the Service Report 2023–24).
Evidence US GAO 2024 identified that four major DOD programs had not developed zero trust architecture implementation plans by their 2027 deadline, and that the National Cybersecurity Strategy implementation plan lacked outcome-oriented performance measures.
Evidence ENISA 2024 documented that 220 of approximately 2,580 EU cyber incidents specifically targeted two or more EU member states simultaneously — indicating cross-border threat actors targeting public sector infrastructure.
Regulatory Drivers
- US Federal: CISA Zero Trust Maturity Model v2.0 (April 2023) requires Automation and Orchestration as a cross-cutting capability for all federal agencies
- EU: NIS2 Directive creates mandatory incident reporting and security measure requirements for essential and important entities, including public sector
- NIST AI RMF: Governance framework for AI system deployment that government agencies are increasingly expected to follow
- Australia: Jobs and Skills Australia (2024) documents sustained demand for cybersecurity skills against constrained public sector hiring capacity
Government-Specific Adoption Considerations
- Procurement processes typically require independent evaluation and certification before deployment
- Sovereignty and data residency requirements may constrain deployment architecture
- Authority matrices must reflect legal constraints on automated action, particularly in defence contexts
- AI governance requirements (NIST AI RMF compliance) should be verified and documented in procurement
Risk, Governance, and Responsible AI
Any responsible evaluation of agentic security automation must acknowledge the real risks. These are not hypothetical concerns — they are documented governance requirements and operational realities.
Risks of Agentic Automation
- Over-reliance risk: Organisations that reduce analyst headcount based on automation capability projections take on risk if the automation underperforms, is unavailable, or is circumvented by adversaries. NIST AI RMF explicitly addresses this through the “Measure” function, requiring ongoing performance monitoring.
- Adversarial risk: AI systems can be manipulated. Adversaries may probe agentic systems to identify policy boundaries, exploit automation gaps, or deliberately trigger containment actions to create operational disruption. This is an open research question.
- False negative risk: No automated triage system achieves zero false negative rates. The AACT study reported 1.36% false negatives over millions of alerts — a low rate that nevertheless means genuine threats are occasionally missed. System design must account for this through escalation and periodic human review.
- Governance and accountability: When an autonomous agent takes a containment action that turns out to be incorrect, clear accountability frameworks are needed. Reversibility, audit logging, and defined escalation paths are minimum requirements.
- Regulatory uncertainty: Autonomous security agents are not yet specifically regulated in most jurisdictions, but this may change. EU AI Act classifications may apply to high-impact automated decision systems.
- Skills erosion: Over time, if human analysts rely heavily on agents for triage and investigation, the skills required to supervise, govern, and replace automated systems must be actively maintained.
Responsible Deployment Framework
Evidence suggests (NIST AI RMF 1.0; CISA Zero Trust Maturity Model; WEF 2025 guidance) that responsible deployment of agentic AI in security operations requires:
- Explicit authority governance: Every autonomous action capability must be explicitly configured, constrained, and documented
- Continuous performance monitoring: Automated system outcomes must be regularly reviewed against human-verified ground truth
- Reversibility by design: Every agent action must have a defined undo path
- Human oversight at exception points: Genuine decision uncertainty must trigger human escalation, not autonomous default action
- Skills preservation: Human analysts must maintain sufficient skills to supervise, govern, and when necessary replace automated systems
AI Governance as a Product Feature
Evidence WEF Global Cybersecurity Outlook 2025: 66% of organisations expect AI to have major cybersecurity impact, but only 37% have processes to assess AI tools before deployment. Shadow AI (unauthorised use of AI tools within organisations) is identified as an emerging risk by both ENISA and WEF.
Agentic security platforms that include shadow AI discovery and governance features address a documented enterprise need — the ability to identify and govern AI usage within the organisation — that is distinct from the security operations efficiency benefits.
Metrics Approved for Public Use
The following metrics are supported by evidence of sufficient quality (Tier 1 or Tier 2 sources, non-vendor-conflicted) for use in external communications, provided they include appropriate attribution and context:
| Metric | Value | Source | Confidence |
|---|---|---|---|
| Global cybersecurity workforce gap | 4.8 million unfilled roles | ISC2 Workforce Study 2024 | High |
| Security teams reporting skills gaps | 90% | ISC2 2024 | High |
| Professionals reporting increased stress | 66% | ISACA 2024 | High |
| Analysts reporting burnout | ~50% | ISC2 2024 | High |
| Organisations with increased cyber risk | 72% | WEF Global Cybersecurity Outlook 2025 | High |
| EU cybersecurity workforce shortage | 300,000 | ENISA 2024 | High |
| Organisations ranking ransomware #1 | 45% | WEF 2025 | High |
| US cybersecurity employment growth (projected) | 29% by 2034 | US Bureau of Labor Statistics, May 2024 | High |
| Alert volume reduction with automated triage | 61% | arXiv 2505.09843 (2025) | Medium |
| AI knowledge worker productivity uplift | 14% average; 34% novice | Brynjolfsson, Li & Raymond, NBER 2023 | Medium (different domain) |
| Organisations with AI assessment processes | 37% | WEF Global Cybersecurity Outlook 2025 | Medium |
| Median attacker dwell time (H1 2023) | 8 days | Sophos Active Adversary Report 2023 | Medium (IR firm data) |
Any use of these metrics in marketing materials must include source attribution. Metrics must not be combined into derived calculations without explicit disclosure of the calculation methodology. Medium-confidence metrics must note their limitation when cited in formal contexts (investor briefings, government procurement).
ROI Calculator Default Inputs
The following parameters may be used as evidence-based defaults in an ROI modelling tool, provided the calculator is transparent about the basis for each default and allows user override with organisation-specific data:
| Parameter | Default Value | Source | Notes |
|---|---|---|---|
| US cybersecurity analyst median salary | $124,910/year | BLS May 2024 | Adjust by geography; fully loaded = 1.35–1.5× |
| False positive alert rate | 40–60% | USENIX ’22; ACM 2025 | Environment-specific; use as range |
| % analyst time on triage | 27–33% | Industry surveys (methodology partially disclosed) | Conservative end recommended for modelling |
| Automated triage reduction potential | 40–60% | AACT study (arXiv 2505.09843) | Use conservative end (40%) pending broader validation |
| Analyst productivity uplift (AI-assisted) | 14% average | NBER w31161, Brynjolfsson 2023 | Different domain; directional only; include caveat |
| MSP managed endpoints per technician | 350 | Kaseya 2023 MSP Benchmark Survey | Industry benchmark; vendor-sponsored survey; user-adjustable |
| % analysts reporting burnout | ~50% | ISC2 2024 | Turnover risk input |
| 24/7 FTE per coverage position | 4–5 | Shift planning arithmetic | Inference; not peer-reviewed; standard HR planning |
Metrics Excluded from External Claims
The following metrics are not suitable for use in external marketing, investor, or procurement communications without independent third-party validation:
| Metric | Reason for Exclusion | Required Action |
|---|---|---|
| Average data breach cost ($4.44M) | Sourced to IBM Cost of Data Breach Report — vendor-commissioned research with material conflict of interest | Find independent source or remove |
| AI automation savings ($1.9M per breach) | Derived from excluded IBM source | Remove; replace with non-conflicted evidence or model explicitly |
| Adversary breakout time (29 minutes) | Sourced to CrowdStrike 2026 Global Threat Report — excluded vendor source | Replace with Sophos/ENISA independent IR data (8-day dwell median) |
| “96% auto-resolution rate” | Vendor-claimed; no independent verification; methodology undisclosed | Independent third-party audit required before external use |
| “3.4-second containment time” | Demo scenario; may not represent production performance | Real-world measurement across customer environments required |
| “1,284 actions per day” | Vendor capability claim; context and methodology unclear | Define conditions precisely; independent validation recommended |
Research Limitations
This research is subject to the following limitations, which must be acknowledged in any derivative materials:
- No primary data collection. All findings are based on secondary research. No surveys, interviews, or empirical measurement of Athena Agentic customer outcomes were conducted.
- Domain transfer problem. The most robust AI productivity evidence (Brynjolfsson et al., 2023) comes from customer support, not cybersecurity. Direct transposition of 14% productivity gains to security operations is inference, not evidence.
- Alert triage research specificity. The AACT study (61% alert reduction) was conducted in a specific SOC environment with specific tool sets. Generalisability to diverse enterprise or MSSP environments is not established.
- Vendor conflict in key metrics. Average breach costs and adversary speed statistics (IBM, CrowdStrike) are the most commonly cited industry metrics but come from excluded sources. This creates a genuine gap in economic impact quantification that is difficult to fill with fully independent data.
- MSSP economics gap. No independent peer-reviewed research specifically quantifies MSSP analyst time distribution across task types, making efficiency models heavily dependent on inference and industry survey data with limited methodology disclosure.
- Temporal relevance. Some evidence pre-dates significant AI capability advances (post-2022 GPT-era). Some older studies on alert fatigue and dwell time may not reflect current SOC environments.
- Geographic concentration. Much of the strongest evidence is US-centric (BLS data, ISC2 US subset, CISA guidance). Application to UK, EU, Australian, or government contexts requires geographic adjustment.
- Agentic AI is nascent. As of mid-2026, agentic security AI at production scale is a recent development. Long-term outcome data, independent audits of production performance, and peer-reviewed case studies are not yet available in the literature.
- Counterfactual gap. We cannot directly observe what security outcomes would have been achieved without agentic automation in specific environments. All benefit claims are necessarily comparative or inferential.
Conclusion
The evidence base for adopting agentic-first cybersecurity platforms is strongest in the foundational problem areas — workforce shortage, analyst burnout, alert volume, and the need for continuous automated coverage — where government, academic, and professional association sources converge on a consistent picture of structural insufficiency in the current human-centric SOC model.
The evidence for specific operational and financial benefits of agentic automation is directional and requires customer-specific validation. The most robust transfer of evidence comes from: (a) academic automated triage research demonstrating genuine reduction in analyst burden; (b) AI productivity research demonstrating measurable uplift in knowledge work; and (c) economic modelling based on verified labour cost data and transparent assumptions.
For MSPs and MSSPs, the structural economic argument is compelling: the workforce shortage makes proportional headcount scaling economically and practically unviable, and agentic automation is architecturally suited to the multi-tenant, high-volume, continuous monitoring use case. For enterprises, the compliance automation and 24/7 coverage arguments are strongest. For government, the zero trust automation mandate and compliance burden reduction are the most robustly policy-supported benefits.
The research gap most worth closing is production performance data: deploying instruments to measure analyst time before/after automation deployment, conducting independent third-party performance audits, publishing case studies with disclosed methodology, and engaging academic institutions for peer-reviewed evaluation. These activities would be the first independent peer-reviewed studies in the space and would represent significant industry credibility.
Full References
All sources below are non-vendor-conflicted primary and secondary sources. Vendor-commissioned research has been excluded; see Section 12 for the excluded metrics register.