Methodology

When Growth Meets Gravity: Registries, Models, and the New Software Infrastructure Burden

This chapter is based on Sonatype’s analysis of registry consumption and infrastructure load signals drawn from aggregated telemetry across major open source ecosystems (with Maven Central used as a primary lens where noted). The study examined download and re-download behavior over the report’s specified reporting windows, focusing on how automated software delivery systems (CI/CD pipelines, ephemeral build fleets, and dependency managers) amplify demand on shared registry infrastructure.

Sonatype Security Research Team evaluated registry load and sustainability pressure using four primary measures:

  • Growth and concentration: overall request volume trends and the degree to which traffic is dominated by a small set of high-volume consumers.
  • Re-download intensity: repeat-fetch behavior for the same artifacts, used as a proxy for cache inefficiency and rebuild amplification.
  • Burstiness and hotspots: peak download behavior (e.g., 95th percentile patterns) to distinguish steady consumption from spiky traffic that strains shared infrastructure.
  • Source footprint signals: directional indicators such as distinct IP counts and distribution patterns to infer automation characteristics (shared egress/NAT, centralized runners), without treating IPs as definitive identity.

While the chapter focuses on open source registry dynamics, the patterns identified (automation-driven amplification, concentrated demand, and cache fragility) reflect broader structural pressures affecting modern software supply chains. All quantitative results reflect a point-in-time snapshot as of the report’s stated verification date, and are reported in aggregate to avoid attribution to specific organizations or users.

Malware at the Gate: The Evolving Software Supply Chain Attack Surface

This chapter is based on Sonatype’s analysis of malicious open source packages identified through a mix of automated detection and expert review, using publicly observable package metadata and Sonatype threat intelligence. We evaluated packages observed within the report’s stated window using a consistent, multi-label threat taxonomy (one package may map to multiple behaviors), normalized duplicates/variants to avoid inflating counts, and used clustering signals (payload and code reuse, naming patterns, publisher behavior, dependency relationships, and shared infrastructure) to identify coordinated campaigns. Findings are reported in aggregate as a point-in-time snapshot as of the report’s verification date.

The Three Layers of Failure in Modern Vulnerability Management

1.

The Data Layer

This analysis evaluates the quality and usefulness of vulnerability records for open source by comparing public advisory data with Sonatype’s enriched vulnerability intelligence. We assembled a study set of 1,718 open source–relevant CVE records disclosed within the report’s defined window (January 1, 2025 to December 31, 2025), drawing from publicly available sources (including NVD/CVE metadata and CVSS where present) and Sonatype Security Research. For each CVE, the Sonatype Security Research Team assessed five core dimensions that directly affect whether teams can make consistent remediation decisions: (1) coverage (whether NVD provides usable CVSS/severity and how often that aligns with Sonatype), (2) scoring consistency (magnitude and direction of CVSS score drift between NVD and Sonatype, plus resulting severity-category shifts), (3) false positives (records or affected-version claims that would trigger remediation for non-impacted software), (4) false negatives (missing, incomplete, or delayed records/metadata that would cause impacted software to be missed), and (5) timeliness (time between public CVE disclosure and availability of NVD analysis/scoring). Results are reported at the CVE level using consistent matching rules across sources, with percentages rounded for readability; all findings reflect a point-in-time snapshot verified as of the report’s stated “as of” date.

2.

The Consumption Layer

This section is based on Sonatype’s analysis of Maven Central download telemetry to measure real-world consumption of known vulnerable vs. fixed component versions. We constructed a dataset of components with publicly disclosed vulnerabilities and an available remediated (fixed) release, then measured how frequently vulnerable versions continued to be downloaded relative to their fixed counterparts over the report’s stated time windows. Downloads are treated as a consumption signal (what build systems actually pull), not as a proxy for unique users, and results are reported in aggregate to quantify avoidable risk—cases where vulnerable versions remain in active use even though safer versions exist.

3.

The Ecosystem Layer

Prevalence of EOL components: We analyzed a representative sample of more than 3,000 enterprise SBOMs. For each SBOM, we examined the fully resolved dependency graph, including all transitive dependencies, and identified the number of package versions that were end-of-life. We calculated the percentage of EOL components per SBOM and then aggregated these results across all enterprises to measure overall EOL prevalence.

Number of EOL components with unpatched CVEs: We analyzed a database of over 11 million package versions with known end-of-life status and known, unpatched CVEs. This analysis identified approximately 81,000 EOL package versions with unpatched vulnerabilities. To estimate ecosystem-wide impact, we weighted this dataset against the broader population of open-source package versions, normalizing for selection bias introduced by database coverage and sourcing constraints. This produced an estimated total of more than 400,000 end-of-life package versions with unpatched CVEs across open-source ecosystems.

Breakdown of EOL Components by Registry: We analyzed a database of over 11 million package versions with known end-of-life status and grouped them by package registry. Within each ecosystem, we calculated the percentage of package versions that are end-of-life versus those that are currently supported. This resulted in a per-ecosystem end-of-life rate, as shown in the chart.

From Guesswork to Governance: Grounding AI Agents in Real-World Intelligence

We analyzed a sample of enterprise applications scanned over a three-month window (June–August 2025), filtering to valid scans (those with >10 components) to remove setup/test/incomplete results. For apps with multiple stages, we selected the most operationally mature snapshot using the hierarchy compliance > operate > release > build > develop > proxy, and then took each app’s first valid scan within the period. Analysis focused on four ecosystems (Maven, npm, PyPI, NuGet) and used direct dependencies identified by Sonatype’s component recognition as upgrade candidates; apps that migrated into/out of an ecosystem during the window were kept to reflect real-world complexity. 

We compared five upgrade strategies: No Breaking Changes (highest version score without breaking changes), Latest (most recent by publication date), Sonatype Best (highest version score regardless of breaking changes), Sonatype Security Hybrid (use No Breaking Changes only if it achieves a perfect security score of 100, otherwise fall back to Best), and an LLM strategy where GPT-5 (reasoning_effort=medium) returned a JSON recommendation (version, confidence, short rationale) per dependency (≈37,000 components, processed asynchronously with concurrency). Breaking-change effort was modeled using four buckets (0–5, 6–20, 21–100, 101+ changes) mapped to estimated hours and cost at $94/hr (conservative lower bound), with SemVer fallbacks when telemetry is unavailable (patch→L1, minor→L2, major→L3; L4 requires explicit data). 

Security outcomes were measured via a 0–100 security score derived from Sonatype vulnerability intelligence, combining the worst-severity issue with the count of distinct vulnerability types (log-transformed to reflect diminishing marginal impact). Strategy comparisons used Welch’s t-tests across primary outcomes (security score change and breaking-change count) at α=0.05.

brand blue glyph download

Download the Full Report

brand blue glyph right arrow

Next Up: Executive Summary