Open Source Vulnerabilities: Persistent Risks and Insights

Introduction

Log4Shell was supposed to be the wake-up call that changed everything. Four years later, the data says otherwise. In 2025, a meaningful share of Log4j downloads are still going to vulnerable versions instead of the fixed releases that have been available for years. Years after disclosure, roughly 13% of Log4j downloads in 2025 were still vulnerable, even though safe versions have been available for nearly four years.

Sonatype’s research shows this isn’t just a lingering Log4j problem. When we zoom out across the broader ecosystem, the pattern repeats: about 95% of vulnerable components downloaded already had a safer version available, while only around 0.5% of components truly lack a fix — the rare “no path forward” edge cases.

In fact, the Log4j vulnerability doesn’t even crack the top few anymore. Sonatype Security Research examined some of the most frequently downloaded avoidable vulnerabilities — they have collectively been downloaded more than 2.94 billion times this year or since their patches were released (whichever is more recent). Every one of those downloads represents unnecessary risk: teams pulling vulnerable versions when fixed ones already exist, and have for years.

This isn’t fundamentally an open source vulnerability problem; it’s a consumption and dependency-management problem. Now that new, critical vulnerabilities pop up regularly, such as the recently disclosed React2Shell set of Critical vulnerabilities, and AI development takes over, the problem is poised to escalate.

To see how big that problem still is — and where it’s worst — we looked at 2025 Log4j download patterns around the world and compared them to some of the most overused vulnerable Java components. The infographic below walks through what we found.

From "Internet on Fire" to Enduring Log4J Risk

When Log4Shell landed in December 2021, it was the nightmare scenario security teams warned about for years: a critical flaw in a ubiquitous Java library. Exploitation was immediate and global. Governments and enterprises scrambled to identify where Log4j in their applications was hiding, ship emergency patches, and stand up crisis war rooms. For a few weeks, headlines about the “internet on fire” were not an exaggeration.

The fallout reshaped the conversation about software risk. Log4Shell kicked off a new era of software supply chain scrutiny: SBOM mandates in EO 14028, tighter oversight in Europe through NIS2 and the CRA, and a wave of investment in open source security. It quickly became a reference point for how a single vulnerable component could ripple through the entire digital ecosystem.

But the fire never fully went out. Despite all the awareness, tooling, and regulation, our 2025 download data shows a stubborn baseline of vulnerable Log4j downloads that just won’t disappear. And Log4j is only part of the picture: we also see massive ongoing use of other components with open source vulnerabilities that already have fixes available.

That’s what makes Log4Shell such a powerful example of persistent risk. On one side, there’s unfixed risk: vulnerabilities that never get patched upstream. On the other, there’s corrosive risk: vulnerabilities that do have fixes, but continue to spread because consumers don’t move. The Log4j vulnerability — and the heavily used commons packages sitting alongside it — are now textbook examples of corrosive risk at scale.

Log4J in 2025: Evidence of Persistent Risk

In 2025, we took a fresh look at how often vulnerable Log4j versions are still being pulled into new builds. Using download data from Maven Central, we analyzed all Log4j activity for the year, classified each download as either vulnerable or fixed, and then broke the results down by country. That gave us a global view of demand for Log4j — and a way to see which parts of the world are actually moving to safe versions and which aren’t.

Globally, the picture is better than it was in the immediate aftermath of Log4Shell, but nowhere near where it should be. In 2025 alone, there were nearly 300 million total Log4j downloads. Of those, about 13% — roughly 40 million downloads — were still vulnerable versions. Given that safe alternatives have been available for nearly four years, every one of those vulnerable downloads represents risk that could have been avoided.

To understand where that risk is concentrated, we zoomed in on the ten countries with the largest developer populations according to recent counts: China, the United States, India, Japan, Brazil, Germany, the United Kingdom, Canada, South Korea, and France. Together, they account for more than 14 million developers in our dataset — and not one of them is close to zero. In 2025, anywhere from 8% to 29% of Log4j downloads in these countries still contained Log4Shell. India (29%), China (28%), and Japan (22%) sit at the high end, but even comparatively “better” performers like the United States (9%), Brazil (8%), and France (8%) are still shipping millions of avoidable vulnerable downloads.

The chart below shows how each of these major software-producing nations stacks up — and how much of this risk is still self-inflicted.

Java Vulnerabilities We Still Won't Let Go

In the graphic above, the bottom panel shows some of the most frequently downloaded components that include open source vulnerabilities and the percentage of their 2025 downloads that use vulnerable versions despite safe versions existing.

Each of these examples has three things in common: at least one CVE, an available fix, and a low rate of fix adoption. Taking a closer look at some of these packages:

commons-compress (1.21) is vulnerable to a slew of bugs, including CVE-2012-2098, CVE-2024-26308, CVE-2020-1945, CVE-2024-25710, CVE-2021-36374, and is used in applications that handle ZIP, TAR, 7z, and other archive formats — very common in backup tools, artifact processors, CI/CD systems, and packaging workflows. The listed vulnerabilities range from ZIP bomb bypasses and DoS attacks to faulty parsing that can lead to resource exhaustion or exposure of arbitrary files, depending on how archives are processed.

CVE-2020-1945 in particular stores data in an easily-identified temporary base, which could result in that data being leaked to other local users. Additionally, The FixCRLF and ReplaceRegExp tasks will grab files from that location to insert into the build process itself — meaning, anyone with access to the directory could slip nefarious files into that build process.

commons-lang (2.6), containing CVE-2025-48924, is part of a legacy major version that was replaced by commons-lang3 over a decade ago. In just the few months since the fix was published in July 2025, we have already observed more than 155 million vulnerable downloads — that represents 99.88% of the total component downloads.

Upgrading from 2.6 to 3.x is not a drop-in change and requires extensive code refactoring. Because many large, older enterprise systems still depend on the 2.x line and cannot practically migrate to the incompatible 3.x API, they continue downloading the vulnerable version even though a fix exists. CVE-2025-48924 is an Uncontrolled Recursion vulnerability in the Apache Commons Lang library. Providing a very long input to the ClassUtils.getClass(...) methods causes a StackOverflowError, which is typically unhandled and results in an application crash or Denial of Service (DoS) attack.

snappy (0.4), which includes CVE-2024-36124, is widely embedded into large distributed systems (Hadoop, HBase, Presto, Spark) where dependencies are pinned for stability. These platforms rarely update low-level compression libraries due to performance risks. As a result, most real-world deployments (99.58% of total component downloads) continue using snappy 0.4, while adoption of the fixed version 0.5 remains extremely low. CVE-2024-36124 is a memory-handling flaw that can lead to out-of-bounds reads, resulting in DoS.

Jdom2 (2.0.6) — which contains CVE-2021-33813 — was fixed in 2021 in version 2.0.6.1, yet the vulnerable version was still unnecessarily downloaded more than 371 million times in 2025 (representing 57.73% of the total package downloads). The package itself is a popular Java library used to easily create, read, manipulate, and output XML documents. CVE-2021-33813 enables remote attackers to conduct XML External Entity (XXE) attacks, in which they may craft malicious XML HTTP requests that would recursively expand entities, leading to a DoS condition, also known as a ‘Billion Laughs’ attack.

If Log4j was the alarm bell everyone heard, these are the quieter, ongoing sources of risk that keep getting pulled into new builds every day. Together, they show that Log4Shell wasn’t a one-time failure — it’s a symptom of how we consume open source.

Why Do We Keep Downloading Vulnerable Versions?

The Log4j maintainers released a fixed version just days after Log4Shell was discovered, yet millions of downloads persist. These packages tell the same story, with some fixed versions having been available for nearly a decade. If so much risk is avoidable, why does it keep showing up in the data?

The reasons are rarely dramatic. They’re mostly habits, defaults, and incentives that quietly stack up over time.

Set-and-forget dependencies

Once a library is wired in and everything compiles, it tends to stay that way. Versions get pinned, build files get copied from one service to the next, and no one revisits those choices unless something forces the issue — a breach, a compliance audit, or a production outage.

Without someone explicitly owning ongoing dependency maintenance, those “temporary” choices turn into long-lived tech debt. Vulnerable versions of Log4j and other libraries stick around not because anyone chose them recently, but because no one chose to replace them.

Transitive dependency blind spots

A lot of the risky usage in our data isn’t even direct. Many vulnerable components — including Log4j in some stacks — are pulled in transitively by other libraries and frameworks. Developers may never add them to a pom.xml or build.gradle themselves, but they arrive as part of the dependency tree.

That creates an ownership vacuum. When a vulnerability is disclosed, it’s not always clear which team “owns” fixing it, or even which service is actually using it. Remediation suddenly requires cross-team coordination and digging through dependency graphs, which lengthens mean time to remediate and makes it far more likely that vulnerable versions linger.

Flawed selection signals

When teams do make component choices, they typically optimize for speed and familiarity. Libraries are chosen because they’re popular on GitHub, recommended in a blog post, or featured in a Stack Overflow answer. Stars, downloads, and tutorial snippets become the de facto procurement process.

What almost never gets considered is the component’s time-to-fix history, its security posture, or the quality of its governance and maintenance. As a result, the organization’s implicit procurement process favors convenience over safety and keeps reseeding risky components into new projects, even as older ones are being cleaned up.

Tools that shriek but don’t steer

Security tooling doesn’t always help. Many SCA tools are excellent at generating long lists of CVEs and far less helpful at telling teams what to do next. Alerts arrive with little prioritization beyond CVSS score, and without clear, actionable guidance like “upgrade to version X.Y.Z, which is compatible with your current stack.”

The predictable outcome is alert fatigue. Developers learn to treat these tools as noisy background radiation rather than trusted navigational aids. Risk gets ignored, remediation velocity slows, and vulnerable versions continue flowing through the pipeline simply because no one has the time or clarity to address them.

Incentives that undervalue hygiene

Finally, there’s the human layer. Product managers are rewarded for shipping features and hitting roadmap dates, not for spending a sprint cleaning up logging libraries. Security teams are often measured on the number of vulnerabilities closed, not on reductions in unnecessary risk.

In that environment, no one gets credit for quietly removing a vulnerable dependency before it becomes a headline. Cleanup work, template hygiene, and dependency modernization get pushed to “later,” which rarely arrives. The incentives are misaligned with the outcome everyone says they want: fewer vulnerable downloads of components that already have fixes.

How to Drive Unnecessary Risk Toward Zero

The good news in all of this: unnecessary risk is one of the few parts of your attack surface that you actually control. You can’t stop new open source vulnerabilities from being discovered, but you can stop pulling known-bad versions once a fix exists.

1. Start by measuring your own numbers

Before changing anything, get a baseline. Use your internal SCA tools and artifact repositories to answer a few blunt questions:

What percentage of our Log4j downloads in 2025 were vulnerable?
Which of the vulnerable components from this post are showing up in our builds, and at what versions?
Which teams, apps, or business units are responsible for most of those downloads?

If your internal numbers look worse than the map and chart above, that’s your first priority: close the gap with the broader ecosystem, then push beyond it.

2. Upgrade your component selection criteria

Next, change how you pick components in the first place. Move beyond “does it work?” to “is it safe and well-run?” by adding:

Security track record – time-to-fix for past issues, number and severity of vulnerabilities.
Maintenance signals – active maintainers, recent commits, regular releases.
Governance and transparency – backing foundation, clear ownership, SBOM availability.

3. Automate the healthy path

Don’t rely on goodwill and spare time to keep dependencies fresh. Automate the easy parts:

Use tools or bots to open upgrade PRs to safe versions for Log4j and similar libraries.
Batch non-breaking upgrades on a regular cadence so they’re part of normal sprint work, not a one-off “patch week.”
Make “safe by default” the path of least resistance by autocompleting to safe versions in internal repos and warning (or blocking) when someone tries to pull in a known vulnerable version.

4. Put guardrails where downloads happen

Then, move from suggestions to guardrails at the points where components actually enter your ecosystem:

Artifact repositories:

Block or at least warn on known vulnerable versions when a fix exists.
Start with high-impact rules like “no new downloads of vulnerable Log4j versions.”

CI/CD pipelines:

Fail builds that introduce new uses of banned versions.
Allow temporary exceptions only with an explicit owner and sunset date.

This keeps the worst offenders from creeping back in, even as teams change and projects move.

5. Change what “good” looks like (and measure it)

Finally, you need a new definition of success — and a way to prove you’re getting there. Instead of counting raw CVEs closed, measure whether you’re actually shrinking unnecessary risk over time. A few concrete metrics that help:

What "good" looks like	What to track	How to use it
Unnecessary risk rate	% of component downloads that are vulnerable when a fixed version exists (globally and for key components like Log4j).	Set a target (e.g., “cut this percentage in half over 12 months”) and review it quarterly with engineering leadership.
Fix adoption time	Median time from fix release → adoption in our critical apps for Log4j and other top components.	Treat long tails as a signal that ownership is unclear or automation is miss.
Policy effectiveness	Number/percentage of builds blocked for using banned versions; trend in vulnerable downloads from internal repos after guardrails are introduced.	If blocks stay high, you have a communication or tooling problem; if they drop while shipping continues, your controls are working.
Recognition and incentives	Highlight teams that reduce their unnecessary risk rate the fastest or fully eliminate Log4Shell and other common vulnerabilities from their dependency graphs.	People respond to what gets discussed in reviews and celebrated in all-hands; make dependency hygiene part of that story.

What "good" looks like

Unnecessary risk rate	% of component downloads that are vulnerable when a fixed version exists (globally and for key components like Log4j).
Fix adoption time	Median time from fix release → adoption in our critical apps for Log4j and other top components.
Policy effectiveness	Number/percentage of builds blocked for using banned versions; trend in vulnerable downloads from internal repos after guardrails are introduced.
Recognition and incentives	Highlight teams that reduce their unnecessary risk rate the fastest or fully eliminate Log4Shell and other common vulnerabilities from their dependency graphs.

What to track

Unnecessary risk rate	Set a target (e.g., “cut this percentage in half over 12 months”) and review it quarterly with engineering leadership.
Fix adoption time	Treat long tails as a signal that ownership is unclear or automation is miss.
Policy effectiveness	If blocks stay high, you have a communication or tooling problem; if they drop while shipping continues, your controls are working.
Recognition and incentives	People respond to what gets discussed in reviews and celebrated in all-hands; make dependency hygiene part of that story.

How to use it

Unnecessary risk rate
Fix adoption time
Policy effectiveness
Recognition and incentives

Before the Fifth Anniversary

Four years after Log4Shell, the problem isn’t that open source is inherently unsafe. The problem is that we keep downloading unsafe versions even when safe ones exist. The data in this post makes that hard to ignore: Log4j has fixes, the other heavily used components have fixes, and yet vulnerable versions continue to flow into new builds.

The map and chart tell the same story. They show a world that’s undeniably better than it was in late 2021 — vulnerable download rates are down, some regions have pushed them into single digits — but they also show how much avoidable risk we’re still willing to live with. Every shaded region and every bar in the above panel represents choices we could make differently.

The real question is what your picture looks like by the fifth anniversary of Log4Shell. By then, your organization could move its dot on that map and erase these dependencies from your software supply chain entirely if you decide that unnecessary risk is no longer acceptable background noise.

If you’re not sure where you stand today, start by getting the numbers. Run a scan of your applications to find Log4j and other frequently downloaded vulnerable components, calculate what share of their usage is to vulnerable versions, and benchmark your own “unnecessary risk rate” against the trends in this post. That’s the first step toward making sure Log4Shell is remembered as a turning point, not just an anniversary.

Methodology

This analysis looks at Java components downloaded from Maven Central in 2025, based on raw download telemetry (component, version, timestamp, IP). Downloads are events, not installs or unique apps.

Using Sonatype vulnerability intelligence, we labeled each version of Log4j and a small set of Java components (e.g., commons-compress, commons-lang, snappy, jdom2) as vulnerable or fixed, then counted 2025 downloads per version to calculate totals and the % of downloads that were still vulnerable. For safe versions released this year, we looked at the % of vulnerable downloads since the fix.

For the country view, we mapped download IPs to countries via GeoIP and focused on the 10 countries with the largest developer populations, computing the share of Log4j downloads that still contained Log4Shell.

Throughout, “unnecessary risk” means downloads of a vulnerable version when a fixed version already exists. Cases with no available fix are not counted in that category.

Unnecessary Risk: The Persistence of Open Source Vulnerabilities

Introduction

From "Internet on Fire" to Enduring Log4J Risk

Log4J in 2025: Evidence of Persistent Risk

Java Vulnerabilities We Still Won't Let Go

Why Do We Keep Downloading Vulnerable Versions?

Set-and-forget dependencies

Transitive dependency blind spots

Flawed selection signals

Tools that shriek but don’t steer

Incentives that undervalue hygiene

How to Drive Unnecessary Risk Toward Zero

1. Start by measuring your own numbers

2. Upgrade your component selection criteria

3. Automate the healthy path

4. Put guardrails where downloads happen

5. Change what “good” looks like (and measure it)

Before the Fifth Anniversary

Methodology

Related Resources

Modern Vulnerability Management in the Age of AI

Modern Vulnerability Management with HeroDevs

2026 State of the Software Supply Chain Report