2021

State of the Software Supply Chain

Now in its seventh year, Sonatype’s 2021 State of the Software Supply Chain Report blends a broad set of public and proprietary data to reveal important findings about open source and its increasingly important role in digital innovation.

banner-right

1

Open Source Supply, Demand, and Security

Open source supply is growing exponentially.

Currently, the top four open source ecosystems contain a combined 37,451,682 components and packages. These same communities released a combined 6,302,733 new versions of components / packages over the past year and have introduced 723,570 brand new projects in support of 27 million developers worldwide.

Available Supply of Open Source

Java

0 1 million 2 million 3 million 4 million 5 million 6 million 7 million 8 million Versions Projects New in 2021 Available prior to 2021 430,995 7.3 MILLION

Javascript

0 1 million 2 million 3 million 4 million 5 million 6 million 7 million 8 million Versions Projects New in 2021 Available prior to 2021 21 MILLION 1.8 MILLION

Python

0 1 million 2 million 3 million 4 million 5 million 6 million 7 million 8 million Versions Projects New in 2021 Available prior to 2021 3 MILLION 336,402

.NET

0 1 million 2 million 3 million 4 million 5 million 6 million 7 million 8 million Versions Projects New in 2021 Available prior to 2021 5.6 MILLION 338,423
Open source demand continues to explode.
Increase in Downloads
Year Over Year 2020 - 2021
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% .NET Python JavaScript Java 71% increase 267 TO 457 BILLION 50% increase 1 TO 1.5 TRILLION 78% increase 44 TO 78 BILLION Year Over Year Increase Percentage 92% increase 66 TO 127 BILLION
Open source demand continues to explode.

In 2021, developers around the world will request more than 2.2 trillion open source packages, representing a 73% YoY growth in developer downloads of open source components. Despite the growing volume of downloads, the percentage of available components utilized in production applications is shockingly low.

Vulnerabilities are more common in popular projects.

The top 10% of most popular OSS project versions are 29% likely on average to contain known vulnerabilities. Conversely, the remaining 90% of project versions are only 6.5% likely to contain known vulnerabilities. In combination, these statistics indicate that the vast majority of security research (whitehat and blackhat) is focused on finding and fixing (or exploiting) vulnerabilities in projects that are more commonly utilized.

Vulnerability Release Density Vs. Popularity

Java (Maven)

0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 Vulnerable Not Vulnerable Percent of Releases Vulnerable by Popularity Group x% Popularity Top 10% Bottom 10% 7% 7% 3% 4% 6% 2% 3% 6% 3% 26%

JavaScript (npm)

0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 Vunerable Not Vunerable Popularity Top 10% Bottom 10% 39% 17% 9% 8% 6% 6% 7% 7% 7% 4%

Python (pypi)

0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 Vunerable Not Vunerable Popularity Top 10% Bottom 10% 38% 14% 12% 3% 6% 5% 11% 9% 7% 5%

.NET (Nuget)

Vunerable Not Vunerable 0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 Popularity Top 10% Bottom 10% 16% 6% 6% 6% 6% 6% 8% 7% 5% 4%
2021 Software Supply Chain Statistics
Ecosystem
Total
Projects
Total Project
Versions
Download
Requests
Year Over
Year Download Growth
Ecosystem
Project Utilization
Vuln Density
for Utilized
Versions 10%
Most Popular
Vuln Density for Utilized Versions
90% Least
Popular
Java
JavaScript
Python
.NET
Totals/Averages
431K
1.9M
336K
338K
3M
7.3M
21M
3M
5.6M
37M
457B
1.5T
127B
78B
2.2T
71%
50%
92%
78%
73%
15%
2%
4%
2%
6%
23%
39%
38%
15%
29%
4%
8%
8%
6%
6.5%
High-Profile Software Supply Chain Attacks
Dec 2020-July 2021
DECEMBER 2020
SolarWinds

Threat actors gained access to SolarWinds dev infrastructure, and injected malicious code into Orion update binaries. 18,000 customers automatically pulled trojanized updates, planting backdoors into their systems and allowing bad actors to exploit private networks at will.

FEBRUARY 2021
Namespace Confusion

Three days after news broke of an ethical researcher hacking over 35 big tech firms in a novel supply-chain attack, more than 300 malicious copycat attacks were recorded. Within one month, more than 10,000 namespace confusion copycats had infiltrated npm and other ecosystems.

APRIL 2021
Codecov

An attacker was able to gain access to a credential via a mistake in how Codecov were building Docker images. This credential then let them modify Codecov’s bash uploader script which was either used directly by customers or via Codecov’s other uploaders like their Github Action. The attacker used this modified script to steal credentials from the CI environments of customers using it.

MAY 2021
Microsoft's WinGet

The weekend after launching, WinGet's software registry was flooded with pull requests for apps that were either duplicates or malformed. Some newly added duplicate packages were corrupted and ended up overwriting the existing packages, raising serious concerns about the integrity of the WinGet ecosystem.

JULY 2021
Kaseya

A ransomware group discovered and exploited a zero-day vulnerability in a remote monitoring and management software platform used by dozens of managed security providers (MSP). Because these MSPs service thousands of downstream customers, the hackers were able to conduct a ransomware attack against 1,500 victims.

Software Supply Chain Attacks Increase 650%

Members of the world’s open source community are facing a novel and rapidly expanding threat that has nothing to do with passive adversaries exploiting known vulnerabilities in the wild — and everything to do with aggressive attackers intentionally tampering with open source projects to infiltrate the commercial software supply chain.

From February 2015 to June 2019, 216 software supply chain attacks were recorded. Then, from July 2019 to May 2020, the number of attacks increased to 929 attacks. However, in the past year, such attacks numbered more than 12,000 and represented a 650% year over year increase.

Next Generation SoftwareSupply Chain Attacks (2015–2020)

Dependency Confusion, Typosquatting, and Malicious Code Injection

2020 2019 2018 2017 2015 2021 0 2,000 4,000 6,000 8,000 10,000 12,000 650% YEAR OVER YEAR INCREASE
High-Profile Software Supply Chain Attacks
Dec 2020-July 2021
DECEMBER 2020
SolarWinds

Threat actors gained access to SolarWinds dev infrastructure, and injected malicious code into Orion update binaries. 18,000 customers automatically pulled trojanized updates, planting backdoors into their systems and allowing bad actors to exploit private networks at will.

FEBRUARY 2021
Namespace Confusion

Three days after news broke of an ethical researcher hacking over 35 big tech firms in a novel supply-chain attack, more than 300 malicious copycat attacks were recorded. Within one month, more than 10,000 namespace confusion copycats had infiltrated npm and other ecosystems.

APRIL 2021
Codecov

An attacker was able to gain access to a credential via a mistake in how Codecov were building Docker images. This credential then let them modify Codecov’s bash uploader script which was either used directly by customers or via Codecov’s other uploaders like their Github Action. The attacker used this modified script to steal credentials from the CI environments of customers using it.

MAY 2021
Microsoft's WinGet

The weekend after launching, WinGet's software registry was flooded with pull requests for apps that were either duplicates or malformed. Some newly added duplicate packages were corrupted and ended up overwriting the existing packages, raising serious concerns about the integrity of the WinGet ecosystem.

JULY 2021
Kaseya

A ransomware group discovered and exploited a zero-day vulnerability in a remote monitoring and management software platform used by dozens of managed security providers (MSP). Because these MSPs service thousands of downstream customers, the hackers were able to conduct a ransomware attack against 1,500 victims.

Practical Recommendation

To accelerate the pace of digital innovation without sacrificing quality or security, engineering and risk management leaders should understand supply, demand, and risk dynamics associated with third-party open source ecosystems. Furthermore, they should carefully define and automatically enforce open source policies across every phase of the software supply chain.

2

Understanding Exemplary Open Source Projects

Some open source projects are definitely better than others. But how do you know? This year we examined three different methods for identifying exemplary open source projects: Sonatype Mean Time to Update (MTTU), OpenSSF Criticality. and Libraries.io Sourcerank. We found that MTTU combined with OpenSSF Criticality are strongly associated with exemplary project outcomes in the areas of security and dev productivity.

Metrics to Use to Assess Relative Quality of an OSS Project
  • Sonatype MTTU
  • OpenSSF Criticality
  • Libraries.io Sourcerank

Sonatype MTTU provides a measure of project quality that is based on how quickly the project moves to update dependencies. Lower (faster) is better. Components that consistently react quickly to dependency upgrades will have lower MTTU. Components that react slowly or have high variance in their update times will have higher MTTU.

OpenSSF Criticality measures a project’s community, usage, and activity. This is distilled into a score that is intended to measure how critical the project is in the open source ecosystem.

Libraries.io Sourcerank aims to measure the quality of software, mostly focusing on project documentation, maturity, and community. It is computed by evaluating a number of yes/no responses such as “Is the project more than six months old?” and a set of numerical questions, such as “How many ‘stars’ does the project have?” These are distilled into a single score, with yes/no questions adding or subtracting a fixed number of “points” and numerical questions being converted into points using a formula, e.g. “log(num_stars)/2.” The current maximum number of points is approximately 30.

Lower MTTU is better.

Components that consistently react quickly to dependency upgrades will have low MTTU. Components that either consistently react slowly or have high variance in their reaction time will have higher MTTU.

Suppose we have a component A with dependencies B and C, both at version 1.2. Suppose B and C each release a new version (1.3) and some time later A releases a new version that bumps the version of B and C to 1.3. The time between the release of B version 1.3 and the release of A version 1.3 is the Time To Upgrade (TTU) for A’s migration to B version 1.3 (and similarly for A’s adoption of C version 1.3). The average of all these upgrade times is then the MTTU.

Expand for more insight.
MTTU-update-2
Aggregate MTTUs are improving over time.

In addition to the number of projects growing over the years, there has been a clear trend toward faster MTTUs. The average MTTU across projects in 2011 was 371 days. In 2014 it was 302 days and by 2018 it was 158 days. In 2021, as of August 1, average MTTU was 28 days – less half of the 73 days the average project took in 2020.

Density MTTU in Days 10 -2 10 -1 10 0 10 1 10 2 10 3 10 4 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 (projected) 2021 2020 2019 2017 2016 2018 2015 2014 2013 2012 2011 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14
MTTU is highly correlated to MTTR.
MTTU_MTTRchart

Suppose a project A includes a dependency B, and B has a vulnerability disclosed at date D1. Then A updates the version of B it’s using on date D2. Time to Remediate (TTR) is then the time between D1 and D2 measured in days, and MTTR is the average TTR for a project across all disclosed security vulnerabilities.

Expand for more insight.
MTTU is highly correlated to MTTR.

While MTTU does not directly measure the speed at which projects fix publicly disclosed vulnerabilities, it does correlate to a project’s mean time to remediate (MTTR), which is the time required to update dependencies that have published vulnerabilities. Thus, we consider MTTU to be the best metric available to determine the impact a component will have on the security of projects that incorporate it.

Practical Recommendations

Choosing high quality open source projects should be considered an important strategic decision for enterprise software development organizations.

To avoid stale dependencies and minimize security risks associated with third party open source, software engineering teams should actively embrace projects that consistently demonstrate low mean time to update (MTTU) values and high OpenSSF Criticality scores.

3

How Your Peers Are Managing Open Source Dependencies

For this year’s report, we examined 4 million real-world dependency management decisions spread across 100,000 applications. Our learnings highlighted below are enlightening.

Despite the growing volume of downloads, the percentage of available components observed in production applications is shockingly low.

On average, production enterprise Java applications utilize 10% of available open source components, and commercial engineering teams actively update only 25% of those components that are utilized.

Active Projects in the Maven Central Repository
430,000 projects in Maven projects present in 100,000 applications 40,000 across 100,000 applications projects were actively being updated during the past year 10,000
69% of dependency management decisions are suboptimal.
5 Groups of Migration Decisions
Optimal Decisions Optimal version chosen 31% Imperfect Decisions Subjectively suboptimal version chosen 64% Dangerous Decisions Non subjective suboptimal version chosen 3% Risky Decisions Pre release version chosen 1% Dead End Decisions No good choice available 1%
69% of dependency management decisions are suboptimal.

The average modern application contains 128 open source dependencies, and the average open source project releases 10 times per year. This reality combined with the fact that a few hyper active projects release more than 8,000 times per year, creates a situation in which developers must constantly decide when (and when not to) update third-party dependencies inside of their applications. In light of these circumstances, Sonatype researchers set out to answer the question: are developers making efficient dependency management decisions? We studied 100,000 applications and analyzed more than 4,000,000 component migrations (upgrades) and found that 69% of such decisions were suboptimal.

Despite unstructured decision making, there is evidence of wisdom in the crowd.

The chart below provides a visual summary of herd migration behavior over the past year associated with spring-core, a single component within the highly popular spring-framework. The y-axis shows the past 52 weeks of upgrade activity, with the top row representing herd migration decisions made one year ago, and the bottom row representing herd migration decisions made during the most recent week. The x-axis represents the 150 most recent versions with older versions to the left, and newer versions to the right. View key observations by clicking on the dots below.

Herd Migration Behavior of org.springframework:spring-coreAugust 9, 2020–August 1, 2021
1

The most recent release (5.3.x) of spring-core releases approximately every 4 weeks.

2

The project is actively maintaining these 2 releases. Darker shading indicates the majority of the community is using these releases.

3

The project is no longer actively supporting these releases. Teams should migrate away from these stale versions.

4

Laggards continue to update to older, unsupported, and even vulnerable versions.

5

Older versions are vulnerable, and older non vulnerable versions (4.3.15+) will inevitably be subject to new vulnerability disclosures.

6

The community generally avoids .0 releases and pre-releases.

8 Rules for Upgrading to the Optimal Version
Avoid Objectively Bad Choices
Vector

Don’t choose an alpha, beta, milestone, release candidate, etc. version.

Vector (1)

Don’t upgrade to a vulnerable version.

Vector (2)

Upgrade to a lower risk severity if your current version is vulnerable.

Vector (3)

When a component is published twice in close succession, choose the later version.

Avoid Subjectively Bad Choices
Vector (4)

Choose a migration path (from version to version) others have chosen.

Vector (5)

Choose a version that minimizes breaking code changes.

Vector (6)

Choose a version that the majority of the population is using.

Vector (7)

If all else is tied, choose the newest version.

Passing these rules results in optimal upgrades.
Save time and money.

Intelligent automation that standardizes engineering teams on exemplary open source projects could remove 1.6M hours and $240M of real world waste spread across our sample of 100,000 production applications. Extrapolated out to the entire software industry, the associated savings would be billions.

The Benefit of Intelligent Automation to Dev Teams
Strategies for optimal dependency management: near the edge is best.

The bleeding edge is dangerous. The near edge is optimal. When analyzing herd migration behavior around dependency management practices, we observed three distinct patterns of team behavior: Teams living in disarray, teams living on the edge, and teams living close to the edge.

Strategies for Dependency Management
Teams living in disarray

Developers working on these teams lack automated guidance. They update dependencies infrequently. When they do update dependencies, they utilize gut instincts and commonly make suboptimal decisions. This approach to dependency management is highly reactive, not scalable, and leads to stale software and increased security risk.

READ MORE READ LESS
Teams living close to the edge

Developers working on these teams have the benefit of intelligent and contextual automation. Dependencies are automatically recommended for updating, but only when optimal. This type of intelligent automation keeps software fresh without inadvertently introducing wasted effort or increased security risks. This approach is proactive, scalable, and optimal in terms of cost efficiency and quality outcomes.

READ MORE READ LESS
Teams living on the edge

Developers working on these teams have the benefit of simplistic, but non contextual, automation. Dependencies are automatically updated to the latest version, whether optimal or not. Such automation helps to keep software fresh, but it can inadvertently lead to increased security risks and higher costs associated with unnecessary updates and broken builds. This approach is proactive and scalable, but not optimal in terms of costs or outcomes.

READ MORE READ LESS
Practical Recommendations

Software engineering teams should strive to standardize dependency management decisions.

Engineering leaders should maximize information available to developers to save time and money.

Engineering leaders should embrace tools to automate intelligent dependency management decisions.

4

Software Supply Chain Maturity Survey

For this year’s report, we surveyed 702 engineering professionals about software supply chain management practices, including approaches and philosophies to utilizing open source components, organizational design, governance, approval processes, and tooling.

Disconnect Between Perception vs Reality on Software Supply Chain Maturity

Subjectively, survey respondents report they are doing a good job remediating defective components and indicate that they understand where supply chain risk resides. Objectively, research shows development teams lack structured guidance and frequently make suboptimal decisions with respect to software supply chain management.

We plotted all survey responses against the five different stages of software supply chain maturity and found that the majority of respondents were graded less than the “Control” level - which is deemed the point at which an organization transitions from “figuring it out” to a minimal level of maturity that will enable high quality outcomes.

Click on the dots to the right for additional insights.

Software Supply Chain Maturity Score by Theme
5th, 50th, and 95th Percentile
1
The majority of respondents demonstrate an “Ad Hoc” approach to software supply chain management.
2
The only two themes where the respondents demonstrated a high level of maturity were for Inventory and Remediation.
3
Comparing survey responses to the objective analysis done, we see a disconnect between what is actually happening, and what people think is happening: 70% of remediations are actually suboptimal.
Practical Recommendation

The survey suggests that respondents have talked themselves into believing they’re doing a good job in areas we see objectively they are not. This is a reminder to be mindful of what you think your organization is doing, versus what's actually happening and continuously measure your workflow and systems against desired outcomes.

5

Emergence of Software Supply Chain Regulation and Standards

Following several attacks in 2020 aimed at critical infrastructure, governments around the world began to pursue regulations and standards aimed at improving software supply chain security and hygiene.

American Flag
The United States

In May 2021, President Biden signed the Executive Order on Improving the Nation’s Cybersecurity, which has been heralded as a milestone for the U.S. government at a time when cyber espionage and nation-state attacks on critical infrastructure are reaching crisis proportions.

UK Flag
The United Kingdom

The U.K. government announced that it was seeking advice on defending against digital supply chain attacks from organizations that either consume IT services, or MSPs that provide software and services.

Germany Flag
Germany

Germany passed the Information Technology Security Act 2.0 as an update to the First Act to “increase cyber and information security against the backdrop of increasingly frequent and complex cyber-attacks and the continued digitalisation of everyday life.”

European Union Flag
European Union

The European Union Agency for Cybersecurity (ENISA) released a July 2021 report titled “Understanding the increase in Supply Chain Security Attacks.” The report reviewed 24 different software supply chain attacks and shared recommendations that organizations should put in place to protect themselves against attacks.

Practical Recommendation

As governments finally recognize the risks associated with unmanaged software supply chains, they are aggressively pursuing mandates that align the software industry with other manufacturing sectors. Pay attention to what's happening legislatively in your market, get involved in the public conversations and be prepared to make changes to your development practices accordingly.

Dig Deeper and Download the Full Report

Engineers are making a wide variety of digital decisions at every phase of the DevSecOps value stream that they didn't have to think about just a year ago. Understanding how to optimize those decisions and how they affect the greater software supply chain is paramount to a company's success.

Dig into the full report for more insights, analysis and guidance around developing optimal software supply chains.