Software Supply Chain Risks | 2026 Software Supply Chain Report

FIGURE 2.1: Annual Open Source Malware Growth

Source: Sonatype

What stands out most about 2025 is not just the scale of the threat, but also the sophistication. Where 2024’s XZ Utils incident was groundbreaking, demonstrating how a single compromised maintainer could imperil global infrastructure, 2025 saw software supply chain risk evolve dramatically.

This year, over 99% of open source malware occurred on npm. State-linked entities such as the Lazarus Group advanced from simple droppers and crypto miners to five-stage payload chains that combined droppers, credential theft, and persistent remote access inside developer environments. The first-ever self-replicating npm malware (Shai-Hulud, quickly followed by Sha1-Hulud) proved that open source malware can now propagate autonomously through open source ecosystems. IndonesianFoods created more than 150,000 malicious packages in just a couple of days. And a series of offensive hijackings of trusted packages like chalk and debug showed that established maintainers of high-profile packages are being targeted as entry points for mass distribution.

Taken together, these developments mark 2025 as a grim year for open source malware: the moment when isolated incidents became an integrated campaign, and bad actors proved software supply chain attacks are now their most reliable weapon.

The Threat Taxonomy: What Open Source Malware Does Today

Open source malware is best understood less as a set of isolated “bad packages” and more as a set of repeatable behaviors that exploit how modern software is built and shipped. Public registries provide a low-friction distribution channel, while developer machines and CI/CD pipelines provide an execution environment that often sits close to sensitive data and production access. As a result, the malicious package is increasingly not the whole attack, but the first step in a larger supply chain intrusion.

FIGURE 2.2: 2025 Landscape - Open Source Malware by Threat Type

Source: Sonatype

Registries Are Being Used as Distribution Platforms

In 2025, the dominant pattern is operational scale through ecosystem mechanics. Repository abuse shows up in 55.9% of all logged malicious packages, indicating actors are treating registries like platforms: automating publication and iterating quickly to maximize reach. Repository abuse packages have been observed harvesting TEA tokens or seeking clicks on spam links. Alongside that, Potentially Unwanted Application (PUA) appears in 27.5% of packages, which include items like empty packages, demos with hardcoded credentials, or messaging app spam bot orchestration frameworks. These are packages that don’t necessarily compromise the developer who installs it or the application it is bundled into, but are still unwanted in developer environments.

Developer and Build Environments Are the Prize

A consistent objective is harvesting valuable data from where software gets built. Host information exfiltration appears in 5.7% of packages, and secrets exfiltration in 3.9%. These aren’t the largest categories by volume, but they’re high-leverage: packages run inside developer machines and CI/CD environments where tokens, API keys, and CI credentials are commonly present and reusable.

Attacks Are Engineered as Chains, Not Single Payloads

Sonatype observed clear signs of staged delivery and follow-on capability. Droppers/loaders appear in 2.7% of packages, and backdoors in 2.1%, with obfuscated code in 1.6% acting as a force multiplier that helps these chains persist and evade inspection. Even lower-volume disruption behaviors matter for impact: data corruption appears in 0.62% and targets build outputs and release workflows where compromise can propagate downstream.

Developers Are the Attack Vector

Software supply chain attackers are perfecting social and technical mimicry to target and exploit developers making development decisions fast and with incomplete information:

Typosquatting and Namespace Confusion
Typosquatting and namespace confusion remain staple techniques, but they operate differently. Typosquatting relies on minor spelling variations of legitimate package names, counting on human error during installation. Namespace confusion exploits how package managers resolve dependencies across public and private scopes. This allows attackers to publish public packages with the same name as internal or expected dependencies, so they are inadvertently pulled into builds.
Toolchain Masquerading
Toolchain masquerading is accelerating. Rather than posing as generic utilities, malicious packages increasingly impersonate the everyday tools developers install reflexively: framework add-ons, build plugins, linters, scaffolding utilities, and migration helpers. These packages are designed to look like routine workflow dependencies, making them more likely to be installed without close inspection.
Front-end Workflow Lures

Front-end workflow lures are especially common. Attackers cluster package names around high-velocity ecosystems and popular tooling where dependency decisions are frequent, repetitive, and time-boxed. In these environments, developers often add or swap dependencies rapidly, creating ideal conditions for malicious lookalikes to blend in.

Attackers increasingly rely less on individual mistakes and more on scale, momentum, and volume. They know developers under deadline pressure are unlikely to pay detailed attention on every dependency. If a package “looks right” with mostly comprehensible code, a legitimate seeming README.MD, and a reasonable amount of downloads, it is likely to get installed.

How North Korea Weaponizes Open Source

The Lazarus Group, or APT38, epitomizes the 2025 malware shift from opportunistic to industrialized. Building on earlier research, Sonatype identified more than 800 Lazarus-associated packages this year, concentrated overwhelmingly in npm (97%). In practical terms, npm provides the fastest path from package publication to developer workstation because it does not require namespace validation and tooling prefers the latest versions. By concentrating activity there, Lazarus maximizes the likelihood that poisoned dependencies will be installed quickly, propagate through transitive dependency chains, and spill into build pipelines, CI/CD systems, and downstream production environments with minimal friction.

This level of sustained activity aligns with broader public reporting that cyber operations, including theft, espionage, and cryptocurrency-related crime, are now a significant source of revenue for the North Korean government. As a result, Lazarus now operates as one of the most prolific and successful state-sponsored cybercriminal enterprises in operation today. Lazarus is investing in ecosystems where speed, scale, and reuse combine to maximize the downstream impact of each compromised dependency.

Hybrid Open Source Malware Dominates the Lazarus Playbook

Lazarus packages are distinguished by how they integrate multiple threat behaviors into a single component. These aren’t single-purpose nuisances; they’re multi-function packages designed to support a staged intrusion chain. In the dataset, Sonatype Security Research observed that most Lazarus packages carry multiple threat behaviors: roughly 77% include two or more threat types, and nearly 9% include four or more. In plain terms, the “package” is often just stage zero.

FIGURE 2.3 Lazarus Group Packages by Number of Threat Type

Source: Sonatype

Behaviorally, the profile is dropper-led and credential-first: droppers appear in ~98% of packages, secrets exfiltration in ~64%, and backdoor functionality in ~29%. That combination matters. Droppers keep the published artifact small and less obviously malicious; exfiltration turns a single install into stolen tokens and credentials; and backdoor capability reflects investment in persistence and post-compromise control. The Lazarus pattern demonstrates repeatable intrusion tooling that is built to land quietly, harvest access, and remain useful after the initial foothold.

FIGURE 2.4 Lazarus Group Campaign Threat Types

Source: Sonatype

FIGURE 2.5 Top Lazarus Group Developer Lures

Source: Sonatype

The naming patterns show deliberate clustering around high-velocity toolchains, such as Tailwind, Vite, and React. Zooming out, nearly 43% of Lazarus-linked packages reference common developer framework or tool keywords. This is an intentional distribution strategy. These ecosystems have high dependency churn, many “one more plugin” installs, and constant troubleshooting under deadlines. That’s the ideal environment for lookalike packages to blend in and get pulled into both workstations and CI. Sonatype’s prior research showed that modern applications routinely contain hundreds of dependencies — averaging around 180 — making it unrealistic for developers to closely scrutinize every package they consume.

Execution is Modular and Repeatable

One of the most important operational signals in Sonatype’s analysis is how scalable the campaign was. The data shows strong indicators of templated reuse and rapid variant generation as opposed to one-off, bespoke malware. The distribution is sharply concentrated: Sonatype Security Research mapped 341 packages to a set of just 32 anchor packages, and the largest anchor clusters fan out into dozens of related variants.

That concentration is a direct indicator of manufacturing capacity: Lazarus can iterate quickly, generate families of near-neighbors, and keep publishing even as specific packages are identified and removed. In other words, this is not a handful of malicious uploads. It’s a production line.

Hidden deep within duplicate files and nested directories, Shai-Hulud evaded superficial scans and leveraged maintainer credential theft to publish poisoned updates. The worm compromised more than 500 packages in days, spreading autonomously across registries and developer machines.

The result was a rapidly self-propagating software supply chain worm, capable of infecting projects downstream without any manual publication step. This was quickly followed by another self-replicating npm malware in November, named “Sha1-Hulud: The Second Coming.” These campaigns illustrate the next phase of open source malware — one that behaves more like network worms than passive implants.

In contrast to traditionally-understood malware, which needed to be downloaded and installed before the malware would execute, open source malware executes pre-install, meaning developers only need to download in order to become a victim.

Self-Replicating Malware Attacks in 2025

September 16, 2025
October 17, 2025
November 9, 2025
November 11, 2025
November 24, 2025
December 1, 2025

Shai-Hulud

npm
500+ packages

The first documented self-replicating open source malware; demonstrated innovative use of automation by attackers to hijack accounts and publish new, malicious versions of legitimate packages.

Learn More

Glassworm

OpenVSX and Microsoft VSCode
12 packages

Impersonated popular developer tools to steal credentials, drain cryptocurrency wallets, and use the Solana blockchain for command-and-control communication

Learn More

Glassworm

OpenVSX and Microsoft VSCode
3 packages

New malicious packages uncovered with 10,000 downloads using new extensions and publisher accounts to bypass cleanup efforts.

Learn More

IndonesianFoods

npm
169,538 packages

This campaign was designed to self-replicate every seven seconds. While some packages abused the TEA protocol, most appeared designed to overwhelm detection and exploit ecosystem trust at scale.

Learn More

Shai-Hulud: The Second Coming

npm
49 packages

The hijacking campaign surged a second time with a new name and slight tweaks to evade detection; the attackers also introduced the use of Bun to deploy the payload.

Learn More

Glassworm

OpenVSX and Microsoft VSCode
24 packages

In this third wave, the threat actors artificially inflated download counts of the packages to increase discoverability.

Learn More

FAKE 0, 2

FAKE 1, 2

The Open Source Malware Supply Chain

Modern open source malware is modular, resilient, and designed to bypass both static and human inspection.

Multi-stage payloads: Droppers download encrypted payloads from C2 servers or embed secondary stages locally.
Obfuscation layers: Increasing use of eval(), encoded scripts, or disguised binaries within legitimate file trees.
Legitimate infrastructure for C2: Slack, GitHub, Dropbox, and even logging services (like Better Stack) are co-opted for command-and-control traffic.
Local project propagation: Recent attacks weaponize developer machines to infect all other projects they find and push infected versions upstream.
Multi-process behavior: Telemetry from Sonatype’s behavioral analysis indicates a rise in “multi-process modular malware,” particularly in npm and PyPI.
Install-time execution: The latest malicious packages run during installation, dropping payloads before builds.

The throughline shows malware is adopting the same modular architecture that makes open source so powerful. In 2025, software supply chain attacks mirrored the software supply chain itself. The risk is not theoretical. It’s structural.

This phenomenon is especially visible in ML and DevOps contexts. MLOps is still a newer, less mature discipline, and it has not yet absorbed many of the supply chain lessons that became standard practice in traditional software development. Combined with intense pressure for rapid experimentation and deployment, teams often default to convenience-driven workflows that bypass normal governance.

In practice, that shows up as ungoverned “shadow downloads” that pull artifacts directly from wherever they are easiest to access. Examples include precompiled Python wheels and CUDA libraries fetched from unofficial sources, Hugging Face models loaded directly through package installs or runtime calls, and internal scripts or agents that silently retrieve dependencies from places like GitHub or Pastebin.

This mirrors the “Complacency and Contamination” model from the 10th Annual State of the Software Supply Chain report. Shadow downloads are the modern form of contamination, created when enforcement gaps intersect with developer convenience and automation.

Emerging Software Supply Chain Risks

As AI becomes core to modern pipelines, attackers are following the trend, embedding malicious payloads into container images, AI models, and helper binaries distributed through trusted platforms.

Malicious AI Models in Hugging Face

Although many quarantined models observed to-date are not overtly nefarious, the underlying pattern reveals a structural weakness in model registries: model artifacts are being treated like data and scanned as single items, but in reality, most behave more like code and can be treated much the same way.

Sonatype’s research into picklescan vulnerabilities underscored why this is uniquely dangerous in ML: widely used serialization formats can execute code during deserialization, turning a routine “load model” step into an execution path.

It’s important to note the shape of the malicious activity observed on Hugging Face: many of these repositories appear consistent with security research or proof-of-concept demonstration uploads rather than fully operational criminal campaigns. Some are transparently labeled as unsafe, and several show low download counts. That doesn’t reduce the underlying software supply chain risk, but rather highlights it. In a model registry, even a “demo” artifact can be copied, repackaged, or pulled into the wrong environment, and the consequences play out at runtime.

Two examples illustrate why this matters:

Backdoored Model Artifacts Enabling Remote Access

A cluster of models published under the same account exhibited behavior consistent with establishing a reverse shell to an external host, granting an attacker interactive access to any machine that loads the model. Even when download counts are low, the risk is disproportionate: models are frequently pulled into shared environments (developer workstations, notebooks, CI runners, GPU boxes) where credentials and tokens are plentiful.

Embedded Malicious Code in Serialized Model Files

In another case, a model artifact (a serialized file) contained embedded malicious logic that invoked common system tooling to exfiltrate local files (for example, transmitting /etc/passwd to a remote endpoint). The key point isn’t the specific file targeted — it’s the mechanism: a “model download” can become code execution at load time if organizations treat model artifacts as inherently safe.

In the From Guesswork to Governance chapter, we will show that AI code assistants like Claude or ChatGPT can fetch and install malicious code automatically when prompted to fix dependency errors or install missing libraries. The developer’s intent may be harmless, but the result can be catastrophic.

Attackers are increasingly preying on this. Sonatype’s 2025 malware research continues to document deceptive naming patterns — including typosquatting and new evasion tactics that mimic legitimate dependencies to trick developers into installing malware. As organizations integrate AI coding assistants into production workflows, they must recognize that these systems are not neutral intermediaries. They are potential infection vectors.

The Evolving Software Supply Chain Attack Surface

A Turning Point for Open Source Malware

FIGURE 2.1: Annual Open Source Malware Growth

The Threat Taxonomy: What Open Source Malware Does Today

FIGURE 2.2: 2025 Landscape - Open Source Malware by Threat Type

Registries Are Being Used as Distribution Platforms

Developer and Build Environments Are the Prize

Attacks Are Engineered as Chains, Not Single Payloads

Developers Are the Attack Vector

Typosquatting and Namespace Confusion

Toolchain Masquerading

Front-end Workflow Lures

How North Korea Weaponizes Open Source

Hybrid Open Source Malware Dominates the Lazarus Playbook

FIGURE 2.3 Lazarus Group Packages by Number of Threat Type

FIGURE 2.4 Lazarus Group Campaign Threat Types

Targeting is Optimized to Exploit Muscle Memory

FIGURE 2.5 Top Lazarus Group Developer Lures

Execution is Modular and Repeatable

The Shai-Hulud Software Supply Chain Attacks: A New Era of Self-Replicating Malware

EACH SHAI-HALUD PACKAGE CARRIED A PAYLOAD DESIGNED TO:

Self-Replicating Malware Attacks in 2025

The Open Source Malware Supply Chain

HOW SHADOW DOWNLOADS COMPOUND OPEN SOURCE MALWARE RISK

Emerging Software Supply Chain Risks

Malicious AI Models in Hugging Face

Backdoored Model Artifacts Enabling Remote Access

Embedded Malicious Code in Serialized Model Files

AI Agents as Software Supply Chain Attack Multipliers

MITIGATE YOUR RISK AGAINST EMERGING THREATS

How Will Software Supply Chain Attacks Evolve?