Rule over your dependencies and scan at your own open source risk
5 minute read time
High-impact vulnerabilities generate headlines you cannot ignore. But mere widespread awareness of cases such as Spring4Shell, Log4Shell, and Struts2 will not reduce your relative open source risk.
How do you direct your attention so your organization's open source vulnerabilities don't go unnoticed? A good first step: know and understand your dependencies by conducting regular scans of open source you use in your environments.
But what do differences in dependency scanning methodologies mean for your software supply chain?
Scrutinizing scanning
After developers downloaded more than 2.3 trillion open source packages from top ecosystems in 2021, recent efforts seek to help OSS maintainers better manage their projects. With so many open source components in every application, risks can come from anywhere in your codebase.
To manage that risk, you need to baseline your existing component usage. You can accomplish this in a linear sequence:
-
Identify the components in your application.
-
Identify the risk in your existing components.
-
Decide if remediation is necessary.
Dependency scanning, as part of your software composition analysis (SCA) toolbox, covers the first two parts of this sequence. A scan captures the components you are using in a list, such as an software bill of materials (SBOM), which you check against vulnerability databases to give you detailed information on the potential risks from your dependencies.
Let's consider two dependency scanning methodologies:
-
As-declared
-
As-observed (a.k.a. as-deployed)
If you understand the differences between these scanning methodologies, you can take a more holistic approach to managing your open source risk. And if you account for direct versus transitive dependencies in your scanning, you can more proactively identify and, if necessary, remediate.
As-declared
An as-declared scan looks at what is declared in a package's manifest file — for example, a package.json, setup.py, or even a simple requirements.txt — and checks for reported vulnerabilities against any and all dependencies explicitly listed in these files.
This type of scan takes just the list of dependencies — known as direct dependencies — provided by a package at face value without verifying what dependencies are actually present in the package.
The resultant report typically takes the form of a simple list:
The output is a blind assumption that a package contains whatever its manifest file declares as a canonical list of dependencies.
This type of scan will not detect many scenarios, with a few oversights being:
-
manually copied dependency files
-
different transitive resolution trajectories between different environments and computers
-
components imported as the dependencies of direct dependencies
As-observed
An as-observed (a.k.a. as-deployed) scan checks for reported vulnerabilities and malware present in any of the dependencies of dependencies (i.e., transitive dependencies) that will end up getting pulled (deployed) by the package/component.
In this scan, the package manager resolves the dependencies of the packages and downloads the entire dependency chain. It accomplishes this by scanning for:
-
post-build artifacts, including binaries
-
every component in the image, even if it is not defined in a manifest
-
structural similarity, derived coordinate, and file name
It looks at the physical files and can tell you exactly what has been put there manually or by way of the package manager, thereby providing data that accounts for all embedded dependencies.
The output typically takes the form of a detailed build report:
With this methodology, you get a complete true-to-physical-facts list of dependencies, less false positives, and often can discover files that have been tampered with after deployment. This allows you to more accurately identify dependencies and manage risk.
As an analogy
Consider as-declared versus as-observed methodologies in a hypothetical scenario regarding an incident in a high-security building.
Let's say the building requires biometric scanning at checkpoints for access to everything onsite — access into the building, onto a specific floor, into a specific room, as well as for use of any hardware or devices and any software system or application therein.
Let's say an authorized person entered the building but perpetrated unauthorized activity onsite. You know the location, device, and software utilized in the hour-long timeframe of the activity, but you need to identify the specific perpetrator.
If you follow an as-declared methodology, you would just check a directory of employees who reportedly work in that building. You might then be able to narrow down a list of suspects by cross-referencing with on-file descriptions of a given person's role. But the directory might be outdated. How do you know who worked from home or was out-of-office that day? And you could only guess at who might have used the hardware or software of note. There’s a lot of unknowns and assumptions in this course of action.
If you follow an as-observed methodology, you would pull identity-specific, time-stamped, location-based metadata from all of the biometric scanners and cross-reference with the location, device, and software tied to the unauthorized activity. This course of action provides more in-depth visibility for more precise identification in a much more efficient way.
A better way to scan below the surface
While not all scans are created equal, you can choose a method that yields more precise data and takes your OSS governance to a higher level. Doing so as part of more secure coding practices can also help you push back on the trend of attacks against your application builds.
In order to manage your risks, you need the most precise inventory of software dependencies used in your applications.
Just taking the word of an as-declared manifest can expose you to vulnerabilities or even malware, such as what happened with Octopus Scanner.
With an as-observed scan, you account for your transitive-dependency chain with a full evaluation of the binaries themselves. You have the power to scan smarter, catch vulnerabilities earlier, and manage risk more efficiently and precisely.
Whether or not you leverage an SCA tool that provides such precision at an organizational level, an as-observed scan gives you more coverage in identifying vulnerabilities and more precise data to automate your OSS governance and free up more time for you to innovate.
Written by Aaron Linskens
Aaron is a technical writer on Sonatype's Marketing team. He works at a crossroads of technical writing, developer advocacy, software development, and open source. He aims to get developers and non-technical collaborators to work well together via experimentation, feedback, and iteration so they can build the right software.
Explore All Posts by Aaron Linskens