New Year, New CVE: A Deep Dive Into the 'node-forge' (CVE-2022-0122)

January 25, 2022 By Juan Aguirre

5 minute read time

With over 16 million weekly downloads, the important and widely-used "node-forge" component on npm implements key security functions, including Transport Layer Security protocol, cryptographic functions, and development tools for web apps in native JavaScript.

Industry giants used the program, such as Cisco, Microsoft, and Alexa (yes, the beloved smart speaker in many people's homes), according to grep.app:

grep.app results showing the use of node-forge in Alexa projects

grep.app results showing the use of node-forge in Alexa projects

CVE-2022-0122 refers to an Open Redirect vulnerability in the parseUrl function from the utils.js file within the package. This function is responsible for taking a supplied string and validating its correctness, to then form and return an object that is easier to manipulate and work with. To do this, it first uses a regular expression (regex) to analyze the string. Regex is supported by hundreds of applications for text and string management, and this regex code is the root cause of the vulnerability.

Irregular Regex

At the beginning of the function, we can even see a comment that says "FIXME: this regex looks a bit broken." This suggests developers understood this was an issue, but unaware of the potential for abuse.

Initial portion of the vulnerable function with a “FIXME” comment.

Initial portion of the vulnerable function with a "FIXME" comment.

To analyze the code, we'll look at one of my favorite regex analysis tools: regex101. It breaks the code down with an easily explained, as well as letting you play around with and even debug the expression to see potential matches.

The first part of this regex tells us that the string must start with either `http` or `https`. We know this because it has the ^ symbol, which asserts the start of a line, followed by a capturing group for `https?`. The question mark, ?, tells us that the character before it, meaning the `s`, can occur 0 or 1 times. This leaves us with either `http` or `https` as valid matches for the first part.

Next, it matches the literal characters ://. This equals the rest of the URL scheme. Then, it matches a single character not present in the list (^:&^/) between zero and unlimited times, as denoted by the *. This leaves room for almost anything, which makes sense since this portion is meant to match the host. Finally, it matches a colon, :, to make way for the port number and just an unlimited number of characters, without restrictions. This will match the path portion and end of the URL.

Regex with ideal input

Regex with ideal input

The main issue here is that this regex seems very specific to a certain type of well-formed URL, which simply isn't always the case. Especially when the input is controlled by a user. This is something that standard unit tests can miss, and we don't even have to go to the worst case: A malicious user. It can also be an issue with not ideal, misbehaving, and curious users.

Also, think of a complex webapp that wants to use this library to parse URLs. Not all URLs are as pretty as the ones we are used to seeing on our browser address bar. Some URLs, which are completely RFC-compliant, can be funky and odd.

Untrusted Redirect

After playing with this regex a bit, I noticed it accepts anything that starts with http[s]://. Because of this tolerance, the groups aren't always properly split. This leads to the URL response object being incorrectly put together, likely with an empty or insecure host, and everything else thrown in the path portion. Within this insecure host and path, everything is fair game, including all variations of slashes (pictured below).

Regex with undesired input results in an insecure path.

Regex with undesired input leads to an insecure path.

This is what allows an Open Redirect.

So how does an attacker leverage this? Well, it greatly depends on the use the vulnerable application gives the library, but ultimately it gives an attacker the ability to bypass the URL parser.

Internally within the node-forge library, we see the use of the parseUrl function in 2 other functions, `createClient` and `withinCookieDomain`.

With createClient, an attacker could entice a victim into being connected to a malicious client by setting the host to an attacker controlled URL. This can be considered an Open Redirect, since I can redirect a victim to an undesired, arbitrary location. Once the victim is connected to the rogue client, many more can happen, including access to sensitive information.

For withinCookieDomain, depending on how the application uses it, it could be abused to plausibly bypass authorization checks. And the possibilities and potential impacts go on if a developer uses the parseUrl function directly.

The Fix

Fortunately, the developers released a fix for this in v1.0.0.

Changelog update for the fix

Changelog update for the 1.0.0 fix

The update removes the insecure function and replaces it with the WHATWG URL Standard. This underlines one of the greatest things about open source: you can easily find many stable/reliable libraries to help you build something awesome.

Implemented fix. Replacing parseUrl with WHATWG URL Standard.

Implemented fix. Replacing parseUrl with WHATWG URL Standard.

Another good practice is always to implement a sanitizing user input. We can never trust a user, because, whether due to clumsy fingers or malicious intent, user input can be dangerous. Especially when we trust it blindly. Even if we're going to pass that input into another function that implements some sort of filtering, it's never a bad idea to do your own sanitizing in-house. It's better security to check the input as it's received, before processing it further.

That's not to say you should do your own implementations of known-reliable functionalities. Open source has it all, so don't reinvent the wheel if it's not necessary – just make sure it rolls the way it's supposed to.

Written by Juan Aguirre

Juan is a security researcher at Sonatype and part of the team who has helped Sonatype catalog more than 100 million open source components.

New Year, New CVE: A Deep Dive Into the 'node-forge' (CVE-2022-0122)

Irregular Regex

Untrusted Redirect

The Fix

Block Open Source Malware

Related Resources

Building a Mythos-Ready Software Supply Chain

Precision Malware: How AI Transformed Open-Source Threats

The Hugging Face Incident Changes the Vulnerability Equation