Can Nexus Scale?

We’re often asked by customers to prove that Nexus can scale to meet the demands of thousands, and sometimes tens of thousands, of developers. Fortunately, we don’t have to stand up an expensive set of machines for a proof-of-concept as we have the world’s largest collection of active open source projects hosted on a single instance of Nexus Professional running at http://oss.sonatype.org. This instance isn’t just proof that Nexus Professional can scale, it serves as a public instance that you can model your own instance after.
If you are looking for an estimate of the hardware required to support your instance of Nexus, this post will detail the configuration and specifications of the Nexus OSS repository instance. This instance is the largest known deployment of a repository manager in active use.
Performance of Nexus OSSRH
Nexus OSSRH serves requests on the order of 1,400-2,500 requests per minute. What drives this level of activity? First, the instance serves as a snapshot repository for many open source projects. If you look at the list of projects hosted on OSSRH, it is a large list. As we examine the logs for oss.sonatype.org we regularly see thousands of unique IP address every day, and oss.sonatype.org is involved in a number of OSS project’s CI builds. This means that at any given time, OSSRH is supporting any number of simultaneous CI builds and over the course of a given day we’re serving artifacts to thousands of developers.
OSSRH approximates the performance characteristics required for the largest development efforts in the world: with multiple geographic locations, 24/7 uptime requirements, and very high performance standards. This service has to stay up. If OSSRH were to become unavailable, you would hear an immediate outcry from every affected OSS developer. Just choose a day and search for projects announcing that they’ve pushed artifacts to oss.sonatype.org on Twitter and you’ll see that every day has several critical releases.
When a customer asks us to prove that Nexus Professional scales, we don’t have to stop and setup a contrived performance test. We support this level of activity every single day. All we need to do is point them at OSSRH.
Nexus OSSRH Specifications
We’ve established that OSSRH is at the center of a large amount of active OSS development. It serves between 1400 and 2500 requests per minute, and it is a mission critical resource. It would be reasonable to expect that this service runs on a cluster of machines distributed throughout the world to minimize latency. Think again, this is a single VM with modest specifications running at Contegix and constantly monitored by New Relic.
Our standard setup for all managed forges is:
- 2 CPUs
- 3GB RAM
- 400GB disk (this is completely dependent on your repository contents)
- RHEL 5.6 x64 (Contegix, our managed hosting service, recommends using this OS)
- Java 1.6 x64 with 1GB Heap* (see correction below)
- The virtual disk is located on a SAN connected with iSCSI over 1GBE
If you are supporting a global-scale network of thousands of developers, the hardware cost for this Nexus instance is a “drop in the bucket”. The specifications for one instance of Nexus Professional running on a service like Amazon EC2 would easily fit on an m1.large instance with space to grow or a very modest VM. (The only thing you might spend on is the disk requirement. For OSSRH, we have a six-disk RAID 50 approach described below.)
Scaling Nexus: I/O Requirements, Network, and Disk
Under heavy load, increasing the number of CPUs and amount of RAM may help, but often the gating factor is either disk I/O or network. We do not recommend using NFS to mount a virtual disk for the working folder as many customers have had trouble with locking and corrupted indexes. iSCSI is working very well for us on oss.sonatype.org and it also works for many of our flagship customers.
Over the course of a day, the system typically needs to scale up in terms of network and IO. And, Nexus “sings” under heavy load because we have made numerous code-level optimizations to ensure that we’re making effective use of caching to reduce roundtrips to disk. For I/O performance, we recommend a redundant solution that maximizes disk spindles, while maintaining fault tolerance. We use RAID 50 in our SAN. A RAID 50 combines the straight block-level striping of RAID 0 with the distributed parity of RAID 5. It is a RAID 0 array striped across RAID 5 elements. This approach emphasizes both performance and extreme reliability, it requires at least 6 drives.
If you need scale, Try Nexus Pro
Sonatype designed Nexus to meet the demands of the OSS community from the beginning. We’ve been supporting global-scale OSS communities for years, and we’ve integrated the lessons learned from supporting active OSS development into Nexus Professional. If you need to scale, try Nexus Professional Today.
Correction from Mike Hansen: With 2.0 we upped that to 2GB, at least on OSSRH. But that pretty much just provides some extra headroom… Actually, IIRC, the reason we went to 2GB was because we were battling memory consumption with some repository indexes that had not been optimized (i.e. the index optimization task had not been run for a very long time).
An Emerging Role in IT Governance: The ALM Architect

Whenever I’m at a client I tend to ask, “Who decides what open source packages are acceptable?” Nine times out of 10, people will say something about an “Architecture” group. Maybe there’s a single architecture group that sets standards across the entire department, or, more often, there are several groups that offer a set of services that may overlap. The more moving parts in an IT department the less clear people are about who will be responsible for running Nexus.
Take one example: I encountered a company that had a central architecture group alongside another team that managed deployments. There was some confusion: who would manage Nexus. Eventually they decided that Nexus fell under the auspices of the Architecture group because this group was setting license policy and that’s why they were bringing it in in the first place. I had two takeaways from this:
- This is an entirely new problem that most organizations haven’t adapted to yet – having a comprehensive view of everything in your Development Infrastructure stack.
- While everyone was cordial, there was some tension. Some unknown set of overlapping responsibilities that wasn’t entirely development nor ops. I do think that while DevOps preaches harmony it may also encourage lack of clarity for who is responsible for running something like a repository manager.
Is Analyzing Open Source Projects by Contributors a Valid Metric?
ReadWriteWeb’s Joe Brockmeier has an interesting piece analyzing OpenStack Essex, while this isn’t an exact overlap with the kind of analysis we’re working on for Insight and Nexus, it’s a view into the social and open source dynamics of a project.
Brockmeier’s article is a summary of some analysis that OpenStack contributor Mark McLoughlin assembled from commits and Gerrit code reviews. It’s a breakdown of activity by organization, as with many open source projects that have corporate involvement, there’s always one or two companies that tend to dominate the commit breakdown.
Where the article is a little off-base is in the assessment of community health, you can’t judge the “health” of an open source project by the mix of companies represented in a commit breakdown alone. It’s an interesting statistic, but there’s so much more to open source than code commits including documentation efforts, marketing spend by companies invested in a project, and financial support for essential efforts not directly related to code (legal, infrastructure, etc.). Open source isn’t about code alone, and while it is an ideal for open source projects with corporate involvement to have balance, this balance can shift over time.
Oracle Issues Critical Security Bug Fixes for Databases, Glassfish, and more.

If you are watching our security feed, you may have noticed this IDG News Service story reporting on a critical security patch from Oracle. Since many of our customers are directly affected by this vulnerability, we thought this announcement was important enough to feature. From the story:
“The upcoming patch batch includes six fixes for Oracle’s database, three of which can be exploited remotely without a username and password. Common Vulnerability Scoring System (CVSS) base score for the database bugs is 9 on the system’s 10-point scale. Another 11 patches cover Oracle Fusion Middleware, with 9 being remotely exploitable without authentication.”
Three important take-aways from this announcement:
- This patch contains some Level 9s on the CVSS. Level 9′s are a “big deal”, if you are not convinced just try playing around with this CVSS calculator from NIST or read this Complete Guide to the Common Vulnerability Scoring System Version 2.0 if you need convincing.
- Many of the vulnerabilities are exploitable without credentials. Attackers don’t need to compromise your database or application server credentials, if someone finds a way into your network, you may be vulnerable. Couple this with the fact that almost everyone is running either MySQL and Oracle and you have factors that bump up that CVSS score.
- Glassfish, a very popular OSS application server, and MySQL, a ubiquitous OSS database, are also affected.
Here’s a quote from the Oracle Critical Security Patch:
Due to the threat posed by a successful attack, Oracle strongly recommends that customers apply CPU fixes as soon as possible. This Critical Patch Update contains 88 new security fixes across the product families listed below.
If you are affected by this vulnerability, go get this Critical Security Patch Update from Oracle today.
Note: This post references our Security Feed. We maintain a feed of security stories relevant to developers which is isolated from our main blog feed. If you are interested in getting the full feed, read it here.
Is your phone possessed? Or is it Android Malware?
Hackers aren’t content enough to infect your laptop, they want your phone. There’s an article over on SecurityNewsDaily that talks about some new Android malware that can take over your phone. Here’s the fun quote:
“The new Android malware disguises itself in fully functional copies of apps, including ―Angry Birds Space,∥ and hides its malicious payload in the string of code at the end of an otherwise genuine JPEG file, Lookout said. This rogue code exploits the GingerBreak vulnerability, a flaw that enables it to gain control of the phone and trick the victim into purchasing apps from illegitimate app stores.”
It looks like Android developers need to start paying more attention to security in general now that Android has exceeded 50% market share in the US market. While this vulnerability isn’t something that is directly addressable with Insight at the moment, but it reminds us that we need to start focusing more on mobile. Since Android development is Java-based, you can immediately benefit from downloading Nexus Professional 2.0 today and making sure that all of your application dependencies are free of known vulnerabilities.
Note: This post references our Security Feed. We maintain a feed of security stories relevant to developers which is isolated from our main blog feed. If you are interested in getting the full feed, read it here.