Contrasting Nexus and Artifactory

By

4 minute read time

Today's Maven users have two solid choices when it comes to repository managers: Sonatype Nexus Repository and JFrog Artifactory. While we believe Nexus is the better choice, we're excited to see that there is competition in this market. Competition leads to more efficient software markets and more "accountability" to the end-user. Nexus and Artifactory have much more in common than not, but we think the differences are important to understand, as they have dramatic impacts on performance and scalability. In this post, I contrast some design decisions made in the construction of Artifactory with the design decisions we made when developing Nexus.

Contrast #1: Network: WebDAV vs. REST

The first major difference is that Artifactory uses Jackrabbit as a WebDAV implementation for artifact uploads. Nexus implements a simple, lightweight HTTP PUT via Restlet instead. We had a WebDAV implementation in early Alpha releases, but found it far too heavy and slow. Switching to a simple REST call improved our performance and significantly decreased the memory footprint. A profile of the memory use in Artifactory that I ran on previous versions showed that most object creation and memory allocation are related to Jackrabbit. Sure, you can't mount a Nexus repository with a webfolder using WebDAV, but is that what you need a repository manager to do, or would you rather it be blazingly fast doing Maven builds? It is possible to use the lightweight wagon against Artifactory (http vs dav:http), but the choice of Jackrabbit is overhead that isn't needed.

Contrast #2: Storage: Relational Database vs. Filesystem

The second major difference is that Nexus deliberately chooses to use a regular Maven 2 repository layout to store the data on disk. Doing this effectively isn't always easy, and we've had many discussions with the team, but I hold fast to this approach for several reasons:

  • It makes importing and exporting the repositories a no brainer. Simply copy the data into the correct folder in the Nexus work folder, and you're done, import finished. Copy it out, export done.

  • The incremental nature of the file changes in a Maven 2 repository layout makes it extremely well suited for incremental backups to tape or other archiving medium.

  • Nexus also keeps its metadata (not to be confused with the maven-metadata) separate from the artifacts, and the data is rebuilt on the fly if it's missing. If you are unlucky and have some hardware or disk error, you will likely only get one file corrupted, not the entire repository.

  • Having the metadata separate means Nexus upgrades don't have to touch any data in the repository folder. Upgrades and rollbacks of the system can happen as fast as you can stop one instance and start the next.

Artifactory takes the polar opposite approach and stores the metadata and artifacts themselves in a huge database. The reason they claim it's needed is transactional behavior. Using a database doesn't guarantee transactionality, and it's certainly not the only way to get transactional behavior.

To use a database, Artifactory needs to have import and export tools. The imports and exports of this data are reported to take significant time. Some upgrades require a full dump and re-import of the database, taking out large systems for a significant amount of time. Also, what happens if you need to tweak or repair a file in the system? Break out your dba books and go to town. How about incremental backups? Would you be happy if a single disk error made your entire repository garbage?

We feel strongly that introducing a repository manager into your system shouldn't require a dba to manage the data. Quite reliable backups can be performed with Nexus using robocopy or rsync tools and a simple script, and transactions can be obtained with much less overhead. In fact, with the Staging plugin, Nexus can turn an entire multimodule build into a single transaction. There are ways to implement "transactional" interactions in a piece of software, without having to throw *everything* into a database. We think loading the entire contents of a repository into Jackrabbit and modeling the repository in a relational database is much more complex than necessary.

Contrast #3: Storage Size

It has also been reported that the indexes and metadata introduced by Artifactory can double or triple the size of a repo. See this thread for real examples. Perhaps that's manageable on a 1gb repo, but how about something like Central at 60+ gb? Nexus uses the Nexus-indexer (Artifactory also uses it to provide search capability), which is just a Lucene index. We can provide indexes of Central that are only 30mb...not double the size of the repo itself. Note that the Nexus indexes also include cross references of the Java classes contained in the jars. Once again, we think that involving a relational database into this problem is an unnecessarily complicating factor.

A preliminary test import of a 116mb release repo took 5 minutes, and the resulting data size was 323mb (2.78x the original size). Extrapolating that to a 4gb repository takes about 3 hours of import and a total data size of 11gb. Sure disks are cheap these days, but still tripling the size of your data has many long-term ramifications when you consider backups, replication, etc.

Note: The actual import I ran failed 3 times on data from central due to too strict checking, I had to prune or repair the files just to get it to import. Fortunately, that didn't happen midway through a 3 hour import.... which leads me to the next point...

Contrast #4: Nexus Doesn't Interfere

We believe Nexus shouldn't interfere with your builds. We all know that the data in remote repos like Central can be incomplete. However, if Maven can use it, we ensure Nexus won't get in the way. Artifactory proactively blocks any data that isn't parse-able as it comes through as a feature. This means you may have a build that works without Artifactory and breaks with it because it refuses to proxy (and apparently import) any files it doesn't like. Nexus will report that there's a problem for the admins, but won't cause Maven to blow up for the developer. Nexus favors stability over correctness for proxy repositories.

Download Nexus Today

Nexus is available as an open source project for free. There is also a Pro version that includes additional commercial functionality and professional support, and for only $2,995 per server. While the Open Source version is capable and popular, Nexus Professional (now known as Sonatype Nexus Repository) adds some new features targeted at Enterprise Users: staging, procurement, and LDAP integration.

Picture of Brian Fox

Written by Brian Fox

Brian Fox, CTO and co-founder of Sonatype, is a Governing Board Member for the Open Source Security Foundation (OpenSSF), a Governing Board Member for the Fintech Open Source Foundation (FINOS), a member of the Monetary Authority of Singapore Cyber and Technology Resilience Experts (CTREX) Panel, a ...

Tags