Maven Continuous Integration Best Practices
Continuous Integration is a development best practice that you need to be using in your process; it is an essential part of an efficient Software Development Lifecycle (SLDC). If you aren’t using it already, then you should start, now. The main benefit of Continuous Integration is the ability to flag errors as they are introduced into a system instead of waiting multiple days for test failures and critical errors to be identified during the QA cycle. This post isn’t about the virtues of using CI, it’s about how to setup an optimal environment in a Maven shop. Here are seven tips for running Maven builds in a CI system such as Hudson.
#1 Automate Snapshot Deployment
In my experience, it is best to let your CI system deploy your snapshots. This is the most reliable way to guarantee that the contents of your repository are kept in sync with your source control system. In order to do this in a practical way, you need to couple CI with a repository manager like Nexus that can automatically purge snapshots. I’ve managed projects that produced >300gb of snapshots in less than a week. Using a repository manager will save your sanity.
#2 Isolate Local Repostitories
Another critical component of a good CI setup is local repository isolation. The local repository in Maven is the temporary holding spot for all artifacts downloaded and produced by Maven, and it is not currently setup to be multi-process safe. There is a remote possibility of a conflict, but it does exist.
The main reason I like to have a local repository per project is that it’s the only way to test that your project is build-able against the artifacts in the corporate repository. If you don’t have separate local repos, then the product on one build will be seen by another build on CI, even if it’s not in the corporate repository. This is important since one function of CI should be to validate that the code is buildable by a real developer.
Tip: use -Dmaven.repo.local=xxxx to define the unique local repositories for each build.
#3 Regularly Purge Local Repositories
To further validate the contents of the repository, and to manage the disk space, I purge the local repostories every night. This way if changes in the repository or artifacts are removed, the CI system will detect this. To keep it easy to purge all the local repositories, I tend to structure them under a single common folder such as /opt/repos/*.
Obviously having many local repositories requires more disk space than a single monolitic one due to dependency duplication, but even on our large grid the repos are less than 10gb total. Local repos get giant when you don’t control the snapshots and purging them nightly keeps this under control.
Tip: use your CI system itself to schedule the local repo cleanup. This way anyone can clean the repos manually right from the UI if Maven gets confused.
Over time, I’ve also picked up a few more simple tricks:
#4 Enable Batch Mode
Tip: Enable -B (batch) mode on the build. This will make the logs shorter since it avoids the dependency download progress logging. It also ensures that the build won’t hang due to waiting for user input. (to enable globally in settings.xml:<interactiveMode>false</interactiveMode>)
#5 Enable Full Stack Traces
Tip: Enable -e to cause Maven to produce the full stack trace if there’s a build exception. This will make it easier to comprehend any problems in the resulting build failure log/email without having to build it again.
#6 Print Test Failures to Standard Output
Tip: Enable -Dsurefire.useFile=false. This is a favorite of mine since this causes surefire to print test failures to standard out, where it will get included in the build failure log and email. This saves you from having to dig back onto the machine to find the surefire report just to see a simple stack trace. (to enable globally in settings.xml:<properties><surefire.useFile>true</surefire.useFile></properties> in an active profile)
#7 Always check for Snapshots
Tip: Enable -U to cause Maven to always check for new snapshots. (to enable globally in settings.xml: <updatePolicy>always</updatePolicy>….this goes on a repository definition)
Summary
Using the above settings and process causes every build to push the artifacts to the repository. The next downstream build will have its own clean repo and check the repository manager for the latest snapshots. Then at least once a day, everything is dumped locally and all dependencies are pulled out of the repository manager.
Naturally, doing all of this updating and purging puts some network load between the CI and repository manager. This works best if they share a highspeed network. If your repository manager isn’t close to your CI system, then you should put one there, if only to proxy the artifacts and reduce the impact of the daily local repo purge.
Note: If you are going to follow these tips, it is essential that you download a copy of Nexus, purging the contents of your local repository and downloading everything from the Central Maven repository once a day (per project) is exactly the sort of behavior that causes traffic problems on the Central Maven repo.

I must be a little dense today, but can you provide an example of how you would configure items 4, 6, and 7 in the settings.xml file?
Sure:
4: <interactiveMode>false</interactiveMode>
6 <properties><surefire.useFile>true</surefire.useFile></properties>..this goes in an active profile in your settings.
7: <updatePolicy>always</updatePolicy>….this goes on a repository definition in Settings.
Thanks Jason, many useful hints.
Jason?
I was wondering that myself
Are you using any kind of CI server? (continuum, cruise control etc) I assume most of these recommendations are for the build server and isn't continuum for instance doing a lot of this?
Also, how do you set the purging of snapshots in nexus? Have looked for the setting for this in the web ui, but didn't find any… ?
Is it possible to set purging of deployed releases too? By that I mean releases that are old and not in use any more?
Hi Kent,
Yes we do use Hudson and have it in a large grid setup for the Maven community (http://grid.sonatype.com/ci). All of these tips are applicable regardless of how you execute the builds since it controls Maven itself. I prefer this over customized plugins for each tool.
Regarding Nexus, go to the Scheduled Tasks screen, you can configure Snapshot purging based on the age, number of artifacts and if the snapshot has been released. We do not currently allow purging of deployed releases in an automatic fashion as we believe this should be done intentionally and rarely. If you find you are deploying Releases many times that don’t correspond to an actual public release, perhaps you should check out the Staging & Promotion feature in Nexus Pro as this is what it was designed for.
Nexus does allow you to purge things from proxied repos based on the length of time since the last request.
–Brian
We use continuum and I have to say that so far we had no reason to regret that choice.
Ok, I see we use a bit of an outdated nexus version (1.0.0-beta-3.1) … so I guess I should update it? That is how it is when a tool just runs…
had no reason to consider an update until now, guess that will be a good thing to do when we get out new build servers soon …
The fact you’re still running beta-3.1 is a testament to the stability of Nexus. However you’re missing out on lots of features such as scheduled tasks, security and I’m sure several more. I would definitely recommend an update.
What are your thoughts on the CI server's use of one-to-many repo mirror settings to gain build speed efficiency?
The use of mirrors (esp. wildcards), it seems, could distort the validity of the build, considering "one function of CI should be to validate that the code is buildable by a real developer.".
I was thinking of the example where a developer added a dependency to the pom but forgot to add the new repo it comes from. When it gets built on the CI box, it redirects "*"-style to something the Nexus server happened to already have proxied.
The CI server should use the exact same setup as the developers with respect to the repository or or repository manager configuration. I think the only way to do CI as suggested in this blog is with a repo manager given the heavy remote repo use of purging the local repo and the constant updates, and being able to delegate the lookups to a group in the repo manager is critical for build speed performance.
I also feel that repository definitions should be left out of the poms and handled in your settings instead (in almost all circumstances). The reasons why are material for another blog post.