How not to download the Internet
By Tim OBrien
4 minute read time
A criticism I hear often about Maven is, "every time you run Maven, it downloads the internet." I understand the criticism, as the first time you run Maven it has to populate your local repository. Maven downloads plugins and artifacts that your project depends on. Maven does in fact download artifacts from remote repositories, but it downloads the artifact once and keeps a local cache.
Maven only downloads most of these dependencies because you've added them to your project. If you are unhappy that Maven is "downloading the internet" then stop developing software that depends on external libraries. Easy, right? Stop using Spring and Hibernate, stop referencing the commons libraries, and do everything yourself. This would be one way to avoid downloading any artifacts from a remote repository. Stop using Maven to build your software and write your own build tool that has all of the capabilities of Maven and every imaginable Maven plugin baked into it.
Not a workable solution, right? The fact of the matter is that your software has dependencies on external libraries. If you find yourself constantly "downloading the internet" there's a reason. You are depending on projects that depend on "the internet" or, your projects have a very wide set of dependencies that may need to be trimmed.
How can we avoid creating projects and POMs that "download the internet"? The simple answer is that everyone needs to start focusing on dependencies. Library developers need to be smarter about creating leaner, meaner dependency lists, and you need to start evaluating your own dependencies with an eye on efficiency.
Library Developers Need to Modularize
Take a project like Spring as an example. Spring's libraries provide interoperability across a number of core enterprise APIs: JMS, JDBC, JTA. Spring also allows people to plug-in different implementations for various feature: Hibernate, ehcache, MyBatis, log4j, slf4j, etc. Picking on Spring in particular, artifacts from Spring tend to rely on the world. If you use some of the core Spring libraries you'll soon realize that that one simple dependency XML snippet actually translates to 30 or 40 dependencies.
If you are creating a library (Spring, Guice, or Hibernate) you need to start thinking about the dependencies that you are selecting. Instead of just blindly adding in a dependency to ten artifacts, split up your projects so you don't create that one, gigantic library which depends on the world. I'll bring the conversation back to Spring. Spring is moving in the right direction, the most recent version of spring-core version 3 has five dependencies, while spring-core 2.5.6 has 13 dependencies. If you've watched the progression of the Spring libraries over time, you'll notice a trend toward modularization. Take the Spring AWS as an example - there isn't just one spring-aws library, there is a spring-aws-ant, spring-aws-ivy, spring-aws-maven library. As more people within the Springsource use Maven as a build tool, more people are starting to realize the value of having more projects with lighter POMs.
This matters because low-level, almost universal libraries like Spring, Hibernate, log4j, Guice, Commons libraries. These projects end up putting dependencies into everyone's classpath. If developers of popular libraries get the message and move toward more modular project scopes then you shouldn't see so much bloat in your own project's classpaths.
Don't just let any dependency into your project
What can you do? You need to have some standards. Don't just let anybody put some new dependency into your POM. Have some process to evaluate and assess exactly what a new dependency is going to do to your classpath. Is that new-fangled database library going to drop a dependency bomb and pull in 20 other libraries, some of which have incompatible licenses?
One tool you can use to make this process easier is Nexus Professional. Nexus Professional has a new Maven Dependency report. It is very easy to use, find an artifact in Nexus Professional, and then select the Maven Dependencies tab. This report will allow you to see just how many dependencies a particular artifact is going to introduce into your project. It will also list artifacts that may be missing from the public repositories, giving you a chance to assess the quality of an artifact's dependencies.
Written by Tim OBrien
Tim is a Software Architect with experience in all aspects of software development from project inception to developing scaleable production architectures for large-scale systems during critical, high-risk events such as Black Friday. He has helped many organizations ranging from small startups to Fortune 100 companies take a more strategic approach to adopting and evaluating technology and managing the risks associated with change.
Explore All Posts by Tim OBrien