Chapter 12. Maven Assemblies

12.1. Introduction

Favicon

Maven provides plugins that are used to create the most common archive types, most of which are consumable as dependencies of other projects. Some examples include the JAR, WAR, EJB, and EAR plugins. As discussed in Chapter 10, The Build Lifecycle these plugins correspond to different project packaging types each with a slightly different build processes. While Maven has plugins and customized lifecycles to support standard packaging types, there are times when you'll need to create an archive or directory with a custom layout. Such custom archives are called Maven Assemblies.

There are any number of reasons why you may want to build custom archives for your project. Perhaps the most common is the project distribution. The word ‘distribution’ means many different things to different people (and projects), depending on how the project is meant to be used. Essentially, these are archives that provide a convenient way for users to install or otherwise make use of the project’s releases. In some cases, this may mean bundling a web application with an application server like Jetty. In others, it could mean bundling API documentation alongside source and compiled binaries like jar files. Assemblies usually come in handy when you are building the final distribution of a product. For example, products like Nexus introduced in Chapter 16, Repository Manager, are the product of large multi-module Maven products, and the final archive you download from Sonatype was created using a Maven Assembly.

In most cases, the Assembly plugin is ideally suited to the process of building project distributions. However, assemblies don’t have to be distribution archives; assemblies are intended to provide Maven users with the flexibility they need to produce customized archives of all kinds. Essentially, assemblies are intended to fill the gaps between the standard archive formats provided by project package types. Of course, you could write an entire Maven plugin simply to generate your own custom archive format, along with a new lifecycle mapping and artifact-handling configuration to tell Maven how to deploy it. But the Assembly plugin makes this unnecessary in most cases, by providing generalized support for creating your own archive recipe without spending so much time writing Maven code.

12.2. Assembly Basics

Favicon

Before we go any further, it’s best to take a minute and talk about the two main goals in the Assembly plugin: assembly:assembly, and the single mojo. I list these two goals in different ways, because it reflects the difference in how they’re used. The assembly:assembly goal is designed to be invoked directly from the command line, and should never be bound to a build lifecycle phase. In contrast, the single mojo is designed to be a part of your everyday build, and should be bound to a phase in your project’s build lifecycle.

The main reason for this difference is that the assembly:assembly goal is what Maven terms an aggregator mojo; that is, a mojo which is designed to run at most once in a build, regardless of how many projects are being built. It draws its configuration from the root project - usually the top-level POM or the command line. When bound to a lifecycle, an aggregator mojo can have some nasty side-effects. It can force the execution of the package lifecycle phase to execute ahead of time, and can result in builds which end up executing the package phase twice.

Because the assembly:assembly goal is an aggregator mojo, it raises some issues in multi-module Maven builds, and it should only be called as a stand-alone mojo from the command-line. Never bind an assembly:assembly execution to a lifecycle phase. assembly:assembly was the original goal in the Assembly plugin, and was never designed to be part of the standard build process for a project. As it became clear that assembly archives were a legitimate requirement for projects to produce, the single mojo was developed. This mojo assumes that it has been bound to the correct part of the build process, so that it will have access to the project files and artifacts it needs to execute within the lifecycle of a large multi-module Maven project. In a multi-module environment, it will execute as many times as it is bound to the different module POMs. Unlike assembly:assembly, single will never force the execution of another lifecycle phase ahead of itself.

Aside from these two goals, the Assembly plugin provides several others; however, discussion of these mojos is beyond the scope of this chapter, largely because they serve exotic or obsolete use cases, and because they are almost never needed. Whenever possible, you should definitely stick to using assembly:assembly for assemblies generated from the command-line, and single for assemblies bound to lifecycle phases.

12.2.1. Predefined Assembly Descriptors

Favicon

While many people opt to create their own archive recipes - called assembly descriptors - this isn’t strictly necessary. The Assembly plugin provides built-in descriptors for several common archive types that you can use immediately without writing a line of configuration. The following assembly descriptors are predefined in the Maven Assembly plugin:

bin

The bin descriptor is used to bundle project LICENSE, README, and NOTICE files with the project’s main artifact, assuming this project builds a jar as its main artifact. Think of this as the smallest possible binary distribution for completely self-contained projects.

jar-with-dependencies

The jar-with-dependencies descriptor builds a JAR archive with the contents of the main project jar along with the unpacked contents of all the project’s runtime dependencies. Coupled with an appropriate Main-Class Manifest entry (discussed in “Plugin Configuration” below), this descriptor can produce a self-contained, executable jar for your project, even if the project has dependencies.

project

The project descriptor simply archives the project directory structure as it exists in your file-system and, most likely, in your version control system. Of course, the target directory is omitted, as are any version-control metadata files like the CVS and .svn directories we’re all used to seeing. Basically, the point of this descriptor is to create a project archive that, when unpacked, can be built using Maven.

src

The src descriptor produces an archive of your project source and pom.xml files, along with any LICENSE, README, and NOTICE files that are in the project’s root directory. This precursor to the project descriptor produces an archive that can be built by Maven in most cases. However, because of its assumption that all source files and resources reside in the standard src directory, it has the potential to leave out non-standard directories and files that are nonetheless critical to some builds.

12.2.2. Building an Assembly

Favicon

The Assembly plugin can be executed in two ways: you can invoke it directly from the command line, or you can configure it as part of your standard build process by binding it to a phase of your project’s build lifecycle. Direct invocation has its uses, particularly for one-off assemblies that are not considered part of your project’s core deliverables. In most cases, you’ll probably want to generate the assemblies for your project as part of its standard build process. Doing this has the effect of including your custom assemblies whenever the project is installed or deployed into Maven’s repositories, so they are always available to your users.

As an example of the direct invocation of the Assembly plugin, imagine that you wanted to ship off a copy of your project which people could build from source. Instead of just deploying the end-product of the build, you wanted to include the source as well. You won’t need to do this often, so it doesn’t make sense to add the configuration to your POM. Instead, you can use the following command:

$ mvn -DdescriptorId=project assembly:single 
...
[INFO] [assembly:single] 
[INFO] Building tar : /Users/~/mvn-examples-1.0/assemblies/direct-invocation/\
                      target/direct-invocation-1.0-SNAPSHOT-project.tar.gz 
[INFO] Building tar : /Users/~/mvn-examples-1.0/assemblies/direct-invocation/\
                      target/direct-invocation-1.0-SNAPSHOT-project.tar.bz2
[INFO] Building zip: /Users/~/mvn-examples-1.0/assemblies/direct-invocation/\
                      target/direct-invocation-1.0-SNAPSHOT-project.zip
...

Imagine you want to produce an executable JAR from your project. If your project is totally self-contained with no dependencies, this can be achieved with the main project artifact using the archive configuration of the JAR plugin. However, most projects have dependencies, and those dependencies must be incorporated in any executable JAR. In this case, you want to make sure that every time the main project JAR is installed or deployed, your executable JAR goes along with it.

Assuming the main class for the project is org.sonatype.mavenbook.App, the following POM configuration will create an executable JAR:

Example 12.1. Assembly Descriptor for Executable JAR

<project xmlns="http://maven.apache.org/POM/4.0.0" 
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
                      http://maven.apache.org/maven-v4_0_0.xsd">

  <modelVersion>4.0.0</modelVersion>
  <groupId>org.sonatype.mavenbook.assemblies</groupId>
  <artifactId>executable-jar</artifactId>
  <version>1.0-SNAPSHOT</version>
  <packaging>jar</packaging>
  <name>Assemblies Executable Jar Example</name>
  <url>http://sonatype.com/book</url>
  <dependencies>
    <dependency>
      <groupId>commons-lang</groupId>
      <artifactId>commons-lang</artifactId>
      <version>2.4</version>
    </dependency>
  </dependencies>
 <build>
    <plugins>
      <plugin>
        <artifactId>maven-assembly-plugin</artifactId>
        <version>2.2-beta-2</version>
        <executions>
          <execution>
            <id>create-executable-jar</id>
            <phase>package</phase>
            <goals>
              <goal>single</goal>
            </goals>
            <configuration>
              <descriptorRefs>
                <descriptorRef>
                  jar-with-dependencies
                </descriptorRef>
              </descriptorRefs>
              <archive>
                <manifest>
                  <mainClass>org.sonatype.mavenbook.App</mainClass>
                </manifest>
              </archive>
           </configuration>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
</project>

There are two things to notice about the configuration above. First, we’re using the descriptorRefs configuration section instead of the descriptorId parameter we used last time. This allows multiple assembly types to be built from the same Assembly plugin execution, while still supporting our use case with relatively little extra configuration. Second, the archive element under configuration sets the Main-Class manifest attribute in the generated JAR. This section is commonly available in plugins that create JAR files, such as the JAR plugin used for the default project package type.

Now, you can produce the executable JAR simply by executing mvn package. Afterward, we’ll also get a directory listing for the target directory, just to verify that the executable JAR was generated. Finally, just to prove that we actually do have an executable JAR, we’ll try executing it:

$ mvn package
... (output omitted) ...
[INFO] [jar:jar]
[INFO] Building jar: ~/mvn-examples-1.0/assemblies/executable-jar/target/\
                     executable-jar-1.0-SNAPSHOT.jar
[INFO] [assembly:single {execution: create-executable-jar}]
[INFO] Processing DependencySet (output=)
[INFO] Building jar: ~/mvn-examples-1.0/assemblies/executable-jar/target/\
                     executable-jar-1.0-SNAPSHOT-jar-with-dependencies.jar
... (output omitted) ...
$ ls -1 target
... (output omitted) ...
executable-jar-1.0-SNAPSHOT-jar-with-dependencies.jar
executable-jar-1.0-SNAPSHOT.jar
... (output omitted) ...
$ java -jar \
        target/executable-jar-1.0-SNAPSHOT-jar-with-dependencies.jar 
Hello, World!

From the output shown above, you can see that the normal project build now produces a new artifact in addition to the main JAR file. The new one has a classifier of jar-with-dependencies. Finally, we verified that the new JAR actually is executable, and that executing the JAR produced the desired output of “Hello, World!”

12.2.3. Assemblies as Dependencies

Favicon

When you generate assemblies as part of your normal build process, those assembly archives will be attached to your main project’s artifact. This means they will be installed and deployed alongside the main artifact, and are then resolvable in much the same way. Each assembly artifact is given the same basic coordinate (groupId, artifactId, and version) as the main project. However, these artifacts are attachments, which in Maven means they are derivative works based on some aspect of the main project build. To provide a couple of examples, source assemblies contain the raw inputs for the project build, and jar-with-dependencies assemblies contain the project’s classes plus its dependencies. Attached artifacts are allowed to circumvent the Maven requirement of one project, one artifact precisely because of this derivative quality.

Since assemblies are (normally) attached artifacts, each must have a classifier to distinguish it from the main artifact, in addition to the normal artifact coordinate. By default, the classifier is the same as the assembly descriptor’s identifier. When using the built-in assembly descriptors, as above, the assembly descriptor’s identifier is generally also the same as the identifier used in the descriptorRef for that type of assembly.

Once you’ve deployed an assembly alongside your main project artifact, how can you use that assembly as a dependency in another project? The answer is fairly straightforward. Recall the discussions in Section 3.5.3, “Maven Coordinates” and Section 9.5.1, “More on Coordinates” about project dependencies in Maven, projects depend on other projects using a combination of four basic elements, referred to as a project’s coordinates: groupId, artifactId, version, and packaging. In Section 11.7.3, “Platform Classifiers”, multiple platform-specific variants of a project’s artifact and available, and the project specifies a classifier element with a value of either win or linux to select the appropriate dependency artifact for the target platform. Assembly artifacts can be used as dependencies using the required coordinates of a project plus the classifier under which the assembly was installed or deployed. If the assembly is not a JAR archive, we also need to declare its type.

12.2.4. Assembling Assemblies via Assembly Dependencies

Favicon

How's that for a confusing section title? Let's try to set up a scenario which would explain the idea of assembling assemblies. Imagine you want to create an archive which itself contains some project assemblies. Assume that you have a multi-module build and you want to deploy an assembly which contains a set of related project assemblies. In this section's example, we create a bundle of "buildable" project directories for a set of projects that are commonly used together. For simplicity, we’ll reuse the two built-in assembly descriptors discussed above - project and jar-with-dependencies. This this particular example, it is assumed that each project creates the project assembly in addition to its main JAR artifact. Assume that every project in a multi-module build binds the single goal to the package phase and uses the project descriptorRef. Every project in a multi-module will inherit the configuration from a top-level pom.xml whose pluginManagement element is shown in Example 12.2, “Configuring the project assembly in top-level POM”.

Example 12.2. Configuring the project assembly in top-level POM

<project>
  ...
  <build>
    <pluginManagement>
      <plugins>
        <plugin>
          <artifactId>maven-assembly-plugin</artifactId>
          <version>2.2-beta-2</version>
          <executions>
            <execution>
              <id>create-project-bundle</id>
              <phase>package</phase>
              <goals>
                <goal>single</goal>
              </goals>
              <configuration>
                <descriptorRefs>
                  <descriptorRef>project</descriptorRef>
                </descriptorRefs>
              </configuration>
            </execution>
          </executions>
        </plugin>
      </plugins>
    </pluginManagement>
  </build>
  ...
</project>

Each project POM references the managed plugin configuration from Example 12.2, “Configuring the project assembly in top-level POM” using a minimal plugin declaration in its build section shown in Example 12.3, “Activating the Assembly Plugin Configuration in Child Projects”.

Example 12.3. Activating the Assembly Plugin Configuration in Child Projects

<build>
  <plugins>
    <plugin>
      <artifactId>maven-assembly-plugin</artifactId>
    </plugin>
  </plugins>
</build>

To produce the set of project assemblies, run mvn install from the top-level directory. You should see Maven installing artifacts with classifiers in your local repository.

$ mvn install
...
Installing ~/mvn-examples-1.0/assemblies/as-dependencies/project-parent/\
           second-project/target/second-project-1.0-SNAPSHOT-project.tar.gz to 
  ~/.m2/repository/org/sonatype/mavenbook/assemblies/second-project/1.0-SNAPSHOT/\
           second-project-1.0-SNAPSHOT-project.tar.gz
...
Installing ~/mvn-examples-1.0/assemblies/as-dependencies/project-parent/\
           second-project/target/second-project-1.0-SNAPSHOT-project.tar.bz2 to 
  ~/.m2/repository/org/sonatype/mavenbook/assemblies/second-project/1.0-SNAPSHOT/\
           second-project-1.0-SNAPSHOT-project.tar.bz2
...
Installing ~/mvn-examples-1.0/assemblies/as-dependencies/project-parent/\
           second-project/target/second-project-1.0-SNAPSHOT-project.zip to 
  ~/.m2/repository/org/sonatype/mavenbook/assemblies/second-project/1.0-SNAPSHOT/\\
           second-project-1.0-SNAPSHOT-project.zip
...

When you run install, Maven will copy the each project's main artifact and each assembly to your local Maven repository. All of these artifacts are now available for reference as dependencies in other projects locally. If your ultimate goal is to create a bundle which includes assemblies from multiple project, you can do so by creating another project which will include other project's assemblies as dependencies. This bundling project (aptly named project-bundle) is responsible for creating the bundled assembly. The POM for the bundling project would resemble the XML document listed in Example 12.4, “POM for the Assembly Bundling Project”.

Example 12.4. POM for the Assembly Bundling Project

<project xmlns="http://maven.apache.org/POM/4.0.0"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
                      http://maven.apache.org/maven-v4_0_0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>org.sonatype.mavenbook.assemblies</groupId>
  <artifactId>project-bundle</artifactId>
  <version>1.0-SNAPSHOT</version>
  <packaging>pom</packaging>
  <name>Assemblies-as-Dependencies Example Project Bundle</name>
  <url>http://sonatype.com/book</url>
  <dependencies>
    <dependency>
      <groupId>org.sonatype.mavenbook.assemblies</groupId>
      <artifactId>first-project</artifactId>
      <version>1.0-SNAPSHOT</version>
      <classifier>project</classifier>
      <type>zip</type>
    </dependency>
    <dependency>
      <groupId>org.sonatype.mavenbook.assemblies</groupId>
      <artifactId>second-project</artifactId>
      <version>1.0-SNAPSHOT</version>
      <classifier>project</classifier>
      <type>zip</type>
    </dependency>
  </dependencies>
  <build>
    <plugins>
      <plugin>
        <artifactId>maven-assembly-plugin</artifactId>
        <version>2.2-beta-2</version>
        <executions>
          <execution>
            <id>bundle-project-sources</id>
            <phase>package</phase>
            <goals>
              <goal>single</goal>
            </goals>
            <configuration>
              <descriptorRefs>
                <descriptorRef>
                  jar-with-dependencies
                </descriptorRef>
              </descriptorRefs>
            </configuration>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
</project>

This bundling project's POM references the two assemblies from first-project and second-project. Instead of referencing the main artifact of each project, the bundling project's POM specifies a classifier of project and a type of zip. This tells Maven to resolve the ZIP archive which was created by the project assembly. Note that the bundling project generates a jar-with-dependencies assembly. jar-with-dependencies does not create a particularly elegant bundle, it simply creates a JAR file with the unpacked contents of all of the dependencies. jar-with-dependencies is really just telling Maven to take all of the dependencies, unpack them, and then create a single archive which includes the output of the current project. In this project, it has the effect of creating a single JAR file that puts the two project assemblies from first-project and second-project side-by-side.

This example illustrates how the basic capabilities of the Maven Assembly plugin can be combined without the need for a custom assembly descriptor. It achieves the purpose of creating a single archive that contains the project directories for multiple projects side-by-side. This time, the jar-with-dependencies is just a storage format, so we don’t need to specify a Main-Class manifest attribute. To build the bundle, we just build the project-bundle project normally:

$ mvn package
...
[INFO] [assembly:single {execution: bundle-project-sources}]
[INFO] Processing DependencySet (output=)
[INFO] Building jar: ~/downloads/mvn-examples-1.0/assemblies/as-dependencies/\
  project-bundle/target/project-bundle-1.0-SNAPSHOT-jar-with-dependencies.jar

To verify that the project-bundle assembly contains the unpacked contents of the assembly dependencies, run jar tf:

$ java tf \
  target/project-bundle-1.0-SNAPSHOT-jar-with-dependencies.jar
...
first-project-1.0-SNAPSHOT/pom.xml
first-project-1.0-SNAPSHOT/src/main/java/org/sonatype/mavenbook/App.java
first-project-1.0-SNAPSHOT/src/test/java/org/sonatype/mavenbook/AppTest.java
...
second-project-1.0-SNAPSHOT/pom.xml
second-project-1.0-SNAPSHOT/src/main/java/org/sonatype/mavenbook/App.java
second-project-1.0-SNAPSHOT/src/test/java/org/sonatype/mavenbook/AppTest.java

After reading this section, the title should make more sense. You've assembled assemblies from two projects into an assembly using a bundling project which has a dependency on each of the assemblies.

12.3. Overview of the Assembly Descriptor

Favicon

When the standard assembly descriptors introduced in Section 12.2, “Assembly Basics” are not adequate, you will need to define you own assembly descriptor. The assembly descriptor is an XML document which defines the structure and contents of an assembly.

Assembly Descriptor Picture

Figure 12.1. Assembly Descriptor Picture


The assembly descriptor contains five main configuration sections, plus two additional sections: one for specifying standard assembly-descriptor fragments, called component descriptors, and another for specifying custom file processor classes to help manage the assembly-production process.

Base Configuration

This section contains the information required by all assemblies, plus some additional configuration options related to the format of the entire archive, such as the base path to use for all archive entries. For the assembly descriptor to be valid, you must at least specify the assembly id, at least one format, and at least one of the other sections shown above.

File Information

The configurations in this segment of the assembly descriptor apply to specific files on the file system within the project’s directory structure. This segment contains two main sections: files and fileSets. You use files and fileSets to control the permissions of files in an assembly and to include or exclude files from an assembly.

Dependency Information

Almost all projects of any size depend on other projects. When creating distribution archives, project dependencies are usually included in the end-product of an assembly. This section manages the way dependencies are included in the resulting archive. This section allows you to specify whether dependencies are unpacked, added directly to the lib/ directory, or mapped to new file names. This section also allows you to control the permissions of dependencies in the assembly, and which dependencies are included in an assembly.

Repository Information

At times, it’s useful to isolate the sum total of all artifacts necessary to build a project, whether they’re dependency artifacts, POMs of dependency artifacts, or even a project’s own POM ancestry (your parent POM, its parent, and so on). This section allows you to include one or more artifact-repository directory structures inside your assembly, with various configuration options. The Assembly plugin does not have the ability to include plugin artifacts in these repositories yet.

Module Information

This section of the assembly descriptor allows you to take advantage of these parent-child relationships when assembling your custom archive, to include source files, artifacts, and dependencies from your project’s modules. This is the most complex section of the assembly descriptor, because it allows you to work with modules and sub-modules in two ways: as a series of fileSets (via the sources section) or as a series of dependencySets (via the binaries section).

12.4. The Assembly Descriptor

Favicon

This section is a tour of the assembly descriptor which contains some guidelines for developing a custom assembly descriptor. The Assembly plugin is one of the largest plugins in the Maven ensemble, and one of the most flexible.

12.4.1. Property References in Assembly Descriptors

Favicon

Any property discussed in Section 13.2, “Maven Properties” can be referenced in an assembly descriptor. Before any assembly descriptor is used by Maven, it is interpolated using information from the POM and the current build environment. All properties supported for interpolation within the POM itself are valid for use in assembly descriptors, including POM properties, POM element values, system properties, user-defined properties, and operating-system environment variables.

The only exceptions to this interpolation step are elements in various sections of the descriptor named outputDirectory, outputDirectoryMapping, or outputFileNameMapping. The reason these are held back in their raw form is to allow artifact- or module-specific information to be applied when resolving expressions in these values, on a per-item basis.

12.4.2. Required Assembly Information

Favicon

There are two essential pieces of information that are required for every assembly: the id, and the list of archive formats to produce. In practice, at least one other section of the descriptor is required - since most archive format components will choke if they don’t have at least one file to include - but without at least one format and an id, there is no archive to create. The id is used both in the archive’s file name, and as part of the archive’s artifact classifier in the Maven repository. The format string also controls the archiver-component instance that will create the final assembly archive. All assembly descriptors must contain an id and at least one format:

Example 12.5. Required Assembly Descriptor Elements

<assembly>
  <id>bundle</id> 
  <formats>
    <format>zip</format>
  </formats>
  ...
</assembly>

The assembly id can be any string that does not contain spaces. The standard practice is to use dashes when you must separate words within the assembly id. If you were creating an assembly to create an interesting unique package structure, you would give your an id of something like interesting-unique-package. It also supports multiple formats within a single assembly descriptor, allowing you to create the familiar .zip, .tar.gz, and .tar.bz2 distribution archive set with ease. If you don't find the archive format you need, you can also create a custom format. Custom formats are discussed in Section 12.5.8, “componentDescriptors and containerDescriptorHandlers. The Assembly plugin supports several archive formats natively, including:

  • jar

  • zip

  • tar

  • bzip2

  • gzip

  • tar.gz

  • tar.bz2

  • rar

  • war

  • ear

  • sar

  • dir

The id and format are essential because they will become a part of the coordinates for the assembled archive. The example from Example 12.5, “Required Assembly Descriptor Elements” will create an assembly artifact of type zip with a classifier of bundle.

12.5. Controlling the Contents of an Assembly

Favicon

In theory, id and format are the only absolute requirements for a valid assembly descriptor; however, many assembly archivers will fail if they do not have at least one file to include in the output archive. The task of defining the files to be included in the assembly is handled by the five main sections of the assembly descriptor: files, fileSets, dependencySets, repositories, and moduleSets. To explore these sections most effectively, we’ll start by discussing the most elemental section: files. Then, we’ll move on to the two most commonly used sections, fileSets and dependencySets. Once you understand the workings of fileSets and dependencySets, it’s easier to understand repositories and moduleSets.

12.5.1. Files Section

Favicon

The files section is the simplest part of the assembly descriptor, it is designed for files that have a definite location relative to your project’s directory. Using this section, you have absolute control over the exact set of files that are included in your assembly, exactly what they are named, and where they will reside in the archive.

Example 12.6. Including a JAR file in an Assembly using files

<assembly>
  ...
  <files>
    <file>
      <source>target/my-app-1.0.jar</source>
      <outputDirectory>lib</outputDirectory>
      <destName>my-app.jar</destName>
      <fileMode>0644</fileMode>
    </file>
  </files>
  ...
</assembly>

Assuming you were building a project called my-app with a version of 1.0, Example 12.6, “Including a JAR file in an Assembly using files would include your project's JAR in the assembly’s lib/ directory, trimming the version from the file name in the process so the final file name is simply my-app.jar. It would then make the JAR readable by everyone and writable by the user that owns it (this is what the mode 0644 means for files, using Unix four-digit Octal permission notation). For more information about the format of the value in fileMode, please see the Wikipedia's explanation of four-digit Octal notation.

You could build a very complex assembly using file entries, if you knew the full list of files to be included. Even if you didn’t know the full list before the build started, you could probably use a custom Maven plugin to discover that list and generate the assembly descriptor using references like the one above. While the files section gives you fine-grained control over the permission, location, and name of each file in the assembly archive, listing a file element for every file in a large archive would be a tedious exercise. For the most part, you will be operating on groups of files and dependencies using fileSets. The remaining four file-inclusion sections are designed to help you include entire sets of files that match a particular criteria.

12.5.2. FileSets Section

Favicon

Similar to the files section, fileSets are intended for files that have a definite location relative to your project’s directory structure. However, unlike the files section, fileSets describe sets of files, defined by file and path patterns they match (or don’t match), and the general directory structure in which they are located. The simplest fileSet just specifies the directory where the files are located:

<assembly>
  ...
  <fileSets>
    <fileSet>
      <directory>src/main/java</directory>
    </fileSet>
  </fileSets>
  ...
</assembly>

This file set simply includes the contents of the src/main/java directory from our project. It takes advantage of many default settings in the section, so let’s discuss those briefly.

First, you’ll notice that we haven’t told the file set where within the assembly matching files should be located. By default, the destination directory (specified with outputDirectory) is the same as the source directory (in our case, src/main/java). Additionally, we haven’t specified any inclusion or exclusion file patterns. When these are empty, the file set assumes that all files within the source directory are included, with some important exceptions. The exceptions to this rule pertain mainly to source-control metadata files and directories, and are controlled by the useDefaultExcludes flag, which is defaulted to true. When active, useDefaultExcludes will keep directories like .svn/ and CVS/ from being added to the assembly archive. Section 12.5.3, “Default Exclusion Patterns for fileSets provides a detailed list of the default exclusion patterns.

If we want more control over this file set, we can specify it more explicitly. Example 12.7, “Including Files with fileSet shows a fileSet element with all of the default elements specified.

Example 12.7. Including Files with fileSet

<assembly>
  ...
  <fileSets>
    <fileSet>
      <directory>src/main/java</directory>
      <outputDirectory>src/main/java</outputDirectory>
      <includes>
        <include>**</include>
      </include>
      <useDefaultExcludes>true</useDefaultExcludes>
      <fileMode>0644</fileMode>
      <directoryMode>0755</directoryMode>
    </fileSet>
  </fileSets>
  ...
</assembly>

The includes section uses a list of include elements, which contain path patterns. These patterns may contain wildcards such as ‘**’ which matches one or more directories or ‘*’ which matches part of a file name, and ‘?’ which matches a single character in a file name. Example 12.7, “Including Files with fileSet uses a fileMode entry to specify that files in this set should be readable by all, but only writable by the owner. Since the fileSet includes directories, we also have the option of specifying a directoryMode that works in much the same way as the fileMode. Since a directories’ execute permission is what allows users to list their contents, we want to make sure directories are executable in addition to being readable. Like files, only the owner can write to directories in this set.

The fileSet entry offers some other options as well. First, it allows for an excludes section with a form identical to the includes section. These exclusion patterns allow you to exclude specific file patterns from a fileSet. Include patterns take precedence over exclude patterns. Additionally, you can set the filtering flag to true if you want to substitute property values for expressions within the included files. Expressions can be delimited either by ${ and } (standard Maven expressions like ${project.groupId}) or by @ and @ (standard Ant expressions like @project.groupId@). You can adjust the line ending of your files using the lineEnding element; valid values for lineEnding are:

keep

Preserve line endings from original files. (This is the default value.)

unix

Unix-style line endings

lf

Only a Line Feed Character

dos

MS-DOS-style line endings

crlf

Carriage-return followed by a Line Feed

Finally, if you want to ensure that all file-matching patterns are used, you can use the useStrictFiltering element with a value of true (the default is false). This can be especially useful if unused patterns may signal missing files in an intermediary output directory. When useStrictFiltering is set to true, the Assembly plugin will fail if an include pattern is not satisfied. In other words, if you have an include pattern which includes a file from a build, and that file is not present, setting useStrictFiltering to true will cause a failure if Maven cannot find the file to be included.

12.5.3. Default Exclusion Patterns for fileSets

Favicon

When you use the default exclusion patterns, the Maven Assembly plugin is going to be ignoring more than just SVN and CVS information. By default the exclusion patterns are defined by the DirectoryScanner class in the plexus-utils project hosted at Codehaus. The array of exclude patterns is defined as a static, final String array named DEFAULTEXCLUDES in DirectoryScanner. The contents of this variable are shown in Example 12.8, “Definition of Default Exclusion Patterns from Plexus Utils”.

Example 12.8. Definition of Default Exclusion Patterns from Plexus Utils

   public static final String[] DEFAULTEXCLUDES = {
        // Miscellaneous typical temporary files
        "**/*~",
        "**/#*#",
        "**/.#*",
        "**/%*%",
        "**/._*",

        // CVS
        "**/CVS",
        "**/CVS/**",
        "**/.cvsignore",

        // SCCS
        "**/SCCS",
        "**/SCCS/**",

        // Visual SourceSafe
        "**/vssver.scc",

        // Subversion
        "**/.svn",
        "**/.svn/**",

        // Arch
        "**/.arch-ids",
        "**/.arch-ids/**",

        //Bazaar
        "**/.bzr",
        "**/.bzr/**",

        //SurroundSCM
        "**/.MySCMServerInfo",

        // Mac
        "**/.DS_Store"
    };

This default array of patterns excludes temporary files from editors like GNU Emacs, and other common temporary files from Macs and a few common source control systems (although Visual SourceSafe is more of a curse than a source control system). If you need to override these default exclusion patterns you set useDefaultExcludes to false and then define a set of exclusion patterns in your own assembly descriptor.

12.5.4. dependencySets Section

Favicon

One of the most common requirements for assemblies is the inclusion of a project’s dependencies in an assembly archive. Where files and fileSets deal with files in your project, dependency files don't have a location in your project. The artifacts your project depends on have to be resolved by Maven during the build. Dependency artifacts are abstract, they lack a definite location, and are resolved using a symbolic set of Maven coordinates. While Since file and fileSet specifications require a concrete source path, dependencies are included or excluded from an assembly using a combination of Maven coordinates and dependency scopes.

The simplest dependencySet is an empty element:

<assembly>
  ...
  <dependencySets>
    <dependencySet/>
  </dependencySets>
  ...
</assembly>

The dependencySet above will match all runtime dependencies of your project (runtime scope includes the compile scope implicitly), and it will add these dependencies to the root directory of your assembly archive. It will also copy the current project’s main artifact into the root of the assembly archive, if it exists.

Note

Wait? I thought dependencySet was about including my project's dependencies, not my project's main archive? This counterintuitive side-effect was a widely-used bug in the 2.1 version of the Assembly plugin, and, because Maven puts an emphasis on backward compatibility, this counterintuitive and incorrect behavior needed to be preserved between a 2.1 and 2.2 release. You can control this behavior by changing the useProjectArtifact flag to false.

While the default dependency set can be quite useful with no configuration whatsoever, this section of the assembly descriptor also supports a wide array of configuration options, allowing your to tailor its behavior to your specific requirements. For example, the first thing you might do to the dependency set above is exclude the current project artifact, by setting the useProjectArtifact flag to false (again, its default value is true for legacy reasons). This will allow you to manage the current project’s build output separately from its dependency files. Alternatively, you might choose to unpack the dependency artifacts using by setting the unpack flag to true (this is false by default). When unpack is set to true, the Assembly plugin will combine the unpacked contents of all matching dependencies inside the archive’s root directory.

From this point, there are several things you might choose to do with this dependency set. The next sections discuss how to define the output location for dependency sets and how include and exclude dependencies by scope. Finally, we’ll expand on the unpacking functionality of the dependency set by exploring some advanced options for unpacking dependencies.

12.5.4.1. Customizing Dependency Output Location

Favicon

There are two configuration options that are used in concert to define the location for a dependency file within the assembly archive: outputDirectory and outputFileNameMapping. You may want to customize the location of dependencies in your assembly using properties of the dependency artifacts themselves. Let's say you want to put all the dependencies in directories that match the dependency artifact's groupId. In this case, you would use the outputDirectory element of the dependencySet, and you would supply something like:

<assembly>
  ...
  <dependencySets>
    <dependencySet>
      <outputDirectory>${artifact.groupId}</outputDirectory>
    </dependencySet>
  </dependencySets>
  ...
</assembly>

This would have the effect of placing every single dependency in a subdirectory that matched the name of each dependency artifact's groupId.

If you wanted to perform a further customization and remove the version numbers from all dependencies. You could customize the the output file name for each dependency using the outputFileNameMapping element as follows:

<assembly>
  ...
  <dependencySets>
    <dependencySet>
      <outputDirectory>${artifact.groupId}</outputDirectory>
      <outputFileNameMapping>
        ${artifact.artifactId}.${artifact.extension} 
      </outputFileNameMapping>
    </dependencySet>
  </dependencySets>
  ...
</assembly>

In the previous example, a dependency on commons:commons-codec version 1.3, would end up in the file commons/commons-codec.jar.

12.5.4.2. Interpolation of Properties in Dependency Output Location

Favicon

As mentioned in the Assembly Interpolation section above, neither of these elements are interpolated with the rest of the assembly descriptor, because their raw values have to be interpreted using additional, artifact-specific expression resolvers.

The artifact expressions available for these two elements vary only slightly. In both cases, all of the ${project.*}, ${pom.*}, and ${*} expressions that are available in the POM and the rest of the assembly descriptor are also available here. For the outputFileNameMapping element, the following process is applied to resolve expressions:

  1. If the expression matches the pattern ${artifact.*}:

    1. Match against the dependency’s Artifact instance (resolves: groupId, artifactId, version, baseVersion, scope, classifier, and file.*)

    2. Match against the dependency’s ArtifactHandler instance (resolves: expression)

    3. Match against the project instance associated with the dependency’s Artifact (resolves: mainly POM properties)

  2. If the expression matches the patterns ${pom.*} or ${project.*}:

    1. Match against the project instance (MavenProject) of the current build.

  3. If the expression matches the pattern ${dashClassifier?} and the Artifact instance contains a non-null classifier, resolve to the classifier preceded by a dash (-classifier). Otherwise, resolve to an empty string.

  4. Attempt to resolve the expression against the project instance of the current build.

  5. Attempt to resolve the expression against the POM properties of the current build.

  6. Attempt to resolve the expression against the available system properties.

  7. Attempt to resolve the expression against the available operating-system environment variables.

The outputDirectory value is interpolated in much the same way, with the difference being that there is no available ${artifact.*} information, only the ${project.*} instance for the particular artifact. Therefore, the expressions listed above associated with those classes (1a, 1b, and 3 in the process listing above) are unavailable.

How do you know when to use outputDirectory and outputFileNameMapping? When dependencies are unpacked only the outputDirectory is used to calculate the output location. When dependencies are managed as whole files (not unpacked), both outputDirectory and outputFileNameMapping can be used together. When used together, the result is the equivalent of:

<archive-root-dir>/<outputDirectory>/<outputFileNameMapping>

When outputDirectory is missing, it is not used. When outputFileNameMapping is missing, its default value is: ${artifact.artifactId}-${artifact.version}${dashClassifier?}.${artifact.extension}

12.5.4.3. Including and Excluding Dependencies by Scope

Favicon

In Chapter 9, The Project Object Model, it was noted that all project dependencies have one scope or another. Scope determines when in the build process that dependency normally would be used. For instance, test-scoped dependencies are not included in the classpath during compilation of the main project sources; but they are included in the classpath when compiling unit test sources. This is because your project’s main source code should not contain any code specific to testing, since testing is not a function of the project (it’s a function of the project’s build process). Similarly, provided-scoped dependencies are assumed to be present in the environment of any eventual deployment. However, if a project depends on a particular provided dependency, it is likely to require that dependency in order to compile. Therefore, provided-scoped dependencies