Writing a Book With Maven: Part 1

By

7 minute read time

DISCLAIMER: In this post, I express my own, somewhat controversial views about Doxia and APT. These are solely my own views, and you should not assume they represent an official statement from Sonatype.

This Maven Book is created using Maven. Everything you see is produced using Springer's plugin. I don't pass in all the configuration variables via the pom.xml. We have two style sheets html_chunk.xsl and xslfo.xsl which help us print out a nice looking PDF and web site from the book. In the next part of this series, next week, I'm going to start blogging about the Maven project we use to manage the book. By next week, I'm going to try to have a Maven archetype ready for people who want to produce a book with Maven. I might even put a chapter in the book about using Maven to create a book (recursion). Ultimately, I'd like to help start a few projects that will make it easier for people to write books using the same technologies we use to publish this book. We need to get more developers writing good content. There are too many technical books written by people who haven't had a day of real coding experience.

DocBook, WTF?

We use DocBook. The original idea was to use APT, but once I started working on the book, I insisted on DocBook to the surprise of many people involved with the effort. In this post, I explain why I think DocBook is the best choice for writing a book.

I'm going to spoil the party for APT lovers. APT is impossible. I'm convinced that APT is the reason why most Maven documentation and many Maven sites are terrible things to try to read. I'd encourage anyone trying to use the Maven Site Plugin to dump it and start using the XSite plugin. Don't be afraid of HTML and Markup, Maven sites would look natural and simpler if you didn't have to suffer through all the canned copy and left navigation menus. It is impossible to use APT to write a book with styled cross-references, a good index, and appropriate in-line styles. The ability to differentiate between a listing of source code and numbered examples, variable lists, the differences between a chapter and a part are a preface. All of these things come with DocBook.

That being said, DocBook tools are a terrible curse. The editor I use is XMLMind. Not only is XMLMind not free, it is also as usable as Emacs on a keyboard with a broken Meta key. But, you get used to it, and you learn to be productive. It takes a year, you initially swear it off, but then you return to it and admit defeat by purchasing it and learning how to convince it to cooperate. In two years, you'll start to respect XMLMind, and you might even start to customize some key bindings. In other words, it's not easy. But, this brings me to my next point.

Writing a Book Is Not Easy

When a bunch of developers (I still consider myself a developer, not a writer) decide to write a book, there's this underlying tension. A developer's job is tough enough, they don't want the writing process to start siphoning already scarce time off of the development cycle. The initial reaction is to choose some technology like APT, because it is easier to write simple things with simple markup. This works for a while, you'll write a few chapters, and you might even start to develop innovative little plugins, including source code, etc. But, as the content grows in size, and you start getting ready for print production, you'll start to think about things like:

Cross-References

A large book without cross-references is about as useless as it gets. If I'm in Chapter 5 Section 3 and want to reference Chapter 1 Section 1.2, and I don't have a way to specify an element in a document, what happens when I move a chapter around or when I want to insert a section before Section 3. Sure, I can develop a facility within some Doxia engine to allow me to reference a section of a document, but then you'll want to do things like customize the text of the reference. Maybe half the time, I'll want to say "See Section 15.1 for more info", but just as often I'll want to say "See Section 1.5 Aggregating Stuff for more info". The point here is that cross-references are increasingly important for both the PDF, HTML, and print output. The only way to equal what comes out of the box with DocBook is to add more hacks to APT and customize the engine that reads it.

Inline Styles

This is probably the one thing that throws most developers-turned-writers into a tailspin. The idea that every command, classname, code reference, variable reference must have a different inline style. This takes most people a few weeks to get the hang of, but once you start doing this, you'll realize that it is essential to making readable technical content. Pickup any O'Reilly book, and you'll notice it contains a heavy amount of inline styling - Classnames are in a fixed font, differentiated from commands on the command-line. We don't just do this because we like to be fancy, we do this because it is a subtle hint to the reader that eases comprehension. It is also something that requires different markup elements in the book's source. There are classname, methodname, variable, code elements in DocBook to handle this. Not so in APT, because APT is solely focused on presentation, you can't embed semantic meaning within it. You can't say, "this is a classname"; instead, in APT you say, "make this italic" or "make this bold."

Print Production

The publisher I've worked with formats the book in DocBook before they send it off to the presses. There's a lengthy production process during which the book is converted to DocBook (if it isn't already in DocBook), and someone will go through and make sure the book has all the right inline styles. Then someone will mark up all the index terms (indexing is an arduous and mind-melting experience BTW). I prefer to produce a product that doesn't require too much manual futzing after I deliver it. I understand the production dudes need to tweak the content a bit, but I prefer the idea that my stuff doesn't have to go through some sort of filter before it gets to the real content. More on this in later parts of this series.

Formatting for Print/Web/PDF

Sure, I understand that I can get some APT stuff to spit out a PDF and a web page. But, can I tell it what section level I want it to descend to when computing the contents of a table of contents? Can I put a watermark on the output and put a disclaimer in the header of the preface to signify the output is an alpha release? (Something I need to do) Can I generate endnotes? How about footnotes? I could go on and on about things DocBook can do that APT can't. It all boils down to tools, and the fact that with DocBook, I'm capturing more than just syntax. DocBook is semantic, and there are many tools out there that let me convert that output to a good looking output. I'm sure someone will comment that all this is possible with some sort of customized Doxia plugin (see previous, I think Doxia should be thrown overboard).

You could hack up APT so much that it closely approximates DocBook. You could muck around with the various Maven plugins involved in the process to make it easy to include code samples and snippets..... Or, you could use the tools and technologies already exist. I'm no big fan of reinvention, so for me, the solution was to use DocBook. Furthermore (ugh), hacking APT to the point where it supported a feature set similar to DocBook would've meant making APT more like DocBook. By definition, I don't think you can write a book which requires this much semantic stuff in a wiki-like format, without making the wiki-like format more trouble than it is worth.

...stop trying to make it easy...

Even when I wrote Jakarta Commons Cookbook in Word, it was far from easy. There was an ultra-nifty (but very complex and unstable) set of VB macros, which were used to manage cross references and inline styles. There were many keyboard shortcuts, etc. For a 400-page book, I had to split the document into chapter DOC files and have every document open in Word to properly render cross references. It wasn't uncommon for Word to just blow up and refuse to respond. That was about four years ago. In the intervening years, there have been various efforts to simplify the process and move to different tool platforms.

Books have been written in OpenOffice. (And, yes, there are books written in APT.) Some people have tried to write books using collaborative web applications. There has been this persistent idea that people could collaborate on a Wiki and produce a great book, etc...

For me, the most difficult part about writing a book isn't the technology used to write it. From a wiki-like markup to etching every word on to a stone tablet, the most difficult part of writing is the process itself. I use a difficult tool to write with XMLMind, but I spend most of my time writing, and rewriting, and rewriting, and proofreading, and rewriting, and rewriting.

And, writing about technology for a tech-audience isn't easy. I guess what I'm trying to say is stop using tool selection as an excuse to procrastinate and get down to the business of writing. Writing isn't easy; in fact, it is just as difficult (maybe a little more difficult) than writing code. Don't shy away from using professional writing tools, even if they are not easy. Writing a book isn't easy, it'll drive you crazy. I promise.

Picture of Tim OBrien

Written by Tim OBrien

Tim is a Software Architect with experience in all aspects of software development from project inception to developing scaleable production architectures for large-scale systems during critical, high-risk events such as Black Friday. He has helped many organizations ranging from small startups to ...

Tags