My name is Keheliya Gallaba and during this Google Summer of Code I am working on the Automated build system for Gephi. The goal of this project is to add Maven build support to Gephi and set up a continuous integration system to fasten the release process. The Netbeans Platform, which Gephi is built upon, natively uses Apache Ant to compile, build and package the application. But now there is also a variant of NetBeans which uses Apache Maven as the build system. There are several reasons that make moving into a Maven based system worthwhile.
Maven vs Ant
The existing Ant build system for building NetBeans Platform-based applications which is called Ant Build Harness is very intuitive, and needs almost no initial setup. The set of standard Ant scripts and tasks can be easily triggered by the IDE or by the command line. But there are reasons that Ant might not suite a rapidly growing, multi-module project like Gephi. The Gephi project consists of a team of developers who work on dependent modules and plugins. These modules have to be composed to the application regularly. With a large number of modules, with many small packages, and with multiple projects with many inter-dependencies and external dependencies, its essential to manage different versions and branches with their dependencies. And reusing modules with the Ant build harness is not that intuitive.
But Apache Maven is introduced as a standard, well defined build system that can be customized. It uses a construct known as a Project Object Model (POM) to describe the software project being built, its dependencies on other external modules and components, and the build order. It comes with pre-defined targets for performing certain well-defined tasks such as compilation of code and its packaging. It makes dependency management very easy and efficient with the concept of repositories. Most importantly in maven unique coordinates: groupId, artifactId, version, packaging, classifier identifies an artifact which can be uploaded or retrieved from a repository. This helps to easily build modules which depend on other modules.
Work completed so far
This project involves digging deeper in to the Gephi’s architecture and understanding dependencies, building and packaging. Gephi includes 100+ submodules categorized into Core, UI, Libraries and Plugins sections. NBM, which stands for “NetBeans module”, is the deployment format of modules in NetBeans. It is a ZIP archive, with the extension .nbm, containing the JARs in the module, and their configuration files. NBM files can be manually installed using the Update Center and choosing the option for installing manually downloaded modules, or they can be downloaded and installed directly from netbeans.org or another update server.
I’m happy to say that I was able to successfully mavenize 75 modules and continuing to complete the rest. I primarily used the NetBeans Module Maven Plugin for this, which now comes built in with NetBeans 6.9 and 7.0 IDEs. Currently NBM handles the tasks like defining the ‘nbm’ packaging by registering a new packaging type “nbm” so that any project with this packaging will be automatically turned into a netbeans module project, creating nbm artifacts and managing branding. It is also capable of populating the local maven repository with module jars and NBM files from a given NetBeans installation.
Some third party libraries used in Gephi are not maintained in any public Maven Repositories. So I had set up a local Sonatype Nexus Repository to store and serve these dependencies. Basic functionalities of a repository manager like Sonatype are:
- managing project dependencies,
- artifacts & metadata,
- proxying external repositories
- and deployment of packaged binaries and JARs to share those artifacts with other developers and end-users.
We are in the process of setting up a Sonatype Nexus Repository in official Gephi server as well, so not only these third party jars, but the Gephi releases such as the Gephi Toolkit can be served as a maven dependency to maven-based projects all over the world.
Challenges faced during the process
- Researching on existing large scale applications using NetBeans RCP and Maven
- Finding documentation on handling Netbeans specific ant tasks, now in Maven
- Managing transitive dependencies and versioning (specially with slight defferences of Maven and NetBeans difinitions)
- Compilation and Test Failures.
While Maven migration is going on, I also looked in to the other aspect of the project, setting up of a continuous integration server. Main benefits of such a system are:
- checking out source from source control,
- running clean build,
- deploying the artifacts in a repository
- and running unit tests.
Furthermore it can notify developers via Email, IM or IRC on Success, Failure, Error and Warning in a build or simply a Source Code Management Failure. What this means is that when a project gets updated during development, the continuous integration system will try to build the project and will notify the developers if it ran into any issues. This is very useful when working on a multi-module project with many developers, like Gephi since a developer may unintentionally, by accident break the build since they are working concurrently on code and they may have unique configurations to their development environment that isn’t shared by other developers. I looked at the options of Apache Continuum, Hudson and Jenkins (A fork of Hudson) considering the criteria, being open source, supporting Ant & Maven and better integration with Java based projects.
Hudson is an extensible Continuous Integration Server built by Sun Microsystem’s Kohsuke Kawaguchi. Since the design of Hudson includes well thought-out extension points, developers have written plugins to support all of the major version control systems and many different notifiers, and many other options to customize the build process for example the Amazon EC2 plugin to use the Amazon “cloud” as the build cluster.
Continuum is described as a fast, lightweight, and undemanding continuous integration system built by Apache Maven team. It is built on the Plexus component framework, and comes bundled with its own Jetty application server. Like Maven, it is built on the Plexus component framework, and comes bundled with its own Jetty application server. It uses Apache Derby, a 100% Java, fully embedded database for its persistence needs. All these reasons make Continuum self-reliant, and also particularly easy to install in almost any environment.
After considering all of these reasons I settled on Apache Continuum because of the ease of setting it up, configuration and out-of-the-box support for Bazaar. Bazaar is the distributed version control system used in Launchpad for managing the source code, when lot of developers work together on software projects like Gephi. I have set up a local instance of Apache Continuum to check out and build the ant-based Gephi hourly. In the future we can host this in the Gephi server to notify the developers and administrators.
Since the initial foundation has been laid out, it will be quite convenient to complete the rest of the planned work. These will include completion of mavenizing rest of the modules, creating .zip distribution, properly running the final project being developed and setting up the infrastructure at the Gephi server.
I would like to thank my mentors Julian Bilcke, Mathieu Bastian and Sébastien Heymann for providing all the guidelines and support for making this project a success. You can find my ongoing work at this repository: https://code.launchpad.net/~keheliya-gallaba/Gephi/maven-build