Algebraic – in Java?

java

Published: 2021-01 (January 2021)
Relevant Java Version: Java 15.

Algebraic – a term that is often associated with school mathematics, and is commonly understood as a branch of mathematics which deals with symbols and rules for operating such symbols. The symbols usually represent quantities without fixed values (referred to as variables).

In programming languages, it carries a similar meaning. An algebraic data type is a composite containing variables. a composite can further contain other types as variables as well. A recursive type can contain another instance of itself as a variable. Algebraic refers to the property that an Algebraic Data Type is created by algebraic operations. The algebra discussed, is sums, products and patterns.

Algebraic data types were introduced in Hope, a small functional programming language developed in the 1970s at the University of Edinburgh.

Delving into Algebra

Sum Types

A sum type:

  • represents alternation (for three values A, B, C → A or B or C but not any combination or other subset).
  • defines variants
  • is a logical OR operator, only one of the variants is possible.

Product Types

A product type:

  • represents combination (for three values A, B, C → A and B and C, possible to hold empty for one or more).
  • holds values
  • is a logical AND operator

Pattern matching

Pattern matching is the check of a given sequence of tokens for presence of the constituents of some pattern. The match has to be exact without ambiguity, so has to evaluate to either is a match or is not a match.

What has algebra got to do with Java ?

Java has primitive as well as non-primitive data types.

Primitive: boolean, byte, char, short, int, long, float and double.
Non-Primitive: String, array, Object, composite objects etc.

This blog will cover some samples of algebraic or composite non-primitive data types in Java.

Algebra in Java

Sum types

Enum

Enumerations (enum) are a special sum type. Enums cannot have additional data associated with them once instantiated.

Enums can have final attributes that can be set via constructors and can have methods defined that can access such final attributes.

An enum can declare abstract methods which must then be implemented by each variant. Similarly, an enum can implement an interface, but each variant must implement such an interface.

An enum can be instantiated or is assignable via a static method Enum.valueOf(String). The valueOf() accepts a String instance and matches it to the declared enum variant.

Pattern matching for enum

The following pattern matching can be used in identifying enum instances.
Class.isEnum() – Works for enum without any body (no methods and extra attributes).
object instanceof Enum A regular instanceof check against java.lang.Enum.
Enum.class.isAssignableFrom(object.getClass()) – Using the Class.isAssignableFrom() matching.

Optional

A java.util.Optional allows two variants. The optional contains either a value of the specified generic or an empty.

Using Optional correctly, prevents the dreaded NullPointerException.

Additionally, an Optional guarantees that the consumer of the object will always receive an object and can act upon either the contained non-empty value, if present or handle the lack of value.

Read more about Optional at my Java Optional blog.

Pattern matching for Optional

The following pattern matching can be used in identifying Optional instances.
object instanceof Optional A regular instanceof check against java.util.Optional.
Optional.class.isAssignableFrom(object.getClass()) – Using the Class.isAssignableFrom() matching.

Sealed types

Java has had final classes and non-final (open, abstract) classes forever. These were two extremes in terms of inheritance. Limiting inheritance and extension to a finite set of was not very easy prior to the recent introduction of sealed types. Sealed types were introduced in Java 15.

A sealed type (class or interface) permits finite extensions or implementations while preventing any others not listed in the permits clause.

The permitted types can be either non-sealed or final (or can also be a record, more on this later in the blog). A non-sealed type implies it is open to extension.

Pattern matching for sealed types

The following pattern matching can be used in identifying Optional instances.
object instanceof <Class> A regular instanceof check against the class ancestry for the instance.
<Class>.class.isAssignableFrom(object.getClass()) – Using the Class.isAssignableFrom() matching.

Product Types

Class

A regular Java POJO (Plain Old Java Object) class is considered to be a product type. It is a composite which allows for attributes that are grouped together.

Pattern matching for POJO types

The following pattern matching can be used in identifying Optional instances.
object instanceof <Class> A regular instanceof check against the class ancestry for the instance.
<Class>.class.isAssignableFrom(object.getClass()) – Using the Class.isAssignableFrom() matching.

Tuple

A special POJO class extending what a Class offers. Tuples currently are not included in the Java, except maybe, the ephemeral Map.Entry (API) that is available while iterating over Map instances. Tuples can include Unit, Pair, Twin, Triple etc. Tuples are useful when using collections.

Pattern matching for Tuple types

The following pattern matching can be used in identifying Optional instances.
object instanceof <Class> A regular instanceof check against the class ancestry for the instance.
<Class>.class.isAssignableFrom(object.getClass()) – Using the Class.isAssignableFrom() matching.

Record

A record is an immutable data object introduced in Java 14.

A record can be declared with just its attributes. An all-attribute constructor and accessors (getters) for each attribute are synthetically generated. The accessors do not have a get prefix normally used in POJOs. The name of the attribute is the same as the name of the accessor for the attribute. Mutator (setter) methods are not allowed on a record.

A record also allows for a Compact Constructor. This constructor is exactly the same as an all-attribute constructor without have to list them in the constructor signature. A record can use the compact constructor to validate / enforce rules during instantiation.

A record can implement an interface. A record works well with sealed types. A sealed type can permit records.

Starting Java 15, a local record can be created within a method. This limits the scope of the record to within the said method.

Pattern matching for record types

The following pattern matching can be used in identifying Optional instances.
object.getClass().isRecord() Using the Class.isRecord().
object instanceof Record A regular instanceof check against java.lang.Record.
Record.class.isAssignableFrom(object.getClass()) – Using the Class.isAssignableFrom() matching.

Summary

We touched upon the few Algebraic Data Types in Java. There is a lot more to discuss. This includes Catamorphism, Homomorphism and Anamorphism. The next blog will include these. Additionally we will look into the Visitor Pattern, Variances (Invariance, Covariance and Contravariance).

Further, there is excellent reading material available online. Here are a few links to read more:

Thanks for reading !!!

#CommunityFIRST #SharingIsCaring #OpenSourceFUN

Understanding Apache Maven – The Series

java, maven

This is a series of blogs about Apache Maven – a build management tool. This series is intended to be an introductory set of blogs to introduce, familiarize or brush up on maven.

The blogs in this series:

Part 1 – Apache Maven basics
Bare Link: https://cguntur.me/2020/05/23/understanding-apache-maven-part-1/

Part 2 – The Project Object Model (POM) and Effective POMs
Bare Link: https://cguntur.me/2020/05/24/understanding-apache-maven-part-2/

Part 3 – Dependency coordinates and POM hierarchies
Bare Link: https://cguntur.me/2020/05/26/understanding-apache-maven-part-3/

Part 4 – Maven Lifecycles, Phases, Plugins and Goals
Bare Link: https://cguntur.me/2020/05/29/understanding-apache-maven-part-4/

Part 5 – Dependencies in Maven
Bare Link: https://cguntur.me/2020/06/03/understanding-apache-maven-part-5/

Part 6 – Maven Project Object Model (POM) Reference
Bare Link: https://cguntur.me/2020/06/20/understanding-apache-maven-part-6/

Part 7 – Configuring Apache Maven
Bare Link: https://cguntur.me/2020/06/27/understanding-apache-maven-part-7/

Part 8 – Maven Plugins
Bare Link: https://cguntur.me/2020/07/04/understanding-apache-maven-part-8/

Part 9 – Versions in Maven
Bare Link: https://cguntur.me/2020/07/05/understanding-apache-maven-part-9/

Understanding Apache Maven – Part 9 – Versions in Maven

java, maven

Published: 2020-07 (July 2020)
Verified with: Apache Maven 3.6.3

Link to an index, to find other blogs in this series.

In Part 9 of the series, versions in maven are covered.

In Part 3, version schemes in Maven were briefly covered. A quick recap, a version is a string that uniquely identifies a set of changes made to the project. Maven does not understand version schemes and cannot compare versions using numeric (or alphanumeric) comparisons. Some plugins may build additional intelligence for such.

Maven uses the version as a coordinate in identifying an artifact.

Common Version naming conventions

Development cycle

During a development cycle, code is often altered. Changing version numbers for each change in code will produce too many throw-away versions. Each deployment where this artifact gets tested will need to update a version number. In addition, if other projects depend on such an artifact, it can cause ripple effects while updating the version number from each build. As a solution maven provides what is known as a SNAPSHOT version. A SNAPSHOT version is usually the same as the targeted final version with a suffix of -SNAPSHOT.

The project POM declares its coordinates with the SNAPSHOT suffix and iteratively builds newer artifacts. Metadata for the produced artifact gets an update, but the version remains constant throughout the development cycle. Deployments can rely on this SNAPSHOT version as a coordinate to get the latest artifact. Other projects that depend on this project can also avail the latest changes with a re-build and no other version changes.

A version with a SNAPSHOT suffix is a mutable artifact that eases the development cycle. Mutability is both a benefit and a liability. More on this later.

Release cycle

Once all desired group of changes are verified/tested, a project reaches a release cycle. The artifact produced from building the project can be marked immutable. Any further change would have to be scoped into another future build, with a different version. Such builds are commonly called release builds. The artifacts produced from this build are immutable. Version schemes for such artifacts may have no suffix or may have an alternate suffix such as -FINAL or -RELEASE or -GA etc.

Transitioning from SNAPSHOT to a release

There are several ways to convert a SNAPSHOT version to a release when the time comes. Versions can be manually altered prior to a build. The build sanctity is potentially violated with any change made in code, hence using an automated process to update the version are preferred. Two sample mechanisms commonly used are detailed below.

Using a versions-maven-plugin

(Group and artifact: org.codehaus.mojo:versions-maven-plugin). The plugin has a goal use-releases that removes the SNAPSHOT. The plugin has several other goals that are quite useful.

Link to the goals page for the versions-maven-plugin: https://www.mojohaus.org/versions-maven-plugin/plugin-info.html

Using a maven-release-plugin

(Group and artifact: org.apache.maven.plugins:maven-release-plugin) is another means of controlling versions. A goal update-versions in the plugin allows for setting versions.

Link to the goals page for the maven-release-plugin: https://maven.apache.org/maven-release/maven-release-plugin/plugin-info.html

What type of version to use in a POM?

Ideally, since a POM is modified during development, it is best to use the -SNAPSHOT suffix in the POM. A project during its release can shed the -SNAPSHOT through many means, two of which were covered above. In addition, the above mentioned plugins also support bumping the version number to the next SNAPSHOT for a future development cycle right after the build for the release completes.

Typically projects start with a 0.0.1-SNAPSHOT or a 1.0.0-SNAPSHOT. Following Semver 2.0 rules is heavily recommended for the numeric portion of the POM, since it provides visual cues for a developer’s understanding.

Common Version Strategy in Maven. Development cycles re-use the SNAPSHOT version, Release produces immutable artifact version.
Common Version Strategy in Maven. Development cycles re-use the SNAPSHOT version, Release produces immutable artifact version.

Controlling Versions in Maven

Version Ranges in Maven

Maven supports version ranges. At times it is possible for a POM to be a bit flexible in accepting a range of versions of a dependency. This flexibility can stem from some underlying assumptions such as backward compatibility or a minimal dependence on the said dependency. It is also a means of restricting an allowed version to be within a specified set of versions.

Hard versus Soft Requirements

Maven version values can either be a soft requirement or a hard requirement.

As the names suggest, a soft requirement is a replaceable version with the current value being the preferred version to use. If the dependency graph contains a different version with alternate requirements, it can be picked over the current version value. This is more often a single value than a range. Most POMs use a soft requirement. Specifying a version without any range or restrictions (for example 1.3.8 or 2.0.0-alpha) implies it is a soft requirement.

A hard requirement is a pattern that restricts the version to be selected. A hard requirement uses square brackets (inclusion) and parenthesis (exclusion) to determine allowed version values. A few examples of hard requirements are tabulated below.

RangeNotes
[1.0]Use exactly version 1.0.
(,1.0]Use any version <= 1.0 Flexible on versions before and including 1.0, Restrict any values above 1.0.
Note the initial comma.
[1.0,1.3]Use any version inclusive of 1.0 up until 1.3 including 1.3
[1.0,1.2)Use any version inclusive of 1.0 and above, until and excluding 1.2
[1.2,)Use any version inclusive of 1.2 and above.
(,1.0],[1.2,)Use any version <=1.0 or any version >= 1.2
(,1.1),(1.1,)Use any version except 1.1
Version Range patterns

Version ranges are a powerful option. One of the major drawback of using ranges, in general, is a lack of build reproducibility. There is no guarantee of the version that will be chosen if a range is provided. There are better options using dependencyManagement to control versions. This blog will also cover the maven-enforcer-plugin, which can do more with version ranges.

Using dependencyManagement and pluginManagement

As was covered in Part 5, versions for a dependency can be controlled using a dependencyManagement section.

A quick recap, the dependencyManagement element allows for a lookup reference where dependencies can be listed via their location coordinates that must include the GAV coordinates (groupId, artifactId, version), may include distinguishers (classifier, type) and may include exclusions, scope or an optional flag.

Dependencies then declared in the POM directly can be listed with just the groupId and artifactId, the rest of the information can be fetched from the lookup to the dependencyManagement block.

Not everything listed in the dependencyManagement block need be used in the actual POM, since none of the dependencies listed in the block are directly used in generating an effective POM.

As was covered in Part 8, versions for a plugin can be controlled using a pluginManagement section.

A similar recap for pluginManagement applies, where a plugins can be listed via their GAV coordinates, and may optionally include dependencies, configuration, and flags for extensions and inherited.

Similar to the pattern followed by dependencyManagement, plugins configured in pluginManagement are not directly used in the build, but are looked up, when encountered in the POM’s build section with a groupId and artifactId.

Using a maven-enforcer-plugin

Maven provides a very versatile tooling plugin to allow centralized control over the build environment from a single POM (inherited to child POMs) and allows for greater flexibility in version specifications by supporting version ranges.

The actual name of the plugin: Maven Enforcer Plugin – The Loving Iron Fist of Maven TM. Link to the goals of the plugin: http://maven.apache.org/enforcer/maven-enforcer-plugin/plugin-info.html

The enforcer plugin deserves a blog unto itself, but as a preview, the enforcer provides built-in rules and a capability to extend the same with custom rules via a rich Maven Enforcer Rule API. Custom rules require creating a custom enforcer, by extending the EnforcerRule. The custom rule is packaged as a jar and included as a dependency to the plugin. The rule can be invoked by including the fully-qualified class name under the configuration for the plugin.

Link to the enforcer API: http://maven.apache.org/enforcer/enforcer-api/writing-a-custom-rule.html

Enforcer provides built-in rules for bannedDependencies, dependencyConvergence, requirePluginVersions, requireReleaseVersions, requireSameVersions etc. and allows for custom rules to be defined, which can help control the versions and fail the build if the rule is not satisfied.

Link to built-in rules: http://maven.apache.org/enforcer/enforcer-rules/index.html

Handling Backward Compatibility

Projects, especially libraries, strive not to break backward compatibility. Backward compatibility is a guarantee that upgrading to the newer version of the library WILL NOT break existing usage. While this goal is utopian, there is a necessity to, at times, break backward compatibility, due to either security constraints or to allow enhancements that are not possible without such a breaking change.

In such situations, it is best to rename the groupId or artifactId so the dependency resolution conflicts are more visible and appropriate exclusions or changes in the project code can be made. Changing the groupId or artifactId provides the necessary separation to indicate a breaking change. It also allows for a continued support of the prior functionality, if deemed necessary.

That’s a wrap on this blog. Have fun !

Part 8
Maven Plugins
IndexPart 10
To Be Determined

Understanding Apache Maven – Part 8 – Maven Plugins

java, maven

Published: 2020-07 (July 2020)
Verified with: Apache Maven 3.6.3

Link to an index, to find other blogs in this series.

In Part 8 of the series, a deeper dive in Maven Plugins is covered.

In Part 4, Maven Lifecycles and Phases were introduced. A brief overview of Plugins and Goals were also included. A quick recap:

  • Apache Maven provides standard lifecycles.
  • Lifecycles have phases that execute sequentially.
  • Execution of maven is done via plugins which define goals.
  • Goals can be associated with phases (or may simply be run independent of phases).

What are plugins?

Maven is a plugin-execution framework. Plugins are an assembly of goals, code written as MOJOs (Maven’s plain Old Java Objects, Modern MOJOs are not restricted to being written in Java). Goals have names and can be bound to phases. A MOJO declares its goal name and optionally, a phase association, which binds the class to a part of a lifecycle.

Image displays a plugin-lifecycle relationship. Plugins define goals. Goals can be bound to phases. Goals from multiple plugins can be bound to a single phase.
Plugins define goals. Goals can be bound to phases. Goals from multiple plugins can be bound to a single phase.

A plugin is typically a .jar file which contains the MOJO classes and a META-INF/maven/plugins.xml. This plugins.xml is generated as a part of the maven execution of the plugin code.

Types of plugins

Broadly, plugins are of two types:

build plugins – configured in a project POM under the <build> element. Such plugins are executed as a part of the default (build) lifecycle.

reporting plugins – configured in a project POM under the <reporting> element. Such plugins are executed as a part of site lifecycle.

Furthermore, plugins can be classified as:

core plugins – Plugins where goals are bound to core phases (clean, compile, install, resources etc.)

packaging plugins – Plugins related to output artifact packaging. Examples include plugins for ear, jar, war etc.

reporting plugins – Plugins related to the site lifecycle and used to generate reports. Examples include checkstyle, javadoc, project-info-reports etc.

tooling plugins – Plugins related to general tooling during the maven execution. Examples include assembly, help, enforcer toolchains etc.

Plugins from Maven – versus – custom plugins

Official Maven plugins developed as a part of Apache Maven have a standard naming convention: maven-<plugin shortName>-plugin. This naming convention is reserved and SHOULD NOT be used by plugins which do not have a groupId of org.apache.maven.plugins and are not found on a maven repository under ord/apache/maven/plugins directory.

Plugins developed with other groupIds typically have a name of <plugin shortName>-maven-plugin.

How to learn about a plugin

Standard plugins from Apache Maven have a consistent site structure under a common site: https://maven.apache.org/plugins/index.html.

Each plugin landing page has a few menu items under its navigation panel. An Introduction page, which provides an overview of the plugin. A Goals page which lists all goals defined by the plugin and a deeper explanation of goal’s intent. A Usage page that provides configuration options and any relevant information regarding constraints for the plugin. An FAQ page for responses to frequently asked questions. There are additional pages for License and Download of the plugin as well.

In addition to the standard menu items, a plugin landing page can provide links to Examples and Project Documentation.

The plugin site is the best way to start understanding a plugin provided by Apache Maven. Plugins developed external to Apache Maven should attempt to follow similar conventions to ensure easier comprehension by the users.

How to use plugins in a project POM

Plugins are configured in a POM under either the <build> or the <reporting> or the <profiles> -> <profile> element. A plugin can be located using the standard maven G-A-V (GroupId-ArtifactId-Version) coordinates. In addition to the location coordinates, a plugin has a few other elements that are optional.

Extensions

The extensions is a flag to determine if Maven extensions from the plugin should be loaded. The value is a true or false, however, the current datatype in the schema is a String (for some technical reasons). The default value is false and it is rarely enabled. Typical use cases for enabling this is when defining a custom lifecycle or packaging types.

Inherited

Will be covered in a section below, but inherited is a boolean flag that is true by default. As with extensions the datatype in the schema is a String. Setting inherited to false prevents propagation of the configuration to any child POM of the current one.

Executions

The executions is a complex element and contains a set of execution elements. At its core, maven executes such definitions. An execution specified the set of goals to execute during the lifecycle. An execution is a complex element that has a unique id, a phase to bind one or more goals to, an inherited flag (similar to the one defined for a plugin, also set as a String datatype), a goals element which is a set of String goal elements that are bound to the phase, and a generic configuration element.

Dependencies

The dependencies is a complex element and contains a set of dependency elements. The dependency definitions listed here are used by the plugin and loaded by the plugin classloader.

Configuration

The configuration is a complex element which allows for a free-form DOM configuration used by the plugin. The configuration specifics are typically listed (and recommended, in case of custom plugins) in the Usage page for a given plugin.

A visual of the plugin element

Plugin Inheritance

Plugins have a inheritance logic similar to dependencies. A plugin declared in a parent POM is inherited into the child POM unless the parent declares the inherited flag to false. Setting the inherited flag to false breaks the inheritance.

In addition, a pluginManagement section in a POM functions the same way a dependencyManagement works for a dependency.

There is currently no equivalent in a bill-of-materials for a plugin, although, there is an open issue for introducing such a facility. Link: https://issues.apache.org/jira/browse/MNG-5588

Mixin Maven Plugin

Not officially an Apache Maven plugin, but the mixin-maven-plugin deserves a special mention. The plugin allows for including multiple pluginManagement sections without needing to inherit the plugins from a single parent. Using this plugin allows for a build behavior to be made more modular.

More about the mixin-maven-plugin: https://github.com/odavid/maven-plugins/tree/master/mixin-maven-plugin

Links to learn more

Introduction to Plugins: https://maven.apache.org/guides/introduction/introduction-to-plugins.html

Maven Plugins: https://maven.apache.org/plugins/index.html

Guide to configuring plugins: https://maven.apache.org/guides/mini/guide-configuring-plugins.html

Guide to developing plugins: https://maven.apache.org/guides/plugin/guide-java-plugin-development.html

Maven MOJO API: https://maven.apache.org/developers/mojo-api-specification.html

Plugin descriptor (Apache Maven 3.6.3): https://maven.apache.org/ref/3.6.3/maven-plugin-api/plugin.html

That’s a wrap on this blog. Have fun !

Part 7
Configuring Apache Maven
IndexPart 9
Versions in Maven

Understanding Apache Maven – Part 7 – Configuring Apache Maven

java, maven

Published: 2020-06 (June 2020)
Verified with: Apache Maven 3.6.3

Link to an index, to find other blogs in this series.

In Part 7 of the series, various means of configuring Apache Maven are covered.

Why Configure Apache Maven?

Over several parts in this series, Maven was touted to be convention-over-configuration. Maven assumes defaults and allows for overrides where possible.

However, Maven can depend on constraints external to what is packaged. Examples include the JDK to use (assuming there are several JDKs on the computing device), the flags to set (memory, Garbage Collection flags for tests etc.), system / environment variables, repository information, extensions to maven and so on.

The examples mentioned above are external to both the maven executable/distribution as well as to the POM that is authored. Hence a need to provide a means of configuration.

Options to configure Maven

  • Using environment variables
  • Using config files under a .mvn directory in the project
  • Using XML configurations

Configuring Maven – Environment Variables

There are a few environment variables (abbreviated as env. var. in the blog) useful for maven execution of a POM. Listing a few common ones here:

JAVA_HOME

The JAVA_HOME env. var. points to the location of the JDK. This is useful to set especially if there are more than one JDK installations on the computing device. The value is an absolute path to a directory which contains the the JDK executable binaries and libraries.

M2_HOME

The M2_HOME env. var. points to the location of the Apache Maven installation. This is useful especially if there are more than one version of Apache Maven installations on the computing device. Apache Maven may also rely on this env. var. for maven location. The value is an absolute path to a directory which contains the maven binaries and libraries.

PATH

The PATH env. var. includes locations where the operating system looks for executable binaries and scripts. Including a path to the JAVA_HOME/bin (or JAVA_HOME\bin for Windows) and a path to M2_HOME/bin (or M2_HOME\bin for Windows) allows for java, javac, mvn and other executables there in, to be accessed from any directory.

MAVEN_OPTS

The MAVEN_OPTS env. var. is useful for setting JVM options to be used during the maven execution of the POM. Common use cases include setting of memory and garbage collection options.

Configuring Maven – Config files

Maven allows for customization on a per-project basis via config files. These files are located in a .mvn directory under the project root directory. The directory can contain a few config files.

The jvm.config file

The project root directory in maven is referred to as projectBaseDir.

The ${maven.projectBaseDir}/.mvn/jvm.config file is a more modern take on specifying JVM arguments and can be used in lieu of the MAVEN_OPTS shared earlier. It also replaces an older .mvnrc file (which had to be located in the logged-in user’s HOME directory). The newer jvm.config allows for customizing JDK/JVM options on a per-project basis and these files can be checked in, into a source control system to be shared with other developers on the same project.

The mvn.config file

A ${maven.projectBaseDir}/.mvn/mvn.config file is useful for setting maven command line options that need to be set for normal execution.

Most developers simply memorize the standard commands to run maven:

mvn clean install
mvn clean verify
etc.

Forgetting to set other required command line interface (CLI) options useful for any other reason can result in unexpected or unwanted outcomes from the execution.

An example could be that the project POM relies on SNAPSHOT versions, but requires force updates of SNAPSHOTs every build. This is done via a -U CLI option. Similarly a project may wish to fail the execution of maven if the checksums for artifacts fail. This can be done via a --strict-checksums CLI option. Such would usually require reading some documentation.

The mvn.config file allows for setting and checking into version control, such options that can be used by other developers on the project.

Configuring Maven – XML files

In Part 2 of the series, the global settings XML file and a user-home settings.xml were covered. There are additional configuration files that maven provides.

Global settings.xml

Located under the maven installation conf directory, is a settings.xml file that is applicable to any user who uses the installed version on maven on the computing device. Typical usage of this settings file is for corporate settings, proxies within the network, approved mirror sites etc. It is not recommended to heavily customize/personalize this file since it will equally impact all users on the said computing device.

User Home settings.xml

Located at ~/.m2/settings.xml (or ${USER_HOME}\.m2\settings.xml), this settings file allows for more personalizing on a per-user setting. Typical usage is to setup any usernames and passwords, default mirror, default profiles, repository and pluginRepository settings. Note that any configurations defined here apply to ALL maven projects executed by the current user.

Maven extensions.xml

In Part 6 of the series, extensions were mentioned as additional enhancements to maven behavior. The extensions are artifacts that get added to maven’s own classpath (not the project classpath) during execution.

Prior to maven extensions.xml, such artifacts had to be compiled into shaded jar files which would need to be copied into the maven installation’s lib/ext directory so they could be picked up.

Starting Apache Maven 3.2.5, an easier solution is to treat such artifacts similar to dependency resolutions via a file located under the project root’s .mvn directory.

The path to maven extensions is ${maven.projectBaseDir}/.mvn/extensions.xml. The file contains a root extensions element which can contain a set of extension elements which provide the groupId, artifactId and version coordinates for the extension.

A full schema for extensions: http://maven.apache.org/ref/3.6.3/maven-embedder/core-extensions.html

Maven toolchains.xml

There can be use cases where the JDK used by maven to launch is different from the JDK used to build the project. Also there could be a use case to build the same project with different JDKs via profiles. Such use cases can accomplished by using toolchains.

Plugins that are toolchain-aware can benefit from a defined toolchain and switch to use a JDK that matches conditions specified in the toolchains.xml. The toolchains.xml is usually located at the project root and is invoked via a flag on maven command line. Using toolchains require including the maven-toolchains-plugin in the POM.

Toolchains can either be local to the project or global (across all projects on a given computing device). The recommended location for global toolchains is at the ~/.m2/toolchains.xml (or ${USER_HOME}\.m2\toolchains.xml). Global toolchains are invoked with a maven command line:

mvn clean verify --global-toolchains ~/.m2/toolchains.xml
mvn clean verify -gt ~/.m2/toolchains.xml

Local toolchains are similarly invoked using a maven command line:

mvn clean verify --toolchains toolchains.xml
mvn clean verify -t toolchains.xml

The toolchains.xml file has a root toolchains element which contains a set of toolchain elements. The toolchain element contains a few child elements.

  • A type element (standard value is jdk, creating custom toolchains allows for other values).
  • A provides element is a collection of properties (<key>value</key>). These properties can be used as conditions when configuring the maven-toolchains-plugin in the POM. Matching the conditions results in the specific toolchain being selected.
  • A configuration element is another properties element that typically is used to provide the location of the JDK in the standard offering but can be customized when creating bespoke toolchains.

Excellent documentation on toolchains is available at the Maven documentation:
Link: https://maven.apache.org/ref/3.6.3/maven-core/toolchains.html
Link: https://maven.apache.org/guides/mini/guide-using-toolchains.html
Link: https://maven.apache.org/plugins/maven-toolchains-plugin/

A working sample for toolchains

toolchains.xml: https://github.com/c-guntur/jvms-compare/blob/master/toolchains.xml

Usage of a specific toolchain in a pom.xml:
toolchains.xml excerpt: https://github.com/c-guntur/jvms-compare/blob/master/toolchains.xml#L44-L54
pom.xml excerpt: https://github.com/c-guntur/jvms-compare/blob/master/pom.xml#L401-L420
(The configuration in the pom maven-toolchains-plugin looks for an AdoptOpenJDK Hotspot Java 11)
Three different means of configuring Maven: Environment variables, .mvn Config files and XML configurations
Three different means of configuring Maven: Environment variables, .mvn Config files and XML configurations

That’s a wrap on configuring Maven. Have fun!

Part 6
Maven POM Reference
IndexPart 8
Maven Plugins

Understanding Apache Maven – Part 6 – POM Reference

java, maven

Published: 2020-06 (June 2020)
Verified with: Apache Maven 3.6.3

Link to an index, to find other blogs in this series.

In Part 6 of the series, a walkthrough of POM content (XML) is covered.

This blog is not meant for a full dissection of a pom.xml, since the Apache Maven doc does the job so well, that anything else written will at best be a copy of that content. It is strongly recommended to peruse the linked doc below.

Excellent documentation of a pom.xml on the Apache Maven site: https://maven.apache.org/ref/3.6.3/maven-model/maven.html.

With the assumption that the above linked content has been read and bookmarked for future use, this blog will go through some of the common portions of the pom.xml.

Reminder: Apache Maven is polyglot. XML was the first and most commonly used format for describing a POM. This blog assumes XML format but other formats share the same logic.

The project

POM Contents. The dark background elements have complex structures while the light background are simple elements. Build and Profiles have additional diagrams
POM contents. The dark background elements have complex structures while the light background is for simple elements. Build and Profiles have additional diagrams

This is the root element of a POM (Project Object Model). All convention overrides of a maven project are listed under the project element in the XML. A parent is identified by its coordinates (groupId, artifactId and version) and an optional relativePath. The relativePath by convention expects a parent to exist one directory above. This relativePath value can be overridden to point to relative alternate locations (such as same directory) or an empty value, to ignore searching locally and only search in configured repositories.

Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_project

Project Model modelVersion

A project is required to conform to an XML Schema Definition (XSD) version. Apache Maven 3.6.3 depends on Model 4.0.0. Ongoing discussions propose a conformity in future releases (Apache Maven 5 potentially being the next, possibly skipping 4, and the model version for such to be 5.0.0). A POM requires a modelVersion element that is set to 4.0.0 for use with Apache Maven 3.x.x.

Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_project

Project Coordinates

A project will need to specify its GAV (Group-Artifact-Version) coordinates. A POM can also specify a parent from which a groupId and version can be inherited. While a project can share the groupId and version with its parent, it will require a unique artifactId under that group to distinguish itself from other projects under the same group. Also, the groupId and version of the current POM can be specified in the pom.xml, in which case they override whatever values the parent provides. Maven works on convention and overrides.

When inheriting from a parent POM, Apache Maven inherits the following:

  • any coordinates (typically a groupId and a version)
  • properties element
  • url, inceptionYear, organization, developers, contributors, mailingLists, scm elements
  • issueManagement, ciManagement elements
  • dependencies and dependencyManagement elements
  • repositories and pluginRepositories elements
  • plugins element along with any plugin executions and plugin configurations
  • reporting element
  • profiles element

Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_project

Output packaging

This element (defaults to a jar) defines the output artifact type when the POM is executed. Typical values may include (but are not limited to): jar, war, ear, pom etc.

Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_project

Project name, description and url

The name can be used to provide a wordy yet small title for the project. If not included, maven uses the directory name of the the project. The name is displayed in the output when executing the POM.

The description can be used to include a more verbose description of the project’s intent. It is optional to include a description, but generally considered a good practice to include one.

The url can be used to provide a link to a webpage or site relevant to the project. It is optional to include a URL.

Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_project

Relationship modules

If the current project itself is a parent or an aggregator POM (see Part 2 for definitions), then the optional modules element can be used. Each listed module refers to a relative path to the child project’s directory. It is considered a best-practice to name the artifactId of the child the same as its base directory.

Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_project

Source Control Management scm

The scm element allows specifying the connection information to the source control system for the current project. This information is valuable to the the release process for tagging the source code. IDEs too can benefit from determining the source control location.

Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_scm

Dependency & Plugin Artifact Search repositories and pluginRepositories

When maven executes a POM and builds an artifact, it also builds in some metadata. This metadata includes a lot of content from the POM. Adding repository and pluginRepository sections in the POM mean that potential consumers of the current project artifacts will need to resolve from the same location as was used in the current POM. This may cause potential problems if the current repositories are private or under some limited access. Best to include these elements in a settings.xml instead. More on this later, but a link to a detailed reference to the settings.xml is included here: https://maven.apache.org/ref/3.6.3/maven-settings/settings.html. It describes what content is valid in a settings.xml.

Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_repository
Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_pluginRepository

Artifact Publishing distributionManagement

A distributionManagement section is used to provide locations for publishing the build outputs (artifacts as well as site content). It allows for specifying repository locations where either a SNAPSHOT version or a release version of the artifact can be pushed. Additionally, the location to deploy the site content can be included. In most commercial and large workplaces, a parent POM for the organization provides a generic set of distribution management which the current project can inherit from.

Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_distributionManagement

Issue Tracking & Continuous Integration issueManagement and ciManagement

It is a good discipline for a project to have issue and bug trackers. These are used to identify rationale for changes being made to the source code either for maintaining a history or changes or for auditing why a change was made. The issueManagement section is where the location of the tracking system. It is commonly used in site generation.

Similar to tracking issues, it is considered good discipline for a project to have a continuous integration build. Builds could be triggered either on a change in the project or manually or on a periodic basis. Similar to configuring issue management, maven POM has a ciManagement section which allows specifying the location as well as notification configuration for success and failures of builds.

Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_issueManagement
Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_ciManagement

POM Properties properties

If a project has values which are re-used in multiple locations and all require update when this value has to change, then it is ideal to take advantage of using properties. Common usages include version numbers of dependencies, re-used configuration values and replacements of variables (templates, filters etc.) during the maven execution. The properties are declared as <name>value</name> pairs in XML and can be used later in the POM as dollar-substitutions ${name}.

Dependency Management dependencyManagement

In Part 5, dependencyManagement was covered as a lookup reference to coerce maven to resolve to a desired version of a dependency. The dependencyManagement element contains a dependencies element which is a set of dependency elements that may (or may not) be used while generating the effective POM. Dependencies declared under this element can include GAV coordinates as well as scope, optional and exclusions elements.

Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_dependencyManagement

Project Dependencies dependencies

The actual set of dependencies used in a project are declared in a dependencies element. Each dependency element declared within is considered to be of the nearest depth when maven generates an effective POM for the current project. A dependency can be declared with its GAV coordinates as well as scope, optional and exclusions elements.

If a dependency was already added as a lookup reference in the dependencyManagement section, then such a dependency here can skip inclusion of a version (so the version specified in the dependencyManagement can be used). All scope, optional and exclusions declared in the dependencyManagement section are incorporated when just a groupId and artifactId are specified in this section, for any looked up dependency.

Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_dependency
Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_exclusion

The build instructions

Build element contents. The dark background elements have complex structures while the light background is for simple elements.
Build element contents. The dark background elements have complex structures while the light background is for simple elements.

Most of the instructions to chain build configuration together are all defined under a build element. A few elements of note are listed below.

Configuring directories for source, script and test files

While it is heavily recommended to not change the convention of sources, scripts and test file locations, there is, at times, a need to customize or alter such. It may also be rarely required to alter the output location of a build. In such cases it is possible to point to the directories by setting their relative paths (to the pom.xml) via the following:

  • sourceDirectory
  • scriptSourceDirectory
  • testSourceDirectory
  • outputDirectory
  • testOutputDirectory

Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_build

Maven extensions

The ability to extend maven functionality to perform other tasks. These extensions are declared with GAV coordinates. Many of such extensions are Wagon Providers, for providing artifact customization (file providers, ftp provider, SSH providers, HTTP providers etc.). Newer extensions are for format benefits of a polyglot maven (Ruby, XML, YAML, JSON etc.).

Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_extension

Customizing resources

Resources are additional content useful in running the project or its tests. Content could include properties, configurations, images and other assets that do not necessarily need compilation. Some resources may also need values added (or replaced) during the maven execution of the project POM. The resources (and the testResources compliment) element allows defining a set resource (or testResource) elements which can customize the location in the final artifact, replacement patterns the location of resources (overriding convention or src/main/resources or src/test/resources. It is possible to filter includes and excludes sections based on filenames and wildcard patterns.

Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_resource
Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_testResource

Configuring plugins and pluginManagement

Plugins are maven’s means of executing goals. Plugin goals can bind to maven lifecycle phases as was discussed in Part 4. Plugins implement behavior that execute in the lifecycle phase/goal sequence. A plugins element can contain several plugin definitions which can further be configured, if needed.

A pluginManagement section is to a plugin what a dependencyManangement is to a dependency. A lookup table for configured plugins to be re-used across many modules in the project. Plugins declared in a pluginManagement are not loaded but are used to specify a reusable version and configuration setup that can be re-used within the actual build plugins if listed. Similar to dependencyManagement, a plugin declared in an accessible pluginManagement section can skip re-configuration and skip the version in the build -> plugins section.

Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_plugin
Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_pluginManagement

Customizing the entire POM based on profiles

Profiles/profile content. The dark background elements have complex structures while the light background is for simple elements.
Profiles/profile content. The dark background elements have complex structures while the light background is for simple elements.

Profiles will require an entire blog post by themselves. Maven offers build profiles which can be activated either by default, or when certain conditions are met or by flagging them in a command line to maven execution. Build profiles contain many of the sections already present under the project element but are only executed when the profile is activated.

A profiles element contains a set of profile elements which can be activated:

  • by default (activeByDefault)
  • matched by JDK definition in a toolchains.xml
  • based on operating system (name, family, architecture and/or version)
  • based on a property existence or a specific value
  • based on a file either existing or missing

A profile can contain several other elements including: build (resources, testResources, pluginManagement, plugins), modules, distributionManagement, properties, dependencyManagement, dependencies, repositories, pluginRepositories, reporting etc., all covered earlier in the blog.

Link: https://maven.apache.org/ref/3.6.3/maven-model/maven.html#class_profile

Additional links

Link: https://maven.apache.org/pom.html
Link: https://maven.apache.org/guides/introduction/introduction-to-the-pom.html

That’s a wrap on this blog. Next up is Configuring Apache Maven. Have fun !

Part 5
Maven Dependencies
IndexPart 7
Configuring Apache Maven

Understanding Apache Maven – Part 5 – Dependencies in Maven

java, maven

Published: 2020-06 (June 2020)
Verified with: Apache Maven 3.6.3

Link to an index, to find other blogs in this series.

In Part 5 of the series , a walkthrough of maven dependencies is covered.

What are dependencies

Dependencies are the basic building blocks on a maven project. Imagine writing some code that requires logging some outputs or using some string utilities or parsing JSON text. The logic can be coded into the project, or a library can be used. Most of the times, it makes sense to harness an existing library to minimize the amount of code needed. This also encourages reuse.

The libraries, required to compile, run, test the project in a maven ecosystem, are referred to as dependencies.

The project in question could potentially be a library used as a dependency in some other consumer POM.

How are dependencies located?

In Part 3 of the series, the dependency coordinates and distinguishers were covered. As a recap, a dependency location can be reached via its groupId, artifactId and version (G-A-V or GAV) coordinates and furthermore the type and classifier can be specified to pinpoint the exact dependency needed in the project. Together these can be referred to as location coordinates.

A future blog is dedicated to a walkthrough of the POM file, but this blog is focused on a deep dive on dependencies.

An sample of a dependency block in an XML format POM file is listed below.

<project>
  ...
  <dependencies>
    <dependency>
      <groupId>a.group-id</groupId>
      <artifactId>an-artifact</artifactId>
      <version>1.0</version>
      <exclusions>
        <exclusion>
          <groupId>transitive.group-id</groupId>
          <artifactId>excluded-artifact</artifactId>
        </exclusion>
      </exclusions>
      <optional>true</optional>
    </dependency>
    <dependency>
      <groupId>another.group.id</groupId>
      <artifactId>another-artifact</artifactId>
      <version>1.0.0-SNAPSHOT</version>
      <type>zip</type>
      <scope>runtime</scope>
    </dependency>
  </dependencies>
</project>

This excerpt is not exhaustive in how a dependency excerpt can look like. Time to dig in.

A dependency in a POM

Dependencies for a project are declared in a dependencies element. This element represents a set of unique dependency elements. As exemplified above and described in earlier blogs, a dependency can contain the G-A-V coordinates and additional optional distinguishers as needed. In addition to the location coordinates, a dependency can contain exclusions, a scope and an optional tag.

Transitive dependencies

As mentioned earlier, a POM has dependencies. The project itself can be a dependency for some other consumer project. The current project’s dependencies are then considered transitive dependencies for the other project. When maven pulls in a dependency from the location coordinates, it also attempts to pull in the transitive dependencies for it. Put in different words, if project A depends on dependency (another project) B and this B depends on dependencies C and D, then maven attempts to resolve and pull in B, C and D when creating an effective POM for project A. More on this in a bit.

The depth of transitive dependencies is not limited. Traversal continues until the level where there are no further transitives for each dependency listed. This entire structure of a dependency and its complete transitive graph is known as its dependency tree.

Exclusions

In some cases, it may not be necessary to pull one or more transitive dependencies (and their entire further depth). A means to instruct maven to ignore certain “branches” of the tree is via an exclusion. As the excerpt suggests, exclusions are a set of rejection criteria. An exclusion requires a groupId and artifactId (more on this in a bit). It is possible to use a wildcard (*) in the exclusion elements (functional since Apache Maven 3.2.1).

A dependency element can have one or more exclusion elements nested within an exclusions element.

Scope

A dependency may be required to compile a project or to run a project or to only run the project’s tests. A scope instructs maven on how the said dependency is used in the project lifecycle. There are a few scopes enumerated for usage in dependencies. A tabulated summary:

scopenotes
compilethe default scope. These dependencies will be available on the classpath of the project. Also, any project that identifies this project as a dependency will find compile scope dependencies propagated in the dependency tree.
provideda scope that determines that the dependency will be made available for use external to the project’s build artifacts. For instance a container or server will furnish the dependency at runtime and is available on the classpath during execution or tests. These dependency is not propagated as transitive.
runtimea scope that determines that a dependency is only required at runtime and not at compile time. Typical usecases are when an API and its implementation are produced as separate dependencies. The compilation may only need the API dependency while the execution at runtime will require an actual implementation as well. The dependency is propagated as a runtime transitive dependency when the project artifact itself becomes another project’s dependency.
testa scope that determines that a dependency is only required for compiling and running tests and not during a normal compilation nor execution of the project. the dependency is not propagated as a transitive.
systema scope that stops maven from resolving a dependency from a repository. The scope requires an additional systemPath element which specifies the location of the dependency. While the dependency is available on the classpath. The dependency is not propagated as a transitive.
importa special scope used exclusively in a dependencyManagement section that will be covered later in this blog. As a preview, the dependencies are an instruction for replacement and are not propagated as transitive.
Tabulated scope values with notes on each

Optional

The project may need some dependencies that need not be passed on to any other projects that use the current project as a dependency. Such dependencies can be of any scope. An element in the dependency structure is optional that marks the said dependency as only needed for the current project’s maven executions.

An anecdotal example of depending on a metrics library: The current project may need a metrics library for execution and testing, however when the project is used as a dependency, there may be no need for the consumer project to rely on this metrics library. Such a dependency can be tagged as optional.

A graphical representation

A graphical representation of a dependency tree showing different depths of transitive dependencies as well as possible exclusions and non-inclusion via an optional attribute on a sample transitive.
Basic dependency graph example

How to view the dependency tree

It is possible to view the dependency tree of the project POM via a command line as well as via most modern IDEs. Command line options for viewing the dependency tree:

View full dependency tree of the POM

mvn dependency:tree

View a verbose dependency tree of the POM

mvn dependency:tree -Dverbose=true
OR
mvn dependency:tree -Dverbose

NOTE: The verbose flag is true if the option is mentioned, so an “=true” can be removed.
PERSONAL OPINION: Prefer the usage of -D<option>=<value> over -D<option>.
CAUTION: This produces a lot of output !

View a verbose dependency tree of the POM for a specific dependency

mvn dependency:tree -Dverbose=true -Dincludes=<groupId>
OR
mvn dependency:tree -Dverbose=true -Dincludes=<groupId>:<artifactId>

How maven resolves transitive dependency versions

A project POM can include several dependencies, which may further have varying depths of transitive dependencies. It is very possible that a few dependencies share transitive dependencies but depend on different versions. Maven is thus tasked with electing the right transitive dependency to use for its effective POM, to avoid duplication. Since maven cannot sort version strings (versions are arbitrary strings and may not follow a strict semantic sequence), maven takes the approach of nearest transitive dependency in the tree depth. This is very similar to how Java picks up the first jar in the class path when looking for a fully qualified class name.

to illustrate with an example, let us look at the transitive dependency Dx in the example below.

POM P1 has a few dependencies listed below (with dummy Group, Artifact and Version numbers) with transitive dependencies shown as ->.

  • Dependency D1 (G1:A1:V1) -> D11 (G11:A11:V11) ->Dx (Gx:Ax:V1.0.0).
  • Dependency D2 (G2:A2:V2) -> Dx (Gx:Ax:V1.2.0).
  • Dependency D3 (G3:A3:V3) -> D33 (G33:A33:V33) -> D34 (G34:A34:V34) -> Dx (Gx:Ax:V1.3.0).
  • Dependency D4 (G4:A4:V4) -> Dx (Gx:Ax:V1.5.0).

Maven creates a dependency tree during its effective POM generation that is illustrated below:

Graphical representation of determining a transitive dependency to be nearest in depth and first in resolution.

the above example shows V1.2.0 of Dx as the transitive dependency of choice since it is nearest in depth and first in resolution in this dependency tree.

Helping maven pick a different version

Add a direct dependency

Adding the desired transitive dependency version as a direct dependency in the project POM will result in such a dependency being the nearest in depth, and thus the dependency version to be selected. In the above example, if the desired version to be used was v1.3.0, then adding a dependency D5 (Gx:Vx:V1.3.0) would ensure its selection.

Use dependencyManagement

A project may contain several modules as was highlighted in Part 3 of this series. Often times, both for compatibility enforcement and POM hygiene, it is necessary to ensure the same version of the dependency be used across all child modules. In addition, the ability to override the nearest depth selection by selecting a specific version requires a lookup section in the POM. A dependencyManagement section in a POM is such a lookup.

Adding dependencies in a dependencyManagement does not include them in the dependency graph, rather provides a lookup table for maven to help determine the selected or chosen version of the transitive dependency that is listed.

A dependencyManagement section contains a dependencies element. Each dependency listed under is a lookup reference used either in the current POM or in any POM that inherits (either any POM that identifies the current POM as a parent or any POM that imports the project POM as a bill-of-materials).

Inheriting a dependencyManagement implies a few items:

  1. Once a dependency is listed in the section, any inheriting POMs can skip the version attribute when declaring the dependency. A version is no longer required, since the dependencyManagement provides one. Deliberately adding a version will override what the managed section defines, so standard maven version nearest depth kicks in.
  2. A project POM can acquire the a managed dependency version by either declaring parentage or by importing a bill-of-materials.
  3. Maven uses the dependencyManagement during the effective POM generation phase.
  4. Declaring a dependency in the dependencyManagement structure is just for a lookup reference.
  5. If a dependency defined in the dependencyManagement is never encountered in the actual dependency tree for the current project, it is ignored when generating the effective POM.
  6. A bill-of-materials POM is typically a large dependencyManagement block of compatible versions of several potential transitive dependencies that may (or may not) be required in the current project.
  7. A bill-of-materials (BOM) POM is a special POM of packaging type of pom. The BOM POM is imported into the project POM as a dependency with a scope of import.

An amazing resource to find out more about best practices for maven can be found at: https://jlbp.dev/.

That is a wrap on this blog. There is a lot more to cover on this topic including version ranges and enforcing version rules. These topics will be covered as separate blogs in the series.

Have fun!

Part 4
Maven Lifecycle
IndexPart 6
Maven POM Reference

Understanding Apache Maven – Part 4 – Maven Lifecycle

java, maven

Published: 2020-05 (May 2020)
Verified with: Apache Maven 3.6.3

Link to an index, to find other blogs in this series.

In Part 4 of the series , a walkthrough of maven lifecycles and executions is covered.

Apache Maven executions are tied to lifecycles. A lifecycle groups a sequence of activities. Maven provides three basic lifecycles for its standard build management. More lifecycles can be created as needed, though that is a rare need. There are three standard lifecycles provided by maven.

Standard lifecycles

  • clean – Intended for clean-up of any prior build-managed outputs and artifacts.
  • default (build) – Intended for project build, test and deployment of artifacts.
  • site – Intended for project site documentation.

What’s in a lifecycle?

A lifecycle is a collection of related activities pertaining to a specific type of build-management. Standard lifecycles include generation of build output artifacts, creation of a documentation site as well as cleaning any artifacts produced from a prior execution.

Lifecycles in Maven comprise of phases. Each standard lifecycle is made up of a few phases. Maven commands are typically executions of phases. More on this in a little bit.

Phases are sequentially executed.

Invoking a phase implies all prior phases in that lifecycle are executed.

A tree structure of the standard lifecycles and phases in each.
Maven standard lifecycles and their respective phases

Exploring phases

Phases are executable blocks. Phases follow an ordered sequence within a given lifecycle. Reiterating what was already mentioned, invoking a phase implies invoking all phases before it in the lifecycle.

Goals are bound to phases.

Goals – units of work

Goals are units of work (tasks). Goals are attached to a phase and this is called a binding. A goal performs a task that is considered relevant for the given lifecycle and phase. Maven provides some built in goals. Goals are defined in plugins.

Binding Goals

Brief digression: Maven outputs are controlled by packaging. Packaging defines the type of output expected when executing a maven POM. A POM includes the packaging as a top level element. Default packaging is a jar.

As mentioned earlier, goals are bound to phases. Every packaging comes with a predefined set of goals bound to phases of the default lifecycle.

However, for some goals, there is a clear phase that fully corresponds to their execution. Such goals do not have a need for being linked to a packaging.

Some goals may not be bound to anything and can be invoked directly without a phase-binding.

Plugins – definers of goals

Plugins are maven’s way of defining goals and providing connectors for the goals to phases. Plugins are developed as MOJOs (Maven’s plain Old Java Objects). The plugins define goals and supply logic to deliver the goals. Goals are usually bound to phases in either the maven built-in configurations or the project POM file.

Binding Goals to Phases

An example of binding for the Clean lifecycle (very confusing nomenclature, but worth reading)

Lifecycle clean : Phase clean
Plugin maven-clean-plugin (prefix clean) : Goal clean

The clean:clean (plugin:goal) is bound to the clean phase. Thus, executing the clean phase will trigger the run of the maven-clean-plugin which will look for and execute the clean goal.

Another example of binding, for the Site Lifecycle (site-deploy phase)

Lifecycle site : Phase site-deploy
Plugin maven-site-plugin (prefix site) : Goal site-deploy

The site:site-deploy (plugin:goal) is bound to the site-deploy phase. Thus, executing the site-deploy phase will trigger the run of the maven-site-plugin which will look for and execute the site-deploy goal.

Maven offers a set of bundled plugins and these plugins have a defined set of goals. Plugins can also be external to the maven distribution and can be built by other users of maven. The standard lifecycle phases of clean and site have default phases and bindings defined. This is done in the maven-core META-INF/plexus/components.xml, if you’re curious to dig through the details. Link to GitHub source: https://github.com/apache/maven/blob/maven-3.6.3/maven-core/src/main/resources/META-INF/plexus/components.xml.

It is usually rare to bind goals, by default, to phases prefixed with pre and post, but those are valid phases and goals can be bound to such.

Lifecycle clean
Default Phase: clean

Associated plugin (as of Maven 3.6.3): org.apache.maven.plugins:maven-clean-plugin:2.5 : clean

The default phase implies that if the lifecycle is invoked without a phase, the default phase (and any phases prior to it) will be executed. In this case, invoking the lifecycle will invoke both the pre-clean and clean phases.

The plugin coordinates follow the standard maven GAV coordinate system that was covered in Part 3 of this series. The content after the colon separator after the plugin coordinates is to point to the goal in the plugin.

Command line: mvn clean

The site lifecycle has two default phases, site and site-deploy. A dissection of the site-deploy default phase is shared below.

Lifecycle site
Default Phase(s) site and site-deploy

Associated plugin (for site-deploy and as of Maven 3.6.3): org.apache.maven.plugins:maven-site-plugin:3.3 : deploy

Invoking the default phase of site-deploy will cause all phases (pre-site, site, post-site and site-deploy) to be executed.

The plugin coordinates follow the standard maven GAV coordinate system that was covered in Part 3 of this series. The content after the colon separator after the plugin coordinates is to point to the goal in the plugin.

Command line: mvn site-deploy

The default lifecycle is unique. The overarching focus of the default lifecycle is to verify, compile, generate sources/resources, test, install and deploy the project code. It is thus not practical to setup defaults in this lifecycle, to a phase. The bindings are set at a different level called packaging.

A good listing for site and clean phases’ bindings can be found at: https://maven.apache.org/ref/3.6.3/maven-core/lifecycles.html.

Binding Goals Using Packaging

Packages – descriptors of outputs

Packages are a core element in the Maven POM and define the type of output produced by building the POM. It is possible to tie goals to the package element. Valid package names include: jar, war, ear, pom etc. Each package has some unique goals and thus goals are bound to these package names.

Binding Goals to Packages

Maven allows binding goals to packages. The default lifecycle is a great example and the bindings can be located at the maven-core META-INF/plexus/default-bindings.xml. Link to GitHub source: https://github.com/apache/maven/blob/maven-3.6.3/maven-core/src/main/resources/META-INF/plexus/default-bindings.xml

A few examples of bindings:

Package bindings for ejb, ejb3, jar, war, par and rar packages:

jar, war, rar, ejb, ejb3,par packages

Phaseplugin:goal
process-resourcesresources:resources
compilecompiler:compile
process-test-resourcesresources:testResources
test-compilecompiler:testCompile
testsurefire:test
packageejb:ejb or ejb3:ejb3 or jar:jar or par:par or rar:rar or war:war
installinstall:install
deploydeploy:deploy

Package bindings for pom package:

pom package

Phaseplugin:goal
package 
installinstall:install
deploydeploy:deploy

A good listing for the default lifecycle bindings can be found at: https://maven.apache.org/ref/3.6.3/maven-core/default-bindings.html.

Summary

This blog was lengthy !

A lot of material was covered. In summary:

  1. Maven provides standard lifecycles.
  2. Lifecycles are a collection of phases.
  3. Logic to execute specific actions is developed in plugins.
  4. Plugins are MOJOs (Maven plain Old Java Objects)
  5. Plugins define and carve out tasks called goals.
  6. A plugin can define several goals.
  7. Plugin goals are bound to parts of lifecycle and executed.
  8. The goals can be bound at either to a phase or based on packaging type of the POM.
  9. Execution of a maven lifecycle implies execution of goals bound to parts of the lifecycle.
  10. Maven commands invoke a goal .

A deeper dive into plugin-prefixes can be found at: https://maven.apache.org/guides/introduction/introduction-to-plugin-prefix-mapping.html.

Convention standards for plugin prefixes:

  • maven-${prefix}-plugin – for official plugins maintained by the Apache Maven team itself (you must not use this naming pattern for your plugin, more on this in a future blog on plugin development)
  • ${prefix}-maven-plugin – for plugins from other sources

Something Something – Personal Learning

Here is a very crude and unscientific pictorial of the my understanding of lifecycles, phases, goals and plugins. This is not meant to be accurate in terms of either human lifecycles or in explaining maven’s lifecycles. This picture is absolutely a personal means of illustrating how I went about learning these concepts.

Possibly inaccurate analogy of a young human lifecycle with phases such as terrible twos and adoloscence, with goals associated with each and soe external influences as plugins.
Highly unscientific, possibly inaccurate lifecycle of a young human from Age 0 to Age 18. Time ranges also not distributed proportionally.

That’s a wrap on this post. Have fun !

Part 3
Dependency coordinates and POM hierarchies
IndexPart 5
Dependencies in Maven

Understanding Apache Maven – Part 3 – Maven Coordinates & POM Inheritance

java, maven

Published: 2020-05 (May 2020)
Verified with: Apache Maven 3.6.3

Link to an index, to find other blogs in this series.

In Part 3 of the series , an explanation of dependency coordinates and ‘distinguishers’ as well as a more detailed look at POM hierarchies are covered.

What are dependency coordinates?

There are hundreds or thousands of projects that produce artifacts. Some such artifacts can potentially be used in a current project as libraries. For instance, a project may depend on a logging framework or a JSON library. It is possible to host many such dependency artifacts on a central repository. Maven’s primary such repository is called Maven Central. Many other such repositories also exist. More on this later.

With the availability of such repositories, the next question is, how are exact artifacts needed for a project identified and downloaded? A proper way to identify the artifact is via its maven coordinates.

What are maven coordinates?

A way to uniquely identify an artifact. There are three primary coordinates that are used to identify an artifact.

groupId

A grouping classification, typically referring to an organization, a company and may include a basic theme for one or more projects. A groupId typically follows a dot notation similar to a Java package name. Each token in the dot notation corresponds to a directory in a tree structure on the repository. For instance, a groupId of org.apache.commons corresponds to $REPO/org/apache/commons.

While is it strongly recommended, it is not a requirement to have the dot notation. Several projects have forgone the need for a dot notation and simply carried a simple name as the groupId. This single-name practice is discouraged. All projects are recommended to maintain a fully-qualified dot-notated groupId.

artifactId

A proper name for the project. Among the many projects that exist in the group, the artifactId can uniquely identify the artifact. An artifactId follows a simple name nomenclature with hyphenation recommended for multi-word names. Names should ideally be small in length. An artifact manifests itself as a sub-directory under directory tree that represents the groupId. For instance, an artifactId of commons-lang3 under a groupId of org.apache.commons would determine that the artifact can be found under : $REPO/org/apache/commons/commons-lang3/.

version

An identifier that tracks unique builds of an artifact. A version is a string that is constructed by the project’s development team to identify a set of changes from a previous creation of an artifact of the same project. It is strongly recommended to follow semantic versioning schemes for versions, although it is not mandated. A version manifests itself as a sub-directory under the directory tree that represents the groupId and artifactId. For instance a version of 3.1.0 for an artifactId commons-lang3 under the groupId of org.apache.commons would determine that the artifact would located under: $REPO/org/apache/commons/commons-lang3/3.1.0/.

A little bit of a deeper dive on versions

Understanding version number schemes

Following standard Software Development Lifecycles, a project can undergo phases of development with each phase consisting of a few attempts at building the software, testing, fixing bugs, making further required changes etc. Once the cycle completes, the product is marked for release for the given phase. These lifecycles and phases have nothing to do with maven.

Development cycle

During development a product can produce a few artifacts that can be tested, verified etc. This development cycle typically produces artifacts with the same version as prior builds in the same phase. The basis for this statement is that the release is still in experimental, beta or alpha state, but may undergo more changes. However, it is hard to get all consumer projects to constantly update their own POMs for each such build. This grief is mitigated by the use of SNAPSHOT builds.

As an example, for a project A, version 1.0.0, a development team can build several SNAPSHOT versions with version 1.0.0-SNAPSHOT over a period of time. This suffix implies that the version number is intended to be 1.0.0 at the end, however the build will produce newer artifacts each build that replace the prior SNAPSHOT. The newer SNAPSHOT can replace an existing SNAPSHOT of the artifact in a maven repository, so consumer projects can safely procure the latest SNAPSHOT version. A SNAPSHOT is a mutable version of the project that can be declared as a dependency in consumer projects.

Production-Ready

Once all testing is complete, the POM no longer needs the -SNAPSHOT suffix and can build a production-ready immutable version of the project. Once the immutable version of the project has been pushed to a repository, no further changes should be made to it (maven cannot enforce this, this is based on discipline). Other production-ready suffixes can commonly be found: 1.0.0-GA (general availability) or 1.0.0-RELEASE (released artifact) or 1.0.0-FINAL etc. Once again, the version strings are NOT meaningful to maven at all.

The GAV coordinate system

A common way of communicating an artifacts coordinates is with a colon separation. Together the coordinates are referred to as Group-Artifact-Version or GAV coordinates. The GAV coordinates for commons-lang3 version 3.1.0 will be: org.apache.commons:commons-lang3:3.1.0.

Additional distinguishers

Often times, a project’s build may include more than one format of artifacts. A maven execution on a project could emit a jar file, a zip file a tarball and many other artifacts. An execution could also emit different outputs such as a binary, a zip file of sources, a zip file of javadoc files etc. Apart from the above mentioned GAV coordinates, distinguishers are thus needed to identify such diverse outputs.

classifier

A classifier is used to distinguish an alternate output emitted by executing maven on the project POM. Common examples include sources as a .jar/ .zip file and javadoc as a .jar / .zip file. The classifier manifests itself as a part of the artifact name. For our above example of commons-lang3, the artifact to look for is: commons-lang3-3.10-javadoc.jar or commons-lang3-3.10-sources.jar under $REPO/org/apache/commons/commons-lang3/3.1.0/.

type

A type is used to distinguish the artifact format. Artifacts emitted from a maven execution can be of various types as already discussed: .jar, .war, .zip etc. It may, at times be beneficial for a project to produce artifacts in different formats. These formats are specified under a type distinguisher.

Putting it all together

A combination of GAV coordinates and distinguishers can be used to locate the exact artifact needed for the project.

POM Hierarchies

This section describes the hierarchy in maven POMs.

Parent POM

A parent POM is a POM from which the current project POM can inherit content. The project POM can depend on exactly one parent POM. This single-parent inheritance is one-way. The parent POM is unaware of the POM that inherits from it. The child POM declares the parentage in its own pom.xml (standard file to hold the POM, can be customized to a different name). Any number of POMs can declare another POM as their parent.
USAGE: While we will delve into the contents in a future blog, the parent POM can be used to declare re-usable portions of the POM that individual child POMs can then inherit. This helps in both maintenance and to reduce clutter.

Aggregator POM

An aggregator POM (also known as a reactor POM) is a POM that can sequence the builds of many projects. An aggregate POM specifies all the projects that can be build-managed together. The child POM(s) remain unaware of the aggregator POM that invokes it. A child POM can be a listed in more than one aggregator POM. The aggregator POM lists the child POM by name in it’s own pom.xml as a module. As the declared module suggests, this pattern is for modular builds of projects. There is no inheritance of any content from the aggregator POM.
USAGE: While we will dig deeper in a future blog, the aggregator POM can be used to ensure the sequence of builds and maintain a list of projects that should be build-managed together.

Bill-Of-Materials POM

A bill-of-materials POM is a POM that can declare bundles of dependencies that have been tested to work well together. The abundance of artifacts and versions of each can, at times, lead to confusion and needs for trial-and-error mechanisms to determine compatibility and/or right functionality. A bill-of-materials POM reduces that overhead. A bill-of-materials POM is a means of multiple inheritance too, since a project POM can import multiple bill-of-material POMs. The bill-of-materials POM is unaware of the child POMs that import it. The child POMs declare the bill-of-materials in the project POM.
USAGE: While this will be covered in detail in a future blog, the bill-of-materials POM can be used to bundle well-working dependencies (with right versions) together to avoid repetition for every project.

A POM can be both a parent as well as an aggregator. This allows for a two-way relationship in a tightly knit set of modules that depend on a common set of inherited content.

That’s a wrap on this blog. Have fun!

Part 2
The Project Object Model (POM) and Effective POMs
IndexPart 4
Maven Lifecycles, Phases, Plugins and Goals

Understanding Apache Maven – Part 2 – POM Hierarchy & Effective POM

java, maven

Published: 2020-05 (May 2020)
Verified with: Apache Maven 3.6.3

Link to an index, to find other blogs in this series

In this blog, the Project Object Model (POM) is explored.

What is the Project Object Model?

First, a maven POM is not a popular pomegranate juice nor is it related to the colorful pom-poms. A maven POM is definitely is wonderful and brings as much joy to a developer as does a pom-pom to a kid.

A POM describes build management needs of a project:

  • project coordinates – uniquely identifiable set of properties by which the project artifacts can be consumed elsewhere.
  • dependencies – libraries and code needed to execute the project build management
  • plugins – helper tools that execute build and build management aspects
  • properties – common and extracted values used in the project
  • inheritance details – the ability to create a hierarchy of re-usable POM components
  • profiles – alternate execution pathways that can be activated on a per-execution basis
  • . . .

How does maven interact with a POM?

Maven utilizes content in the POM for its build management. However, maven also has convention-based defaults. Maven thus has the onus of amalgamating defaults and applying overrides and additions discovered in the project’s pom file (typically a pom.xml).

This amalgamation of defaults and applying overrides and additions results in an effective POM.

What is an effective POM?

An effective POM is:

  • an assembly of execution steps, properties and profiles
  • the content that maven can execute for the project
  • an exhaustive set of dependencies and plugins needed for such an execution
  • determination of any transitive dependencies (dependencies of dependencies, full depth imaginable)
  • any conflict resolution in terms of dependency versions

How does maven assemble the effective POM?

A set of boxes representing the maven internal defaults -> maven super pom -> maven global settings -> maven user settings -> Parent/bill-of-material poms -> project pom that finally results in an effective POM.

Maven assembles its effective POM by traversing the layers that act as building blocks. Each layer used has the ability to override or enrich the content of what will become an effective POM. Maven internal defaults and the super POM are built-in to the maven installation, so ideally not subject to customization. The layers below, the global settings and user settings are, as their name suggests, inclined towards hosting and overriding any settings for maven. The parent, bill-of-material and project POM files are where maven instructions can be customized. Default values from above layers are utilized if no customization is made.

The layers explained

Going through the layers:

Maven Internal Defaults

This layer is within the maven’s own code. Unless the intent is to modify maven’s source code, it is safe to assume that these defaults are not modifiable.
Location: Internal to the maven installation.

Maven Super POM

This Super POM exists in the maven’s own code. Once again, unless the intent is to modify maven’s own source code, it is same to assume that the Super POM is not modifiable.
Location: Internal to the maven installation

Maven Global Settings

Once Maven is installed/unarchived on a computing device, it creates a directory structure. One of the base directories, (right below the maven directory) is conf. This directory contains a settings.xml referred to as a global settings file. Since this file exists on the local computing device, it is editable. Typically, any settings that apply to all projects being managed on the computing device are managed in this global settings.xml. Examples may include proxy settings, corporate server URLs, mirrors etc. It should not often be modified, in any case.
Location:
Unix/MacOS: <maven installation>/conf/settings.xml
Windows: <maven installation\conf\settings.xml

Maven User Settings

Similar to the global settings, but at a different location, it is possible to create a user settings file. This file is also named settings.xml but its location is under the user home in an .m2 directory. The purpose of the user settings.xml is to setup any settings that apply to projects managed by the specific user (there could be multiple users on a given computing device). Examples include usernames and passwords to connect to the network, repository ordering etc.
Location:
Unix/MacOS: <user.home>/.m2/settings.xml
Windows: <user.home>\.m2\settings.xml

Parent / Bill-of-Material POMs

Maven has an inheritance and version-trait model. Maven can inherit content from a parent pom and version-traits from a bill-of-materials pom. Typically parent POMs contain re-usable dependencies, plugins and properties used by a project POM. The Bill-of-Material (BOM) POM is a specialized POM that allows to group together dependency versions of dependencies that are known to be valid and tested to work together. Using a BOM POM reduces the developer grief from having to test compatibility of different dependencies. Modification of either is possible if there is a need and the entitlement to change. Since changes to either can impact several other projects, appropriate version increments and reviews are recommended for changes.
Location: Various locations, either on device or elsewhere in some repository.

Project POM

This is the Project Object Model for the project. Project specific maven instructions are detailed in this location. Typical contents include: a unique set of coordinates used to identify the project, name and description of the project, a set of developers associated with the project, any source control management details specific to the project, all project-specific dependencies and plugins, any profiles that allow for alternate executions of maven on this project and so on. All these will be covered in a blog in this series. The file containing this POM is, by convention, named pom.xml, but other names can be used. If an alternate name is used, then the maven executable will need to be pointed to the the filename for execution.
Location:
UNIX/MacOS: $PROJECT/pom.xml
Windows: $PROJECT\pom.xml

The next blog will cover POM hierarchies. Have fun!

Part 1
Apache Maven basics
IndexPart 3
Dependency coordinates and POM hierarchies

Understanding Apache Maven – Part 1 – The basics

java, maven

Published: 2020-05 (May 2020)
Verified with: Apache Maven 3.6.3

Link to an index, to find other blogs in this series.

This blog (a part of the series) is an introduction to Apache Maven basics.

Apache Maven (commonly referred to as “maven” ) is a Build Management tool. Maven is primarily used to build Java projects. Other language projects can also be built using maven. Apache Maven is written in Java, for most part. It is an open source project.

Maven follows a convention-over-configuration philosophy. More on this follows.

Why is Maven a Build Management tool?

Maven aims at achieving the following for a project:

  • build tooling
  • version management
  • re-usability
  • maintainability
  • comprehensibility
  • inheritance

In addition, maven provides capabilities for a project to :

  • produce different build targets and results builds via profiles
  • test code
  • generate documentation
  • furnish metrics and reports
  • deploy built artifacts

What is Convention-Over-Configuration?

Convention-over-configuration is a software paradigm. The main intent of such a paradigm is to reduce the number of superfluous decisions required by a developer to build her/his project. The paradigm aims to meet and satisfy the “principle of least astonishment“.

Apache Maven – convention over configuration

Apache Maven provides sensible defaults for a project’s build management. A developer can then choose to override any preset defaults.

An example of such is clear in the conventional directory structure of a maven project.

project
|
|____src
|   |
|   |____main
|   |   |
|   |   |____java
|   |   |
|   |   |____resources
|   |
|   |____test
|       |
|       |____java
|       |
|       |____resources
| 
|____pom.xml

Printed using: alias tree="find . -print | sed -e 's;[^/]*/;|____;g;s;____|; |;g'"

In this directory structure, the source code is, by convention located under the project root directory in a src directory. Under the src directory there exists a main directory that is, by convention expected to contain production code, that is, code that is expected to be a part of the final executable. Parallel to the main directory, there is a test directory, where test code is expected, by convention. Java code, once again, by convention (I think by now the point is made, will stop referring to it) exists in a java directory either in the production or in the test location. Similarly resources needed for production or test outputs, reside respectively in the resources directory.

This is just one example. Maven uses the convention-over-configuration philosophy in many other areas.

What are some Maven capabilities?

A few features that maven offers (this is not a comprehensive list):

  • Validate the project structure
  • Auto generate any code/resources needed by the project
  • Generate any documentation
  • Compile source code, display errors / warnings
  • Test the project based on existing tests
  • Package compiled code into artifacts (examples include .jar, .war, .ear, .zip archives and many more)
  • Package source code into downloadable archives / artifacts
  • Install packaged artifacts on to a server for deployment or into a repository for distribution
  • Generate site reports and test evidence
  • Report a build as success or failure
  • . . .

How does Maven work?

Maven uses a Project Object Model (POM) to manage a project. Maven commands execute parts of its Project Object Model. A Project Object Model is usually described as an XML document. A POM description is NOT limited to XML. Other formats can be used to describe the Project Object Model, however, XML was the first format used.

A picture to illustrate a typical maven execution:

A pictorial overview of how maven interacts with a project's Project Object Model. Includes assembling, download of dependencies and plugins, execution of build lifecycles and an upload of build artifacts to either a local repository or to a maven repository on a network
Maven – A pictorial overview

The next blog in this series will dig into details of a Project Object Model (POM). Have fun !

             IndexPart 2
The Project Object Model (POM) and Effective POMs