Java Module Repository Wishlist

This page describes some ideas for a Java Component (jar or other) repository.  Some of the ideas are based on experiences with the Maven repository structure.

 

1. Hierarchical structure

A completely flat directory structure can be more difficult to browse and manage, so the directory structure should be hierarchical with a basic correlation to the Java packages contained in the jars. 

 

The Maven 2 repository format is hierarchical in nature, and meets this requirement.

 

2. Component co-location

Components that are tagged, built, and released together should be located together in the repository.  This makes it easier to see in the repository all the components that belong to a specific release, it makes it simpler to stage and publish a release, easier to see what's changed between releases, and easier to clean up old releases.

 

Maven Layout - /groupId/artifactId/version/artifacts

Proposed Layout - /packageId/projectId/version/<subprojectId>/artifacts

 

In the proposed layout, groupId and artifactId roughly correspond to packageId and projectId.  The difference is that in a multi-module project, all modules would contain the same packageId and projectId, but the moduleId would distinguish the artifacts.  So all the modules from version 1.0 of a project would be under a single directory structure.

 

The Maven 2 repository format does not enforce that artficts relesed together end up in the same repository location.  A multi-module release tends to spread out across multiple locations in the repository due to the different artifactIds and occasionally different groupIds.

 

3. Artifact Metadata Limited by Relevancy

The metadata which describes an  artifact in the repository should contain only information that is relevant to using the artifact from the repository.  This includes general information about the project, such as project name, description, license, etc.  It also includes any information that is required to correctly use the artifact such as unique identifiers and dependency information.

 

The Maven 2 POM format does not meet this requirement because it contains details, in the "<build>" section, about how the build works that are not useful for an artifact in the repository.

 

4. Self Contained Artifact Metadata

The metadata for a component should be internally complete, meaning that the metadata itself does not require any external information/metadata before it can be used to describe the module. 

 

Maven 2 does not meet this requirement because in order to determine the full set of information about an artifact, it is often necessary to resolve parent POMs and/or resolve dependency information from external imported POMs.

 

5. Indexing and Searching

There should be a standard index format which build tools could use to look up artifacts and available versions, metadata, etc.  There should be a standard or recommended process for sharing index information between multiple repositories to locate conflicting releases.

 

The Nexus index format is a de-facto standard for the Maven 2 repository format, however there is no official Maven 2 standard.

 

6. Standard for Repository Cleanup

There should be a standard process and metadata used for changes made over time to old releases in the repository.

 

Bad/Broken artifacts and metadata - In some instances bad builds are released to the repository.  This could be due to a manual mistake, or maybe something lacking in the release process.  An additional use case is that over time, security issues are sometimes found in old releases.  A subset of the repository metadata should be changable post-release to allow the release to be marked as a bad release.

 

Deprecating/Archiving releases - Over time,  releases become stale and unsupported.  A subset of the repository metadata should allow the release to be marked as deprecated meaning that it is no longer recommended that it be used.  In addition, after a period of deprecation, there should be a standard process for archiving these releases to a separate storage location.  The advantage of this is a cleaner and smaller primary repository which can be more easily mirrored and indexed.

 

The Maven 2 repository format assumes that all releases to the repository will remain completely unchanged (both the artifact and metadata) over an infinite amount of time.

This is not optimal for several reasons:

 

 

.