Federation design (for 3.1 and later)

Version 4

Created by rhauch on Oct 25, 2012 10:11 AM. Last modified by rhauch on Nov 9, 2012 2:31 PM.

ModeShape 2.x had a "federation" feature that made it possible for some of the content in a repository to be accessed from external systems. The feature used the same connector SPI for storing content and for accessing external systems, but using a single SPI for two incongruous capabilities caused quite a few issues. Consequently, when ModeShape 3.0 was designed to use Infinispan for all storage, the ability to access external systems was not implemented (due largely to time constraints). However, when designing ModeShape 3.0 we did take into account the need to access external systems, and put into place a number of implementation details that might also be useful for federation. But there still are a lot of specifics and decisions that need to be addressed.

Federation terminology

Some proposed terms:

External node - A node that is obtained from an external system.
Internal node - A node that is owned and managed entirely by ModeShape.
Federated node - An internal node that has one or more external nodes as children. (Can anyone think of a better name for this?)
Connector - The software that implements our SPI and talks to a specific kind of external system. This is analogous to a JDBC Driver.
Source - A particular instance of an external system that is accessed via a Connector with particular connection parameters. This is analogous to a JDBC DataSource.
Connection - An open channel to a particular source. The duration of the connections are not known, but are likely to be short-lived. (An alternative might be "Transaction", but that might imply semantics we don't want to accept.)

Existing design

The current design of ModeShape 3.0 includes several features that may be important as we implement federation.

NodeKey

Every CachedNode (which are wrapped by the JCR objects) is identified by a NodeKey object that has three parts: a source part, a workspace part, and an identifier part. The workspace part uniquely identifies a workspace within the repository; the identifier part uniquely identifies the node within that workspace; and the source part was intended to allow the identification of the source where the node is persisted. In 3.0, the source part is the same for all nodes (which are known as "internal nodes"). In 3.1 (or later), all external nodes will use a specific source part that identifies an external system.

Node documents

Every node is mapped to a JSON/BSON Document representation and stored in Infinispan (or an external source), and each node's Document contains the node's exposed JCR properties, the keys to the parent(s), hidden properties, and child references (the key and name; SNS indexes are calculated rather than stored).

Here's an example of a node document. The "source1" is the source key, "works1-" is the workspace key", and the node's identifier is a UUID.

{
  "parent" : "source1works1-cafebabe-cafe-babe-cafe-babecafebabe",
  "properties" : {
    "http://www.jcp.org/jcr/1.0" : {
      "primaryType" : {
        "$name" : "http://www.jcp.org/jcr/nt/1.0}unstructured"
      },
      "mixinTypes" : [
        {
          "$name" : "http://www.jcp.org/jcr/mix/1.0}referenceable"
        }
      ],
      "uuid" : "2ef58009-28ab-4348-8445-b92b629e97e1"
    },
    "http://www.modeshape.org/1.0/test" : {
      "description" : "This is the description of the \'childB\' node.",
      "intValue" : 20,
      "booleanValue" : false,
      "doubleValue" : 2.3533
    }
  },
  "childrenInfo" : {
    "count" : 12,
    "lastBlock" : "source1works1-childB"
  },
  "children" : [
    {
      "key" : "source1works1-childC",
      "name" : "childC"
    },
    {
      "key" : "source1works1-childD",
      "name" : "childD"
    },
    {
      "key" : "source1works1-13860492-a42c-4506-912c-63c91ddbbbcc",
      "name" : "newChild"
    },
    {
      "key" : "source1works1-cc929d32-6477-40cc-9a00-205b82dbe0ad",
      "name" : "newChild"
    },
    {
      "key" : "source1works1-8668fbd0-43e3-41e2-9fdf-fc28789633fa",
      "name" : "newChild"
    },
    {
      "key" : "source1works1-6aaaa1a5-a210-42c7-8fd7-426ad085ecc7",
      "name" : "newChild"
    },
    {
      "key" : "source1works1-fd7860d2-bfd7-4efb-9efc-178e1dd08f79",
      "name" : "newChild"
    },
    {
      "key" : "source1works1-0f79a231-2471-4426-a1a6-1e3f74cc0de1",
      "name" : "newChild"
    },
    {
      "key" : "source1works1-689b6ce2-695b-4ebc-ba4f-d64263fe0e5c",
      "name" : "newChild"
    },
    {
      "key" : "source1works1-dcd4d299-1c51-4370-85f6-b4900003e8a0",
      "name" : "newChild"
    },
    {
      "key" : "source1works1-8fef05d3-aa32-4255-ab62-a5972d87c829",
      "name" : "newChild"
    },
    {
      "key" : "source1works1-39008642-aaf0-4a7b-8fa9-dc80e61151e3",
      "name" : "newChild"
    }
  ]
}

Note how the "properties" contains a nested document that contains fields for each namespace URI (not prefixes, which can change) that is itself a nested document with fields for each property name.

Also note that two separate fields are used to store information about the children. The "childrenInfo" stores summary information, including the total number of children (in "count") and the last block of child references (in "lastBlock", which points to this document, meaning it's the last block and there are no other blocks). We'll discuss how these segments/blocks work a bit later.

Child References

Every CachedNode can return the ChildReferences object that contains (and owns) the set of references to child nodes (e.g., ChildReference objects). Note that key, name and same-name-sibliing index are all access on the ChildReference. In other words, a node does not know it's own name, but rather each parent knows the names of it's children. (This allows a node to be linked under multiple parents yet have different names under each. It also is more efficient, as we often need to know the name of a child node before we actually materialize the child node.)

Conceptually the ChildReferences instance is an Iterable<ChildReference> that can access nodes by name (and optionally SNS index) and can also create a variety of iterators over subsets of the ChildReference objects.

Segments/Blocks

Within each node's document representation, the node child references can be represented as an array of nested child reference mini-documents - this is equivalent to storing all child references in the node's document. However, the node document becomes larger (and more costly to work with) as the number of children go up.

To account for this, ModeShape can segment the child references, which involves leaving the first N child references but placing all remaining child references in one or more separate documents. For example, consider a node that was created to have 1000 child references. Initially, all 1000 child references might be put into the node's document. At a later time (perhaps during an asynchronous optimization process), this document is changed so that the first 100 child references remain in the node's document, but the next 100 are placed into a separate "segment" document that is referenced in the node's document. The next 100 would be placed in a separate segment document referenced by the first segment document. This process continues while the remining child references are placed into other segment documents. (This process is done in a way that doesn't create conflicts with concurrent reads. See the DocumentTranslator.optimizeChildrenBlocks and DocumentTranslator.splitChildren methods for details.) Note that the reference to the next segment(s) are stored in a "

Here's an example of a node's document that contains some child references while others are stored in a separate segment/block:

{
  "parent" : "source1works1-cafebabe-cafe-babe-cafe-babecafebabe",
  "properties" : {
    "http://www.jcp.org/jcr/1.0" : {
      "primaryType" : {
        "$name" : "http://www.jcp.org/jcr/nt/1.0}unstructured"
      },
      "mixinTypes" : [
        {
          "$name" : "http://www.jcp.org/jcr/mix/1.0}referenceable"
        }
      ],
      "uuid" : "2ef58009-28ab-4348-8445-b92b629e97e1"
    },
    "http://www.modeshape.org/1.0/test" : {
      "description" : "This is the description of the \'childB\' node.",
      "intValue" : 20,
      "booleanValue" : false,
      "doubleValue" : 2.3533
    }
  },
  "childrenInfo" : {
    "count" : 12,
    "lastBlock" : "source1works1-609cb28f-244c-4a2c-98c5-8506c78c92b8",
    "blockSize" : 5,
    "nextBlock" : "source1works1-609cb28f-244c-4a2c-98c5-8506c78c92b8"
  },
  "children" : [
    {
      "key" : "source1works1-childC",
      "name" : "childC"
    },
    {
      "key" : "source1works1-childD",
      "name" : "childD"
    },
    {
      "key" : "source1works1-92cdd956-5ba0-4ccf-9feb-d5e683a941a2",
      "name" : "newChild"
    },
    {
      "key" : "source1works1-c76e39cf-ae9b-4614-adae-5b70abef962e",
      "name" : "newChild"
    },
    {
      "key" : "source1works1-4c5651e4-117d-45ea-bbcf-7c12bd38b15b",
      "name" : "newChild"
    }
  ]
}

Here, the "childrenInfo" nested document tells us:

the node contains 12 child references (via the "count" field),
only 5 child references are contained in this document/block (via the "blockSize" field),
the next block of children is the "source1works1-609cb28f-244c-4a2c-98c5-8506c78c92b8" document (via the "nextBlock" field); this is used when iterating or finding children by name
the last block of children is also the "source1works1-609cb28f-244c-4a2c-98c5-8506c78c92b8" document (via the "lastBlock" field); this is used when appending new child references

The next block document is a bit simpler than a node document. Here's the "source1works1-609cb28f-244c-4a2c-98c5-8506c78c92b8" block document:

{
  "childrenInfo" : {
    "blockSize" : 7
  },
  "children" : [
    {
      "key" : "source1works1-a3f372d7-1e6e-405d-a96b-13e7a998e12e",
      "name" : "newChild"
    },
    {
      "key" : "source1works1-c0cacc85-578a-414f-b45c-c11cbdfa0e54",
      "name" : "newChild"
    },
    {
      "key" : "source1works1-74f6fabf-b8a4-42bc-8578-e1a6de9795e9",
      "name" : "newChild"
    },
    {
      "key" : "source1works1-8dac5f39-b923-4d77-b922-ae4de4bdb612",
      "name" : "newChild"
    },
    {
      "key" : "source1works1-c6f0a1a0-fac6-44da-a544-d9db576ac4db",
      "name" : "newChild"
    },
    {
      "key" : "source1works1-805ec9a7-5575-40d7-bd72-a3ac8b51f23d",
      "name" : "newChild"
    },
    {
      "key" : "source1works1-31258860-6868-4432-b8f3-7255f8672fac",
      "name" : "newChild"
    }
  ]
}

A federated node's document could refer to the external nodes with a block, which would allow the federated node to contain it's own child references while also containing (an unknown number of) external nodes. We can achieve this in 1 of two ways:

Option 1 is to simply create a special kind of block document that points to the external source (by identifier) and contains enough information to know which external nodes should be considered as child nodes. The federated node would simply refer to this block just like any other block; only one getting the block document and looking at its content would we know that it's a "external block" document. Note that because the "external block" might appear in the middle of other regular blocks. This is both a benefit and a detriment, as the optimization logic and child reference append/reorder logic needs to address this.
Option 2 is to always treat "external" blocks a bit differently. For example, we could add an optional "externalBlocks" array field that references zero or more external block documents whose child references are always placed at the end or beginning of the child references. We likely will still have to address appending/reordering nodes, but the optimization logic might be left untouched.

SessionCache

The SessionCache implementations manage accessing all persistent node state and applying all changes. For example, we might need an abstraction (e.g., DocumentStore) for getting and putting the documents, where one implementation would use the SchematicDB the way we're currently doing it. The source should be able to hold onto a shareable instance of this interface, and the AbstractSessionCache (used for both ReadOnlySessionCache and WritableSessionCache) could use delegate to those objects. However, it's important that this is done as efficiently as possible, to prevent adding overhead to the current functionality. Consider that we probably don't want WritableSessionCache to have to check each node's source to look up the DocumentStore object; it may be better to store the transient changes keyed by source id.

Configuration

ModeShape's JSON configuration file and the RepositoryConfiguration class will need to change to allow defining external sources. The AS7 subsystem will need to be changed to support definining them via the AS7 toolset. Should we also store the connector and source information within the repository, perhaps only the hidden repository information or perhaps within the "/jcr:system" area?

How do we configure the federated nodes?

Binary storage

ModeShape stores all binary content (above a configurable size) in a BinaryStore. Since clients can use their Session to create new Binary values and then use those Binary values in properties on specific nodes, we'll have to create all new Binary values in ModeShape's BinaryStore. Only when a new Binary value is used in an external node's property would we know that the Binary should be sent to the connector.

One option is to have the connectors expose a BinaryStore implementation, which ModeShape uses to push binary values down to the connector and which can be referenced by StoredBinaryValue objects created for an external node.

Another option is for the connector to provide only a handful of methods, and for ModeShape to wrap that. This would be simpler for connector implementations. For example, many connectors might not be able to store extracted text; should connectors just store the binary values and ModeShape manage the rest of the binary information?

NodeTypes

A connector should be able to expose the set of node types that it uses, and we can perhaps follow a similar pattern to the Sequencer framework: the connector is "initialized" upon its startup and is given interfaces to register node types and namespaces

The connector may also need to list the node types that are allowed on it's content, and we need to determine where we perform this validation (either in ModeShape or in the connector). Is it sufficient that the ModeShape JCR layer will validate the properties and child node definitions, and can limit the primary types based upon the child node definitions? ModeShape doesn't currently limit which mixins can be added to a node, but we might need to add that capability. Perhaps connectors should optionally provide a Validator object that JcrSession can use during PreSave to validate nodes (based upon the nodes' source).

What about registering new node types? Should ModeShape do this automatically when it determines that an external node uses a node type (via primary type or mixin type) that the connector doesn't know about? Perhaps the connector Validator can do this as it validates the content.

Versioning

Some external systems (like Git, SVN, and even other JCR repositories) already support a notion of versioning, so it's likely that we want to allow connectors to control the versioning of their own content. However, when we ported the versioning logic in 2.x to the 3.0 codebase, merely changing the logic to use the new cached nodes was big enough of a change, so we didn't really consider this as part of the 3.0 effort.

To support versioning, we'll likely want to extract the parts of the JcrVersionManager logic that uses the cached nodes into a separate component, so that what remains in JcrVersionManager can either version internal nodes or delegate to the connector for versioning of external nodes.

Query

After 3.1, we'll likely want to be able to push queries down to connectors so that ModeShape doesn't have to index the external content. In the meantime, we'll need connectors to be able to coordinate with ModeShape when "new" content becomes available so that it is properly indexed.

Implementation steps

There are a lot of things that need to be done to complete the federation capability, but we definitely want to proceed in a step-wise function. Here an initial set of goals:

Determine hooks for reading/writing in current codebase (to identify what operations we perform)
Create initial but basic connector SPI (e.g., "DocumentStore")
Modifying codebase to use any abstractions (e.g., "DocumentStore") for internal content
Create hard-coded connector implementation (to simplify testing)
Enable read-only federation (with node types already in ModeShape, no binary values, no queries, hacked configuration)
Initial (minimal) configuration of sources
Add ability to read binary values
Add node type support to connectors
Implement file system connector (read-only)
Enable writing (changing/adding/removing) content to the external system, perhaps without validation
Add ability to write binary values to external system
Add write ability to file system connector
Add validation
Start implementing additional connectors (e.g., JCR, Git, SVN, JDBC metadata)
Support quering external content (requires indexing external content)
Support connectors versioning their own external content

UPDATE 1: I've created a new "federation" branch in the upstream repository off of which all federation-related topic branches should be based. We'll continue to use 'master' for non-federation changes, and we'll periodically merge (not cherry-pick!) "master" into "federation" branch so that it's easier to merge "federation" back into "master".

UPDATE 2: As of Nov 9 2012, most of the above steps (1-12) have been completed on a "federation" branch. See MODE-1513 for other status information.

JBossDeveloper