9 Replies Latest reply on Sep 7, 2010 11:01 PM by penkween

    How to use ModeShape 2.2.0 to store custom filesystem metadatas like JackRabbit

    penkween

               My company project require a Hierachical Datastore. Basically, the project is a Photo Album like Flickr which allow user to upload photo and store it using conventional file system in Windows (or Linux) and later will be used by the web presentation layer. The reason we choose to try Modeshape (JCR) is for storing the metadata of the photo stored inside the filesystem which allow for easy  indexing/searching later. I have tried the Modeshape 2.2.0 using the following configuration following the Modeshape's UFOs example:

       

      <mode:source jcr:name="UFOs" mode:classname="org.modeshape.connector.filesystem.FileSystemSource"
                  mode:workspaceRootPath="C:/test/modeshape/ufoSource"
                  mode:defaultWorkspaceName="workspace1"
                  mode:creatingWorkspacesAllowed="false"
                  mode:updatesAllowed="true"/>

       

      1. Reading Repository  -> is OK

      #########################################################################################################

       

      SecurityContextCredentials credential = new SecurityContextCredentials(securityContext);
      session  = repository.login(credential);
      nodeRoot = session.getRootNode();

      dump(nodeRoot);                                          // From here, i can see my filesystem contents. dump is a function written to dump the Nodes

      session.save();

      session.logout();

       

       

       

      2. Creating Nodes-> OK (using graph api)  but Fail (using Node.addNode)

      #########################################################################################################

       

      Fail by using Node.addNode()   ** during session.save()

      ==============================================

      *** if I try to add it at root "/" ***

      javax.jcr.RepositoryException: org.modeshape.graph.connector.RepositorySourceException: Primary type "UFOs" for path "nt:unstructured" in workspace "/" in workspace1 is not valid for the file system connector.  Valid primary types are nt:file, nt:folder, nt:resource, and dna:resouce.
              at org.modeshape.jcr.SessionCache.save(SessionCache.java:412)
              at org.modeshape.jcr.JcrSession.save(JcrSession.java:1346)

       

      *** if I try to add it at a folder named "FolderA" under root "/" ***

      javax.jcr.nodetype.ConstraintViolationException: Unable to determine a valid node definition for the node "/{}FolderA/{}newfile.txt" in workspace "workspace1" of "JCR UFOs"
              at org.modeshape.jcr.SessionCache$NodeEditor.createChild(SessionCache.java:1565)
              at org.modeshape.jcr.AbstractJcrNode.addNode(AbstractJcrNode.java:1468)
              at org.modeshape.jcr.AbstractJcrNode.addNode(AbstractJcrNode.java:1344)

       

       

      OK by using Graph api

      ==============================================

      By using the Modeshape graph API (learned from : http://community.jboss.org/message/528581).  I manage to create and remove File and Folder using following codes:

       

      [Create File]

      String file = "/newfile.txt";

      Graph graph = jcrEngine.getGraph("UFOs);
      graph.create(file).with("jcr:primaryType","nt:file").and();
      graph.create(file+"/jcr:content").with("jcr:primaryType","nt:resource").and("jcr:data",data).orReplace().and();

       

      [Create Folder]

      String folder = "/newfolder"

      Graph graph = jcrEngine.getGraph("UFOs);
      graph.create(folder).with("jcr:primaryType","nt:folder").and();

       

       

       

      3. Creating Properties -> Fail

      #########################################################################################################

      My problem is how do I add properties like "Author" to newfile.txt using Node.setProperty("Author","David") and later fetch it using Node.getProperty() ? I have tried the above setProperty and getProperty and all failed with the following exceptions:

       

      ======================================================================

      javax.jcr.nodetype.ConstraintViolationException: Cannot find a definition for the property named 'author' on the node at '/newfile.txt' with primary type 'nt:file' and mixin types: []
              at org.modeshape.jcr.SessionCache$NodeEditor.setProperty(SessionCache.java:1046)
              at org.modeshape.jcr.SessionCache$NodeEditor.setProperty(SessionCache.java:971)
              at org.modeshape.jcr.AbstractJcrNode.setProperty(AbstractJcrNode.java:1667)

      =======================================================================

       

       

       

         Anybody can help out how do I create new properties to attach (as metadata) to the file (nt:file) ? I strongly believe that modeshape is a fantastic technology but only if we know how to use it properly . Thanks .

       

      Rgds,

      Danny

        • 1. Re: How to use ModeShape 2.2.0 to store custom filesystem metadatas like JackRabbit
          rhauch

          First, the graph API is a low-level API, and apart from the fact that it consists of very different interfaces and methods, the major difference between the graph API and the JCR API is that the graph API does not know about node types and therefore does no validation. It will happily set a "jcr:primaryType" to a date value. So, when using the graph API to create or update content that will be accessed by someone via the JCR API, you are responsible for ensuring that the content will be considered valid when coming from JCR. Some things, like default values, will appear within the JCR layer because of the validation work it does. But for the most part, the content set through the graph API should be what the JCR user expects to see. I think this explains why you were able to use the graph API without it complaining about validity.

           

          Second, you didn't include any of your code that was creating the nodes, and by the errors I'm guessing that you were creating nodes with a primary type of 'nt:unstructured'. That's not really allowed by JCR when using 'nt:file' and 'nt:folder' nodes. All JCR implementations (even Jackrabbit and ModeShape) restrict what you can do once you enter the 'nt:file' or 'nt:folder' world, simply because the 'nt:file' and 'nt:folder' node types will restrict what kind of nodes, properties and children you're allowed to add. For example, let's say that you create a folder '/a/myFolder' of type 'nt:folder' as follows:

           

          (1)  Node a = session.getNode("/a");

          (2)  Node myFolder = a.addNode("myFolder","nt:folder");

           

          I'm assuming that the primary type and mixin types of Node a allow a child of type 'nt:folder'; something like 'nt:unstructured' of course will.

           

          Per the 'nt:folder' node type, the only children you can add under 'myFolder' are those with a primary type of 'nt:hierarchyNode' (the supertype of both 'nt:file' and 'nt:folder'). So you can do this:

           

          (3)  Node myFile = myFolder.addNode("myFile.txt","nt:file");


          But you cannot do this:

           

          (4)  Node somethingElse = myFolder.addNode("somethingElse","nt:unstructured");

           

          since the 'nt:unstructured' node type does not extend 'nt:hierarchy'. In fact, you can't even do this:

           

          (5)  Node somethingElse = myFolder.addNode("somethingElse");

           

          because the 'nt:folder' node does not define a default primary type for it's children. This error that you found is trying to say this:

          javax.jcr.nodetype.ConstraintViolationException: Unable to determine a valid node definition for the node "/{}FolderA/{}newfile.txt" in workspace "workspace1" of "JCR UFOs"

                  at org.modeshape.jcr.SessionCache$NodeEditor.createChild(SessionCache.java:1565)
                  at org.modeshape.jcr.AbstractJcrNode.addNode(AbstractJcrNode.java:1468)
                  at org.modeshape.jcr.AbstractJcrNode.addNode(AbstractJcrNode.java:1344)

           

          So you must specify the primary type as in line (3) above, and it must be either 'nt:file' or 'nt:folder' (or, the lone child of an 'nt:file' must be called 'jcr:content' and must have a primary type of 'nt:resource'). This is what the first error you listed is trying to explain (I think you're supplying "UFOs" as the primary type name (second parameter) to the 'addNode(...)' method:

          javax.jcr.RepositoryException: org.modeshape.graph.connector.RepositorySourceException: Primary type "UFOs" for path "nt:unstructured" in workspace "/" in workspace1 is not valid for the file system connector.  Valid primary types are nt:file, nt:folder, nt:resource, and dna:resouce.

                  at org.modeshape.jcr.SessionCache.save(SessionCache.java:412)
                  at org.modeshape.jcr.JcrSession.save(JcrSession.java:1346)

           

          The properties are also restricted based upon these node types, and neither 'nt:file' or 'nt:folder' allow you to add a property of any name - instead, there are only a few properties that are allowed.  You can, of course, add mixins to each node where the mixins define other (residual or non-residual) properties and even children of other node types.

           

          ModeShape's File System connector [1] is only able to store nodes with primary types of 'nt:file' or 'nt:folder', because it is essentially mapping every node onto a file or folder on the file system. In other words, every node in the repository (outside of '/jcr:system') backed by a FileSystemSource must have a primary type of 'nt:file' or 'nt:folder'.  (Note that this connector is not trying to persist any content on the local file system; it is literally mapping one-for-one the files and folders under a certain location on your file system into 'nt:file' and 'nt:folder' nodes. If you want to persist any content on your local filesystem, I suggest using the JPA connector with HSQLDB, and configure HSQLDB to store it's data files on your file system.)

           

          The File System connector also does not, out-of-the-box, allow you to store extra properties (defined via mixins) because it doesn't know where to store those extra properties. It does have an extension point that let's you define how to store and read those extra properties. This is the CustomPropertiesFactory [2], and using it is very simple (see [3] for an earlier discussion). Basically, the FileSystemSource will use your CustomPropertiesFactory to store and read those extra properties that your mixins would allow. The interface is pretty simple, and the JavaDoc does explain what each method is expected to do. Once implemented, simply set in your ModeShape configuration file the "customPropertiesFactory" property on the FileSystemSource to the name of your class. For example, using your configuration:

           

          <mode:source jcr:name="UFOs" mode:classname="org.modeshape.connector.filesystem.FileSystemSource"
                      mode:workspaceRootPath="C:/test/modeshape/ufoSource"
                      mode:defaultWorkspaceName="workspace1"
                      mode:creatingWorkspacesAllowed="false"

                      mode:updatesAllowed="true"

                      mode:customPropertiesFactory="com.acme.repository.MyCustomPropertiesFactory"/>

           

          I wish we had a default implementation that simply reads/writes these extra properties to a file, using a particular naming convention and using the graph API's ValueFactories (accessible via getValueFactories() method on the ExecutionContext passed into the CustomPropertiesFactory methods) to serialize and deserialize the properties. We just haven't had the time to do it yet.

           

          If you write one and want to contribute it, please do! I'd also be happy to answer any questions about how to write your own CustomPropertiesFactory implementation.

           

          I hope this helps!

           

          Best regards,

           

          Randall

           

          [1] http://docs.jboss.org/modeshape/latest/manuals/reference/html_single/reference-guide-en.html#file-system-connector

          [2] http://docs.jboss.org/modeshape/latest/api/org/modeshape/connector/filesystem/CustomPropertiesFactory.html

          [3] http://community.jboss.org/message/543466

          • 2. Re: How to use ModeShape 2.2.0 to store custom filesystem metadatas like JackRabbit
            penkween

            Hi Randall,

             

            Thank for your reply. Below is the codes used for creating new nt:folder and nt:file and the exception error encountered:

             

            ##########################################################################################

            [Creating New Folder]

            ##########################################################################################

            SecurityContextCredentials credential = new SecurityContextCredentials(securityContext);
            session  = repository.login(credential);

             

            Node root = session.getRootNode();
            Node folder= root.addNode("newfolder","nt:folder");

             


            dump(root);   //dump() is a custom function written to dump the repository nodes and its properties

             

            ============================Output of dump(root)================================

            /

            /jcr:primaryType = mode:root

            /jcr:uuid = cafebabe-cafe-babe-cafe-babecafebabe

            /jcr:system

            /newfolder

            /newfolder/jcr:createdBy = admin

            /newfolder/jcr:primaryType = nt:folder

            /newfolder/jcr:created = 2010-08-28T02:19:53.479+08:00

             

            ** Pls note that at this point, the new nt:folder "newfolder" only exist  in memory, not yet persisted to the filesystem **

            ============================================================================

             

             

            session.save();     //So, now trying to persist it to the filesystem

             

            ======================Error after session.save()====================================

            org.modeshape.graph.connector.RepositorySourceException: Attempt to set or update invalid property names: [jcr:createdBy]

            at org.modeshape.connector.filesystem.FileSystemSource$StandardPropertiesFactory.ensureValidProperties(FileSystemSource.java:696)
            at org.modeshape.connector.filesystem.FileSystemSource$StandardPropertiesFactory.recordDirectoryProperties(FileSystemSource.java:637)
            at org.modeshape.connector.filesystem.FileSystemWorkspace.putNode(FileSystemWorkspace.java:253)

            etc....

             

            ** Pls note inspite of this exception, interestingly, the new "newfolder" will still be created in the filesystem

            =============================================================================

             

            session.logout();

             

             

             

            ##########################################################################################

            [Creating  New File]

            ##########################################################################################

            SecurityContextCredentials credential = new  SecurityContextCredentials(securityContext);
            session  =  repository.login(credential);

             

            Node root = session.getRootNode();
            Node file= root.addNode("newfile.txt","nt:file");
            Node fileContent = file.addNode("jcr:content","mode:resource");
            fileContent.setProperty("jcr:data", "testing");
            fileContent.setProperty("jcr:mimeType","text/plain");

             

            dump(root);   //dump() is a custom function written to dump the  repository nodes and its properties

             

            ============================Output of  dump(root)============================

            /
            /jcr:primaryType = mode:root
            /jcr:uuid = cafebabe-cafe-babe-cafe-babecafebabe
            /jcr:system
            /newfile.txt
            /newfile.txt/jcr:createdBy = admin
            /newfile.txt/jcr:primaryType = nt:file
            /newfile.txt/jcr:created = 2010-08-28T02:41:18.035+08:00
            /newfile.txt/jcr:content
            /newfile.txt/jcr:content/jcr:primaryType = mode:resource
            /newfile.txt/jcr:content/jcr:mimeType = text/plain
            /newfile.txt/jcr:content/jcr:data = testing

             

            ** Pls note that at this point, the new nt:file "newfile.txt" only  exist  in memory, not yet persisted to the filesystem **

            ========================================================================

             

            session.save();     //So, now trying to persist it to the filesystem

             

            ============================Error after  session.save()=========================

            org.modeshape.graph.connector.RepositorySourceException: Attempt to set or update invalid property names: [jcr:createdBy]
            at org.modeshape.connector.filesystem.FileSystemSource$StandardPropertiesFactory.ensureValidProperties(FileSystemSource.java:696)
            at org.modeshape.connector.filesystem.FileSystemSource$StandardPropertiesFactory.recordFileProperties(FileSystemSource.java:652)
            at org.modeshape.connector.filesystem.FileSystemWorkspace.putNode(FileSystemWorkspace.java:173)
            at org.modeshape.graph.connector.base.PathWorkspace$PutCommand.apply(PathWorkspace.java:258)
            at org.modeshape.graph.connector.base.PathWorkspace.commit(PathWorkspace.java:192)
            at org.modeshape.graph.connector.base.PathTransaction$WorkspaceChanges.commit(PathTransaction.java:830)

            etc ...

             

            ** Pls note inspite of this exception, interestingly, a empty  "newfile.txt" file will still be created in the filesystem with 0 bytes file size

            =======================================================================

             

             

             

                        It seem like, an additional property "jcr:createdBy" will be attached automatically by the system when we are creating new nt:file or nt:folder type of Nodes and it will cause exception when we try to persist the nodes using session.save(). Things that I have tried is to manually remove the "jcr:createdBy" using ====> node.getProperty("jcr:createdBy").remove(); .Even though, now the "jcr:createdBy" is gone and can be removed from the memory (as shown by dump(root),   but still causing the EXACTLY THE SAME exception above during session.save() as if it has never been removed before. That means "jcr:createdBy = admin" seem like appear back during session.save() which previously seem is GONE after Property.remove().

             

                       I will have to sort this out before moving on to CustomPropertiesFactory implementation. Thank you for your help.

            • 3. Re: How to use ModeShape 2.2.0 to store custom filesystem metadatas like JackRabbit
              rhauch

              Hmm... that's definitely a bug. Not sure why we're not seeing that in our integration tests. But the connector should be dealing with the 'jcr:createdBy' property much better, and your code shouldn't have to do a thing.

               

              Would you care to file a new JIRA for this? I should be able to commit a fix into trunk in the next few days.

              • 4. Re: How to use ModeShape 2.2.0 to store custom filesystem metadatas like JackRabbit
                penkween

                Hi Randall,

                 

                Just created the Jira [MODE-866] at https://jira.jboss.org/browse/MODE-866 . I also have attached all the source file used =>  Main.java and its configRepository.xml with Jira and also here. Thank for your help.

                • 5. Re: How to use ModeShape 2.2.0 to store custom filesystem metadatas like JackRabbit
                  rhauch

                  Thanks, Danny! I'll update the issue and this thread when I have a fix.

                  • 6. Re: How to use ModeShape 2.2.0 to store custom filesystem metadatas like JackRabbit
                    rhauch

                    I just committed a fix to SVN in trunk. See MODE-866 for details (including the patch).

                     

                    I've marked the defect as resolved, since I was able to replicate it in a new integration test, and after the fix the test passes without error. If you test this and still have problems, please reopen the issue.

                     

                    Thanks! Can't wait to hear how this fix works for you.

                     

                    Best regards

                    • 7. Re: How to use ModeShape 2.2.0 to store custom filesystem metadatas like JackRabbit
                      penkween

                      Hi Randall,

                       

                      Thank for your quick fix and a detail article at http://modeshape.wordpress.com/2010/09/01/custom-properties-on-ntfile-and-ntfolder-nodes/ . Last week, I have learned some basic of CustomPropertiesFactory as adviced in your previous post. So, now together with this fix, we could store and fetch custom filesystem metadatas. But, I am not sure if I have used the right metadata persistence strategy inline with Modeshape strength.

                       

                              Basically, by using CustomPropertiesFactory, we override the behavior of writting and reading of each Nodes' properties and by doing so, we manage to add extra metadata like "author" to a file as custom properties. Below is some metadata persistence strategy idea:

                       

                      1. [Per file metadata]

                          [Root]

                           /Folder1

                                  /picture1.jpg

                                  /picture1.jpg._dat  (just simple key value pair eg. author="Danny" , to be used read/write by the CustomPropertiesFactory class)

                                  /picture2.jpg

                                  /picture2.jpg._dat  (just simple key value pair eg.  author="John29" , to be used read/write by the CustomPropertiesFactory  class)

                       

                      2. [Per directory metadata]

                          [Root]

                           /Folder1

                                  /picture1.jpg

                                  /picture2.jpg

                                  /filemetadata._dat  (just simple key value pair eg.  picture1.jpg.author="danny" , picture2.jpg.author="John29" to be read/write by the CustomPropertiesFactory  class)

                       

                       

                       

                              While this kind of persistence strategy may allow easy replication for scalability in multiple servers, will it suffer query performance issue ? let say, if i want to search for all files which the author is "danny". How could I twist it under Modeshape so that the node query performance will not be greatly affected?

                       

                              Or should I store the metadata(node's custom properties) using high performance distributed Key-Value cache store like Infinispan (using Modeshape infinispan connector) and make my CustomPropertiesFactory to read/write the file's metadata from Infinispan while fetching the file's content using nt:file resource?

                       

                      Thanks.

                      • 8. Re: How to use ModeShape 2.2.0 to store custom filesystem metadatas like JackRabbit
                        rhauch

                        The CustomPropertiesFactory mechanism is only used for persisting those properties not already defined on "nt:file" and "nt:folder". This will thus affect the reading and writing of this information to disk, but generally these extra properties will be small and have very little impact on performance. Plus, like you suggest, it makes it very easy to replicate.

                         

                        Executing queries never directly uses the connectors, but instead queries are executed by operating against internal Lucene indexes maintained by the engine. Certainly these indexes do need to be kept up to date as your content changes, but this is done automatically by ModeShape. Sometimes, this maintenance (initial populating or updates due to changes) may result in the engine reading the content via the connectors, and connector read performance will have an impact.

                         

                        Thus, using CustomPropertiesFactory will have no impact on query execution, and only negligible impact on index maintenance (assuming the extra properties are small and inexpensive to read and write).

                        • 9. Re: How to use ModeShape 2.2.0 to store custom filesystem metadatas like JackRabbit
                          penkween

                          Ok. I got what you mean. We shall implement and testing our project using Modeshape. With federation & connectors, it is simply fantastic. Thanks.