5 Minutes Tutorial for CMS Adapter
The CMS Adapter component acts as a bridge between content management systems and the Apache Stanbol. Please note that all components of Apache Stanbol also provides RESTful services which allow accessing them directly from outside. CMS Adapter interacts with content management systems through JCR and CMIS specifications. In other words, any content repository compliant with JCR or CMIS specifications can make use of CMS Adapter functionalities. For the time being, there are two main functionalities that CMS Adapter offers: "Bidirectional Mapping" and "Contenthub Feed".
Note: URLs given in the curl commands and link are valid as long as full launcher of the Stanbol is launched with the default configurations. In other words, it assumed that the root URL of the Stanbol is http://localhost:8080.
Session Management
To be able to use Contenthub features, it is necessary to get a session key beforehand. While obtaining this key, CMS Adapter caches a JCR/CMIS session to be used when the generated session key is passed in the subsequent operations that require interaction with the content repository. A session can key can be obtained through REST services as follows:
curl -X GET -H "Accept: text/plain" "http://localhost:8080/cmsadapter/session?repositoryURL=rmi://localhost:1099/crx&workspaceName=demo&username=admin&password=admin&connectionType=JCR"
In this example a session key is obtained for a JCR compliant repository. CMS Adapter use RMI protocol to get session from JCR repositories or it tries to access repository via a URL. So, RMI endpoint or URL of the repository is specified. Furthermore target workspace in the repository has been specified together with the username and password to access it. While accessing CMIS repositories AtomPub binding is used, so repository URL should be specified considering this protocol.
Apart from the retrieval of session key by providing one by one as in the RESTful example, Java API of CMS Adapter also allows obtaining a session key with an already available session object through the SessionManager service. Thus, this is a more convenient way while obtaining a session key using CMS Adapter through its Java API.
Bidirectional Mapping
This feature provides bidirectional mappings between JCR/CMIS compliant content repositories and external RDF data. Using this feature it is possible to generate RDF data from content repository or populate content repository based on the external RDF data.
The functionality described in this feature is realized by a two-step process. This process includes sequential execution of RDFBridge and RDFMapper services of CMS Adapter. Considering the update of content repository based on external RDF data, in the first step the given raw RDF data is annotated with standard terms by RDFBridge. There are a few terms that are described in the CMS Vocabulary section. RDFMapper processes the annotated RDF and update the content repository accordingly. From the other direction, in the first step content repository structure is transformed into RDF annotated with the CMS Vocabulary terms by RDFMappers. In the second step RDFBridges add implementation specific annotations.
From one perspective, bidirectional mapping feature makes possible to exploit open linked data, which is already available on the web, in content management systems. Apart from the already available RDF data on the web, any RDF data can be mapped to content repository. By mapping external RDF data existing content repository items can be updated or new ones created.
This services is also available through RESTful API. A curl command as follows will map the rdf data given in rdfData file to content repository. The mapping done according to configurations of default RDF Bridge implementation which can be changed through the Apache Stanbol CMS Adapter Default RDF Bridge Configurations entry of Configuration Panel Apache Felix Web Console. It is possible to define several configurations and all of these configurations are processed in transformation process. In case of no configuration, provided RDF is processed by RDFMapper directly.
curl -i -X POST --data-urlencode "serializedGraph@rdfData" --data "sessionKey=9ef42d3c-aaa3-494f-ba6f-c247a58ac2db" http://localhost:8080/cmsadapter/map/rdf
This blog post from October 2011 describes the process of adding knowledge from DBPedia to a Nuxeo content management system which is a CMIS compliant repository.
From the other perspective, thanks to this feature repository structure can be transformed into RDF through RESTful services as well. A curl command similar to following one can be used to map content repository structure into RDF format.
curl -i -X POST --data "sessionKey=9ef42d3c-aaa3-494f-ba6f-c247a58ac2db&baseURI=http://www.apache.org/stanbol/cms" http://localhost:8080/cmsadapter/map/cms
Note that during the mapping process of bidirectional mappings the same RDF Bridge configurations are used in both directions. Also, baseURI is a mandatory parameter that is used as the base URI of the RDF to be generated.
RDF representation of a content management system helps building semantic services on top of the existing system. Contenthub component of Apache Stanbol can be used to provide semantic indexing and search functionalities based on the RDF representation of content repositories. That is, the RDF representation is used as a resource that Contenthub uses to populate the underlying semantic index.
Contenthub Feed
Contenthub feed feature aims to manage content repository items within the Contenthub component of Apache Stanbol. The management process includes only two types of operations, submit and delete.
Submission and deletion operations can be done based on the identifiers of path of the content repository items. During the submission process, properties of content repository items are collected and they are stored along with the actual content. This makes possible faceted search over the properties of items.
RESTful API of CMS Adapter can be used submit content repository items to Contenthub.
curl -i -X POST --data "sessionKey=9ef42d3c-aaa3-494f-ba6f-c247a58ac2db&path=/contenthubfeedtest&recursive=true" http://localhost:8080/cmsadapter/contenthubfeed
The previous curl command submits all content repository items under the /contenthubfeedtest path to the Contenthub. In a similar way content item can be deleted from Contenthub. Following command deletes the content items corresponding to the content repository items under the /contenthubfeedtest path.
curl -i -X DELETE --data "sessionKey=9ef42d3c-aaa3-494f-ba6f-c247a58ac2db&path=/contenthubfeedtest&recursive=true" http://localhost:8080/cmsadapter/contenthubfeed
CMS Vocabulary
This vocabulary aims to provide a standardized mapping between content repositories and RDF data. This vocabulary includes a small number of essential terms to map an RDF data to a content repository. As well as general terms that are commonly used for both JCR and CMIS repositories there are also JCR or CMIS specific terms.
General Terms
- CMS_OBJECT: In a CMS vocabulary annotated RDF, if a resource has this URI reference as value of its rdf:type property, the subject of that resource represents a content repository item e.g a node in JCR compliant content repositories or an object in CMIS compliant content repositories.
- CMS_OBJECT_NAME: This URI reference represents the name of the content repository item.
- CMS_OBJECT_PATH: This URI reference represents the absolute path of the content repository item.
- CMS_OBJECT_PARENT_REF: This URI reference represents the item to be created as parent of the item having this property.
- CMS_OBJECT_HAS_URI: This URI reference represents the URI which is associated with the content repository item.
JCR Specific Properties
- JCR_PRIMARY_TYPE: This URI reference represents primary node of the content repository item associated with the resource within the RDF.
- JCR_MIXIN_TYPES: This URI reference represents the mixin type of the content repository item associated with the resource within the RDF.
CMIS Specific Properties
- CMIS_BASE_TYPE_ID: This URI reference represents the base type of the content repository item associated with the resource within the RDF.