This project has retired. For details please refer to its Attic page.
Apache Stanbol - List Chain

List Chain

The ListChain creates the ExecutionPlan based on the exact order of the configured EnhancementEngines. This provides users with a simple possibility to configure the exact oder in which the referenced EnhancementEngines are called during the enhancement process of a content item. However the ListChain can not support parallel execution of engines - a considerable disadvantage in contrast to the GraphChain.

A typical usage scenario would be that users start of with configuring a ListChain and later optimize the execution by migrating functional configuration to a GraphChain.

Configuration

The property "stanbol.enhancer.chain.list.enginelist" is used to provide the list of engine names. This configuration MUST BE parsed as an array as string because the ordering of the configured entries is essential for the configuration.

In addition it is possible to define engines as optional. This allows to specify that the enhancement process should not fail if an engine is not active or fails while processing a content item.

The syntax to define an engine as optional is as follows below (Both variants make the execution of the engine with the name optional.):

<name>;optional
<name>;optional=true

The following figure shows the configuration dialog for ListChains as provided by the Apache Felix Web Console.

Configuration dialog for the ListChain

It is also possible to configure a ListChain by directly installing a configuration with the name "{classname}-{configName}.config". Note that the {configName} needs not to be the same as the name of the chain. The {configName} is just used by the OSGI environment to distinguish different configurations for {classname}.

To create the same configuration as in the above screenshot the file would need to look like this:

stanbol.enhancer.chain.name="list"
stanbol.enhancer.chain.list.enginelist=["metaxa;optional","langid","ner","dbpediaLinking"]

Enhancement Properties support

since 0.12.1

Starting from 0.12.1 the List Chain allows to configure EnhancementProperties

All EnhancementProperties configured with a Chain are written as RDF to the ExecutionPlan. Chain scoped properties are directly added to the ep:ExecutionPlan instance while chain and engine scoped properties are added to the ep:ExecutionNode of the according engine.

The following figure and listing provide an example

ListChain including some Enhancement Properties

The figure shows the definition of two chain and engine scoped and one chain scoped enhancement properties. First the maximum number of suggestions are set on a chain scope to 5. This is overridden by a specific configuration of the dbpedia-fst engine that thats this value to 10 for this engine. Finally the dereferenced languages are set to English, German and French for the dbpedia-dereference engine.

The following listing shows the exact same configuration in the .cfg format.

stanbol.enhancer.chain.name="list"
stanbol.enhancer.chain.list.enginelist=["tika;optional","langdetect","opennlp-sentence","opennlp-token","opennlp-pos","opennlp-chunker",
    "dbpedia-fst;\ enhancer.max-suggestions\=10",
    "dbpedia-dereference;\ enhancer.engine.dereference.languages\=en,de,fr"]
stanbol.enhancer.chain.chainproperties=["enhancer.max-suggestions\=5"]

Calculation of the ExecutionPlan

The ExecutionPlan is created based on the exact order of the EnhancementEngines provided by the "stanbol.enhancer.chain.list.enginelist" property. The configuration MUST contain at least a single engine. In addition no engine MUST be mentioned twice.