This project has retired. For details please refer to its Attic page.
Apache Stanbol - EnhancementJobManager

EnhancementJobManager

The EnhancementJobManager is the component responsible for the execution of the ExecutionPlan as provided by the Enhancement Chain on the ContentItem.

EnhancementJobManager interface

The interface of the EnhancementJobManager is very simple:

/** Enhances the content item by using the default Chain */
+ enhanceContent(ContentItem ci)
/** Enhances the content item by using the parsed Chain */
+ enhanceContent(ContentItem ci, Chain chain)

Note: the parsed ContentItem will be changed during the enhancement process. EnhancementEngines will add extracted knowledge to the metadata of the content item. Also additional content parts may be added to the ContentItem.

Enhancement Process

Enhancement Job Manager Overview

While the ExecutionPlan defines what EnhancementEngines are used and how they depend on each other, the EnhancementJobManager is responsible for the actual execution of the enhancement process based on this plan. This section provides detailed information about requirements and expectations that MUST BE considered.

The EnhancementJobManager is also responsible to create and update the ExecutionMetadata in the metadata of the processed ContentItem. Details about this are provided in the section "Creation/Management of ExecutionMetadata" of the ExecutionMetadata documentation.

Initializing the Enhancement Process

Here one needs to distinguish two cases:

  1. Initialization of an new Enhancement process and
  2. Continuation of an existing Enhancement process.

The two cases can be easily detected by the EnhancementJobManager by evaluating if a content part with the URI "http://stanbol.apache.org/ontology/enhancer/executionMetadata#ChainExecution" is present within the parsed ContentItem.

In the first case the ExecutionPlan to be used by the enhancement process is provided by the Chain in a final graph that is guaranteed to be not changed. However because the configuration of a Chain might be changed at any time the EnhancementJobManager MUST retrieve the execution plan only once and use it during the entire enhancement process. In addition the ExecutionPlan MUST BE also added to the graph containing the EnahcementMetadata. In case of continuing on an previously aborted enhancement process the ExecutionPlan MUST BE initialized from the ExecutionMetadata provided by the ContentItem.

For details on how to initialize/load the execution metadata see the section "Creation/Management of ExecutionMetadata" of the ExecutionMetadata documentation.

Engine Execution

The ExecutionPlan provides the necessary information which EnhancementEngines can be executed at any given state. The following code shows how to determine executable engines. This code snippet assumes to be called after the execution of an EnhancementEngine has completed. Note that in a multi threaded environment access to the list of executed and running engines need to be synchronized.

Collection<NonLiteral> executed; //already executed Engines
Collection<NonLiteral> running; //currently running Engines

Collection<NonLiteral> next = ExecutionPlanUtils.getExecuteable(plan, executed);
for(NonLiteral node : next){
    if(!running.contains(node)){
        String engineName = EnhancementEngineHelper.getString(executionPlan,node, EX_ENGINE));
        EnhancementEngine engine = tracker.getEngine(engineName);
        if(engine != null){
            // execute engine
        } else {
           //check if optional and throw error if not
        }
    } // else already running -> ignore
}

NOTE that the NonLiterals contained in the two collections are 'ep:ExecutionNode' instances and NOT 'em:EngineExecution' instances. Each 'em:EngineExecution' instance in the ExecutionMetadata' is linked by the 'em:executionNode' property to the corresponding 'ep:ExecutionNode' of the ExecutionPlan.

Before executing an EnhancementEngine, the EnhancementJobManager needs to check if and how the engine can enhance a content item. This is indicated by the integer returned by the "canEnhance(ContentItem ci)" method:

If the execution of an EnhancementEngine completes, the JobManager needs to set the state of the execution to completed and update the execution metadata accordingly.

If a call to "computeEnhancement(ContentItem ci)" results in an Exception the EnhancementJobManager must mark the execution of the engine as failed with a decryption of the occurred exception. If the execution of the affected engine was optional, the enhancement process is continued. Otherwise the enhancement process needs to be stopped and the Error needs to rethrown by the "enhanceContent(..)" method.

For all the details on how to reflect state changes in the Execution metadata see this section of the documentation of the ExecutionMetadata.

Multi Threaded enhancement processes

In case the EnhancementJobManager supports to simultaneously call EnhancementEngines for the same content item in multiple threads, it is important to correctly use the ReadWriteLock as provided by the ContentItem.getLock() method.

There are many good examples on how to correctly use "java.util.concurrent.ReadWriteLock" available on the web.

Finalizing the EnhancementProcess

When the execution is completed (successfully or failed), the EnhancementJobManager need to ensure that the 'em:status' and the 'em:completed' of the 'em:ChainExecution' instance are set. If the execution failed also the 'em:statusMessage' should be available and contain a message that describes the problem.

EnhancementJobManager implementations

EnhancementJobManager implementations need to register itself as OSGI services. By default the Stanbol Enhancer will use the implementation with the highest service ranking. The service ranking can be set by providing a configuration defining an integer value for the property "service.ranking"

EventJobManager

This implementation is provided by the "org.apache.stanbol.enhancer.jobmanager.event" module and is currently used as default. It registers itself (by default) with a service ranking of '0'.

This implementation supports an asynchronous enhancement process by using the "org.osgi.service.event" framework.

WeightedJobManager

This JobManager was used as default before the introduction of EnhancementChains. It does not support EnhancementChains and will enhance parsed ContentItems by calling all currently active EnhancementEngines in a sequential manner. It does also not have support for EnhancementMetadata.

This implementation is provided by the "org.apache.stanbol.enhancer.jobmanager.weightedjobmanager" module and is no longer included within the Apache Stanbol launchers. This JobManager registers itself with a service ranking of "-1000". Users that want to use this job manager need to manually install this bundle and either deactivate other EnhancementJobManager implementations or reconfigure the service ranking of this one to an value > 0.