Execution Plan
The ExecutionPlan is represented as an RDF graph following the ExecutionPlan ontology. It needs to be provided by the Enhancement Chain and is used by the EnhancementJobManager to enhance ContentItems and to write the ExecutionMetadata.
ExecutionPlan Ontology
The RDFS schema used for the execution plan is defined as follows:
- Namespace: ep : http://stanbol.apache.org/ontology/enhancer/executionplan#
- ep:ExecutionPlan : Represents an execution plan defined by all linked execution nodes.
- ep:hasExecutionNode (domain: ep:ExecutionPlan; range: ep:ExecutionNode; inverseOf: ep:inExecutionPlan): links the execution plan with all the execution nodes.
- ep:chain (domain: ep:ExecutionPlan; range: xsd:string): The name of the chain this execution plan is used for.
- ep:ExecutionNode : Class used for all Nodes representing the execution of an Enhancement Engine.
- ep:inExecutionPlan (domain: ep:ExecutionNode; range: ep:ExecutionPlan ;inverseOf: ep:hasExecutionNode): functional property that links the execution node with an execution plan
- ep:engine (domain: ep:ExecutionNode; range: xsd:string): The property is used to link to the Enhancement Engine by the name of the engine.
- ep:dependsOn (domain: ep:ExecutionNode; range: ep:ExecutionNode) Defines that the execution of this node depends on the completion of the referenced one.
- ep:optional (domain: ep:ExecutionNode; range: xsd:boolean) Can be used to specify that the execution of this EnhancementEngine is optional. If this property is set to TRUE an engine will be marked as executed even if it execution was not possible (e.g. because an engine with this name was not active) or the execution failed (e.g. because of the Exception).
Note: the data for the ep:ExecutionPlan and the ep:hasExecutionNode/ep:inExecutionPlan typically need not to be parsed as configuration of a Chain. This information are typically automatically added based on the assumption that all ep:ExecutionNode parsed in the configuration for a chain are member of the execution plan for such a chain. Therefore, this information is typically added by the chain itself when the configuration is parsed and validated.
Example
This example shows an ExecutionPlan with the nodes for the "langId", "ner", "dbpediaLinking" "geonamesLinking" and "zemanta" engine. Note that this names refer to actual EnhancementEngine Services registered with the current OSGI Environment.
This example assumes that
- "langId" is the singleton instance of LangIdEnhancementEngine
- "ner" is the default instance of the NamedEntityExtractionEnhancementEngine
- "dbpediaLinking" is an instance of the NamedEntityTaggingEngine configured to use the dbpedia.org ReferencedSite of the Entityhub
- "geonamesLinking" is an instance of the NamedEntityTaggingEngine configured to use the geonames.org ReferencedSite
- "zemanta" is the singleton instance of the ZemantaEnhancementEngine
The RDF graph of such a chain would look
urn:execPlan rdf:type ep:ExecutionPlan ep:hasExecutionNode urn:node1, urn:node2, urn:node3, urn:node4, urn:node5 ep:chain "demoChain" urn:node1 rdf:type stanbol:ExecutionNode ep:inExecutionPlan urn:execPlan ep:engine langId urn:node2 rdf:type ep:ExecutionNode ep:inExecutionPlan urn:execPlan ep:dependsOn urn:node1 ep:engine ner urn:node3 rdf:type ep:ExecutionNode ep:inExecutionPlan urn:execPlan ep:dependsOn urn:node1 ep:engine dbpediaLinking urn:node4 rdf:type ep:ExecutionNode ep:inExecutionPlan urn:execPlan ep:dependsOn urn:node1 ep:engine geonamesLinking urn:node5 rdf:type ep:ExecutionNode ep:inExecutionPlan urn:execPlan ep:engine zemanta ep:optional "true"^^xsd:boolean
This plan defines that the "langId" and the "zemanta" engine do not depend on anything and can therefore be executed from the start (even in parallel if the JobManager execution of these chains supports this). The execution of the "ner" engine depends on the extraction of the language and the execution of the entity linking to dbpedia and geonames depends on the "ner" engine. Note that the execution of the "dbpediaLinking" and "geonamesLinking" could be also processed in parallel.
ExecutionPlan Utility
The Enhancer MUST also define an utility that provides the following:
/** Getter for the list of executable ep:ExecutionNodes */ + getExecuteable(Graph executionPlan, Set<NonLiteral> completed) : Collection<NonLiteral>
This method takes an execution plan and the list of already executed nodes as input and return the list of ExecutionNodes that can be executed next. The existing utility methods within the EnhancementEngineHelper can be used to retrieve further information from the ex:ExecutionNodes returned by this method.
The code using this utility will look like this (pseudo code):
Graph executionPlan = chain.getExecuctionPlan(); Map<String, EnhancementEngine> engines = enhancementEngineManager.getActiveEngines(chain); Collection<NonLiteral> executed = new HashSet<NonLiteral>(); Collection<NonLiteral> next; while(!(next = ExecutionPlanUtils.getExecuteable(plan, executed)).isEmpty()){ for(NonLiteral node : next){ EnhancementEngine engine = engines.get( EnhancementEngineHelper.getString(executionPlan,node, EX_ENGINE)); Boolean optional = EnhancementEngineHelper.get( executionPlan,node,EX_OPTIONAL,Boolean.class,literalFactory); /* Execute the Engine */ completed.add(node); } }