Enhancement Engines and their main features

This provides an overview about all Enhancement Engine implementations managed by the Apache Stanbol community.

Preprocessing

Natural Language Processing (NLP)

This does contain Engines the process textual content sent to the Stanbol Enhancer

Language Detection

Language detection engines add Language annotations as defined by STANBOL-613 to the metadata of the ContentItem

Sentence Detection

Sentence detection engines add Sentences to the AnalyzedText content part

Tokenizer Engines

The responsibility of Tokenizer Engines is to add Tokens to the AnalyzedText content part

Part of Speech (POS) Tagging

POS tagging engines do add Part-of-Speech annotations to Tokens present in the AnalyzedText content part

Chunk/Phrase detection

Chunker (or Phrase Detection) Engines do add detected Chunks to the AnalyzedText content part. They also annotate added Chunks with the type of the detected phrase

Named Entity Recognition (NER) Engines

NER engines need to write detected Named Entities as 'fise:TextAnnotation's to the metadata of the ContentItem. In addition they may also add NER annotations to Chunks in the AnalyzedText content part

Morphological Analysis

This includes Engines that perform some sort of morphological analyses (e.g. lemmatization)

General NLP processing Engines

Linking / Suggestions

This category covers enhancement engines that suggest Entities for features present in the parsed content. An Entity is an uniquely identified resource. Typically it provides (or links to) further information such as the type, a description (text, pictures, videos …), spatial and/or temporal context, links to other entities … .

Sentiment Analyses

This includes Engines that perform word/chunk level sentiment classifications on the AnalyzedText content part as well as Engines that summarize those lower level annotations to Sentiments for sentences, sections or the whole text. Sentiment summarizations are represented as 'fise:SentimentAnnotation's (TODO: not yet fully specified (see STANBOL-760).

Disambiguation

Enhancement Engines in this category can disambiguate Entities based on contextual information (e.g. if "Apple" in a sentence refers to the fruit or the company). Based on that such engines can adjust existing Entity suggestions or also create new one.

Postprocessing / Other