This project has retired. For details please refer to its Attic page.
Apache Stanbol - The Named Entity Extraction Engine

The Named Entity Extraction Engine

This engine detects named entities from unstructured text. It is implemented based on Natural Language Processing (NLP) features of the Apache OpenNLP (incubating). It uses the maximum entropy models to detect persons, names and organizations.

Example Result

This engine adds fise:TextAnnotation for the text "The Stanbol enhancer can detect famous cities such as Paris and people such as Bob Marley.", (amongst other) the following information to the enhancement graph, suggesting Bob Marley (of type: Person) for the string "Bob Marley":

{
  "@subject": "urn:enhancement-b3d4617d-1760-0374-f471-e0e746003f4e",
      "@type": [ "enhancer:Enhancement","enhancer:TextAnnotation"],
      "dc:created": "2012-02-29T11:34:56.369Z",
      "dc:creator": "org.apache.stanbol.enhancer.engines.opennlp.impl.NEREngineCore",
      "dc:type": "dbp-ont:Person",
      "enhancer:confidence": 0.94647044,
      "enhancer:end": 59,
      "enhancer:extracted-from": "urn:content-item-sha1-37c8a8244041cf6113d4ee04b3a04d0a014f6e10",
      "enhancer:selected-text": "Bob Marley",
      "enhancer:selection-context": 
      "The Stanbol enhancer can detect famous Entities such as Paris or Bob Marley.",
      "enhancer:start": 69
}

The following figure provides a visual representation of the above graph

'fise:TextAnnotation'

See the documentation of the Enhancement Structure for details.