This project has retired. For details please refer to its Attic page.
Apache Stanbol - The OpenCalais Enhancement Engine

The OpenCalais Enhancement Engine

The OpenCalais Enhancement Engine provides an interface to the OpenCalais Webservice for Named Entity Recognition (NER).

Technical description

The engine will send the text of content item to the OpenCalais service and retrieve the NER annotations in RDF format. The OpenCalais annotations are added to the content item's metadata as specified by the Stanbol Enhancement Structures.

The engine natively supports the mime types text/plain and text/html. Additionally, text can be processed that is provided in the content item's metadata as value of the property

http://www.semanticdesktop.org/ontologies/2007/01/19/nie#plainTextContent

Supported languages are

Requirements for use and configuration options

The use of this component requires an API key from OpenCalais. Without providing an API key, the engine will not do anything. Such a key can be obtained from http://www.opencalais.com/APIkey.

In the OSGi configuration the key is set as value of the property

org.apache.stanbol.enhancer.engines.opencalais.license

Also, the unit tests require the API key. Without the key some tests will be skipped. For Maven the key can be set as a system property on the command line:

mvn -Dorg.apache.stanbol.enhancer.engines.opencalais.license=YOUR_API_KEY [install|test]

The following configuration properties are defined:

Usage

Assuming that the Stanbol endpoint with the full launcher is running at

http://localhost:8080

the license key has been defined and the engine is activated, from the command line commands like this can be used for submitting some text file as content item:

Alternatively, the Stanbol web interface can be used for submitting documents and viewing the metadata at

http://localhost:8080/contenthub