This project has retired. For details please refer to its Attic page.
Apache Stanbol - Getting Started

Getting Started

This tutorial targets developers, who want to enrich unstructured textual content with "named entity" tags (locations, persons or organizations such as "Paris", "Barack Obama", "BBC"). Apache Stanbol can provide such enhancements together with links to public (e.g. DBpedia) or private (e.g. an enterprise specific terminology) repositories.

Build and run your Apache Stanbol instance

To build Apache Stanbol from source you need Java 6 and maven 3.0.3 + (version as defined in the pom). You probably need also:

% export MAVEN_OPTS="-Xmx1024M -XX:MaxPermSize=256M"

Fetch the sources from the Apache Stanbol code repository

% svn co http://svn.apache.org/repos/asf/stanbol/trunk stanbol

From the source directory run

% mvn clean install

Run the stable launcher of Apache Stanbol from your local server machine from the your local directory {root}/stanbol/launchers/ with

% java -Xmx1g -jar stable/target/org.apache.stanbol.launchers.stable-{snapshot-version}-SNAPSHOT.jar

Your instance runs within the stanbol/sling/ directory and is accessible at

http://localhost:8080

Post content item, get an enhancement graph

Goto the local HTTP web endpoint

http://localhost:8080/enhancer

This stateless interface allows the caller to submit content to the Apache Stanbol enhancer engines and get the resulting enhancements formatted as RDF at once without storing anything on the server-side.

Simply copy arbitrary english textual content into the input field and get back the enhancements for Bob Marley and Paris together with the enhancement graph. If you want to work with the REST interface directly, you may also post the text with the cURL command below. The resulting enhancement RDF will be in turtle notation.

% curl -X POST -H "Accept: text/turtle" -H "Content-type: text/plain" \
     --data "The Stanbol enhancer can detect famous cities such as Paris and people such as Bob Marley." \
     http://localhost:8080/enhancer

Configuration

The "default" enhancement chain includes the following, by default active Enhancement Engines:

You can use the OSGI console (http://{yourdomain}:{port}/) (user/pwd: admin/admin) of your running Stanbol instance to activate and configure additional engines. Additional engines provide support keyword extraction together with a better language support, for geonames, zemanta or opencalais. See the overview of available Apache Stanbol Enhancement Engines.

Another feature of this Apache Stanbol version is to manage and locally cache external entity repositories such as DBpedia as well as the possibility to use custom vocabularies as linking target repositories. Read more about this scenario using custom vocabularies.

Advanced: Explore Apache Stanbol "full" launcher

The full (including experimental) features of Apache Stanbol can be accessed via Apache Stanbol's "full launcher". See the list of all available components and their features.

To start the full launcher, you just have to execute its JAR via the following command:

$ java -Xmx1g -XX:MaxPermSize=256m \
       -jar full/target/org.apache.stanbol.launchers.full-{snapshot-version}-SNAPSHOT.jar

To start the full launcher, you just have to execute its WAR via the following commands:

$ export MAVEN_OPTS="-Xmx1g -XX:MaxPermSize=256m"
$ cd launchers/full-war
$ mvn clean package tomcat7:run

Your instance is then available on localhost:8080/stanbol.