This project has retired. For details please refer to its Attic page.
Apache Stanbol - Refactor

Refactor

The Refactor is a service which allows to interpret rules in order to perform refactoring of RDF graphs. For the refactoring the set of rules in the recipes are interpreted and run as SPARQL CONTRUCT in which the where clause is derived from the body of the rule and the construct clause is derived from the head of the rule. The output of a refactoring is a transformed graph which satisfies the constraints expressed in the rules. The refactoring in useful for tasks of semantic harmonization of RDF graphs expressed with different ontologies/vocabularies towards their representation with a single ontology or vocagulary. The output of a refactoring is a transformed graph which satisfies the constraints expressed in the rules.

Terminology

Usage Scenarios

Vocabulary harmonization

Supposing we want to use some dataset in Linked Data as external knowledge bases for a generic CMS enhanced with Stanbol. Now the problem how to use data from those datasets expressed with some external and heterogeneous vocabularies or ontologies within the CMS has. Furthermore the CMS has its own way to formalize knowledge, namely the its Ontology Network managed by Stanbol OntoNet. The solution is provided by Refactor which allows to interpret the rules of inference as refactoring rules in order harmonize external data to the Stanbol's ontologies. Figure 1 gives a very quick idea about how the CMS can benefit from the Refactor showing how external data can be aligned and used within the CMS.

Vocabulary harmonization via Stanbol Refactor
Figure 1: the refactor is used to align external data to the ontologies used in a generic CMS enhanced with Stanbol.

We can specify a concrete scenario for a better understanding of the Refactor. Suppose we have configured the CMS (i.e. Stanbol EntityHub) in order to fetch entities about persons from DBpedia. Now we want to represent these entities adopting the vocabulary from schema.org and produce schema.org Rich Snippets in order to provide search engine optimization capabilities to the CMS. What we need to do is to write a recipe and call the Refactor via HTTP REST passing to it the recipe itself and the entities we have fetched from Linked Data.

Features

In the Refactor rules are interpreted as SPARQL CONSTRUCT queries in which the premises (the left part before the arrow in the rule) are the WHERE clause, while the conclusion (the right part after the arrow in the rule) is translated into the construct template, i.e., triple patterns in conjunctive form.

As an example, we can take in account the following rule:

prefix kn = <http://foo.org/kinship#> . 
uncleRule[ has(kn:parent, ?x, ?y) . has(kn:sibling, ?y, ?z) -> has(kn:uncle, ?x, ?z) ]

The rule above is transformed into the following SPARQL CONSTRUCT query:

PREFIX kn: <http://foo.org/kinship#>
CONSTRUCT { ?x kn:uncle ?z }
WHERE { 
    ?x kn:parent ?y .
    ?y kn:sibling ?z
}

The SPARQL engines used internally by the Refactor for running rules is Apache Jena ARQ

We remand any detail about the syntax and the expressivity of the Stanbole Rule language to its section.

Service Endpoints

The Refactor RESTful API is structured as follows: (Please note, that the following links to the actual service endpoint link to a running instance of Apache Stanbol. If you use other domains or ports than "localhost:8080", then please change accordingly)

Refactor Engine ("/refactor"):

The request should be done as it follows:

Example:

curl -G -X GET \
-d input-graph=stored_graph -d recipe=myTestRecipeA -d output-graph=result_graph \
http://localhost:8080/refactor

Refactor Engine ("/refactor/apply"):

The request should be done as it follows:

Example:

curl -X POST -H "Content-type: multipart/form-data" \
-H "Accept: application/rdf+json" \
-F recipe=recipeTestA -F input=@graph.rdf \
http://localhost:8080/refactor/apply

Back to Stanbol Rules