Ontology Registry Manager
Registry management is a facility for Stanbol administrators to pre-configure sets of ontologies that Stanbol should load and store, or simply be aware of, before they are included in a part of the ontology network (e.g. a scope or session). Via the registry manager, it is possible to configure whether these ontologies should be loaded immediately when Stanbol is initialized, or only when explicitly requested. The Ontology Registry Manager is essentially an ontology bookmarker with caching support. It is also possible to cache multiple versions of the same ontology if needed.
Terminology
- A Library is a collection of references to ontologies, which can be located anywhere on the Web. CMS administrators and knowledge managers can create a library by any criterion, e.g. a library of all W3C ontologies, a library of all the ontologies that describe a social network (which can include SIOC, FOAF etc.), a library of ontology alignments (which includes ontologies that align DBPedia to Schema.org, GeoNames to DBPedia, or a custom product ontology to GoodRelations).
- A Registry is an RDF resource (i.e. an ontology itself) that describes one or more libraries. It is the physical object that has to be accessed to gain knowledge about libraries.
Usage Scenarios
Your CMS ca handle hundreds of vocabularies together for semantic annotation, but you do not want to clutter the system runtime by having all those vocabularies loaded altogether. You would like the ontology of a specific vocabulary to be loaded only when someone uses or requests it, and once it is loaded you would like it to be stored internally, rather than fetching it from the Web over and over again.
The Ontology Registry Manager can help you with that. With a simple RDF document that references these hundreds of ontologies, it is possible to organize them into libraries, e.g. by topic ("user profile", "product", "event") or by provenance ("W3C", "Industrial standards", "nonstandard" etc.). If a user decides to annotate a content item using schema.org
, she can choose to do so and the Registry Manager will automatically figure out that it is referenced by libraries "Industrial standards", "user profile" and "product"). None of these libraries has been preloaded yet, so Stanbol will automatically choose the smallest one, say "Industrial standards", and load it. The schema.org
ontology will be available from that point on.
Configuration
Ontology registries (and, by extension, the libraries they reference) are configured by the Stanbol administrator via the Felix Web console. Note that the following links assume Stanbol to be deployed on http://localhost:8080 .
- Go to the Felix console Configuration Manager and select Apache Stanbol Ontology Registry Manager (or follow this direct link)
- Under Registry locations you can add or remove the physical URIs of ontology registries, i.e. RDF resources that contain knowledge about ontology libraries.
- If you wish all the registry contents to be loaded altogether on startup, uncheck the lazy ontology loading box.
- You can select one Caching policy between Centralised (default) and Distributed. In Centralised caching, all the libraries that reference an ontology with the same URI will share the very same version of that ontology. In Distributed caching, each library will reference its own version of each ontology. Centralised caching is generally recommended, as distributed caching allows multi-version ontology management, but occupies much more storage space, depending on the amount of ontologies in common.
Usage
Setup an Ontology Registry
To create a Registry, you simply need to make an OWL ontology with certain types of axioms. See Registry Language for examples on how to create a Registry and add Library instances to it.
Then upload the ontology on the Web and add it to the Registry locations from the Felix console Configuration.
Note that not only can a single registry describe multiple libraries, but also multiple registries can describe the same library, each adding information on the ontologies referenced by it. Library descriptions are monotonic, in that registries can only add information about libraries, never remove any.
Access a cached ontology
A cached ontology does not need to be loaded into an OntoNet scope or session in order to be available. It can be accessed at @/ontonet/{ontologyURI}
, where {ontologyURI}
can be either the ontology ID, if the ontology is named, of the physical URI it was retrieved from.
Load a library into OntoNet
One useful application of ontology libraries is that they can be used to populate an OntoNet ontology collector (space or session) with multiple ontologies with a single command. There are two ways to do so.
Java API
The Registry Manager is an OSGi service component. To obtain the service reference:
@Reference RegistryManager registryManager;
Loading the contents of a library into an ontology collector is done in the same way as with loading single ontologies, i.e. by creating a suitable OntologyInputSource
. This time, though, it is a special input source called LibrarySource
. A LibrarySource
creates a new blank ontology (or uses an existing one upon request) and appends all the contents of a library to it via owl:imports
statements.
Suppose we want to load the contents of a library called http://stanbol.apache.org/ontologies/library/W3C_ontologies
into a scope called AnnotationStandards
that already exists (so we'll have to use its custom space). First of all let us get a hold of the space:
@Reference ONManager onManager; OntologyScope scope = onManager.getScopeRegistry().getScope("AnnotationStandards"); OntologySpace space = scope.getCustomSpace();
Now create the LibrarySource
:
IRI libraryID = IRI.create("http://stanbol.apache.org/ontologies/library/W3C_ontologies"); OntologyInputSource<?,?> libSrc = new LibrarySource(libraryID, registryManager);
Note that all LibrarySource
constructors require that a RegistryManager
be passed to them. This is because ontology input sources in general are not OSGi components and cannot trade service references.
Also note that building a LibrarySource
will cause the contents of the corresponding library to be loaded and stored into the ontology manager, assuming the lazy loading policy option is set and the library had not been loaded earlier. Creating library sources is one way to "touch" an ontology library and get the Registry Manager to load it.
Finally, add the input source to the custom space. Simple as that:
space.addOntology(libSrc);
Note that we called addOntology()
although this resulted in adding multiple ontologies. This is because a LibrarySource
"fools" OntoNet into thinking a single ontology is being loaded, i.e. the root ontology that depends on the library contents. It will still possible to access a single imported ontology, though.
REST API
When using the REST API, an ontology library can be loaded into a scope the very moment the scope is created.
To load the contents of library http://stanbol.apache.org/ontologies/library/W3C_ontologies
into a new scope called AnnotationStandards
:
curl -X PUT http://localhost:8080/ontonet/ontology/AnnotationStandards?corereg=http://stanbol.apache.org/ontologies/library/W3C_ontologies
Load selected ontologies from library
We are working on that. Stay tuned.
Service Endpoints
Because the configuration of registries is a task for administrators and is performed through the Felix administration console, there are no RESTful services for modifying ontology registries. It is however possible to browse the known ontology libraries.
- Libraries. If called from a Web browser, shows a list of known ontology libraries, and for each of them the caching state and the number of referenced ontologies. Note that this service does not provide information on the registries that describe these libraries; that is classified information for administrators. This endpoint supports only GET with no parameters, and generates text/html only.