Apache Stanbol Components
Apache Stanbol is built as a modular set of components. Each component is accessible via its own RESTful web interface. From this viewpoint, all Apache Stanbol features can be used via RESTful service calls.
Components do not depend on each other. However they can be easily combined if needed as shown by the different usage scenarios. This ensures that the list of used components depend on the specific usage scenario and not on the Apache Stanbol architecture.
All components are implemented as OSGi bundles, components and services. By default Apache Stanbol uses the Apache Felix OSGi environment. However generally we try to avoid the use of Felix specific features. If you need to run Stanbol in an other OSGi environment an encounter problems tell us by opening a JIRA issue and/or asking about it on the Stanbol Developer mailing list.
For deployment Stanbol uses the Apache Sling launcher. While the Stanbol Community maintains different launcher options including run-able JARs and WAR files we expect users to configure their custom launchers optimized for their usage scenario. However it os also possible to us Stanbol with other launchers (such as Apache Karaf) or to add its bundles to any existing OSGi environment.
Figure 2 depicts the main Apache Stanbol components and their arrangement within the Apache Stanbol architecture.
- The Enhancer component together with its Enhancement Engines provides you with the ability to post content to Apache Stanbol and get suggestions for possible entity annotation in return. The enhancements are provided via natural language processing, metadata extraction and linking named entities to public or private entity repositories. Furthermore, Apache Stanbol provides a machinery to further process this data and add additional knowledge and links via applying rules and reasoning. Technically, the enhancements are stored in a triple-graph that is maintained by Apache Clerezza.
- The 'Sparql endpoint' gives access to RDF graphs of Apache Stanbol. This especially includes the graph with all enhancement results managed by the Apache Stanbol Contenthub.
- The 'EnhancerVIE' is a stateful interface to submit content to analyze and store the results on the server. It is then possible to browse the resulting enhanced content items.
- The Rules component provides you with the means to refactor knowledge graphs, e.g. for supporting the schema.org vocabulary for search engine optimization.
- The Reasoner can be used to automatically infer additional knowledge. It is used to obtain new facts in the knowledge base, e.g. if your enhanced content tells you about a shop located in "Montparnasse", you can infer via a "located-in" relation that the same shop is located in "Paris", in the "Île-de-France" and in "France".
- The Ontology Manager is the facility that manages your ontologies. Ontologies are used to define the knowledge models that describe the metadata of content. Additionally, the semantics of your metadata can be defined through an ontology.
- The CMS Adapter CMS Adapter component acts as a bridge between JCR/CMIS compliant content management systems and Apache Stanbol. It can be used to map existing node structures from JCR/CMIS content repositories to RDF models or vica versa. It also provides services for the management of content repository items called content items within the Contenthub.
- The Entityhub is the component, which lets you cache and manage local indexes of repositories such as DBPedia but also custom data (e.g. product descriptions, contact data, specialized topic thesauri).
- The Contenthub is the component which provides persistent document store whose back-end is Apache Solr. On top of the store, it enables semantic indexing facilities during text based document submission and semantic search together with faceted search capability on the documents.
- The FactStore is a component that let's use store relations between entities identified by their URIs. This relation between two entities is called a fact.