This project has retired. For details please refer to its Attic page.
Apache Stanbol - FactStore Implementation Concept

FactStore Implementation Concept

The FactStore specification is written with a certain kind of implementation in mind. Although the implementation of the specification is not pretended it might be useful to have a look at this simple implementation concept.

Store Implementation

The store implementation is based on the well known concept of relational databases. Each fact schema is a table in a relational database. Creating a new fact schema is equivalent to creating a new table with a number of String attributes, because we store IRIs, according to the schema. For performance reasons the attributes should be indexed. The store just needs to be able to create new schemata. It is not specified that a schema may be altered over time. This could be an improvement for the future.

Query Implementation

The JSON-LD query structure is designed to be mapped directly to valid SQL statements. If the store is implemented in a relational database all queries can be transformed to SQL queries to this database. For security reasons it is important to keep hacks like SQL injection in mind when transforming the JSON-LD query to SQL.

As seen in the examples, queries may use attributes of entities to formulate the request. However, the FactStore does only store the IRIs of the entities not the entities with their attributes. Therefore, the FactStores needs an EntityHub to resolve entities specified by their attributes. The EntityHub must be able to query for entities by example.

Note: Depending on the number of entities returned by the EntityHub for a certain request this architecture may lead to performance problems. It has to be evaluated where the limit of this approach is in terms of performance. However, the assumption is that in many (or most) scenarios this will not become a problem. If it becomes a problem, the type of allowed queries may be restricted, e.g. don't allow queries that use entity attributes in the "where" clause, to avoid performance or memory problems.


Back to FactStore