This project has retired. For details please refer to its Attic page.
Apache Stanbol - Stanbol Enhancer Stress Test Utility

Stanbol Enhancer Stress Test Utility

As of STANBOL-670 Apache Stanbol provides an utility that allows users to stress test the Stanbol Enhancer by using multiple concurrent requests. This might be useful for both:

In addition this Utility also provides some statistics including

Usage

This utility is part of the Apache Stanbol Integration tests and is also run during normal builds against the default chain of the Stanbol Enhancer. As any integration test it can be also run standalone and against Stanbol Servers running at a configured URL.

To use this tool you need to checkout and build Apache Stanbol and than change to the {stanbol-source}/integration-tests directory. Within this directory one can now call this utility using

mvn -o test -Dtest.server.url={stanbol-server} -Dtest=MultiThreadedTest

this will make 500 requests with 5 concurrent threads on the {stanbol-server} using DBpedia.org abstracts as content. The integration-test includes up to 10000 those abstracts that can be used for testing.

This utility can be configured using the following system properties:

Here is an example that makes extensive use of custom options:

mvn -o test -Dtest=MultiThreadedTest \
    -Dstanbol.it.multithreadtest.data=/stanbol/test/data/stanbol-test-data.txt.gz \
    -Dstanbol.it.multithreadtest.requests=10000 \
    -Dstanbol.it.multithreadtest.threads=20 \
    -Dstanbol.it.multithreadtest.rdf-format=text/turtle \
    -Dtest.server.url=http://www.example.org:8080/stanbol

NOTES:

Supported Test Data Formats

This tool supports two different test data formats and also is able to read compressed filed. The following three sub sections provide detailed information.

Plain Text Files

All test data are within a single text file. Single texts are separated by two (or more) empty lines.

The following example includes three content items:

Astronomers discover largest star on record\n
\n
European astronomers have discovered the largest star yet on record; 
it is 300 times the mass of our sun, beyond the previously accepted 
limit of 150 solar masses.\n
\n
Paul Crowther, professor of astrophysics at […]\n
\n
\n
Australian election debate moved to avoid clash with cookery show\n
\n
A televised debate between Australia's candidates for Prime Minister […]\n
\n
\n
The Only Joy In Town\n
\n
by Joni Mitchell\n   
\n
I want to paint a picture\n
Botticelli * style\n
Instead of Venus on a clam *\n
I'd paint this flower child\n

Plain text test data are read sequentially from the provided source. This ensures that only ~100 content items are loaded into memory at any given time. So this is the preferred option for large test data sets.

Text files can recognized by the file ending "txt" to the parsed resource. For resources with other engines the property 'stanbol.it.multithreadtest.media-type=text/plain' must be specified. If the test data are not encoded using 'UTF-8' the charset MUST BE parsed by using the 'charset' parameter (e.g. 'stanbol.it.multithreadtest.media-type=text/plain;charset=iso-8859-7').

RDF data

The tool also allows to use RDF graphs as test data. This is mainly because in a lot of cases it is the easiest to use RDF dumps of public datasets - such as DBpedia.org - for testing. Users need to be aware that RDF data are imported into an in-memory graph.

Content Items are extracted by

Supported RDF formats and mapped file endings:

If you want to use a different file ending you need to parse the Media-Type using the 'stanbol.it.multithreadtest.media-type' property

Support for compressed test data

Bot plain text and RDF data can be efficiently compressed. Because of that this utility also supports compressed files. The compression format is detected by the file ending.

Supported are

Compressed files need to use double endings (e.g. 'test-data.txt.gz' or 'test-data.rdf.bz2').