Skip to content
Joshua Shinavier edited this page Feb 9, 2018 · 14 revisions

LinkedDataSail logo|width=102px|height=100px

Introduction

LinkedDataSail is an implementation of the OpenRDF Sail API which provides a dynamic, uniform view of the Web of Data. Like the Semantic Web Client Library and Tabulator's AJAR library, LinkedDataSail gathers RDF data incrementally, dereferencing URIs in response to queries.

Features:

  • use of the StackableSail interface, which allows LinkedDataSail to be easily combined with various Semantic Web reasoners, triple stores, and mappings into other graph data models.
  • guaranteed thread-safety w.r.t. concurrent query answering, benefitting throughput in crawlers and other multi-threaded applications.
  • automatic persistence of cached Semantic Web data into a base Sail.
  • use of Named Graphs and compact caching metadata for human-friendliness and interoperability.
  • configurable URI dereferencers by protocol, allowing RDF views of non-HTTP resources.
  • configurable RDFizers by media type, allowing RDF views of information resources such as Web pages and Exif-compatible images.
  • configurable sets of black-listed file extensions and URI spaces, to cut down on needless requests.

Usage

To use LinkedDataSail in your Java application, you can use the standalone Ripple JAR (which includes LinkedDataSail) or build Ripple from source (see Running Ripple) and grab the LinkedDataSail JAR, or you can import LinkedDataSail using Maven:

    <dependencies>
        ...
        <dependency>
            <groupId>net.fortytwo</groupId>
            <artifactId>linked-data-sail</artifactId>
            <version>1.5</version>
        </dependency>
        ...
    </dependencies>

LinkedDataSail is always stacked on top of another, "base" Sail, which provides the storage layer for the aggregated RDF data. The base Sail can be any Sail implementation, such as MemoryStore or NativeStore. The simplest way to instantiate LinkedDataSail is to pass the base Sail to its constructor:

LinkedDataSail sail = new LinkedDataSail(baseSail);
sail.initialize();

You can also customize the Sail with specialized URI mappings, URI dereferencers, or RDFizers. For example:

// Map the URI space of an ontology to a local resource
URIMap map = new UriMap();
map.put("http://www.holygoat.co.uk/owl/redwood/0.1/tags/Tagging",
    MyClass.class.getResource("tags.owl").toString());

// Add a custom URI dereferencer and RDFizer
LinkedDataCache cache = LinkedDataCache.createDefault(baseSail, map);
cache.addDereferencer("jar", new JarURIDereferencer());
cache.addRdfizer(MediaType.IMAGE_JPEG, new ImageRdfizer(), 0.4);

// Instantiate and initialize the Sail
LinkedDataSail sail = new LinkedDataSail(baseSail, cache);
sail.initialize();

See the createDefault method in in the LinkedDataCache source code for an example of instantiating LinkedDataCache from scratch. For API details, please see the JavaDocs.

Motivation

LinkedDataSail is the Linked Data client of the Ripple scripting language, enabling "one liner" path-based queries over Linked Data which are, themselves, linkable RDF data. LinkedDataSail has been generalized to support other crawlers and query services, as well, such as SPARQL endpoints and Sesame-based tools of all kinds. Gremlin is a related graph programming language which has been adapted to use LinkedDataSail.

Clone this wiki locally