Selector expressions are a concise way of selecting some set of nodes in an RDF graph, given a particular starting node. For example, given the graph:
evaluating the selector expression foaf:knows/foaf:name with a starting node of <bob> would yield two literal nodes: "Alice" and "Carol".
The syntax for selector expressions was inspired by XPath syntax, so if you are familiar with XPath you will notice many similarities.
Todo
make graphviz output prettier
At the heart of selector expressions is the notion of traversing a path through the RDF graph along properties, in the same way XPath can traverse a document tree. Selector expressions are always evaluated with respect to a context node in the graph, which is the starting point for traversals.
The simplest traversal consists of a single RDF property, such as foaf:knows. This expression selects all nodes which are the object of a foaf:knows predicate where the starting node is the subject. In other words, it can be considered equivalent to the SPARQL query:
SELECT ?o
WHERE { ?start foaf:knows ?o . }
(Actually the simplest traversal is the empty string, which always evaluates to the starting node. This is really only meaningful when used with other syntax elements described below.)
If we used <bob> in the example graph above as our starting node, the expression foaf:knows would evaluate to two resource nodes: <alice> and <carol>. In general a selector expression may yield zero or more results. For example, if we used <alice> as a starting node, the result would be empty.
Multiple traversals may be chained together using / as a separator, as in foaf:knows/foaf:name.
Note that property URIs are always given in their prefixed form. In order to keep the syntax simple, there is no way to specify a complete URI reference in a selector expression.
The direction of a property traversal can be inverted by prepending ! to the property name. For example, given some article as a starting node, the expression dc:creator/!dc:creator/dc:title might be used to select the title of all articles written by the authors of the starting node.
The set of nodes resulting from a traversal can be filtered with a predicate. The predicate is given in square brackets ([]) following the property name. Predicates may appear at any point in the chain of traversals.
The following predicates are supported:
Multiple predicates may be applied by joining them together with the and keyword, as in !dc:creator[type=bibo:Article and uri-prefix='http://example.com/'].
Custom predicates may be defined at runtime by supplying a custom PredicateResolver implementation.
The result of evaluating a traversal is zero or more RDF nodes (in Java, implementations of Jena’s RDFNode interface). However, it is often necessary to convert these RDF nodes into a more useful data type, or to perform some post-processing on them.
An adaptation is a function which takes an RDF node and “adapts” it in some way. An adaptation can be specified at the end of a selector expression, preceded by # and optionally followed by an argument list. For example, the expression foaf:knows#uri would evaluate to the URIs of the people known to the starting node. The distinction here is important: whereas foaf:knows evaluates to zero or more RDFNodes, foaf:knows#uri evaluates to zero or more Strings giving the URI of each node.
The following adaptations are supported:
Short for “formatted date-time”. This adaptation can only be applied to literal nodes whose values are represented as Joda datetime types. It takes a single string argument, specifying the date-time format to apply. Use it like this: dc:created#formatted-dt('d MMMM yyyy').
Todo
hacks for Joda are not in stock Jena
Custom adaptations may be defined at runtime by supplying a custom AdaptationFactory implementation.
RDF graphs by their nature do not define any ordering, so a selector expression like foaf:knows will return its results in arbitrary order. When we expect the result to contain more than one node, it is often useful to ensure a predictable (repeatable) ordering of the resulting nodes.
Sorting can be applied at any point in the chain of traversals, by giving a sort expression enclosed in parentheses (()). The sort expression can be a complete selector expression (including multiple traversals, nested sorts, and any other selector features). The set of nodes in the traversal are then sorted by evaluating the sort expression for each node, and sorting with these values as keys. The sort expression may optionally be prepended with ~ to indicate a reverse sort.
For example, given an author as a starting node, !dc:creator(dc:title#comparable-lv) would evaluate to the works created by that author, ordered by the title of each work.
Note that the sort expression must always evaluate to a Java object which implements Comparable, so it is typically necessary to apply the comparable-lv adaptation in the sort expression.
If one expression is not enough to uniquely sort each item in the result, multiple sort expressions can be specified using , to separate them.
A sort expression may optionally be followed by a subscript [n], indicating that only the n-th node in the result should be selected. For example, !dc:creator(~dc:date)[0]/dc:title might be used to select the title of an author’s most recent work.
Selector expressions can be chained together using |. The result of the expression will be the result of each sub-expression chained together in sequence. For example: !dc:creator | !bibo:translator.
The following classes in the au.id.djc.rdftemplate.selector package are relevant for compiling and evaluating selector expressions:
This interface represents the compiled version of a selector expression. It is parametrised on the result type of the expression.
Returns the result type of this selector expression. (This is the runtime class of the type parameter T.) For a simple traversal this will be RDFNode, or if an adaptation is applied to the selector expression it will be the result type of the adaptation (such as String or Object).
A convenience method to cast the type parameter of this Selector. Always returns this instance. Just a dumb hack to keep Java’s static type checking happy.
Use this class to compile selector expressions into Selector instances. Instances of this class can safely be shared across threads (for example, as singleton beans in Spring).
Compiles the given selector expression into a Selector instance.
Selector<RDFNode> s1 = factory.get("foaf:knows").withResultType(RDFNode.class);
Selector<String> s2 = factory.get("foaf:knows/foaf:name#string-lv").withResultType(String.class);
Configures a custom AdaptationFactory implementation for selectors created by this factory. If this setter is not called, an instance of DefaultAdaptationFactory will be used.
Configures a custom PredicateResolver implementation for selectors created by this factory. If this setter is not called, an instance of DefaultPredicateResolver will be used.
Configure namespace prefix mappings for selectors created by this factory. If this setter is not called, no namespace prefixes will be defined.
Implement this interface if you would like to use custom adaptations in your selector expressions.
Your implementation should fall back to a DefaultAdaptationFactory instance, so that selector expressions have access to the builtin adaptations in addition to your custom ones.
Implement this interface if you would like to use custom predicates in your selector expressions.
Your implementation should fall back to a DefaultPredicateResolver instance, so that selector expressions have access to the builtin predicates in addition to your custom ones.
Wrap an AntlrSelectorFactory with this class if you want to avoid compiling selectors anew every time. Do not use this class if the number of different selector expressions is unbounded, as it will cause heap exhaustion.