rdftemplate

Library for generating XML documents from RDF data using templates
git clone https://code.djc.id.au/git/rdftemplate/

src/doc/sphinx/selector.rst (12622B) - raw

      1 Selector expressions
      2 ====================
      3 
      4 Selector expressions are a concise way of selecting some set of nodes in an RDF 
      5 graph, given a particular starting node. For example, given the graph:
      6 
      7 .. include:: example-graph.rst.inc
      8 
      9 evaluating the selector expression ``foaf:knows/foaf:name`` with a starting node of 
     10 ``<bob>`` would yield two literal nodes: ``"Alice"`` and ``"Carol"``.
     11 
     12 The syntax for selector expressions was inspired by `XPath`_ syntax, so if you 
     13 are familiar with XPath you will notice many similarities.
     14 
     15 .. todo:: make graphviz output prettier
     16 
     17 .. _XPath: http://www.w3.org/TR/xpath/
     18 
     19 Syntax
     20 ------
     21 
     22 Traversing
     23 ~~~~~~~~~~
     24 
     25 At the heart of selector expressions is the notion of traversing a path through 
     26 the RDF graph along properties, in the same way XPath can traverse a document 
     27 tree. Selector expressions are always evaluated with respect to a context node 
     28 in the graph, which is the starting point for traversals.
     29 
     30 The simplest traversal consists of a single RDF property, such as 
     31 ``foaf:knows``. This expression selects all nodes which are the object of 
     32 a foaf:knows predicate where the starting node is the subject. In other words, 
     33 it can be considered equivalent to the SPARQL query:
     34 
     35 .. code-block:: none
     36 
     37    SELECT ?o
     38    WHERE { ?start foaf:knows ?o . }
     39 
     40 (Actually the *simplest* traversal is the empty string, which always evaluates 
     41 to the starting node. This is really only meaningful when used with other 
     42 syntax elements described below.)
     43 
     44 If we used ``<bob>`` in the example graph above as our starting node, the 
     45 expression ``foaf:knows`` would evaluate to two resource nodes: ``<alice>`` and 
     46 ``<carol>``. In general a selector expression may yield zero or more results. 
     47 For example, if we used ``<alice>`` as a starting node, the result would be 
     48 empty.
     49 
     50 Multiple traversals may be chained together using ``/`` as a separator, as in 
     51 ``foaf:knows/foaf:name``.
     52 
     53 Note that property URIs are always given in their prefixed form. In order to 
     54 keep the syntax simple, there is no way to specify a complete URI reference in 
     55 a selector expression.
     56 
     57 Inverse traversal
     58 ~~~~~~~~~~~~~~~~~
     59 
     60 The direction of a property traversal can be inverted by prepending ``!`` to 
     61 the property name. For example, given some article as a starting node, the 
     62 expression ``dc:creator/!dc:creator/dc:title`` might be used to select the 
     63 title of all articles written by the authors of the starting node.
     64 
     65 .. _predicates:
     66 
     67 Predicates
     68 ~~~~~~~~~~
     69 
     70 The set of nodes resulting from a traversal can be filtered with a predicate. 
     71 The predicate is given in square brackets (``[]``) following the property name. 
     72 Predicates may appear at any point in the chain of traversals.
     73 
     74 The following predicates are supported:
     75 
     76 ``type``
     77     Includes only nodes of the given type. Use it like this: 
     78     ``!dc:creator[type=bibo:Article]``.
     79 
     80 ``uri-prefix``
     81     Includes only resource nodes whose URI begins with the given string. Use it 
     82     like this: ``dc:identifier[uri-prefix='urn:issn:']``.
     83 
     84 Multiple predicates may be applied by joining them together with the ``and`` 
     85 keyword, as in ``!dc:creator[type=bibo:Article and uri-prefix='http://example.com/']``.
     86 
     87 Custom predicates may be defined at runtime by supplying a custom 
     88 :java:class:`PredicateResolver` implementation.
     89 
     90 .. _adaptations:
     91 
     92 Adapting the result
     93 ~~~~~~~~~~~~~~~~~~~
     94 
     95 The result of evaluating a traversal is zero or more RDF nodes (in Java, 
     96 implementations of Jena’s :java:class:`RDFNode 
     97 <com.hp.hpl.jena.rdf.model.RDFNode>` interface). However, it is often necessary 
     98 to convert these RDF nodes into a more useful data type, or to perform some 
     99 post-processing on them.
    100 
    101 An adaptation is a function which takes an RDF node and “adapts” it in some 
    102 way. An adaptation can be specified at the end of a selector expression, 
    103 preceded by ``#`` and optionally followed by an argument list. For example, the 
    104 expression ``foaf:knows#uri`` would evaluate to the URIs of the people known to 
    105 the starting node. The distinction here is important: whereas ``foaf:knows`` 
    106 evaluates to zero or more :java:class:`RDFNodes 
    107 <com.hp.hpl.jena.rdf.model.RDFNode>`, ``foaf:knows#uri`` evaluates to zero or 
    108 more :java:class:`Strings <java.lang.String>` giving the URI of each node.
    109 
    110 The following adaptations are supported:
    111 
    112 ``uri``
    113     Returns the URI of the RDF node as a :java:class:`String 
    114     <java.lang.String>`. Throws an exception if applied to a node which is not 
    115     a resource.
    116 
    117 ``uri-slice``
    118     Returns a substring of the URI. This adaptation takes a single integer 
    119     argument specifying the number of characters to be removed. Use it like 
    120     this: ``dc:identifier[uri-prefix='urn:issn:']#uri-slice(9)``.
    121 
    122 ``uri-anchor``
    123     Returns the anchor part of the URI, excluding the # character. Returns 
    124     empty string if there is no anchor part.
    125 
    126 ``lv``
    127     Short for “literal value”. Returns the value of the literal RDF node, 
    128     converted to a Java object using Jena’s type conversion facilities (see 
    129     :java:method:`Literal#getValue() 
    130     <com.hp.hpl.jena.rdf.model.Literal#getValue()>`). Throws an exception if 
    131     applied to a node which is not a literal.
    132 
    133 ``comparable-lv``
    134     Essentially the same as ``lv``, but with a runtime check to ensure the 
    135     literal value implements :java:class:`Comparable <java.lang.Comparable>`. 
    136     Only exists for type-safety reasons.
    137 
    138 ``string-lv``
    139     Like ``lv``, but additionally calls toString() on the resulting object to 
    140     ensure it is always a String. This adaptation also strips all tags from XML 
    141     literals.
    142 
    143 ``formatted-dt``
    144     Short for “formatted date-time”. This adaptation can only be applied to 
    145     literal nodes whose values are represented as Joda datetime types. It takes 
    146     a single string argument, specifying the date-time format to apply. Use it 
    147     like this: ``dc:created#formatted-dt('d MMMM yyyy')``.
    148 
    149     .. todo:: hacks for Joda are not in stock Jena
    150 
    151 Custom adaptations may be defined at runtime by supplying a custom 
    152 :java:class:`AdaptationFactory` implementation.
    153 
    154 Sorting the result
    155 ~~~~~~~~~~~~~~~~~~
    156 
    157 RDF graphs by their nature do not define any ordering, so a selector expression 
    158 like ``foaf:knows`` will return its results in arbitrary order. When we expect 
    159 the result to contain more than one node, it is often useful to ensure 
    160 a predictable (repeatable) ordering of the resulting nodes.
    161 
    162 Sorting can be applied at any point in the chain of traversals, by giving 
    163 a sort expression enclosed in parentheses (``()``). The sort expression can be 
    164 a complete selector expression (including multiple traversals, nested sorts, 
    165 and any other selector features). The set of nodes in the traversal are then 
    166 sorted by evaluating the sort expression for each node, and sorting with these 
    167 values as keys. The sort expression may optionally be prepended with ``~`` to 
    168 indicate a reverse sort.
    169 
    170 For example, given an author as a starting node, 
    171 ``!dc:creator(dc:title#comparable-lv)`` would evaluate to the works created by 
    172 that author, ordered by the title of each work.
    173 
    174 Note that the sort expression must always evaluate to a Java object which 
    175 implements :java:class:`Comparable <java.lang.Comparable>`, so it is typically 
    176 necessary to apply the ``comparable-lv`` adaptation in the sort expression.
    177 
    178 If one expression is not enough to uniquely sort each item in the result, 
    179 multiple sort expressions can be specified using ``,`` to separate them.
    180 
    181 Selecting from many results
    182 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
    183 
    184 A sort expression may optionally be followed by a subscript ``[n]``, indicating 
    185 that only the *n*-th node in the result should be selected. For example, 
    186 ``!dc:creator(~dc:date)[0]/dc:title`` might be used to select the title of an 
    187 author’s most recent work.
    188 
    189 Combining multiple expressions
    190 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    191 
    192 Selector expressions can be chained together using ``|``. The result of the 
    193 expression will be the result of each sub-expression chained together in 
    194 sequence. For example: ``!dc:creator | !bibo:translator``.
    195 
    196 Evaluating expressions
    197 ----------------------
    198 
    199 The following classes in the au.id.djc.rdftemplate.selector package are 
    200 relevant for compiling and evaluating selector expressions:
    201 
    202 .. java:class:: au.id.djc.rdftemplate.selector.Selector<T>
    203 
    204    This interface represents the compiled version of a selector expression. It 
    205    is parametrised on the result type of the expression.
    206 
    207    .. java:method:: java.lang.Class<T> getResultType()
    208 
    209       Returns the result type of this selector expression. (This is the runtime 
    210       class of the type parameter T.) For a simple traversal this will be 
    211       :java:class:`RDFNode <com.hp.hpl.jena.rdf.model.RDFNode>`, or if an 
    212       adaptation is applied to the selector expression it will be the result 
    213       type of the adaptation (such as :java:class:`String <java.lang.String>` 
    214       or :java:class:`Object <java.lang.Object>`).
    215 
    216    .. java:method:: Selector<Other> withResultType(java.lang.Class<Other> otherType)
    217 
    218       A convenience method to cast the type parameter of this Selector. Always 
    219       returns this instance. Just a dumb hack to keep Java’s static type 
    220       checking happy.
    221 
    222    .. java:method:: java.util.List<T> result(com.hp.hpl.jena.rdf.model.RDFNode node)
    223 
    224       Evaluates this selector expression with respect to the given starting 
    225       node, and returns the result.
    226 
    227    .. java:method:: T singleResult(com.hp.hpl.jena.rdf.model.RDFNode node)
    228 
    229       Evaluates this selector expression with respect to the given starting 
    230       node, and returns the result. If the selector does not evaluate to 
    231       exactly one node, an exception is thrown.
    232 
    233 .. java:class:: au.id.djc.rdftemplate.selector.AntlrSelectorFactory
    234 
    235    Use this class to compile selector expressions into :java:class:`Selector` 
    236    instances. Instances of this class can safely be shared across threads (for 
    237    example, as singleton beans in Spring).
    238 
    239    .. java:method:: au.id.djc.rdftemplate.selector.Selector<?> get(java.lang.String expression)
    240 
    241       Compiles the given selector expression into a :java:class:`Selector` 
    242       instance.
    243 
    244       .. code-block:: java
    245 
    246          Selector<RDFNode> s1 = factory.get("foaf:knows").withResultType(RDFNode.class);
    247          Selector<String> s2 = factory.get("foaf:knows/foaf:name#string-lv").withResultType(String.class);
    248 
    249    .. java:method:: void setAdaptationFactory(au.id.djc.rdftemplate.selector.AdaptationFactory adaptationFactory)
    250 
    251       Configures a custom :java:class:`AdaptationFactory` implementation for 
    252       selectors created by this factory. If this setter is not called, an 
    253       instance of :java:class:`DefaultAdaptationFactory 
    254       <au.id.djc.rdftemplate.selector.DefaultAdaptationFactory>` will be used.
    255 
    256    .. java:method:: void setPredicateResolver(au.id.djc.rdftemplate.selector.PredicateResolver predicateResolver)
    257 
    258       Configures a custom :java:class:`PredicateResolver` implementation for 
    259       selectors created by this factory. If this setter is not called, an 
    260       instance of :java:class:`DefaultPredicateResolver 
    261       <au.id.djc.rdftemplate.selector.DefaultPredicateResolver>` will be used.
    262 
    263    .. java:method:: void setNamespacePrefixMap(java.util.Map<String, String> namespacePrefixMap)
    264 
    265       Configure namespace prefix mappings for selectors created by this 
    266       factory. If this setter is not called, no namespace prefixes will be 
    267       defined.
    268 
    269 .. java:class:: au.id.djc.rdftemplate.selector.AdaptationFactory
    270 
    271    Implement this interface if you would like to use custom adaptations in your 
    272    selector expressions.
    273 
    274    Your implementation should fall back to 
    275    a :java:class:`DefaultAdaptationFactory 
    276    <au.id.djc.rdftemplate.selector.DefaultAdaptationFactory>` instance, so that 
    277    selector expressions have access to the builtin adaptations in addition to 
    278    your custom ones.
    279 
    280 .. java:class:: au.id.djc.rdftemplate.selector.PredicateResolver
    281 
    282    Implement this interface if you would like to use custom predicates in your 
    283    selector expressions.
    284 
    285    Your implementation should fall back to 
    286    a :java:class:`DefaultPredicateResolver 
    287    <au.id.djc.rdftemplate.selector.DefaultPredicateResolver>` instance, so that 
    288    selector expressions have access to the builtin predicates in addition to 
    289    your custom ones.
    290 
    291 .. java:class:: au.id.djc.rdftemplate.selector.EternallyCachingSelectorFactory
    292 
    293    Wrap an :java:class:`AntlrSelectorFactory` with this class if you want to 
    294    avoid compiling selectors anew every time. Do not use this class if the 
    295    number of different selector expressions is unbounded, as it will cause heap 
    296    exhaustion.