Graphical “more like this” Query Building
I promised in an earlier blog post to talk about how to create queries over OWL in RDF. So here it is.
As Ivan alluded in his comment, there are some syntax issues with talking about OWL restrictions in RDF. What is he referring to? Well, let's take the same example in the last blog post, a datatype restriction about things with age>=21. We could write this in Manchester Syntax as
hasAge only xsd:integer [>=21]
But the OWL/RDF rendition of this is where the 'arcane' syntax comes in. We can see it just by looking at the source code in turtle, where it looks like this:
[] a owl:Restriction ;
owl:allValuesFrom
[ a rdfs:Datatype ;
owl:onDatatype xsd:integer ;
owl:withRestrictions
([ xsd:minInclusive 21])
] ;
owl:onProperty :hasAge .
In the last blog entry, we saw a rule that would match this sort of definition, so that we could classify persons of appropriate ages as Adults. That rule looked like this:
CONSTRUCT {
?x a ?restriction .
}
WHERE {
?datatype owl:onDatatype xsd:integer .
?datatype owl:withRestrictions ?var .
?datatype a rdfs:Datatype .
?restriction owl:allValuesFrom ?datatype .
?restriction a owl:Restriction .
?restriction owl:onProperty ?datatypeproperty .
?var rdf:first ?var1 .
?var1 xsd:minInclusive ?mval .
?x ?datatypeproperty ?val .
FILTER (?val >= ?mval) .
}
How do you write a rule like that? By looking up in the standard how to express datatype restrictions, and how to link those to restricted value sets, and . . . . if that seems labor intensive and error-prone to you, then you're right. It is.
But we can use a power-tool to help make this happen. The power tools aren't included in the free version of TopBraid Composer, so if you want to follow along here, you'll need the Maestro Edition; a 30-day trial is available for free.
Start by loading http://workingontologist.org/Examples/adult.rdf into Composer, just as shown before, and open it. We're going to use the model itself as a prototype to create a query. Let's start by looking at an example of the restriction we want to match - look at the definition of Adult in the model:
You can type it in just like that. But that doesn't help us write a SPARQL query to match any restriction of this form. How can we do that? If you click on "Graph" at the bottom of the pane, you can explore this definition, in RDF. If you drill down to the Datatype Restriction itself, you get a view like the top of this figure:
This is just a graphic representation of triples in the model - you can see all the structure of the RDF representation of the restriction.
Now comes the fun part - let's turn this image into a query (which, to avoid suspense, is already shown at the bottom of the figure). We want a query that will match "things like this" restriction. What does "like this" mean? That's what we have to specify - there are some aspects of this example that should be included in the match (like the fact that it is a owl:Restriction, on a rdfs:Datatype xsd:integer, and that it is a owl:minInclusive restriction), and others should not be included in the match (that the property is :hasAge; after all, we this to match for restrictions on any property). So, we select the things that we want to keep in the query, marked with a small "x" (you can set/reset the "x" by clicking on the small box in each node in the graph).
Once you have selected the aspects that specify what you mean by "like this" (a Datatype Restriction, on some property, with minInclusive over xsd:integers), you can generate the query automatically by clicking the
button. You can see the generated query at the bottom of the figure.
All the generator did was to take the triples shown in the figure, and render them in the query. Selected nodes (with "x") appear in the query as themselves; unselected nodes (no "x") become variables. Properties always show up as themselves. Best guesses are made for meaningful variable names; it uses type information for the guesses.
There are a few differences between the generated query and the WHERE clause of the rule:
WHERE {
?datatype owl:onDatatype xsd:integer .
?datatype owl:withRestrictions ?var .
?datatype a rdfs:Datatype .
?restriction owl:allValuesFrom ?datatype .
?restriction a owl:Restriction .
?restriction owl:onProperty ?datatypeproperty .
?var rdf:first ?var1 .
?var1 xsd:minInclusive ?mval .
?x ?datatypeproperty ?val .
FILTER (?val >= ?mval) .
}
The first difference is ordering of triples - the generator isn't very fussy about the order in which triples are generated, so it is different each time (if you are following along at home, your generated query will probably be different from the one shown here, and also from the rule).
The second difference is the inclusion of a triple to match data, to wit:
?x ?datatypeproperty ?val .
After all, in a rule, we want to say "when some data satisfies this restriction, ..." This clause uses the same variable for the property (?datatypeproperty) as used in the rest of the query.
The final difference has to do with the constant "21". The generated query includes the constant, whereas the rule turns it into a variable (?mval) and adds a filter to compare it to the actual data (?val). After all, the value "21" comes from the model, and shouldn't be built in to the rule.
So yes, these modifications have to be made by hand (using the SPARQL editor, where the generator put the query). The query generator should be seen as a power tool; you still need an operator who knows how to use it, but it simplifies a lot of the heavy lifting for query writing. In this case, we have a rule with 10 clauses (9 triples and a filter). The generator created seven of the triples, and most of the eighth one; the human only had to write the last two clauses. That is, the power tool took care of the "arcane syntax" that Ivan referred to, leaving the human to figure out what they really want the rule to mean.
I use this feature of TopBraid Composer all the time, in this pattern. I want to write a query that matches some 'arcane' bit of RDF (e.g., from dbpedia, the OWL in RDF standard, the XML DOM, SKOS, etc.). Instead of trying to write a query from scratch, I find (or even build) an example of the thing I want to match. Then I generate the query - automatically guaranteeing that I didn't leave out any triples, that I got all the namespaces and property names correct, that I didn't accidentally collide bnodes by giving them the same variable name, etc. Then I beat up the result to create the query that I really want - in which I define what I want to do with the match.
So when you see an elaborate query with dozens of triples in it, and you wonder what sort of geek can write or maintain such a thing, keep in mind that it might not have been written at all; it might have been generated from an example.







I was a newbie to the library mashup scene, and took in a lot of information yesterday at
Having read some documentation recently around the plans of the
Aiming high is rarely the wrong thing to do, in my opinion, and Jonathan Purday’s presentation, at the
At the PLA 2009 conference last week, Bob McKee, Chief Executive of CILIP, proudly presented a new set of guidelines as to what makes a good library service. In comparison to the traditional bulky, text heavy and complex use of language presented in traditional library guidelines, this A5 pamphlet could easily be overlooked as an advert or flyer rather than library guidelines. However, this is not to be perceived as a bad thing. The concise manner in which it is presented leaves no room for hot air and leaves it do exactly what it says on the tin: guide.
Last week, the All-Party Parliamentary Group launched their new report: an inquiry into the governance and leadership of the public library service in England. On the basis of the progression we have seen with the DCMS modernisation review, I had little expectation of this report providing any real insight or vision. As I worked my way through the report, I found myself scribbling and highlighting away, only to find the very thought I had just noted to be clarified in the upcoming paragraph. So I was pleasantly surprised to say the least, as I found the report to consider more perspectives than I anticipated.
Day 3 and it’s the final day of the Public Library Association conference 2009. I had low expectations for the day, as I misread the conference programme to believe the day would be dwindling to an end. Yet as the first session began, I was quickly proven wrong.
ng the last few sessions, I began concluding my thoughts of the three days and of my first PLA conference. Though officially the themes were centred on community engagement, in hindsight, I felt it was something quite different. Reading between the lines, I felt the main focus of the delegates wasn’t around engaging with their communities at all, but more about justifying their existence. Cases like Wirral and more recently, the proposals of library closures in Aberdeenshire, has left librarians constantly thinking about how they can build their portfolio of ammunition, should their service come under the firing line some time soon. And if recent goings on are anything to go by, it’s almost certain that they will have to in the coming years. Each speaker seemed aware of this too. Though not literally, each was providing ideas and models to do so, with the term ‘outcome based accountability’ sneaking in quite frequently.
view itself is to be published in a much faster paced climate than previously published reports, she explained, and therefore, the DCMS do not intend for it to be the last word in the conversation. Margaret would like the time to input her thoughts on the paper before release, and publish as a consultation document. The cynic may read this as a lack of ideas or direction on the DCMS’ part, yet others may believe wider consultation is a genuine attempt to engage with those experienced in the field. In her closing statements, she encouraged librarians to get in touch, as she would like to produce a comprehensive and controversial report. She promised that the Government remains committed to strong and modern public library services and will continue to value and champion them.