3

1

I use the following query to find all classes with the parent class PUMP:

PREFIX dm: <http://dm.rdlfacade.org/data#>
PREFIX rdl: <http://rds.posccaesar.org/2008/06/OWL/RDL#>
SELECT 
  ?supclDesig ?supclIdPCA
  ?subclDesig ?subclDef
  ?subclIdPCA ?subcl
WHERE 
{
  ?x dm:hasSuperclass ?supcl .
  ?x dm:hasSubclass ?subcl .
  ?subcl rdl:hasIdPCA ?subclIdPCA ;
         rdl:hasDefinition ?subclDef ;
         rdl:hasDesignation ?subclDesig .
  ?supcl rdl:hasIdPCA ?supclIdPCA ;
         rdl:hasDesignation ?supclDesig ;
         rdl:hasDesignation "PUMP"
}
LIMIT 2000

Now, I know that I can duplocate the expression above, change PUMP to VALUE and put a UNION between the two expressions to get a list of classes where the parent class is either PUMP or VALVE. However, it just itches all over my body having to duplicate code like that. An alternative way to do it, is this:

PREFIX dm: <http://dm.rdlfacade.org/data#>
PREFIX rdl: <http://rds.posccaesar.org/2008/06/OWL/RDL#>
SELECT 
  ?supclDesig ?supclIdPCA
  ?subclDesig ?subclDef
  ?subclIdPCA ?subcl
WHERE 
{
  ?muu dm:hasSuperclass ?supcl .
  ?muu dm:hasSubclass ?subcl .
  ?subcl rdl:hasIdPCA ?subclIdPCA ;
         rdl:hasDefinition ?subclDef ;
         rdl:hasDesignation ?subclDesig .
  ?supcl rdl:hasIdPCA ?supclIdPCA .
  ?supcl rdl:hasDesignation ?supclDesig .
  FILTER (regex(?supclDesig, "^VALVE$") || regex(?supclDesig, "^PUMP$"))
}
LIMIT 2000

That however have the drawback that a lot more data is gone through and is then filtered according to the regexps, so it is much slower. Isn't there a way to write something like this:

PREFIX dm: <http://dm.rdlfacade.org/data#>
PREFIX rdl: <http://rds.posccaesar.org/2008/06/OWL/RDL#>
SELECT 
  ?supclDesig ?supclIdPCA
  ?subclDesig ?subclDef
  ?subclIdPCA ?subcl
WHERE 
{
  ?x dm:hasSuperclass ?supcl .
  ?x dm:hasSubclass ?subcl .
  ?subcl rdl:hasIdPCA ?subclIdPCA ;
         rdl:hasDefinition ?subclDef ;
         rdl:hasDesignation ?subclDesig .
  ?supcl rdl:hasIdPCA ?supclIdPCA ;
         rdl:hasDesignation ?supclDesig ;
         rdl:hasDesignation ("PUMP", "VALVE")
}
LIMIT 2000

("PUMP", "VALVE") above is just an example of how it could look like.

flag

3 Answers

2

Would something like this work (untested).

PREFIX dm: <http://dm.rdlfacade.org/data#>
PREFIX rdl: <http://rds.posccaesar.org/2008/06/OWL/RDL#>
SELECT 
  ?supclDesig ?supclIdPCA
  ?subclDesig ?subclDef
  ?subclIdPCA ?subcl
WHERE 
{
  {?x dm:hasSuperclass ?supcl .
   ?supcl rd1:hasDesignation "PUMP"} UNION
  {?x dm:hasSuperclass ?supcl .
   ?supcl rd1:hasDesignation "VALVE"}
  ?x dm:hasSubclass ?subcl .
  ?subcl rdl:hasIdPCA ?subclIdPCA ;
         rdl:hasDefinition ?subclDef ;
         rdl:hasDesignation ?subclDesig .
  ?supcl rdl:hasIdPCA ?supclIdPCA ;
         rdl:hasDesignation ?supclDesig
}

LIMIT 2000

link|flag
Excellent! It works well, and is fast too. It is clear that it takes a while to grok the SPARQL syntax when coming from the world of SQL - things that look like they work the same are not at all the same. But I'll get there, I'm sure :) Thanks! By the way, it's "rdl", not "rd1", but the font might make it it hard to see the difference between the "ell" and the "one". – Mathias Dahl Mar 4 at 8:30
3

Currently there is no mechanism to do what you suggested in your last example in the working drafts of SPARQL 1.1 either though I think they are discussing adding syntax to achieve this - saw Andy Seaborne tweet something about IN and NOT IN syntax the other day.

For your use case you could do it with a more efficient FILTER like so:

FILTER(SAMETERM(?supclDesig,"PUMP") || SAMTERM(?supclDesig,"VALVE"))

This would still require loading more data than strictly necessary and then filtering over it but this will be doing RDF term equality to check whether the designation is either of the designations you are interested in which should be far more efficient than the REGEX approach. Also note the use of SAMETERM as opposed to equality (e.g. ?supclDesign = "PUMP") since equality would do value based equality which is typically a more costly an operation than term based equality.

Another alternative is to UNION over only the bit of interest. So where you have the triple pattern than checks the parent class replace it with a group graph pattern that UNIONs over the two possibilities. This reduces the need to duplicate code e.g.

{{ ?x ex:parentClass ex:A } UNION { ?x ex:parentClass ex:B}}
link|flag
Thanks for the prompt reply, it was really useful! The UNION example you gave works but is many many times (seconds vs minutes) slower than the longer UNION example I gave. Maybe I am using it the wrong way? Here is what I tried: ... ?supcl rdl:hasIdPCA ?supclIdPCA . ?supcl rdl:hasDesignation ?supclDesig . {{?supcl rdl:hasDesignation "PUMP"} UNION {?supcl rdl:hasDesignation "VALVE"}} ... This took over seven minutes. If I instead duplicate the whole expression and add a UNION (changing PUMP to VALVE) it takes a few seconds. – Mathias Dahl Feb 26 at 11:51
By the way, the queries are tested here: rdl.rdlfacade.org/… – Mathias Dahl Feb 26 at 11:54
Huh, probably due to how the underlying SPARQL engine executes it's queries resulting in it having to do a big cartesian product somewhere. I'd stick to using the complete UNION approach for the time being even if it does replicate code since the SPARQL engine you're using performs significantly better on that form of query – Rob Vesse Feb 26 at 13:57
Yeah, I'll stick with the longer version. Thanks! – Mathias Dahl Feb 26 at 14:15
0

This query works at http://rdl.rdlfacade.org/data?info=&search=&sparql. Any reasonable SPARQL query processor will be able to execute this as efficiently as the union the OP posted.

PREFIX dm: <http://dm.rdlfacade.org/data#>
PREFIX rdl: <http://rds.posccaesar.org/2008/06/OWL/RDL#>
SELECT 
  ?supclDesig ?supclIdPCA
  ?subclDesig ?subclDef
  ?subclIdPCA ?subcl
WHERE 
{
  ?x dm:hasSuperclass ?supcl .
  ?x dm:hasSubclass ?subcl .
  ?subcl rdl:hasIdPCA ?subclIdPCA ;
         rdl:hasDefinition ?subclDef ;
         rdl:hasDesignation ?subclDesig .
  ?supcl rdl:hasIdPCA ?supclIdPCA ;
         rdl:hasDesignation ?supclDesig ;
         rdl:hasDesignation ?d.
  filter(?d = "PUMP" || ?d = "VALVE").
}
LIMIT 2000
link|flag
1 
I think that also falls victim to: "That however have the drawback that a lot more data is gone through and is then filtered according to the regexps, so it is much slower." Best to filter as soon as possible so as to avoid retrieving all the info from your entire model (e.g. subclIdPCA, subclDef, etc.), only to have to filter it out later anyway. – Jeff Schmitz Mar 8 at 15:38
Not sure I understand. With this query a reasonable SPARQL processor will evaluate { ?subcl rdl:hasDesignation ?x. } with the ?x = "PUMP" and ?x = "VALVE" constraints (assuming that is the most restrictive pattern according to the underlying data provider). In that case you would not be retrieving a lot more data then necessary, only what is required to perform the query processing. A union does exactly the same thing as an OR in this case. – spoon16 Mar 11 at 23:16
I tried this query now (here: rdl.rdlfacade.org/…) and it does not work, gives me an empty result set. – Mathias Dahl Mar 17 at 7:00
Mathais, sorry about that I reused the ?x variable in the query inappropriately. I have fixed the answer and validated results on the page you posted. – spoon16 Mar 17 at 16:20
I apologize for misspelling your name as well :) – spoon16 Mar 19 at 0:21

Your Answer

Get an OpenID
or

Not the answer you're looking for? Browse other questions tagged or ask your own question.