-
Notifications
You must be signed in to change notification settings - Fork 21
Description
@frensjan @afs @TallTed @lisp @JervenBolleman
(I don't even know how to define this issue: feel free to edit the title!)
@frensjan in #100 (comment) started a discussion on which bindings are passed between which SPARQL clauses and formulated some nice queries to exercise these questions.
I posted similar things in #103 (but they are not yet reflected below).
Different SPARQL processors return different results on such basic queries :-(
- Jena on https://sparql.org/sparql
- Blazegraph on Wikidata endpoint
- Virtuoso with Strict checking of void variables off (
&signal_unconnected=on
) on dbpedia.org endpoint - GraphDB up to 10.6.1 (RDF4J 4.3.9)
I don't know SPARQL algebra very well, but I guess it all comes from the bottom-up execution semantics of SPARQL.
- "Canonical" below is my understanding (or what someone else said) should happen according to spec
- "Intuition" is what I think should happen, or what is most "useful".
Coincidentally, rdf4j's behavior matches that intuition. I do have a vested interest, so I won't claim this is the "right" behavior.
Now: I have no illusions that the group will change SPARQL semantics to fit my intuitions.
But maybe some option/flag/"mode" can be added to change the treatment of bindings.
At the least, this issue will serve as a big warning for the unwary.
Brackets
- Intuition: adding brackets should not change the results
- Canonical: brackets make a new sub-clause that is executed independently of the others (bottom-up), so in this case should return no results
PREFIX : <http://example.org/>
SELECT * WHERE {
VALUES ?x { :x }
{
FILTER( BOUND(?x) )
BIND( :y as ?y )
BIND( ?x as ?z )
}
}
- Jena, Blazegraph, Virtuoso: no results
- rdf4j: one result
:x :y :z
Optional
- Intuition: OPTIONAL can only "enlarge" the result, it should not remove rows nor bindings
- Canonical: in
LHS optional {RHS}
, LHS bindings should not be passed to RHS
PREFIX : <http://example.org/>
SELECT * WHERE {
VALUES ?x { :x }
OPTIONAL {
FILTER( BOUND(?x) )
BIND( :y as ?y )
BIND( ?x as ?z )
}
}
- Blazegraph, Virtuoso: one result
:x
- Jena: one result
:x :y
but why isy
bound? - Rdf4j: one result
:x :y :z
(reported by @frensjan as Optional incorrectly binds values from LHS in the RHS graph pattern eclipse-rdf4j/rdf4j#4882)
Union
- Intuition: values "before" (outside) Union clauses are used in the union.
In particular, this is crucial for fetching multivalued props of a subject. - Canonical: no outside bindings are used in Union clauses
PREFIX : <http://example.org/>
SELECT * WHERE {
VALUES ?x { :x }
{} union {
FILTER( BOUND(?x) )
BIND( :y as ?y )
BIND( ?x as ?z )
}
}
- Jena, Blazegraph, Virtuoso: one result
:x
- Rdf4j: two results
:x
and:x :y :z
Referential Transparency
- Intuition: binding a sub-expression to a variable then using it in bigger expressions should give the same result.
So by any reasonable referential transparency principle, these should give the same result (except the latter should also bind?effectivePrice
) - Canonical: @frensjan writes these should give different results, but I don't know why
SELECT * {
VALUES ?price { 10 }
OPTIONAL {
VALUES ?discount { 0.10 }
FILTER( ?price * (1 - ?discount) < 10 )
}
}
has different semantics from:
SELECT * {
VALUES ?price { 10 }
OPTIONAL {
VALUES ?discount { 0.10 }
BIND( ?price * (1 - ?discount) AS ?effectivePrice )
FILTER( ?effectivePrice < 10 )
}
}
- Jena and Virtuoso:
10 0.1
and10
- Blazegraph:
10
and10
- rdf4j:
10 0.1
and10 0.1 9.0