Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparql ASK Query to implement unique constraint? #182

Open
tduval-unifylogic opened this issue May 1, 2023 · 9 comments
Open

Sparql ASK Query to implement unique constraint? #182

tduval-unifylogic opened this issue May 1, 2023 · 9 comments

Comments

@tduval-unifylogic
Copy link

Greetings again!

I am attempting to implement a unique constraint using Sparql as there is no predicate for this (that I know of ) in SHACL.
My thoughts were to use a sparql ask query. I have looked through tests/examples and cannot find an example of where an ask query is used in such a manner as I am looking to implement.

Here is what I'm attempting to use that doesn't get me the desired results.
Any suggestions are greatly appreciated!

    @prefix ex: <http://example.com/> .
    @prefix owl: <http://www.w3.org/2002/07/owl#> .
    @prefix sh: <http://www.w3.org/ns/shacl#> .
    @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

    # ONTO and SHACL
    ex:Person a owl:Class, sh:NodeShape ;
        sh:validator [
            a sh:SPARQLAskValidator ;
            sh:ask  """
                ask
                where {
                    ?person1 ex:email ?email .
                    ?person2 ex:email ?email .
                    FILTER (?person1 != ?person2)
                }
            """ ;
            sh:message "email addresses must be unique." ; ] 
    .

    # DATA
    ex.i:Person1 a ex:Person ;
        ex:email "email@address.com" 
    .
    ex.i:Person2 a ex:Person ;
        ex:email "email@address.com" 
    .
@ajnelson-nist
Copy link
Contributor

This looks like sh:maxCount, value 1, would meet your needs.

@tduval-unifylogic
Copy link
Author

tduval-unifylogic commented May 1, 2023

yes, i would use sh:maxCount if i needed to check if a single instance has more than one ex:email.

What I'm looking to do is check all instances of ex:Person to see if any of them have the same value for ex:email.

Unless there is something I am missing/not seeing?

@ajnelson-nist
Copy link
Contributor

Right, I read that backwards, I see now.

I think you could do this by treating the email value as a node---which it formally is, but linguistically I still have a hard time calling literals nodes.

ex:my-email-objects-shape
    a sh:NodeShape ;

    # Target the *object* of the predicate.  So, the Object member of the triple is the node whose shape we're constraining.
    sh:targetObjectsOf ex:email ;
    # Peek backwards to the subject using an inverse path.
    sh:property [
        a sh:PropertyShape ;
        sh:maxCount 1 ;
        sh:path [
            sh:inversePath ex:email .
        ] ;
    ] ;

    # That should do it.
.

@tduval-unifylogic
Copy link
Author

Thanks!

Just tried this and it throws validation error when the email addresses are the same, but also throws a validation error when they are different.

    @prefix ex: <http://example.com/> .
    @prefix owl: <http://www.w3.org/2002/07/owl#> .
    @prefix sh: <http://www.w3.org/ns/shacl#> .
    @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
    @prefix ex.i: <http://example.com/instance/> .

    # ONTO & SHACL
    ex:PersonShape a sh:NodeShape ;
        sh:targetObjectsOf ex:email ;
        sh:property [
            sh:maxCount 1 ;
            sh:path [
                sh:inversePath ex:email ;
            ] ;
        ] ;
    .
    # DATA
    ex.i:Person1 a ex:Person ;
        ex:email "email@address.com" 
    .
    ex.i:Person2 a ex:Person ;
        ex:email "email1@address.com" 
    .     

@tduval-unifylogic
Copy link
Author

AGH! I just realized what I did wrong. this works!!

@tduval-unifylogic
Copy link
Author

tduval-unifylogic commented May 1, 2023

thanks so much for your help!

now working on if I can create a composite unique. This seems to work. Does it look correct semantically?

    @prefix ex: <http://example.com/> .
    @prefix owl: <http://www.w3.org/2002/07/owl#> .
    @prefix sh: <http://www.w3.org/ns/shacl#> .
    @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
    @prefix ex.i: <http://example.com/instance/> .

    # ONTO & SHACL
    ex:Person a owl:Class, sh:NodeShape ;
        sh:property [
            sh:name "unique constraint" ;
            sh:description "" ;
            sh:targetObjectsOf ex:email, ex:address ;
            sh:maxCount 1 ;
            sh:path [
                sh:inversePath ex:email, ex:address ;
            ] ;
        ] ;
    .
    # DATA
    ex.i:Person1 a ex:Person ;
        ex:email "email@address.com" ;
        ex:address "address" .
    ex.i:Person2 a ex:Person ;
        ex:email "email@address.com" ;
        ex:address "address" .  

@ajnelson-nist
Copy link
Contributor

It looks like I missed you'd probably meant to reply to me.

I do not know what the semantics are of putting multiple targetObjectsOf into one sh:Shape. It turns out to be unnecessary to have sh:targetObjectsOf in that nested shape, though - you're already using the (implicit) selector from ex:Person, so you don't need sh:target(anything) on the sh:PropertyShape tied with sh:property.

Also, my recollection was one subject of sh:inversePath can't have two objects like you've spelled. The --metashacl flag (that I suggested you use in #183 ) confirms this usage is wrong. Here's the shell transcript of when I put your example into ex.ttl:

$ pyshacl --metashacl --shacl ex.ttl ex.ttl
SHACL File does not validate against the SHACL Shapes SHACL (MetaSHACL) file.
Validation Report
Conforms: False
Results (2):
Constraint Violation in XoneConstraintComponent (http://www.w3.org/ns/shacl#XoneConstraintComponent):
	Severity: sh:Violation
	Source Shape: shsh:ShapeShape
	Focus Node: [ sh:description Literal("") ; sh:maxCount Literal("1", datatype=xsd:integer) ; sh:name Literal("unique constraint") ; sh:path [ sh:inversePath ex:address, ex:email ] ; sh:targetObjectsOf ex:address, ex:email ]
	Value Node: [ sh:description Literal("") ; sh:maxCount Literal("1", datatype=xsd:integer) ; sh:name Literal("unique constraint") ; sh:path [ sh:inversePath ex:address, ex:email ] ; sh:targetObjectsOf ex:address, ex:email ]
	Message: Node [ sh:description Literal("") ; sh:maxCount Literal("1", datatype=xsd:integer) ; sh:name Literal("unique constraint") ; sh:path [ sh:inversePath ex:address, ex:email ] ; sh:targetObjectsOf ex:address, ex:email ] does not conform to exactly one shape in shsh:NodeShapeShape , shsh:PropertyShapeShape
Constraint Violation in OrConstraintComponent (http://www.w3.org/ns/shacl#OrConstraintComponent):
	Severity: sh:Violation
	Source Shape: [ sh:maxCount Literal("1", datatype=xsd:integer) ; sh:minCount Literal("1", datatype=xsd:integer) ; sh:or ( shsh:PathShape [ sh:nodeKind sh:IRI ] ) ; sh:path sh:path ]
	Focus Node: [ sh:description Literal("") ; sh:maxCount Literal("1", datatype=xsd:integer) ; sh:name Literal("unique constraint") ; sh:path [ sh:inversePath ex:address, ex:email ] ; sh:targetObjectsOf ex:address, ex:email ]
	Value Node: [ sh:inversePath ex:address, ex:email ]
	Result Path: sh:path
	Message: Node [ sh:inversePath ex:address, ex:email ] does not conform to one or more shapes in shsh:PathShape , [ sh:nodeKind sh:IRI ]

Validator encountered a Runtime Error:
SHACL File does not validate against the SHACL Shapes SHACL (MetaSHACL) file.
Validation Report
Conforms: False
Results (2):
Constraint Violation in XoneConstraintComponent (http://www.w3.org/ns/shacl#XoneConstraintComponent):
	Severity: sh:Violation
	Source Shape: shsh:ShapeShape
	Focus Node: [ sh:description Literal("") ; sh:maxCount Literal("1", datatype=xsd:integer) ; sh:name Literal("unique constraint") ; sh:path [ sh:inversePath ex:address, ex:email ] ; sh:targetObjectsOf ex:address, ex:email ]
	Value Node: [ sh:description Literal("") ; sh:maxCount Literal("1", datatype=xsd:integer) ; sh:name Literal("unique constraint") ; sh:path [ sh:inversePath ex:address, ex:email ] ; sh:targetObjectsOf ex:address, ex:email ]
	Message: Node [ sh:description Literal("") ; sh:maxCount Literal("1", datatype=xsd:integer) ; sh:name Literal("unique constraint") ; sh:path [ sh:inversePath ex:address, ex:email ] ; sh:targetObjectsOf ex:address, ex:email ] does not conform to exactly one shape in shsh:NodeShapeShape , shsh:PropertyShapeShape
Constraint Violation in OrConstraintComponent (http://www.w3.org/ns/shacl#OrConstraintComponent):
	Severity: sh:Violation
	Source Shape: [ sh:maxCount Literal("1", datatype=xsd:integer) ; sh:minCount Literal("1", datatype=xsd:integer) ; sh:or ( shsh:PathShape [ sh:nodeKind sh:IRI ] ) ; sh:path sh:path ]
	Focus Node: [ sh:description Literal("") ; sh:maxCount Literal("1", datatype=xsd:integer) ; sh:name Literal("unique constraint") ; sh:path [ sh:inversePath ex:address, ex:email ] ; sh:targetObjectsOf ex:address, ex:email ]
	Value Node: [ sh:inversePath ex:address, ex:email ]
	Result Path: sh:path
	Message: Node [ sh:inversePath ex:address, ex:email ] does not conform to one or more shapes in shsh:PathShape , [ sh:nodeKind sh:IRI ]

If you believe this is a bug in pyshacl, open an Issue on the pyshacl github page.

Confirming this has nothing to do with the instance data, here is the same command run against a graph with one owl:Thing individual, and the "# DATA" section cut from ex.ttl:

$ cat thing.ttl
@prefix owl: <http://www.w3.org/2002/07/owl#> .

[] a owl:Thing .
$ pyshacl --metashacl --shacl ex.ttl thing.ttl
SHACL File does not validate against the SHACL Shapes SHACL (MetaSHACL) file.
Validation Report
Conforms: False
Results (2):
Constraint Violation in OrConstraintComponent (http://www.w3.org/ns/shacl#OrConstraintComponent):
	Severity: sh:Violation
	Source Shape: [ sh:maxCount Literal("1", datatype=xsd:integer) ; sh:minCount Literal("1", datatype=xsd:integer) ; sh:or ( shsh:PathShape [ sh:nodeKind sh:IRI ] ) ; sh:path sh:path ]
	Focus Node: [ sh:description Literal("") ; sh:maxCount Literal("1", datatype=xsd:integer) ; sh:name Literal("unique constraint") ; sh:path [ sh:inversePath ex:address, ex:email ] ; sh:targetObjectsOf ex:address, ex:email ]
	Value Node: [ sh:inversePath ex:address, ex:email ]
	Result Path: sh:path
	Message: Node [ sh:inversePath ex:address, ex:email ] does not conform to one or more shapes in shsh:PathShape , [ sh:nodeKind sh:IRI ]
Constraint Violation in XoneConstraintComponent (http://www.w3.org/ns/shacl#XoneConstraintComponent):
	Severity: sh:Violation
	Source Shape: shsh:ShapeShape
	Focus Node: [ sh:description Literal("") ; sh:maxCount Literal("1", datatype=xsd:integer) ; sh:name Literal("unique constraint") ; sh:path [ sh:inversePath ex:address, ex:email ] ; sh:targetObjectsOf ex:address, ex:email ]
	Value Node: [ sh:description Literal("") ; sh:maxCount Literal("1", datatype=xsd:integer) ; sh:name Literal("unique constraint") ; sh:path [ sh:inversePath ex:address, ex:email ] ; sh:targetObjectsOf ex:address, ex:email ]
	Message: Node [ sh:description Literal("") ; sh:maxCount Literal("1", datatype=xsd:integer) ; sh:name Literal("unique constraint") ; sh:path [ sh:inversePath ex:address, ex:email ] ; sh:targetObjectsOf ex:address, ex:email ] does not conform to exactly one shape in shsh:NodeShapeShape , shsh:PropertyShapeShape

Validator encountered a Runtime Error:
SHACL File does not validate against the SHACL Shapes SHACL (MetaSHACL) file.
Validation Report
Conforms: False
Results (2):
Constraint Violation in OrConstraintComponent (http://www.w3.org/ns/shacl#OrConstraintComponent):
	Severity: sh:Violation
	Source Shape: [ sh:maxCount Literal("1", datatype=xsd:integer) ; sh:minCount Literal("1", datatype=xsd:integer) ; sh:or ( shsh:PathShape [ sh:nodeKind sh:IRI ] ) ; sh:path sh:path ]
	Focus Node: [ sh:description Literal("") ; sh:maxCount Literal("1", datatype=xsd:integer) ; sh:name Literal("unique constraint") ; sh:path [ sh:inversePath ex:address, ex:email ] ; sh:targetObjectsOf ex:address, ex:email ]
	Value Node: [ sh:inversePath ex:address, ex:email ]
	Result Path: sh:path
	Message: Node [ sh:inversePath ex:address, ex:email ] does not conform to one or more shapes in shsh:PathShape , [ sh:nodeKind sh:IRI ]
Constraint Violation in XoneConstraintComponent (http://www.w3.org/ns/shacl#XoneConstraintComponent):
	Severity: sh:Violation
	Source Shape: shsh:ShapeShape
	Focus Node: [ sh:description Literal("") ; sh:maxCount Literal("1", datatype=xsd:integer) ; sh:name Literal("unique constraint") ; sh:path [ sh:inversePath ex:address, ex:email ] ; sh:targetObjectsOf ex:address, ex:email ]
	Value Node: [ sh:description Literal("") ; sh:maxCount Literal("1", datatype=xsd:integer) ; sh:name Literal("unique constraint") ; sh:path [ sh:inversePath ex:address, ex:email ] ; sh:targetObjectsOf ex:address, ex:email ]
	Message: Node [ sh:description Literal("") ; sh:maxCount Literal("1", datatype=xsd:integer) ; sh:name Literal("unique constraint") ; sh:path [ sh:inversePath ex:address, ex:email ] ; sh:targetObjectsOf ex:address, ex:email ] does not conform to exactly one shape in shsh:NodeShapeShape , shsh:PropertyShapeShape

If you believe this is a bug in pyshacl, open an Issue on the pyshacl github page.

If you take what is piled into one sh:PropertyShape (object of sh:property) and split it into two sh:PropertyShapes, one for ex:email and one for ex:address, you'll get past the SHACL-SHACL error.

Back to uniqueness-constraining: What you need to do is select the object of the predicate, and then "hop backwards". sh:targetObjectsOf was the selector in my example ex:my-email-objects-shape, because I wrote a sh:NodeShape focused on that property. If you want a sh:NodeShape focused on the class (which I think is a reasonable exercise---it's a shape that roughly says "emails are uniquely used among this class, and likewise for addresses"), you need to use a property path that goes to the object of the property, and then back along all inverses of that Literal.

Here is your example graph showing a corrected implementation and a still-incorrect implementation, also with one more individual that is expected to not trigger an error:

$ cat ex.ttl 
@prefix ex: <http://example.com/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix ex.i: <http://example.com/instance/> .

# ONTO & SHACL
ex:Person
	a
		owl:Class ,
		sh:NodeShape
		;
	sh:property
		[
			sh:name "unique constraint" ;
			rdfs:comment "This shape is NOT correct yet.  No complaints raised from DATA section's ex:email usage."@en ;
			sh:maxCount 1 ;
			sh:path [
				sh:inversePath ex:email ;
			] ;
		] ,
		[
			sh:name "unique constraint" ;
			sh:maxCount 1 ;
			sh:path (
				ex:address
				[
					sh:inversePath ex:address ;
				]
			) ;
		]
		;
	.

# DATA
ex.i:Person1 a ex:Person ;
ex:email "email@address.com" ;
ex:address "address" .
ex.i:Person2 a ex:Person ;
ex:email "email@address.com" ;
ex:address "address" .
ex.i:Person3 a ex:Person ;
ex:email "email2@address2.com" ;
ex:address "a different address" .

Here is the shell transcript of running that - and because your DATA section is effectively an XFAIL test between Person1 and Person2, you should see that it's not failing everywhere it should be:

$ pyshacl --metashacl --shacl ex.ttl ex.ttl
Validation Report
Conforms: False
Results (2):
Constraint Violation in MaxCountConstraintComponent (http://www.w3.org/ns/shacl#MaxCountConstraintComponent):
	Severity: sh:Violation
	Source Shape: [ sh:maxCount Literal("1", datatype=xsd:integer) ; sh:name Literal("unique constraint") ; sh:path ( ex:address [ sh:inversePath ex:address ] ) ]
	Focus Node: ex.i:Person2
	Result Path: ( ex:address [ sh:inversePath ex:address ] )
	Message: More than 1 values on ex.i:Person2->( ex:address [ sh:inversePath ex:address ] )
Constraint Violation in MaxCountConstraintComponent (http://www.w3.org/ns/shacl#MaxCountConstraintComponent):
	Severity: sh:Violation
	Source Shape: [ sh:maxCount Literal("1", datatype=xsd:integer) ; sh:name Literal("unique constraint") ; sh:path ( ex:address [ sh:inversePath ex:address ] ) ]
	Focus Node: ex.i:Person1
	Result Path: ( ex:address [ sh:inversePath ex:address ] )
	Message: More than 1 values on ex.i:Person1->( ex:address [ sh:inversePath ex:address ] )

Do you see why?

@ajnelson-nist
Copy link
Contributor

Oops, I realized I an error in my demonstration. I think the uniqueness constraint needs to include a qualified shape on the class of the thing being hopped "back" towards. Depending on whether this is intended or not, the pervasiveness of the backwards hop can be demonstrated by adding this extra individual to the graph - note that it is typeless:

ex.i:Person4
ex:email "email2@address2.com" ;
ex:address "a different address" .  

ex.i:Person4 is not a ex:Person. (This example is a little odd for "Persons," but might make sense for other things, like imported ex:ImportedRecords vs. locally-generated ex:LocalRecords.) Should ex.i:Person3 care that a node not classified as ex:Person is using its supposedly-unique email address?

I'm not sure offhand how to write such a qualified shape. It sounds like a good SHACL exercise.

@ashleysommer
Copy link
Collaborator

@ajnelson-nist Thank you for your help fielding this issue.

@tduval-unifylogic Is this issue resolved? Can it be closed now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants