Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ShapeRecursionWarning #154

Open
bguayante opened this issue Aug 24, 2022 · 8 comments
Open

ShapeRecursionWarning #154

bguayante opened this issue Aug 24, 2022 · 8 comments

Comments

@bguayante
Copy link

I am encountering some unexpected behavior when validating a JSON against a SHACL file using pyshacl v0.19.1. In all modes of operation, the following warning is returned when validation is run:

/Users/home/.local/share/virtualenvs/SHACL_Validation-NUUjCTak/lib/python3.10/site-packages/pyshacl/constraints/core/shape_based_constraints.py:88: ShapeRecursionWarning: Warning, A Recursive Shape was detected executing a recursive validation sequence 6 levels deep. Backing out.
<NodeShape http://schema.org/Organization>-><PropertyConstraintComponent on <NodeShape http://schema.org/Organization>>->->>><NodeConstraintComponent on >-><NodeShape http://schema.org/Organization>-><PropertyConstraintComponent on <NodeShape http://schema.org/Organization>>
For reference, see https://www.w3.org/TR/shacl/#shapes-recursion
warn(ShapeRecursionWarning(_evaluation_path))

The validation results that are returned appear to be correct (i.e., errors are returned when intentionally introduced into the JSON) and so this is non-fatal, but I would like to verify whether this is a consequence of my implementation or a characteristic of pyshacl's support for recursive shapes. I suspect the former is true as the pattern TypeA:Property -> TypeA is not uncommon on Schema.org and so this does not seem to be a nonstandard use of SHACL.

A minimal example that reproduces the warning follows.

shacl.ttl

schema:Organization
    a rdfs:Class, sh:NodeShape ;
    sh:property
    [
        sh:path schema:name ;
        sh:label "name" ; 
        sh:description "The official name of the entity being described." ;
        sh:datatype xsd:string ;
    ],
    [
        sh:path schema:subOrganization ;
        sh:label "contained organization" ;
        sh:description "An organization contained within a parent organization." ;
        sh:node schema:Organization ;
    ] ;
.

data.json

{
    "@context": "http://schema.org",
    "@id": "http://www.illinoiscourts.gov/Circuit",
    "@type":"http://schema.org/Organization",
    "name": "State of Illinois Circuit Court",
    "subOrganization":  {
        "@context": "http://schema.org/",
        "@id": "http://www.illinoiscourts.gov/Circuit#Circuit1",
        "name": "State of Illinois Circuit 1",
        "subOrganization":  {
            "@id": "http://www.illinoiscourts.gov/Circuit#Circuit1District1",
            "name": "State of Illinois Circuit 1 District 1"
        }
    }
}

As an addendum, I'll also note that I am seeing the same non-halting error described in this issue under the same circumstances reported by the user.

@ajnelson-nist
Copy link
Contributor

Is there a reason the sh:path schema:subOrganization property shape uses sh:node schema:Organization instead of sh:class schema:Organization? I'd think from this example that sh:class would meet your "Recursive" needs, without inducing a loop in SHACL references.

@bguayante
Copy link
Author

Is there a reason the sh:path schema:subOrganization property shape uses sh:node schema:Organization instead of sh:class schema:Organization? I'd think from this example that sh:class would meet your "Recursive" needs, without inducing a loop in SHACL references.

Thanks for your response, and apologies for the delay in mine.

No, there is no reason that sh:node would be preferred. I patterned the graph using examples from the documentation but did not fully understand them, apparently. For clarification, in what circumstance would sh:node be preferred over sh:class? Intuitively, both seem to point to an instance/node of the same shape but I am guessing that I am generalizing something incorrectly or incompletely. Do you mind explaining what I am missing?

In any case, your suggestion addressed the errors and all seems to be working as expected. Thanks very much for your assistance.

@ashleysommer
Copy link
Collaborator

ashleysommer commented Sep 7, 2022

Hi @bguayante
You are conflating two different concepts.

sh:class is a Value-Type constraint, that means "The value of rdf:type of this data must be of the given class type". I expect this is what you want to achieve in your example above. See the SHACL spec for this constraint type

sh:node is a more advanced kind of SHACL constraint, it is a Shape-based constraint, that means "the focus of this shape, must conform to this other given shape". See the SHACL spec for this constraint type. It is rare that you would use sh:node constraint as a beginner, especially if you don't know what it does or what it is used for.

The reason you are getting the two confused in this case, is because you are (I suspect unintentionally) using the Implicit Class Target feature of SHACL, where instead of defining as SHACL Shape and giving it a sh:targetClass, for targeting data, you have redefined your targetClass itself (schema:Organization) to also be a shacl:NodeShape. So in this case, your schema's rdfs:Class (schema:Organisation), the SHACL Shape (sh:NodeShape), and the SHACL targetClass are all the same node. This is usually done in cases where you do not want to have your SHACL Shapes file separate from your Ontological definition file (ie, where you set up your rdfs:Class relationships).

For further clarity, in this example Shape, you are targeting all data objects that have a class of "schema:Organization", and enforcing a property constraint, which is a propertyShape that enforces the constraint that subOrganization should comply with the sh:node schema:Organization. So to verify that conforms, it again executes the original "schema:Organization" NodeShape, that again enforces a property constraint that looks at subOrganization that itself will validate using the sh:node constraint, executing sh:Organization NodeShape again. I hope you can see the recursive issue you're facing.

Instead, when you use sh:class it becomes: you are targeting all data objects that have a class of "schema:Organization", and enforcing a property constraint, which is a propertyShape that enforces the constraint that subOrganization should have the rdf:type schema:Organization, that is what I think you want.

Note, while investigating this issue, I found a possible bug in the implementation above, that if fixed might allow simple examples like yours to work as expected without throwing this error. I'll update this thread if I make any progress on that.

EDIT: See my next post..

@ashleysommer
Copy link
Collaborator

ashleysommer commented Sep 7, 2022

Doing some more investigation, I've come to the conclusion that this particular example is not actually recursive in the sense that it would break not PySHACL. If the recursion check was not in place, this example would run and complete as expected, as it is not infinitely recursive. However the pattern of evaluation path that looks like:

NodeShape->propertyPathConstraint->PropertyShape->nodeShapeConstraint->NodeShape->propertyPathConstraint->PropertyShape->nodeShapeConstraint->NodeShape 

That does trigger the recursion detection mechanism in PySHACL. That mechanism has no way of knowing if the recursion look is infinite or not, so it triggers after the first repeating cycle. This could be optimised more, to perhaps trigger on the second cycle, but this would still throw the warning when it gets to Organisation->subOrganisation->Organisation->subOrganisation->Organisation, so the question is how many cycles of this pattern should be allowed before it is deemed too deep?

I will also note, something I should have mentioned in my previous comment:
This message is harmless. It is a warning, to let you know that PySHACL protected itself against potentially recursive shapes, it broke out of the recursion, and alerts you that there might be something wrong with your Shapes File definitions. It is not an error, and in almost all cases, your data is still validated correctly. In this case, the NodeShape schema:Organisation constraints are still applied to all "organisation" objects in the data. It doesn't need to recurse to do that.

@ashleysommer
Copy link
Collaborator

@bguayante Can you please try the new PySHACL v0.20.0 released today: https://pypi.org/project/pyshacl/0.20.0/
The situation I described above is handled a little better in this new version.

@ashleysommer
Copy link
Collaborator

ReOpening, to keep the discussion open, because I don't think the discussion is finished yet.

@ashleysommer ashleysommer reopened this Sep 8, 2022
@bguayante
Copy link
Author

Happy to. Run against the example above, the only information returned is the validation result. No warning is presented. This is also the case when run against the more robust files with which I am actually working and the results seem to be accurate when errors are introduced. I'll also note that the --metashacl issue I mentioned in my original post appears to be resolved with this release.

Thank you for your extensive answer, I really appreciate it. I believe I have the concepts straight now and sh:class is indeed what I'm looking for in this case.

@ashleysommer
Copy link
Collaborator

Yep, you're right, the metashacl issue is also fixed as part of that release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants