Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(shacl): improve type generation #1300

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

noelmcloughlin
Copy link
Contributor

@noelmcloughlin noelmcloughlin commented Feb 20, 2023

This PR improves shaclgen.py by adding nodeKind and datatype for Linkml types (fixes #1299)

TODO: I am not sure if sh:nodeKind sh:IRI is correct and maybe should be sh:nodeKind sh:Literal.

Example output from this PR:

sh:property [ sh:datatype "xsd:anyURI" ;             <=========== added
            sh:description "Specifies a reference to a resource outside of the UCO." ;
            sh:nodeKind sh:IRI ;              <=========== added
            sh:order 3 ;
            sh:path core:externalReference ],
        [ sh:datatype "xsd:string" ;              <=========== added
            sh:description "The name of a particular concept characterization." ;
            sh:maxCount 1 ;
            sh:nodeKind sh:Literal ;               <=========== added
            sh:order 6 ;
            sh:path core:name ]

@cmungall
Copy link
Member

Thanks for taking this on!

I believe sh:datatype should only be used for literals https://www.w3.org/TR/shacl/#validator-DatatypeConstraintComponent

ideally the unit tests would use pyshacl to test that validation works as expected but let's hold off on adding that as a dependency for now. For now we can just check the resulting shacl triples map as extended, and we can do an out of band test to ensure the shacl validates as expected

@codecov-commenter
Copy link

codecov-commenter commented Feb 21, 2023

Codecov Report

Merging #1300 (5ed2084) into main (b2278bb) will decrease coverage by 0.04%.
The diff coverage is 88.37%.

❗ Current head 5ed2084 differs from pull request most recent head 91377b9. Consider uploading reports for the commit 91377b9 to get more accurate results

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@            Coverage Diff             @@
##             main    #1300      +/-   ##
==========================================
- Coverage   80.39%   80.36%   -0.04%     
==========================================
  Files          77       77              
  Lines        8923     9004      +81     
  Branches     2197     2213      +16     
==========================================
+ Hits         7174     7236      +62     
- Misses       1332     1345      +13     
- Partials      417      423       +6     
Impacted Files Coverage Δ
linkml/transformers/relmodel_transformer.py 94.90% <66.66%> (-1.80%) ⬇️
linkml/generators/shaclgen.py 84.95% <75.00%> (+1.62%) ⬆️
linkml/generators/jsonschemagen.py 88.80% <95.65%> (+0.86%) ⬆️
linkml/workspaces/example_runner.py 80.21% <100.00%> (-0.47%) ⬇️
linkml/utils/schema_fixer.py 80.51% <0.00%> (-5.62%) ⬇️
linkml/linter/rules.py 92.26% <0.00%> (ø)
linkml/reporting/model.py 81.89% <0.00%> (ø)
linkml/generators/docgen.py 85.37% <0.00%> (ø)
... and 16 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@noelmcloughlin
Copy link
Contributor Author

Thanks @cmungall Good point - we should pair datatype with Literals only.

I am checking my linkml generated shacl against a control rdfs+shacl file someone else created which only declares sh:IRI and sh:Literal nodeKinds. However my PR is naive.

From docs: "The values of sh:nodeKind in a shape are one of the following six instances of the class sh:NodeKind: sh:BlankNode, sh:IRI, sh:Literal sh:BlankNodeOrIRI, sh:BlankNodeOrLiteral and sh:IRIOrLiteral. "

Is there a way calculate correct nodeKind from value of str(Literal(rt.uri)) ?

@cmungall
Copy link
Member

cmungall commented Feb 22, 2023 via email

rt = sv.get_type(r)
if rt.uri:
prop_pv(SH.datatype, rt.uri)
prop_pv(SH.datatype, Literal(rt.uri))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is right - even though the value of sh:datatype is used for validating literals, it is not itself a literal

See:

https://www.w3.org/TR/shacl/#DatatypeConstraintComponent

example:

sh:property [
		sh:path ex:age ;
		sh:datatype xsd:integer ;
	] .

@nlharris
Copy link
Contributor

Looks like this still has changes requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

shaclgen not generating nodeKind/datatype
4 participants