Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gen-SHACL "target class" vs "Shape name" #2084

Open
HendrikBorgelt opened this issue Apr 29, 2024 · 4 comments
Open

gen-SHACL "target class" vs "Shape name" #2084

HendrikBorgelt opened this issue Apr 29, 2024 · 4 comments
Labels
bug Something that should work but isn't, with an example and a test case. community-generated documentation Improvements or additions to documentation generator-misc Pertaining to more than one generator, or perhaps one that doesn't exist yet

Comments

@HendrikBorgelt
Copy link

Describe the bug
When trying to create a SHACL shape with the LinkML generator the LinkML template currently does not differentiate between the target class and the SHACL shape name. However, SHACL neither requires nor intends to always have a SHACL shape which also is its own shape. Therefore a "target class" should be added similar to the "class_uri" attribute ( sorry if this is not the right LinkML nomenclature...) .

To reproduce
Steps to reproduce the behavior:

  1. Go to the command line
  2. Click on use the "gen-Shacl" converter to generate a SHACL template from a LinkML Yaml file as seen below:

LinkML_issue_HB.yaml

id: https://linkML.com/linkml/tests/DCATap
name: DCATap_LinkML_Template
prefixes:
  linkml: https://w3id.org/linkml/
  dcat: http://www.w3.org/ns/dcat#


imports:
  - linkml:types
default_range: string  

  
classes:
    dcat_dataset:
        class_uri: dcat:Dataset

to
LinkML_Issue_HB.shacl.ttl

# metamodel_version: 1.7.0
@prefix dcat: <http://www.w3.org/ns/dcat#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

dcat:Dataset a sh:NodeShape ;
    sh:closed true ;
    sh:ignoredProperties ( rdf:type ) ;
    sh:targetClass dcat:Dataset .
  1. Scroll down to ' dcat:Dataset a sh:NodeShape ; ... sh:targetClass dcat:Dataset '
  2. See the error that a NodeShap is an OWL class at the same time

Expected behavior
While it is acceptable to create classes in a SHACL file (a sh: class a.s.o.), a shape typically is not equivalent to its target class. A SHACL shape most often validates the axioms of a target class but is not required to fully depict all axioms of a target class. Accordingly, a SHACL shape can check for only a part of the axioms of a target class, for example, to ensure computational efficiency or extensibility.
LinkML should therefore either have an additional attribute similar to the class_uir: which then might be called target_uri or should use the LinkML class name as the shape name. Please look at the examples below:

1st recommendation (would be more feasible since the Shape could carry a prefix):

classes:
    dcat_dataset:
        target_uri: dcat:Dataset
        class_uri: dcat:DatasetShape

should lead to:

dcat:DatasetShape a sh:NodeShape ;
    sh:closed true ;
    sh:ignoredProperties ( rdf:type ) ;
    sh:targetClass dcat:Dataset .

2nd possibility (not recommended as this limits the SHACL shape to having just the default prefix as a prefix):

classes:
    dcat_dataset:
        target_uri: dcat:Dataset
        class_uri: dcat:DatasetShape

should lead to:

:dcat_dataset a sh:NodeShape ;
    sh:closed true ;
    sh:ignoredProperties ( rdf:type ) ;
    sh:targetClass dcat:Dataset .

About your computer (if applicable, please complete the following information):

  • OS: Win 10
  • Browser: (Firefox) as for coding the general cmd shell (not power shell) is used
  • Version: linkML v.1.7.8
@HendrikBorgelt HendrikBorgelt added the bug Something that should work but isn't, with an example and a test case. label Apr 29, 2024
@cmungall
Copy link
Member

cmungall commented May 2, 2024

You can pass the --suffix option to override the default behavior:

https://linkml.io/linkml/generators/shacl.html#cmdoption-gen-shacl-s

@linkml/developers - this issue should not be closed until either a FAQ entry and/or the shaclgen docs make this behavior a bit less hidden.

@cmungall cmungall added the documentation Improvements or additions to documentation label May 2, 2024
@HendrikBorgelt
Copy link
Author

Thanks, @cmungall for the advice, but I am sorry to say that this is not the fix I was hoping for.

Command line options like --suffix, --close a.s.o. force a behavior on all linkML classes which should not be the intended solution. In SHACL I might want to have one class
4Chem: Simple DataSetShape
which would have as target class dcat:DataSet and then a more elaborate Shape called
4Catalysis: Extended DataSetShape .
With my tests of the suffix option, both of them would adopt the class_uri: dcat:DataSet and thus, create to shapes called dcat:DataSetShape. While this allows to differentiate between a class and a target sufficiently for rudimentary purposes, it does not allow for the free naming schema intended in SHACL.

It would restrict using LinkML's gen:shacl converter in some instances. For example, if I want to create a new shape in my domain, let's call it 4Catalysis: and I want to verify a skos vocabulary, an ontology, or even just an rdfs terminology not belonging to my domain but rather to a domain such as obo:, dcat or linkML i would misrepresent the provenance or even violate the license agreement if I inherit the domain of the target class. While this may not be a concern for some people, I also can not publish these new SHACL shapes, because they don't belong to a domain, I possess and therefore a domain to which I can add them.

I hope there is a fix for this possible because I hope to use linkML in some data pipelines where we want to validate metadata in json as well as rdfs/SHACL formats, and therefore LinkML would be a perfect tool and thanks again for the help.

@nlharris nlharris added community-generated generator-misc Pertaining to more than one generator, or perhaps one that doesn't exist yet labels May 2, 2024
@cmungall
Copy link
Member

cmungall commented May 8, 2024

apologies, should have read more closely.

Would this be acceptable:

classes:
    dcat_dataset:
        class_uri: dcat:Dataset
        annotations:
            shape: dcat:DatasetShape

leading to:

dcat:DatasetShape a sh:NodeShape ;
    sh:closed true ;
    sh:ignoredProperties ( rdf:type ) ;
    sh:targetClass dcat:Dataset .

I think we should keep class_uri as being whatever is expected for rdf:type in the ABox (i.e the targetClass).

@HendrikBorgelt
Copy link
Author

Yes, this should fix the current problems and also not disturb the class_uri class, which of course is required for some other generators.

I just have the remark that the annotation shape might be a little bit confusing for people not interested in ever using LinkML with SHACL in mind and would just give it a slightly more precise (self-explaining) name such as shape_uri, shacl_uri or shacl_name.
As I only want to mention that there might be a better name for the annotation, don't want to start a discussion around a possible name and think that the linkML developers are more than capable of finding a good name for the annotation, i would leave this to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that should work but isn't, with an example and a test case. community-generated documentation Improvements or additions to documentation generator-misc Pertaining to more than one generator, or perhaps one that doesn't exist yet
Projects
None yet
Development

No branches or pull requests

3 participants