Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempting to validate nodeKind sh:IRI is useless #161

Open
umbreak opened this issue Aug 10, 2018 · 2 comments
Open

Attempting to validate nodeKind sh:IRI is useless #161

umbreak opened this issue Aug 10, 2018 · 2 comments

Comments

@umbreak
Copy link

umbreak commented Aug 10, 2018

Validations on property with "nodeKind": "sh:IRI" with "minCount": 0 don't work.

For example, this test against this schema passes because the workbench signals the wrong message.

The error happens before the validator kicks in, when trying to convert the triples into a Jena Model: Exception: Illegal character in path at index 5: wrong IRI. This is Just Jena saying that a field expected to have a @type of @id cannot have this character (space).

If you had tried to put "isDefinedBy":"wrong" on the data to test, the test would pass (even though we don't want it to pass, since wrong is not an IRI. This is due to the conversion from JSON-LD to Jena Model.

If we have something like this:

{
  "@context": {
    "dcterms": "http://example.com/",
    "isDefinedBy": {
      "@id": "dcterms:isDefinedBy",
      "@type": "@id"
    }
  },
  "@type":"dcterms:FileFormat",
  "isDefinedBy":"wrong"
}

when converting it to Jena (without having any base), the triple isDefinedBy gets removed and it will never be taken into account for the validation step. This is because isDefinedBy does not get expanded to dcterms:isDefinedBy because it does not match the rule "@type": "@id".

Same happens in may other validations like this test

@MFSY
Copy link
Contributor

MFSY commented Aug 13, 2018

Hi,
Thank you for looking at this.
I would not say that "Attempting to validate nodeKind sh:IRI is useless". I think it is a too strong statement.

Checking that a node is an IRI through "nodeKind":"sh:IRI" is definitely usefull and I believe that's why it was added to the SHACL core spec.

But I think you want to point that depending on how the user write his/her json-ld document, some triples may reach the validator or not being ignored when deserialised. And that's a fair point.

I have few comments on that:

  1. For a user, there is only one validator: the user does not know if the shacl validator is based on Jena or not. So in that case, the process of loading the json-ld payload into a Jena model prior to send it to a SHACL validator is completely transparent and corresponds to an atomic operation (I want to validate my json-ld payload and get back validation report).

  2. Let take the following property shape (which corresponds to the one you mentioned above):

{
          "path": "rdfs:isDefinedBy",
          "name": "Defined by",
          "nodeKind": "sh:IRI",
          "maxCount": 1
        }

The following data should fail because "mybase/wrong" is not a correct IRI:

{
  "@context": {
    "dcterms": "http://example.com/",
    "isDefinedBy": {
      "@id": "rdfs:isDefinedBy",
      "@type": "@id"
    },
"@base":"mybase/"
  },
  "isDefinedBy":"wrong"
}

The following data should pass:

{
  "@context": {
    "dcterms": "http://example.com/",
    "isDefinedBy": {
      "@id": "rdfs:isDefinedBy",
      "@type": "@id"
    },
"@base":"http://goodbase"
  },
  "isDefinedBy":"wrong"
}

The following data should fail because rdfs:isDefinedBy property is present with a value which is not an IRI (again this is a user perspective). The fact that "isDefinedBy" property is ignored or not when a @base is present or not is an implementation concern. If you look at the following data in the json-ld playground then you'll see that a default @base (https://json-ld.org/playground/) is used whenever one is not asserted in the user provided payload. I would imagine Nexus implements such default base feature.

{
  "@context": {
    "dcterms": "http://example.com/",
    "isDefinedBy": {
      "@id": "rdfs:isDefinedBy",
      "@type": "@id"
    }
  },
  "isDefinedBy":"wrong"
}

@umbreak
Copy link
Author

umbreak commented Aug 13, 2018

Yes I fully agree. That's why I was raising awareness that right now, the last example you posted should failed but it doesn't. The current test, which has "isDefinedBy":"wrong IRI" fails, as expected but just because there is a space and Jena is able to catch that error.

I just wanted to clarify that the test is misleading because it seems to cover the case when isDefinedBy is not a URI, but it doesn't cover it.

And the way to fix this is by having some default @base , as you mentioned

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants