Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using SPARQL constraints with the keyword graph on nquads #175

Open
Manoe-K opened this issue Mar 28, 2023 · 1 comment
Open

Using SPARQL constraints with the keyword graph on nquads #175

Manoe-K opened this issue Mar 28, 2023 · 1 comment

Comments

@Manoe-K
Copy link

Manoe-K commented Mar 28, 2023

Hello,
I am trying to use SHACL in order to verify the shape of Nquads data.
For some of the constraints I use SPARQL constraints with the keyword graph in order to validate that the information within a graph is coherent with the information in the default graph.

Here's an example of such a constraint:

sh:sparql [
        sh:prefixes foaf: ;
        sh:select """
            SELECT $this
            WHERE {
                GRAPH $this { ?s ?p ?o. } 
                ?s a foaf:person.
            }
            """ ;
    ] ;
.

My current approach is to load n-quad data as a rdflib Dataset and then call validate() with data_graph=my_dataset.

Doing this will send the following exception, despite me already using a dataset:

Exception: You performed a query operation requiring a dataset (i.e. ConjunctiveGraph), but operating currently on a single graph.

By reading issue #26, I think I understood that validating over datasets validates over each graph one by one, which would explain the exception.

So is there a way to validate n-quads that would allow me to use such SPARQL constraints?
Or in general, is there a way to validate a Dataset as one block?

@ashleysommer
Copy link
Collaborator

ashleysommer commented Mar 31, 2023

Hi @Manoe-K
Thanks for bringing this up. The issue from #26 is resolved, and it is not related to this issue.

The issue thread you want to read is #152 and indeed your issue thread is considered a duplicate of #152.

You are right that what you want to do seems like it should be possible, but due to the architecture of PySHACL, it is not possible. (And that error you are seeing is misleading).

The SHACL W3C Spec is written with the assumption that your datagraph is a single graph, rather than a dataset or union of named graphs. In order to adhere to this assumption, when we added the feature to allow validation of an RDFLib Dataset it was necessary to iteratively run validation on each named graph separately.

In order to validate across the whole Dataset at once, it would be necessary to re-engineer large portions of PySHACL, and also require it to make decisions about validation that are outside the scope of the W3C SHACL Spec, that is dangerous territory. I have not used other SHACL Validation engines, but I believe none of the others have this ability either, for the same reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants