Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG : jsonld parser does not randomise bnodes when using local _prefixed identifiers "@id": "_:mybnode01" #2760

Open
marc-portier opened this issue Apr 9, 2024 · 1 comment

Comments

@marc-portier
Copy link

marc-portier commented Apr 9, 2024

Parsing this turtle:

_:b0 a <http://example.org/MyType> .
_:b1 a <http://example.org/MyType> .
_:x9 a <http://example.org/MyType> .

leads to the bnode local-identifiers (correctly) being replaced with generated uuid

. found 0 -> BNode item.n3()='_:n17eb88c2c4cf4557b23b9407db5723ffb1' 
. found 1 -> BNode item.n3()='_:n17eb88c2c4cf4557b23b9407db5723ffb2' 
. found 2 -> BNode item.n3()='_:n17eb88c2c4cf4557b23b9407db5723ffb3'

and (also correct) new ones at every run

While parsing the equivalent json-ld:

[
  {"@id": "_:b0", "@type": "http://example.org/MyType" },
  {"@id": "_:b1", "@type": "http://example.org/MyType" },
  {"@id": "_:x9", "@type": "http://example.org/MyType" } ]

will lead to

. found 0 -> BNode item.n3()='_:b0' 
. found 1 -> BNode item.n3()='_:b1' 
. found 2 -> BNode item.n3()='_:x9' 

Which actually extends the reach and life-time of these local identifiers far beyond their intended scope.

In practice: loading two distinct json-ld files which happen to use the same local bnode-identifiers into the same graph will effectively mix up the nodes from both.

Note: A similar issue was identified and fixed in rdflib.js --> linkeddata/rdflib.js#555

@marc-portier
Copy link
Author

in case you get bitten by this bug too:
a dirty hack around this is just serializing to another format and use another rdflib parser that does not have this problem

def reparse(g: Graph, format="nt"):
    """This is a dirty hack workaround for issue https://github.com/RDFLib/rdflib/issues/2760
    It reproduces the graph by serializing and parsing it again
    Via an intermediate format (not jsonld!) that is known to work
    :param g: the graph to reparse
    :param format: the intermediate format to use
    """
    return Graph().parse(data=g.serialize(format=format), format=format)

@marc-portier marc-portier changed the title jsonld parser does not randomise bnodes when using local _prefixed identifiers "@id": "_:mybnode01" BUG : jsonld parser does not randomise bnodes when using local _prefixed identifiers "@id": "_:mybnode01" Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant