Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in longturtle serialization #2767

Closed
mschiedon opened this issue Apr 15, 2024 · 2 comments
Closed

Bug in longturtle serialization #2767

mschiedon opened this issue Apr 15, 2024 · 2 comments

Comments

@mschiedon
Copy link

mschiedon commented Apr 15, 2024

The longturtle serializer fails to emit a whitespace separator between a predicate and a list of objects if one of these objects is a blank node (and the blank node cannot be 'inlined', i.e. is used more than once). The problem can be reproduced using this Python code:

from rdflib import Graph

input = '''\
@prefix ex: <https://example.org/> .

ex:1 a ex:Thing ;
    ex:relatedTo ex:3, _:bnode0 .

ex:2 a ex:Thing ;
    ex:relatedTo _:bnode0 .

_:bnode0 a ex:Thing .
'''

graph = Graph().parse(data=input, format='turtle')
output = graph.serialize(format='longturtle')
print(output)
assert output.find('relatedTo_:') == -1, \
    'Missing whitespace separation between predicate' \
    ' and the first blank node of a list of objects.'

The resulting Turtle with the bug looks like below. Note the missing space between the predicate ex:relatedTo and blank node _:n40fef3a41a034be9a7116df126afd613b1 for the ex:1 case. The ex:2 case does correctly use a space separator when serializing because it's a single object and not a list.

PREFIX ex: <https://example.org/>

ex:1
    a ex:Thing ;
    ex:relatedTo_:n40fef3a41a034be9a7116df126afd613b1 ,
        ex:3 ;
.

ex:2
    a ex:Thing ;
    ex:relatedTo _:n40fef3a41a034be9a7116df126afd613b1 ;
.

_:n40fef3a41a034be9a7116df126afd613b1
    a ex:Thing ;
.

I believe the issue might be solved by adding an additional indent in the longturtle.py source code on this line, as shown in the code below.

    def objectList(self, objects):
        count = len(objects)
        if count == 0:
            return
        depthmod = (count == 1) and 0 or 1
        self.depth += depthmod
        first_nl = False
        if count > 1:
            if not isinstance(objects[0], BNode):
                self.write("\n" + self.indent(1))
                # BUG: Gave below line an extra indent.
                first_nl = True
        self.path(objects[0], OBJECT, newline=first_nl)
        for obj in objects[1:]:
            self.write(" ,")
            if not isinstance(obj, BNode):
                self.write("\n" + self.indent(1))
            self.path(obj, OBJECT, newline=True)
        self.depth -= depthmod
@nicholascar
Copy link
Member

I think this issue has been addressed by PR #2700 but that fix is currently only in the HEAD of this repo, not an RDFlib release yet. It should appear in 7.0.1 or 7.1.0 in the next few weeks when we make that release which will fix a bunch of small things.

@mschiedon
Copy link
Author

mschiedon commented May 23, 2024

I think this issue has been addressed by PR #2700 but that fix is currently only in the HEAD of this repo, not an RDFlib release yet. It should appear in 7.0.1 or 7.1.0 in the next few weeks when we make that release which will fix a bunch of small things.

Excellent, thank you! I can confirm this addresses the issue. Looking forward to the next rdflib release then 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants