Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

turtle parsing not correctly handling . in suffix #601

Open
jeswr opened this issue Feb 5, 2023 · 6 comments
Open

turtle parsing not correctly handling . in suffix #601

jeswr opened this issue Feb 5, 2023 · 6 comments
Assignees
Labels

Comments

@jeswr
Copy link

jeswr commented Feb 5, 2023

As discussed in this thread; terms like ex:a.b appear to be valid according to the definition of PN_LOCAL in the Turtle Grammar. However it appears that rdflib is interpreting this as ex:a . b rather than as a single term.

The terms are correctly parsed by N3.js but not by the custom parser in this library.


It might be worth considering migrating to use rdf-parse & rdf-serialize instead of the custom parsers; given that those packages use parser/serializers that are 100% passing the spec tests, and most are also currently having RDF-star & RDF 1.2 support added to them.

@situx
Copy link

situx commented Feb 8, 2023

I second this. I recently tried to parse the GeoSPARQL 1.1 vocabularies with rdflib.js, but it does not work because of this problem. All URIs describing examples in the GeoSPARQL specification contain dots.

@timbl
Copy link
Member

timbl commented Feb 8, 2023

They don't though seem to be going toward full Notation3 support. The current parser in rdflib will parse not only turtle but full Notation3 features like

  • nested graphs { :today a :NiceDay; :temp 15.0 } :wastrue :yesterday
  • variables ?who
  • explicit quantification @forall
  • the implies operator =>
  • RDF paths like :person!parent!name or :person^child!name useful in queries
  • naked names without the leading ':' Alice foaf:knows Bob .
  • Sets

which are useful for knowledge about knowledge, time-qualified data, and rules, and proofs

but also some things which were dropped from the Turtle spec for no understandable reason

  • inverse arcs :Joe :child Alice; is :child of Bob .
  • Defaults to @prefix : <#>. to save time
  • Two valid N3 files concatenated are a valid N3 file

ad also it has some things which are just fun and useful especially for testng

  • Naked dates 2023-02-08 or 2023-02-08T12:55 instead of "2023-02-08T12:55"^^xsd:dateTime

There is probably other stuff but that's it off the top of my head. The parser is old, and was converted from python, but the functionality is more than just turtle.

I guess we could could keep it for things explicitly labelled text/n3 and use the other one .. though I know there were many issues trying to connect the RDF object models.

But I suggest we change the notation3 parser to match the turtle spec.
By adding a dot to the allowed things in a name. There may be side-effects as there will have to be kludge code to check edge cases like :alice. :knows :bob. ...

@timbl
Copy link
Member

timbl commented Feb 8, 2023

((The history is that Notation3 was first, and then people standardized turtle as a subset but unfortunately then made small incompatible changes in the turtle spec. The addition of the dot as being allowed in names, when dot is already a punctuation in the grammar, obviously makes the language more complicated, with communication between the tokenizer and the grammar parser. of course another would be to promote a change in the turtle standard to fix that and a few other things. But life may be too short))

@jeswr
Copy link
Author

jeswr commented Feb 8, 2023

They don't though seem to be going toward full Notation3 support. The current parser in rdflib will parse not only turtle but full Notation3 features like [...] I suggest we change the notation3 parser to match the turtle spec.

rdf-parse & rdf-serialize already support most (perhaps even all?) standardised RDF serializations. This includes support for Notation3 and Turtle given by N3.js under the hood.

The only Notation3 features listed above that N3.js may be missing in its parser is support for naked names & sets (@RubenVerborgh I'm guessing you would know?).

Indeed N3.js is not up to scratch for serializing Notation3 at present as this custom code is required to use it for serializing Notation3 for use with the webassembly distribution of the eye reasoner.

@RubenVerborgh
Copy link
Member

support for naked names & sets

The N3.js code was based on a combined interpretation of https://www.w3.org/DesignIssues/Notation3.html and https://www.w3.org/TeamSubmission/n3/. I don't think either of them supports naked terms. The set syntax {$ 1, 2, <a> $} is marked as not part of N3.

A new effort is on the way to standardize N3 as a superset of Turtle: https://w3c.github.io/N3/spec/. This is also how N3.js interprets N3, so all Turtle syntax constructs like ex:a.b are also supported.

I think rdflib.js should definitely support the full Turtle syntax (for MIME type text/turtle), and also consider parsing N3 as a superset of Turtle (as @timbl suggests above), which would be in line with the new N3 spec effort.

@robertschubert
Copy link

robertschubert commented Mar 25, 2024

I second this also.
Tried to parse a path like this:
sh:path prefix:P2.1.1
The occurrence of the dots lead to an error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants