Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with % in URI #14

Open
dwbutler opened this issue Apr 22, 2011 · 4 comments
Open

Problem with % in URI #14

dwbutler opened this issue Apr 22, 2011 · 4 comments
Assignees
Labels

Comments

@dwbutler
Copy link
Member

I have some TTL data as follows:

@prefix db: <http://dbpedia.org/resource/> .
@prefix dbm: <http://dbpedia.org/meta/> .
@prefix dbo: <http://dbpedia.org/ontology/> .

db:Michael_Jackson dbo:activeYearsEndYear "2009-01-01T00:00:00Z";
   dbo:activeYearsStartYear "1964-01-01T00:00:00Z";
   dbo:artistOf db:%28I_Can%27t_Make_It%29_Another_Day,

etc

I receive the following error when trying to parse this data with RDF::Reader.for(:n3).new(data)

   dbo:artistOf db:%28I_Can%27t_Make_It%29_Another_Day,

-------------------^
Error on line 22 at offset 19: Found '%' when parsing a pathtail. expected  | "!","expression" | "^","expression"
    from /opt/local/lib/ruby/gems/1.8/gems/rdf-n3-0.3.1.1/lib/rdf/n3/reader/parser.rb:35:in `parse'
    from /opt/local/lib/ruby/gems/1.8/gems/rdf-n3-0.3.1.1/lib/rdf/n3/reader.rb:95:in `each_statement'

I originally wrote this data using the same parser.

Any help you can provide would be greatly appreciated!

@dwbutler
Copy link
Member Author

The problem only seems to happen when using prefixes. If I use full URIs, the problem goes away.

<http://dbpedia.org/resource/Michael_Jackson> <http://dbpedia.org/ontology/activeYearsEndYear> "2009-01-01T00:00:00Z";
   <http://dbpedia.org/ontology/activeYearsStartYear> "1964-01-01T00:00:00Z";
   <http://dbpedia.org/ontology/artistOf> <http://dbpedia.org/resource/%28I_Can%27t_Make_It%29_Another_Day>,

@gkellogg
Copy link
Member

That's because in N3, QNames have a prescribed form that disallows the %-encoded URI. A QName has a prefix and a local part, both of which must be NCNames. From Namespaces [1], an NCName is composed of the following characters:

[4]     NCName      ::= (Letter | '_') (NCNameChar)*     /* An XML Name, minus the ":" */
[5]     NCNameChar  ::= Letter | Digit | '.' | '-' | '_' | CombiningChar | Extender

None of these include '%'.

The bug is in the serializer, which does not ensure that an attempt to shorten a URI to a QName ensures that the result is a valid QName. Note that RDFa uses CURIes, instead of QName, and this would be a valid CURIE. I'll update the serializer for N3 (and for RDF/XML) to do this. This should result in the following:

    @prefix db: <http://dbpedia.org/resource/> .
    @prefix dbm: <http://dbpedia.org/meta/> .
    @prefix dbo: <http://dbpedia.org/ontology/> .

    db:Michael_Jackson dbo:activeYearsEndYear "2009-01-01T00:00:00Z";
      dbo:activeYearsStartYear "1964-01-01T00:00:00Z";
      dbo:artistOf <http://dbpedia.org/resource/%28I_Can%27t_Make_It%29_Another_Day> .

[1] http://www.w3.org/TR/1999/REC-xml-names-19990114/#dt-NSDecl

@dwbutler
Copy link
Member Author

Thanks for the quick response! I am not too familiar with the details of the standards, as you can see. :)

I've been experimenting a bit, and found another workaround for this. If you specify the base_uri instead of a prefix, reading and writing work without errors.

BASE_URI = URI.new("http://dbpedia.org/resource/")
repository.dump(:n3, {:prefixes => PREFIXES, :base_uri => BASE_URI})

results in:

@base <http://dbpedia.org/resource/> .
@prefix dbo: <http://dbpedia.org/ontology/> .

<Michael_Jackson> dbo:activeYearsEndYear "2009-01-01T00:00:00Z";
   dbo:activeYearsStartYear "1964-01-01T00:00:00Z";
   dbo:artistOf <%28I_Can%27t_Make_It%29_Another_Day>,

@ghost ghost assigned gkellogg Apr 26, 2013
@dannybtran
Copy link

Ran into a similar issue when trying to parse data from dbpedia.org.

http://dbpedia.org/data/S'ym.n3

In the last statement on the page, the local portion of the qname contains a single-quote. No doubt, caused by the serializer on dbpedia's end. Nevertheless, it is quite frustrating.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants