Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dcat:mediaType must be a resource #237

Open
jze opened this issue Dec 19, 2022 · 3 comments
Open

dcat:mediaType must be a resource #237

jze opened this issue Dec 19, 2022 · 3 comments

Comments

@jze
Copy link
Contributor

jze commented Dec 19, 2022

The range of dcat:mediaType has been tightened from dct:MediaTypeOrExtent to dct:MediaType as part of the revision of DCAT. https://www.w3.org/TR/vocab-dcat-2/#Property:distribution_media_type

Currently a URI or a literal is returned. https://github.com/ckan/ckanext-dcat/blob/master/ckanext/dcat/profiles.py#L1411 Only a URI should be used at this point.

@jlanza
Copy link

jlanza commented Jan 17, 2023

Maybe a newbie question. Why is it not required to explicitly say the the ref in the URI is point to a dct:MediaType class?

if mimetype:
    mimetype_ref = URIRef(mimetype)
    g.add((mimetype_ref, RDF.type, DCT.MediaType))
    g.add((distribution, DCAT.mediaType, mimetype_ref))

@jlanza
Copy link

jlanza commented Jan 19, 2023

I would like to add another comment concerning the same issue with the dcat:mediaType value. As from the DCAT-AP spec both dct:format and dcat:mediaType are dct:MediaType.

In this sense, if you consider using the full URI of IANA, that is for example https://www.iana.org/assignments/media-types/application/ld+json or the URI of the data.europa.eu vocabulary as suggested by the European Data Portal Metadata Quality Assessment Methodology, CKAN is not showing the previsualization.

Find below 2 examples of what I mean. It is not just the previsualization but the way the Dataset is later on serialized.

  1. Format set as JSON_LD and mediaType as the short IANA definition application/ld+json. You can see the previsualization.

jsonld-noref jlanza

In this case the serialization of the properties of the Dataset results in:

"dct:format": "JSON_LD"
"dcat:mediaType": "application/ld+json"
  1. Format set as full URI http://publications.europa.eu/resource/authority/file-type/JSON_LD and mediaType as https://www.iana.org/assignments/media-types/application/ld+json. You cannot see the previsualization.

jsonld-ref jlanza

In this case the serialization of the properties of the Dataset as JSON-LD results in:

"dct:format": {
        "@id": "http://publications.europa.eu/resource/authority/file-type/JSON_LD"
},
"dcat:mediaType": {
        "@id": "https://www.iana.org/assignments/media-types/application/ld+json"
 }

As you can see the first one is not fully compliant with DCAT-AP but CKAN behaves as expected. The second is just the other way round, complaint but CKAN is not working as expected.

In this sense, I don't know if it will be sensible to modify the dcat extension, mainly in the profiles definition, to check if the values of format and mediaType are URI references or just values. In case they are URIs we just left it untouched, but in case they aren't the logical thing will be to search for one that "resembles" or directly prepend the IANA or Europa Vocabulary domains and paths to get the full URI.

What do you think? Should I try to work that out?

Thanks for you help and comments.

@seitenbau-govdata
Copy link
Member

The range of dcat:mediaType has been tightened from dct:MediaTypeOrExtent to dct:MediaType as part of the revision of DCAT. https://www.w3.org/TR/vocab-dcat-2/#Property:distribution_media_type

Currently a URI or a literal is returned. https://github.com/ckan/ckanext-dcat/blob/master/ckanext/dcat/profiles.py#L1411 Only a URI should be used at this point.

Yes, but it is serialized as a literal only if the value isn't a valid URI. This avoids resulting in an invalid serialized graph. That's more or less necessary, because the python library rdflib also creates serialized URIs with values that are an invalid URI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants