Replies: 4 comments 4 replies
-
Hi, @wardi I like very much how the suggestion is described in details. It feels like a change that will be used widely concerning applicability of this change. May you add answers on a standard set of questions:
|
Beta Was this translation helpful? Give feedback.
-
updated description above with feedback from dev call: link to #7856 and ownership as part of the model |
Beta Was this translation helpful? Give feedback.
-
Hey @wardi 馃憢 Can I ask some clarifying questions?
|
Beta Was this translation helpful? Give feedback.
-
This is much needed for "Metadata resources" (#7856) to work. Currently, DP+ optionally allows the creation of "secondary" resources (summary statistics, frequency tables, quarantined PII candidate rows) and the connection is a bit brittle. It currently just creates a file with a suffix to indicate the relationship to the primary resource, and a resource extra with the resource_id of the secondary resource. For the model, you may also want to add "format kind" in addition to "file format", with a parent-child relationship - e.g. "geospatial" format kind - geojson; "geospatial" - kml; compressed - zip; compressed - tar.gz; data - exampledata.json; data - exampledata.xlsx; metadata - datapackage.json; metadata - frequency_table.csv. Note how the exampledata.json and datapackage.json are differentiated as data and metadata respectively, even though they are both JSON files. See https://docs.rs/file-format/latest/file_format/enum.Kind.html |
Beta Was this translation helpful? Give feedback.
-
Uploaded files in CKAN are limited to 0 or 1 file attached to only groups or resources.
The group or resource model stores a reference to the file with a plain text column that can be updated like other metadata values. Resources can store the length, hash and format of a file uploaded, but these are metadata fields free for users to update (or not) that aren't durably linked to the file itself.
Uploaded files can leak, staying on the underlying storage and costing money even though there is no longer any way to reach them from the CKAN site.
There is no shared way to represent files that aren't yet attached to a group or resource, e.g:
It's not possible to attach multiple files to a resource even when they represent the same data. This would be very useful for:
Model solution
Let's create a model for uploaded files in CKAN that can be linked to resources or groups or anything else that a site might need.
Files would have:
Other possibilities:
This model would make file metadata reliable, allow us to build new features and potentially save people money by better tracking hosted data in CKAN.
Beta Was this translation helpful? Give feedback.
All reactions