Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tighten validation for id fields #7604

Open
amercader opened this issue May 25, 2023 · 2 comments · May be fixed by #8069
Open

Tighten validation for id fields #7604

amercader opened this issue May 25, 2023 · 2 comments · May be fixed by #8069

Comments

@amercader
Copy link
Member

amercader commented May 25, 2023

Having loose or inconsistent validation on id fields can lead to confusing behaviour or security concerns. Let's do a review of the various entities with id field in the model and apply the same validation to all of them (if possible). Validators should include:

  • "Can only be changed by a sysadmin": custom ids only make sense in the context of specific scenarios like migrations, it make sense that only sysadmins (or plugins that ignore auth) touch them
  • "Value is a valid UUID". In the past we've allowed different values here to support potential use cases but it's probably safer to enforce UUIDs for all ids and use custom fields if needed
  • "Id does not already exist when creating": to prevent taking over other entities

Whenever possible we should look at enforcing those not only with validators but also at the SQLAlchemy level, eg using inserts instead of upserts when creating records.

Some of these are already applied to some entities, but it would be good to be consistent across all models

  • Dataset
  • Resource
  • Resource View
  • Groups (org/groups)
  • User
  • Activity
  • Extras
  • Tags (?)
  • ...
@wardi wardi removed the To Discuss label Jun 1, 2023
@wardi wardi added this to the CKAN 2.11 milestone Jun 1, 2023
@wardi wardi added this to Candidates in 🎤 Rockstar Tickets 🎸 via automation Jun 1, 2023
@pdelboca
Copy link
Member

@amercader enforcing UUID will have conflicts with the current implementation of ckanext-harvest right?

Locally (after running a CKAN Harvester) I have resources with ID turismo_f7ce925a-0a48-4e38-8f48-360d6513fd83 that will not validate under this changes.

The following is an example of resource_show with ?id=turismo_f7ce925a-0a48-4e38-8f48-360d6513fd83:

{
  "help": "http://localhost:5000/api/3/action/help_show?name=resource_show",
  "success": true,
  "result": {
    "accessURL": "http://datos.yvera.gob.ar/dataset/fbc269ea-5f71-45b6-b70c-8eb38a03b8db/resource/f7ce925a-0a48-4e38-8f48-360d6513fd83",
    "attributesDescription": "[{\"type\": \"string\", \"description\": \"Temporada\", \"title\": \"temporada\"}, {\"type\": \"string\", \"description\": \"Puerto de amarre\", \"title\": \"puerto\"}, {\"type\": \"string\", \"description\": \"Nombre del crucero\", \"title\": \"crucero\"}, {\"specialTypeDetail\": \"cruceristas\", \"type\": \"number\", \"description\": \"Cantidad de cruceristas\", \"title\": \"cruceristas\"}, {\"type\": \"string\", \"description\": \"Especificaciones sobre los valores ausentes o nulos presentes en la serie de datos.\", \"title\": \"observaciones\"}]",
    "cache_last_updated": null,
    "cache_url": null,
    "created": "2023-09-29T14:43:19.101248",
    "datastore_active": false,
    "datastore_contains_all_records_of_source_file": false,
    "description": "Cantidad de cruceristas por temporada, puerto y nombre del crucero",
    "fileName": "cruceristas-por-puerto-crucero-temporada.csv",
    "format": "CSV",
    "hash": "",
    "id": "turismo_f7ce925a-0a48-4e38-8f48-360d6513fd83",
    "last_modified": "2023-09-29T14:43:19.057358",
    "metadata_modified": "2023-11-07T11:26:55.764273",
    "mimetype": null,
    "mimetype_inner": null,
    "name": "Cruceristas por temporada, puerto y crucero",
    "package_id": "turismo_fbc269ea-5f71-45b6-b70c-8eb38a03b8db",
    "position": 0,
    "resource_type": null,
    "size": null,
    "state": "active",
    "url": "http://datos.yvera.gob.ar/dataset/fbc269ea-5f71-45b6-b70c-8eb38a03b8db/resource/f7ce925a-0a48-4e38-8f48-360d6513fd83/download/cruceristas-por-puerto-crucero-temporada.csv",
    "url_type": null
  }

@amercader
Copy link
Member Author

@pdelboca I don't recall the harvester extension changing the incoming resource ids in any way, but maybe I'm wrong. Or maybe you have other extensions/custom schemas at play?

In any case perhaps we can loosen up the requirement a bit by using an uuid_if_not_a_sysadmin validator but the consensus was that long term it made sense to enforce UUIDs for id fields for security reasons.

@amercader amercader removed this from the CKAN 2.11 milestone Nov 21, 2023
@amercader amercader assigned amercader and unassigned shashigharti Feb 7, 2024
@amercader amercader linked a pull request Feb 9, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
4 participants