Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider technical approach taken for genomics data tracks #44

Open
joncison opened this issue May 21, 2019 · 0 comments
Open

Consider technical approach taken for genomics data tracks #44

joncison opened this issue May 21, 2019 · 0 comments

Comments

@joncison
Copy link
Member

Jose Maria says ...

First of all, our approach is having JSON schema format validation extensions "curie" and "term", needed to validate the genomic tracks metadata, modelled using JSON Schema. The code we already have implemented in Python is available at:
https://github.com/fairtracks/fairtracks_standard/tree/master/toolsForValidation/python
For one of the milestones in the implementation study we need to check the correctness of CURIEs which are registered in identifiers.org , as well as ontological terms or labels. The syntax of the format extensions "curie" and "term" are described here:
• A description of the syntax understood by the extensions is available at https://github.com/fairtracks/fairtracks_standard/blob/master/toolsForValidation/README.md
• An example of a JSON Schema using these custom format extensions is available at https://github.com/fairtracks/fairtracks_standard/blob/master/toolsForValidation/test-data/fair_genomic_tracks.schema-fixed2.json
• You can also find examples of validating and failing JSON documents in https://github.com/fairtracks/fairtracks_standard/tree/master/toolsForValidation/test-data
Custom formats are supported by the standard, as well as custom keys in JSON Schema. At the technical level, when an unknown custom format is found, depending on the implementation, the validation library either ignores (Python jsonschema, old Perl JSON::Validator versions) or complains (new Perl JSON::Validator versions, Java org-everit/json-schema) about it. Custom JSON Schema keys are always ignored, unless the implementation has some mechanism to access them in some validation stage (Python jsonschema). Custom types are also supported by the standard, but very poorly supported by the implementations.
The implementations are being written as modular as possible, so we can reuse them in the near future in other projects. The extensions code is at https://github.com/fairtracks/fairtracks_standard/tree/master/toolsForValidation/python/libs, and it can be hooked to any validation in an easy way. See https://github.com/fairtracks/fairtracks_standard/blob/master/toolsForValidation/python/fairGTrackJsonValidate.py , from line 92 to line 125, as well as line 403.
We are aiming to have the same format validation extensions implemented in more programming languages, using the same syntax. But technical issues were stopping us. Both Perl 5 JSON::Validator library and Java everit-org/json-schema (as many other libraries) limit what a custom format validator can access (basically, only the value to validate, but no access to the JSON Schema context). So, I have made forks of these libraries (https://github.com/fairtracks/json-validator , https://github.com/fairtracks/json-schema/tree/extended_custom_format), and I have made the needed changes so custom format validators can have access to the JSON Schema context. Also, I have submitted pull requests to the upstream repositories (jhthorsen/json-validator#156 , everit-org/json-schema#301), so they can decide whether they are interested in this work and whether to re-integrate these changes.
Next step is reimplementing Python code in Perl and Java, with similar behaviour. Also, we are interested in implementing more custom format extensions, so other custom checks (unique values, foreign keys) we already do in other projects can be integrated in an easier way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant