Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-2370: Improve validation of dataset graph names #2373

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

kinow
Copy link
Member

@kinow kinow commented Mar 27, 2024

GitHub issue resolved #2370

Pull request Description:

@afs, @jmkeil, first stab at #2370. Had a look at 1 and 2, and I saw some parts of the text mentioning "percent-encoded" in requests, and that the backend is supposed to return 400, as it's doing now. From quickly reading those two (skipping parts, trying to find info for validation), I think

  1. empty is valid
  2. spaces are no-no
  3. otherwise it must be a URI
  4. encoded URL is no-no, it must be a valid URI, without encoded parts

I extended the current validation to also check 3 and 4 now. Does that sound right?

This is a draft until I confirm that the fix looks good, then I will either write the tests, or modify the PR again to move the code to some JS module as that a lot easier to write unit-test against, instead of testing code buried somewhere in Vue code/components.

Cheers


  • Tests are included.
  • Documentation change and updates are provided for the Apache Jena website
  • Commits have been squashed to remove intermediate development commit messages.
  • Key commit messages start with the issue number (GH-xxxx)

By submitting this pull request, I acknowledge that I am making a contribution to the Apache Software Foundation under the terms and conditions of the Contributor's Agreement.


See the Apache Jena "Contributing" guide.

Footnotes

  1. https://www.w3.org/TR/2013/REC-sparql11-http-rdf-update-20130321/#http-post

  2. https://www.ietf.org/rfc/rfc3986.html

@kinow
Copy link
Member Author

kinow commented Mar 27, 2024

"" → OK
" " → ERROR
"snoopy" → ERROR
"http://example.org/dataset" → OK
"http://example.org/dataset#graph" → OK
"http://example.org/dataset" → OK
"http://example.org/dataset#graph?Aaa" → OK
"http://example.org/dataset%23graph" → ERROR

@kinow kinow self-assigned this Mar 27, 2024
@kinow kinow requested a review from afs March 27, 2024 12:09
@jmkeil
Copy link
Contributor

jmkeil commented Mar 27, 2024

Any valid absolute IRI should be permitted, including other schemes than http/https, like urn:uuid:6e8bc430-9c3a-11d9-9669-0800200c9a66.

"http://example.org/dataset%23graph" → ERROR

I think this should be OK too, as IRIs serve as identifiers in this case and therefore the Simple String Comparison for equivalence apples. (But @afs probably knows for sure.)

Maybe better leaf it up to the back-end, as it already has sophisticated methods to check that?

@kinow
Copy link
Member Author

kinow commented Mar 27, 2024

Any valid absolute IRI should be permitted, including other schemes than http/https, like urn:uuid:6e8bc430-9c3a-11d9-9669-0800200c9a66.

Haven't tried on this PR, but a quick test on the browser console accepts "urn:uuid:6e8bc430-9c3a-11d9-9669-0800200c9a66" too.

new URL("urn:uuid:6e8bc430-9c3a-11d9-9669-0800200c9a66")
URL { href: "urn:uuid:6e8bc430-9c3a-11d9-9669-0800200c9a66", origin: "null", protocol: "urn:", username: "", password: "", host: "", hostname: "", port: "", pathname: "uuid:6e8bc430-9c3a-11d9-9669-0800200c9a66", search: "" }

I think this should be OK too, as IRIs serve as identifiers in this case and therefore the Simple String Comparison for equivalence apples. (But @afs probably knows for sure.)

👍

Maybe better leaf it up to the back-end, as it already has sophisticated methods to check that?

In some other systems I've seen the UI sending a request to the backend to validate (e.g. Jenkins does that)... this way validation was only done in the backend, and the UI simply relied on the response being OK or NOK + the error explanation.

It's harder to test without the backend, and has an extra cost on the server, but could be a solution here too.
So that should pass the validation without issues.

Happy to implement either way. I think this new version at least improves the previous one. Not sure if there are some corner cases covered by the backend that are not handled correctly in JS though...

Thanks @jmkeil !

@afs afs added the Fuseki UI label Mar 28, 2024
@jmkeil
Copy link
Contributor

jmkeil commented Apr 2, 2024

Not sure what would be the best solution. But the simplest solution I could think of would be to just do the request with the encoded graph name IRI and the back-end would complain, if it is invalid. Of course, that would have the draw back that the user might send a large payload only to learn about an invalid input. This could be avoided by sending a trial request with empty payload to test the graph name IRI. That way, no changes of the back-end are needed. Would that be feasible from the testing-perspective?

@kinow
Copy link
Member Author

kinow commented Apr 2, 2024

Would that be feasible from the testing-perspective?

I think that's doable too. Let's wait to see if @afs has other ideas or suggestions.

Thanks @jmkeil !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fuseki UI Add Data Dataset graph name not escaped
3 participants