Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jena-fuseki-access - Propagate request/service context #1291

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

vtermanis
Copy link

@vtermanis vtermanis commented May 3, 2022

What

Process request/service context in AccessControlledDataset the same way as is done in (parent) query processor, i.e. honour the server => dataset => endpoint context value priority order.

Why

Standard SPARQL queries honour the context of the endpoint (see docs) as well as setting of reuqest-specific timeout via the timeout URL parameter. These two features allow one to:

  • Override context values specified at server and/or dataset level (e.g. disable access to text indexing or change the default timeout for the endpoint)
  • Specify per-request timeouts

(See bottom of summary of example. See also mailing list thread.)

How

  • Use QueryExec to create the QueryExecution instead of QueryExecutionFactory (the latter does not consider HTTPActions's context)
  • Use same timeout parameter logic for pre-request timeouts as in base SPARQL query processor

Note: I am not convinced that the way I've updated AccessControlledDataset (and promoted a helper method from SPARQLQueryProcessor.java to public) is the right way to go. But I can confirm that the following use-case works:

  1. Define a dataset DS1 with jena:text indexing enabled
  2. Define a dataset DS2, with AccessControlledDataset wrapping DS1. (The access actual rules are irrelevant here.)
  3. Define service A exposing a query endpoint for DS1, with extended context: ja:context [ ja:cxtName "http://jena.apache.org/text#index" ; ja:cxtValue false ] ;
  4. Define service B the same as A, but for DS2

Current behaviour (pre-patch):

  • jena:text is exposed only with query endpoint of B. SPARQL queries against A do not match text-indexed properties.

Expected behaviour (post patch):

  • Neither A nor B match text-indexed properties

.query(query)
.context(action.getContext())
;
// Note: Instead of this call (and the above build), the equivalent code from
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeating the same step here for timeout seems not right (but I wasn't sure what a better / the right approach would be).

Copy link
Member

@kinow kinow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use-case sounds good to me. Not sure about the implementation either, so probably needs @afs 's review here.

Thanks!

(also need to confirm all tests pass in Jenkins or GH Actions [manual])

@afs
Copy link
Member

afs commented May 3, 2022

From users@ thread https://lists.apache.org/thread/h0c81qjl8oc83yl2xf7xvt4l0pw4grrf

It looks like there is an issue as to whether the text dataset should push down the context setting for the index or not. Requiring endpoint configuration isn't so user-friendly.

This PR may contain a change to make anyway but this jena-fuseki-access uses context settings DataAccessCtl.symAuthorizationService itself so allowing the endpoints to modify context may be a security risk. Needs investigation - it's been a long time since I looked at the code!

An outstanding question from email:

  • Doesn't that [setting the context to an illegal value] cause warnings in the Fuseki log?

It'll need some tests.

@vtermanis
Copy link
Author

vtermanis commented May 3, 2022

  • Doesn't that [setting the context to an illegal value] cause warnings in the Fuseki log?

(copied from thread reply):

Yes it does - three (expected) warnings from TextQueryPF:

  1. Context setting 'symbol:http://jena.apache.org/text#index'is not a TextIndex
  2. Failed to find the text index : tried context and as a text-enabled dataset
  3. No text index - no text search performed

It'll need some tests.

I'd be happy to assist with this (time permitting) if I'm pointed in the right direction.

@vtermanis
Copy link
Author

Apologies - I updated the description since it had a mistake. The expected behaviour section has changed to:

Current behaviour (pre-patch):

  • jena:text is exposed only with query endpoint of B. SPARQL queries against A do not match text-indexed properties.

Expected behaviour (post patch):

  • Neither A nor B match text-indexed properties

@vtermanis
Copy link
Author

(Changing to draft as suggested, since the best way to address this is still being debated.)

@vtermanis vtermanis marked this pull request as draft May 4, 2022 08:05
vtermanis added a commit to vtermanis/jena that referenced this pull request Jul 4, 2022
- Requires context propagation patch (see apache#1291)
- Comment specified before query can be used to override security
  context with a given set of graphs.
- Overridden security context is preserved in the execution context
- Simplified regex pattern only - has to be first non-whitespace line
  to work.
vtermanis added a commit to vtermanis/jena that referenced this pull request Jul 4, 2022
- Requires context propagation patch (see apache#1291)
- Comment specified before query can be used to override security
  context with a given set of graphs.
- Overridden security context is preserved in the execution context
- Simplified regex pattern only - has to be first non-whitespace line
  to work.
vtermanis added a commit to vtermanis/jena that referenced this pull request Jul 4, 2022
- Requires context propagation patch (see apache#1291)
- Comment specified before query can be used to override security
  context with a given set of graphs.
- Overridden security context is preserved in the execution context
- Simplified regex pattern only - has to be first non-whitespace line
  to work.
vtermanis added a commit to vtermanis/jena that referenced this pull request Jul 11, 2022
- "#pragma acl.graphs" preamble can be used to specify one or more
  graphs to restriction query exectution to
- The functionality is only enabled if an access:entry for a specific user
  contains just <urn:jena:accessGraphsDynamic>
- Requires context propagation patch (see apache#1291), since query
  is stored in request-specific context
vtermanis added a commit to vtermanis/jena that referenced this pull request Jul 11, 2022
- "#pragma acl.graphs" preamble can be used to specify one or more
  graphs to restriction query exectution to
- The functionality is only enabled if an access:entry for a specific user
  contains just <urn:jena:accessGraphsDynamic>
- Requires context propagation patch (see apache#1291), since query
  is stored in request-specific context
vtermanis added a commit to vtermanis/jena that referenced this pull request Jul 13, 2022
- "#pragma acl.graphs" preamble can be used to specify one or more
  graphs to restriction query exectution to
- The functionality is only enabled if an access:entry for a specific user
  contains just <urn:jena:accessGraphsDynamic>
- Only applies to SPARQL query, not GSP (since the latter can already be
  filtered dynamically)
- Requires context propagation patch (see apache#1291), since query
  is stored in request-specific context
vtermanis added a commit to vtermanis/jena that referenced this pull request Jul 14, 2022
- "#pragma acl.graphs" preamble can be used to specify one or more
  graphs to restriction query exectution to
- The functionality is only enabled if an access:entry for a specific user
  contains just <urn:jena:accessGraphsDynamic>
- Only applies to SPARQL query, not GSP (since the latter can already be
  filtered dynamically)
- Requires context propagation patch (see apache#1291), since query
  is stored in request-specific context
@vtermanis vtermanis force-pushed the vt-add-ctx-to-fuseki-access branch from b4d73bd to 62b4adc Compare July 14, 2022 10:33
vtermanis added a commit to vtermanis/jena that referenced this pull request Jul 14, 2022
- "#pragma acl.graphs" preamble can be used to specify one or more
  graphs to restriction query exectution to
- The functionality is only enabled if an access:entry for a specific user
  contains just <urn:jena:accessGraphsDynamic>
- Only applies to SPARQL query, not GSP (since the latter can already be
  filtered dynamically)
- Requires context propagation patch (see apache#1291), since query
  is stored in request-specific context
- Honour context from HttpAction (as is done in SPARQLQueryProcessor parent)
- Set timeouts from request (as with SPARQLQueryProcessor)
- Without this, context changes configured against an endpoint are ignored if
  the associated dataset is an AccessControlledDataset
@vtermanis vtermanis force-pushed the vt-add-ctx-to-fuseki-access branch from 62b4adc to a1eb933 Compare July 14, 2022 10:35
vtermanis added a commit to vtermanis/jena that referenced this pull request Jul 14, 2022
- "#pragma acl.graphs" preamble can be used to specify one or more
  graphs to restriction query exectution to
- The functionality is only enabled if an access:entry for a specific user
  contains just <urn:jena:accessGraphsDynamic>
- Only applies to SPARQL query, not GSP (since the latter can already be
  filtered dynamically)
- Requires context propagation patch (see apache#1291), since query
  is stored in request-specific context
vtermanis added a commit to vtermanis/jena that referenced this pull request Jul 14, 2022
- "#pragma acl.graphs" preamble can be used to specify one or more
  graphs to restriction query exectution to
- The functionality is only enabled if an access:entry for a specific user
  contains just <urn:jena:accessGraphsDynamic>
- Only applies to SPARQL query, not GSP (since the latter can already be
  filtered dynamically)
- Requires context propagation patch (see apache#1291), since query
  is stored in request-specific context
@afs
Copy link
Member

afs commented Jul 14, 2022

@vtermanis -- there has been a bug fixes for general context handling with queries and these are now on the main branch. I don't think it immediately relates to your report but I wanted to make sure you are working against the right state of the codebase.

@vtermanis
Copy link
Author

@afs thanks for the heads-up, do you mean:

Is that the main one or other there others in particular? (Though it'll probably be clear when I rebase what's changed and look at the query servlet & processor code)

@afs
Copy link
Member

afs commented Jul 29, 2022

#1444 is about transactions.

#1375 and parts of some others around that time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants