Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom RPC from user TA #3303

Closed
jenswi-linaro opened this issue Sep 26, 2019 · 20 comments
Closed

Custom RPC from user TA #3303

jenswi-linaro opened this issue Sep 26, 2019 · 20 comments
Labels

Comments

@jenswi-linaro
Copy link
Contributor

jenswi-linaro commented Sep 26, 2019

Currently we have two proposals for custom RPC from a user TA
[1] #3270 (core: add a new pta to support any ree service for TA)
[2] #3292 (Generic RPC Support)

After a quick look I hope that the summary below isn't too far from the truth.

[1] routes the custom RPC to tee-supplicant which then either can call into a dynamic library or pass messages to another user space process.

[2] routes the custom RPC to a separate thread in the calling client.

With [1] the TA has to open a session to the service in normal world before it can start sending requests.

With [2] there's no need to open a new session, instead the requests are sent back using the clients session.

[2] requires changes in the kernel driver, in the OP-TEE part and in the framework, there's even changes in (extensions to) the user space API. [1] currently doesn't need any kernel changes.

In the original use case for which [2] was defined it seems to be a must to be able to receive and process the RPCs from the user TA in the calling process. I've also heard from other companies ideas about a similar way of receiving the RPCs from the TA. For [1] I don't know why this particular way of receiving the RPCs was selected.

Over all I'm more keen on [2] than [1] for the following reasons:

  1. The RPCs are something between the client and the TA only, no need to make modifications in tee-supplicant for each client/TA.
    EDIT: Past [1] tee-supplicant will not be changed. However, it will have become dynamically extendable to execute code that is somewhat client and/or TA specific. With [2] the code would normally be contained in the client.
  2. Less complicated from TA point of view since the session already is established (no need for REE open and close session)
  3. It seems that with [2] the building blocks to implement something like [1] are available, but not the other way around.

I would of course like to see a solution that can support all use cases that has prompted these two pull requests. Regardless of which solution we'll start from there's bound to be further changes.

What does others think, @jforissier, @etienne-lms, @jbech-linaro?
@divneil, @HernanGatta, comments?

@jforissier
Copy link
Contributor

jforissier commented Sep 27, 2019

Hi Jens,

I'm also commenting after a quick look, i.e., I certainly have not considered implementation details at this point. But, I have the same understanding as you regarding how the two proposals work and I also lean towards [2] as a better option.

With [1] I'm also concerned about the possibility of having tee-supplicant execute some client code. IMO that could be problematic because you probably want to control the privileges of that code in normal world: user/group ID, SELinux permissions and whatnot.

The fact that [2] requires changes in the kernel driver while [1] does not is not a significant problem in practice, since we're working upstream and I suppose the maintainer of the TEE subsystem could easily be convinced 😆

@divneil
Copy link

divneil commented Sep 27, 2019

@HernanGatta I was not well from past few days and also owing to some laptop troubles, I couldn't take out much time from past few days to look at it in detail. I got the gist of the solution and seems to work out nicely. I will directly come to one of the things I was looking to understand. Other than that, I don't have concerns in this.

Questions is:
When the CA calls TEEC_OpenSessionEx(), it spawns the thread teec_grpc_thread_proc() which waits on ioctl(..,, TEE_IOC_GRPC_RECV, ..).

Let's assume you have multiple TAs who send out RPCs to Normal World CAs. So, how do you make sure that the RPC commands lands in the right thread?

@jforissier @jenswi-linaro
This is fairly good patch to begin with. Coming back on one of the points of (1), the .so solution allows the custom functionality to be transparent to CA, as it need not know about some of the TA requirements like storing clear data (logs) in Linux filesystem or just HMAC'ed data on fs etc.. to be used may be for telemetry. The main idea for .so was to have stateless functionality, so, that it need not be handled by CA and need not be informed to CA in most of the cases.
(Well, after writing too many times CA, I am recalling last org nomenclature (Conditional Access) ;))

If possible, can you help in figuring out fault in usage of shared memory in zero_copy example (linaro-swg/optee_examples#65) . The usage of shared memory in this example could be used in solution (2) too.

@divneil
Copy link

divneil commented Sep 27, 2019

  1. The RPCs are something between the client and the TA only, no need to make modifications in tee-supplicant for each client/TA.

It can be rephrased ;) because the whole idea was to have solution where tee-supplicant modifications are not required any more. With this patch, tee-supplicant just hands over the processing to any REE service. It is just doing marshaling/demarshaling of params for message queue. In .so, it will just call .so function.

  1. Less complicated from TA point of view since the session already is established (no need for REE open and close session)

Well, sort of similar. With (1), it is now directly opening session on PTA using existing APIs. In (2), same set of APIs are used although with a different name (REE inserted)
[EDIT]: Jumbled (1) and (2) Correction below
(2) directly opening session on PTA using existing APIs
(1) same set of APIs are used although with a different name (REE inserted)

  1. It seems that with [2] the building blocks to implement something like [1] are available, but not the other way around.

Can you please elaborate.

@divneil
Copy link

divneil commented Sep 27, 2019

With [1] I'm also concerned about the possibility of having tee-supplicant execute some client code. IMO that could be problematic because you probably want to control the privileges of that code in normal world: user/group ID, SELinux permissions and whatnot.

tee-supplicant is not executing the client code. The same set of permission limitations apply when tee-supplicant loads TA. :)

@divneil
Copy link

divneil commented Sep 27, 2019

What's this:
https://github.com/OP-TEE/optee_client/pull/170/files#diff-5f2f588f1d9298e1ffcc3ce1c5c0b71cR341-R342
https://github.com/OP-TEE/optee_client/pull/170/files#diff-0870b12fea3a2ee08770f31c8afedea5R74-R75

?

My understanding about your comment was, that you are referring to code in context of Client Application like some data parsing happening on Client data. If you are considering the marshaling/de-marshaling as in Client context, I understand your point now.

However, considering tee-supplicant does a lot of stuff like socket, RPMB etc.. so, I suppose this shouldn't be considered a problem point.

@HernanGatta
Copy link

HernanGatta commented Oct 1, 2019

Questions is:
When the CA calls TEEC_OpenSessionEx(), it spawns the thread teec_grpc_thread_proc() which waits on ioctl(..,, TEE_IOC_GRPC_RECV, ..).

Let's assume you have multiple TAs who send out RPCs to Normal World CAs. So, how do you make sure that the RPC commands lands in the right thread?

@divneil:
RPC's issued by a TA are tied to a specific session, and for each session that is created, a new REE thread is spawned for that session to handle that session's RPC requests. As such, when a TA instance requests an RPC, the request is added to that session's RPC queue, and is then handled by that session's RPC processing thread.


@jenswi-linaro, @jforissier:
On a design note, since there can only ever be a single thread inside a session at a time given OP-TEE's current design, there will currently never be more than a single RPC request pending for a given session (except for concurrent PTA's). Hence, a case could be made for a design where a configurable number of REE threads are spawned to handle RPC requests for an entire TEE context. This would also work fine down the line if OP-TEE ever gains support for concurrent user-mode TA's.

Another way to implement RPC's is to do it the way SGX does it. OP-TEE's own kernel-mode RPC mechanism works this way already. Instead of adding RPC requests to a queue, per-session or otherwise, then processing the requests in a REE user-mode thread, we could let the TEE_IOC_INVOKE IOCTL complete when an RPC request arrives, return to the client application with an RPC pending code, process the RPC, send the response back down, then resume the original command. This way, it is always the same thread that invokes a TA command and processes that command's RPC requests. It seems to me that this is a bit more involved than what we currently have by virtue of needing to keep a call context around in kernel mode across IOCTL's, but I haven't thought it all the way through.

An interesting observation regarding option [2] is that it would allow us to re-implement the entire RPC handling mechanism underneath client applications while leaving the API surface the same. This might be true of option [1], too; perhaps @divneil can pitch in.


With respect to the notion of extending the supplicant's functionality with dynamic libraries, I can see the appeal. However, I would argue that any augmentation of the supplicant's feature set should be upstreamed into the supplicant proper. The reason is that if a TA uses functionality provided by a supplicant extension, then the unit of delivery for a TA, so to speak, becomes a bundle of the TA binary plus any dynamic libraries to extend the supplicant, as opposed to just shipping and updating the TA. There are protocols being discussed like OTrP for secure TA delivery and update, and this dynamic supplicant extension mechanism does not fit within that model as far as I can tell.

P.S.: Sorry for the late reply, I'm technically on vacation until next week :). Also, thank you @jenswi-linaro for starting this discussion, as well as for your interest in this feature.

@jforissier
Copy link
Contributor

@HernanGatta

This would also work fine down the line if OP-TEE ever gains support for concurrent user-mode TA's.

Do you mean multithreaded TAs (several threads executing the same session simultaneously -- I know that's in violation of GP rules but it makes sense for some use cases and allowing that could be considered an extension)? Because unrelated user-mode TAs can already be executed concurrently on different cores or simply interrupted/scheduled on the same core.

There are use cases I know of for multithreaded TAs, typically for computation-intensive stuff (IA related). Being able to leverage the REE RPC mechanism to spawn, synchronize and terminate TA threads would be nice IMO. In other words, implement libpthreads for TAs and rely on generic RPC to instruct the client app to do what's needed.

@jenswi-linaro
Copy link
Contributor Author

jenswi-linaro commented Oct 2, 2019

In reply to @divneil.

  1. The RPCs are something between the client and the TA only, no need to make modifications in tee-supplicant for each client/TA.

It can be rephrased ;) because the whole idea was to have solution where tee-supplicant modifications are not required any more. With this patch, tee-supplicant just hands over the processing to any REE service. It is just doing marshaling/demarshaling of params for message queue. In .so, it will just call .so function.

Sure, but it's still running client specific code in tee-supplicant context, unless forwarding the message to another process. I'll update the text above.

I think it would be wrong to use [1] as a way of extending tee-supplicant with other system services.

  1. Less complicated from TA point of view since the session already is established (no need for REE open and close session)

Well, sort of similar.

I don't agree, the difference is that it's the session of the client that is reused instead a new REE RPC session being established. That a separate session with the system PTA or some other PTA is needed doesn't need to be visible to the TA.

  1. It seems that with [2] the building blocks to implement something like [1] are available, but not the other way around.

Can you please elaborate.

With [2] the client can mimic [1] by passing on the message to some other process or dynamically load some code to process the request. It might be possible to twist [1] into mimicking [2], but it would require a detour via tee-supplicant and also look a bit strange.

@jenswi-linaro
Copy link
Contributor Author

In reply to @divneil:

Coming back on one of the points of (1), the .so solution allows the custom functionality to be transparent to CA, as it need not know about some of the TA requirements like storing clear data (logs) in Linux filesystem or just HMAC'ed data on fs etc.. to be used may be for telemetry. The main idea for .so was to have stateless functionality, so, that it need not be handled by CA and need not be informed to CA in most of the cases.

I've been wondering what [1] would be used for, this gives some clues.

@HernanGatta has a good point in:

With respect to the notion of extending the supplicant's functionality with dynamic libraries, I can see the appeal. However, I would argue that any augmentation of the supplicant's feature set should be upstreamed into the supplicant proper. The reason is that if a TA uses functionality provided by a supplicant extension, then the unit of delivery for a TA, so to speak, becomes a bundle of the TA binary plus any dynamic libraries to extend the supplicant, as opposed to just shipping and updating the TA. There are protocols being discussed like OTrP for secure TA delivery and update, and this dynamic supplicant extension mechanism does not fit within that model as far as I can tell.

Perhaps we can take care of most of the problems that [1] solves by adding some new services to tee-supplicant.

@jenswi-linaro
Copy link
Contributor Author

In reply to @HernanGatta:

On a design note, since there can only ever be a single thread inside a session at a time given OP-TEE's current design, there will currently never be more than a single RPC request pending for a given session (except for concurrent PTA's). Hence, a case could be made for a design where a configurable number of REE threads are spawned to handle RPC requests for an entire TEE context. This would also work fine down the line if OP-TEE ever gains support for concurrent user-mode TA's.

If a client is doing concurrent invokes on the same TEE context (file descriptor) it could increase the number of threads accordingly or use a mechanism similar to the one in tee-supplicant where new threads are spawned as needed.

Another way to implement RPC's is to do it the way SGX does it. OP-TEE's own kernel-mode RPC mechanism works this way already. Instead of adding RPC requests to a queue, per-session or otherwise, then processing the requests in a REE user-mode thread, we could let the TEE_IOC_INVOKE IOCTL complete when an RPC request arrives, return to the client application with an RPC pending code, process the RPC, send the response back down, then resume the original command. This way, it is always the same thread that invokes a TA command and processes that command's RPC requests. It seems to me that this is a bit more involved than what we currently have by virtue of needing to keep a call context around in kernel mode across IOCTL's, but I haven't thought it all the way through.

I think this should work quite fine, much simpler also. However, we'd need to update for instance params_to_user() to allow completely changed struct tee_param (not a big deal). The client library also must make sure to have room for eventual RPC parameters.

An interesting observation regarding option [2] is that it would allow us to re-implement the entire RPC handling mechanism underneath client applications while leaving the API surface the same.

It would be possible but I don't think it's desirable from a security point of view. From secure world tee-supplicant and the kernel can be equally trusted (as much normal world can be trusted), but a client is less trusted.

It occured to me that [2] needs to identify the TA issuing the RPC as well. This is needed since the client cannot tell which TA is doing the RPC. The called TA could have opened a session to another TA which in its turn does RPC.

@divneil
Copy link

divneil commented Oct 4, 2019

, the request is added to that session's RPC queue, and is then handled by that session's RPC processing thread.

Okay. I think due to my limited experience with OPTEE (just 3 months overall last year), I may have skipped this part how the RPC traces back to the calling thread. Based on the explanation, seems that you have used that path to route the request to a thread in that process context. If it's not too much trouble, can you please point me to the code. Anyways, I will try to go through the sequence by myself at a later time.

@divneil
Copy link

divneil commented Oct 4, 2019

Thanks @jforissier and @jenswi-linaro for a good discussion. I wouldn't exaggerate it more, as more or less we are at the same understanding :).

If I am needed anywhere, feel free to connect with me.

@jenswi-linaro
Copy link
Contributor Author

Thank you @divneil for bringing this up. I hope we'll be able to find a solution that can handle your use cases too.

@HernanGatta
Copy link

HernanGatta commented Oct 14, 2019

I'll try to summarize the various dimensions along which the discussion runs:

Supplicant Involvement

Question: Should OCALLs from a TA to its CA be routed through the supplicant?

It seems that we agree that the supplicant should not be involved in passing OCALL requests between TAs and CAs.

In-thread vs Out-of-thread Processing

Question: Should OCALL requests be placed on a queue and processed by a thread other than the one that initiated the ECALL that in turn caused the OCALL, or should the thread that initiated said ECALL process the OCALL itself by allowing the function invocation IOCTL to return with an RPC pending code?

Implementation [2] implements the first idea because it was the least intrusive with respect to code changes. However, it seems there is sympathy for the second idea. I for one tend to agree: it's cleaner and there is no need to manage threads whose sole purpose is OCALL processing. Implementation [2] will have to be pretty much rewritten.

API Surface for CAs and TAs

Question: What should the APIs look like for the CA to configure OCALL handling, and what should the APIs be for TAs to invoke OCALLs?

[1] has a proposal to these, while [2] only does for the first. The API surface may perhaps be discussed without knowing the exact implementation: policy vs mechanism. Needs discussion.

Non-Secure Services for TAs

Question: The supplicant services requests on behalf of OP-TEE, which may in turn be servicing requests on behalf of TAs (e.g., sockets). Should this mechanism be exposed to TAs so that they be able to request services from the supplicant directly?

Either OP-TEE could be extended with more services, which are then passed on to the supplicant for processing, or TAs could have an API to request services from the supplicant directly, or for each OCALL, it could be filtered by the supplicant and if unhandled, passed to the CA (or vice-versa).

Should the supplicant be extensible?

Question: Should third-parties (i.e., non-upstream builders) be able to extend the supplicant with additional services?

I've outlined my position on this, which is to extend the supplicant upstream. That way, TAs that run against the upstream supplicant and/or OP-TEE will run on a sufficiently new version of these, as opposed to requiring specific extensions to be present. Would this be sufficient to cover @divneil's scenario?

@jenswi-linaro, @jforissier, @divneil: Am I missing something? What do you think?

@jenswi-linaro
Copy link
Contributor Author

Hi @HernanGatta,

Well summarized. Here's my view on the questions.

In-thread vs Out-of-thread Processing: Preferably In-thread processing, we need to be careful to be backwards compatible with old clients/libteec/kernel drivers.

API Surface for CAs and TAs: For CAs I'd prefer an API where a callback is registered by the client on the TEEC_Context (invoke on the same session from the callback is not allowed). For TAs what's in [2] is a good start. The client will need to know which TA is requesting the service since a TA can invoke another TA which does the OCALL.

Non-Secure Services for TAs: I'd prefer letting OP-TEE provide services to TAs.

Should the supplicant be extensible: Nothing we encourage,, better to have what's needed upstream. The preferred way of providing custom services should be via OCALLs.

I think it's good to call this feature or mechanism OCALL.

@HernanGatta
Copy link

@jenswi-linaro:

Thank you for your input. I'll get cracking on an implementation with in-thread processing.

As for the CA needing to know which TA made the OCALL, it's not so much which TA but rather from which session did the OCALL originate. If a TA calls another TA which then makes an OCALL, the CA ought to expect that TA's OCALLs, which would then be funneled through the first TA's session with the CA.

@jenswi-linaro
Copy link
Contributor Author

Regarding session versus TA with an OCALL. The session is what carries the OCALL, but the meaning of the OCALL depends on the originating TA.

It's possible or even natural to have a common definition for a group of TAs which are used together, then the client always knows how an OCALL is to be interpreted. However, assume that one of the TAs (TA-a) happens to call a TA (TA-b) outside this group of TAs. Since TA-b isn't part of this group of TAs it uses differently formatted OCALLs, when the CA receives the OCALL it may misinterpret the request.

If the UUID of originating TA is delivered with the request then the CA can easily tell what to do with the request.

@github-actions
Copy link

This issue has been marked as a stale issue because it has been open (more than) 30 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 5 days. Note, that you can always re-open a closed issue at any time.

@HernanGatta
Copy link

HernanGatta commented Mar 19, 2020

@jenswi-linaro, @jforissier: My apologies for having taken this long to come back with a new implementation. I've finally re-implemented the feature in-thread, as opposed to using the out-of-thread call queue.

I still have to work on adding tests to xtest for the new feature, but I wanted to get out what I have so far so I can get your feedback early. I'm also aware of the style violations; the code looks fine in VS Code, but I have clearly missed the mark here and there, so I'll be fixing those.

PRs:
OPTEE-OS
OP-TEE Client
Linux
OP-TEE Examples

Thanks!

Edit: Looks like merely commenting on this issue did not re-open it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants