Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

un-finished TA invocation will not be automatically canceled after invoker process exit #4781

Open
zhaoxuyang13 opened this issue Aug 5, 2021 · 9 comments
Labels

Comments

@zhaoxuyang13
Copy link

I have a CA and a TA,

  1. CA create a new thread invoke TA command.
  2. TA spinning on a shared buffer, and process request being passed on a shared buffer.
  3. CA sending request by writing on the shared buffer, and receiving result by reading
  4. CA exit,but the TA spinning will not be canceled automatically. ( I have to call TA_CancelRequest manually or pass signal by shared buffer). if i don't cancel it by myself, the resource will not be released, and the CA remain zombie status occupying a core.
    I wonder if this is a problem ?
@jforissier
Copy link
Contributor

Hi @zhaoxuyang13,

Two things:

  • First, I am not sure the programming model you are using is a valid usage of the GlobalPlatform APIs. Basically, there is no guarantee that writes to a buffer by the CA will be visible by the TA unless you are calling TEEC_InvokeCommand(). See the TEE Client API specification section 3.2.5 Memory References.
  • Regardless of the first point, it looks like what you are describing in your point number 4 should not happen. In other words, the TA should be canceled automatically when the CA exits, as described in the TEE Internal Core API spec section 2.1.5 Unexpected Client Termination. So we might have a bug here. Do you have a simple test case you could share?

@zhaoxuyang13
Copy link
Author

Thanks for your quick reply, @jforissier .

  • for your first point, it seems I did not notice the specification. So how should I implement a switchless TA call, so I can eliminate context switch overhead ? Or is there any reason I shouldn't do this?
  • for the second part, I will try to write a test case to reproduce.

@zhaoxuyang13
Copy link
Author

@jforissier
here is the test-case based on optee-example/hello_world.
https://github.com/zhaoxuyang13/ca-term
I run op-tee in qemu environment. using qemu_v8.xml (after login, I chroot to a ubuntu18 image, btw)

after make and install, run ca.
ca first create a new thread invoking ta, then the ca will exit after 3 secs (waiting for invocation finish)
so, I assume the ca should exit normally and invocation cancelled then?
I try to use sudo kill -9 $pid to kill the process,(or exit(0) explicitly in int_handler), but then ps aux show it's still in zombie status, occupying a core.

$ ps aux | grep optee_
zhaoxuy+   297 99.0  0.0      0     0 ?        Zl   09:13   6:01 [optee_example_h] <defunct>

@jforissier
Copy link
Contributor

@zhaoxuyang13 thanks for the test case, very helpful. I understand the issue better now.

  • Firstly, you need to change the TA code to allow cancellation and introduce a cancellation point (TEE_Sleep()) in the busy loop. Otherwise the TA cannot be interrupted.
  • Secondly, the CA needs to actively request cancellation of the pending request before exiting. That I think, is an OP-TEE bug as I mentioned above. I think the driver should request cancellation on process exit. @jenswi-linaro what do you think?

Here is a patch against you test cases to make it work as expected.

diff --git a/host/main.c b/host/main.c
index 119d487..b129964 100644
--- a/host/main.c
+++ b/host/main.c
@@ -42,11 +42,12 @@ void intHandler(int dummy) {
 	exit(0);
 }
 
+TEEC_Operation op;
+
 void *call_ta(void *arg){
 	TEEC_Result res;
 	TEEC_Context ctx;
 	TEEC_Session sess;
-	TEEC_Operation op;
 	TEEC_UUID uuid = TA_HELLO_WORLD_UUID;
 	uint32_t err_origin;
 
@@ -119,6 +120,8 @@ int main(void)
 	pthread_detach(thread_id); // 线程分离,结束时自动回收资源
 	printf("sleep 3\n");
 	sleep(3);
+	printf("TEEC_RequestCancellation()\n");
+	TEEC_RequestCancellation(&op);
 	printf("exited\n");
 	return 0;
 }
diff --git a/ta/hello_world_ta.c b/ta/hello_world_ta.c
index b91eecf..086165f 100644
--- a/ta/hello_world_ta.c
+++ b/ta/hello_world_ta.c
@@ -130,8 +130,11 @@ static TEE_Result dec_value(uint32_t param_types,
 	IMSG("Got value: %u from NW", params[0].value.a);
 	params[0].value.a--;
 	IMSG("Decrease value to: %u", params[0].value.a);
+	TEE_UnmaskCancellation();
 	while(true){
-		;
+		printf("wait\n");
+		if (TEE_Wait(1000) == TEE_ERROR_CANCEL)
+			return;
 	}
 	return TEE_SUCCESS;
 }

@zhaoxuyang13
Copy link
Author

Many thanks for you answer.
I will try to patch my own code :)

@zhaoxuyang13
Copy link
Author

ahh, may I ask one more question?
If I want my TA thread to yield, I should call TEE_Wait(0) right?

@jforissier
Copy link
Contributor

ahh, may I ask one more question?
If I want my TA thread to yield, I should call TEE_Wait(0) right?

This behavior is not specified in the GP spec but in practice (OP-TEE implementation) it will sometime yield, but not always, because tee_time_wait() is not always called when timeout == 0. See here: https://github.com/OP-TEE/optee_os/blob/3.14.0/core/tee/tee_svc.c#L960. tee_time_wait() causes a return to Normal World via a RPC call, which is what will cause the yield (the scheduler is in Normal World).

Perhaps we should change the code in syscall_wait() to make sure tee_time_wait() is called at least once? Feel free to experiment and propose a patch.

@jenswi-linaro
Copy link
Contributor

  • Secondly, the CA needs to actively request cancellation of the pending request before exiting. That I think, is an OP-TEE bug as I mentioned above. I think the driver should request cancellation on process exit. @jenswi-linaro what do you think?

I agree

@github-actions
Copy link

This issue has been marked as a stale issue because it has been open (more than) 30 days with no activity. Remove the stale label or add a comment, otherwise this issue will automatically be closed in 5 days. Note, that you can always re-open a closed issue at any time.

@github-actions github-actions bot added the Stale label Sep 17, 2021
@jforissier jforissier reopened this Sep 23, 2021
@jforissier jforissier added bug and removed Stale labels Sep 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants