Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Xen] optee_driver_init() panic after device PTA invoke error #77

Open
lorc opened this issue May 11, 2020 · 9 comments
Open

[Xen] optee_driver_init() panic after device PTA invoke error #77

lorc opened this issue May 11, 2020 · 9 comments

Comments

@lorc
Copy link

lorc commented May 11, 2020

There are multiple issues with device PTA driver:

  1. It uses old style TMEM arg even with dynamic SHM enabled. Xen is not happy about this
  2. If optee_enumerate_devices() fails, optee_driver_init() fails. But actually optee can work without this feature. Does it means, than older OP-TEE versions are not supported anymore?
  3. Drivers panics during optee_remove()

Basically I'm getting this error log:

optee: probing for conduit method from DT.                                                         
optee: revision 3.8                                                                                
optee: dynamic shared memory is enabled    
(XEN) optee.c:868:d0v0 Guest tries to use old tmem arg
optee: PTA_CMD_GET_DEVICES invoke function err: ffff0006                                           
Unable to handle kernel NULL pointer dereference at virtual address 00000058
Mem abort info:                                 
  Exception class = DABT (current EL), IL = 32 bits  
  SET = 0, FnV = 0                         
  EA = 0, S1PTW = 0                         
Data abort info:                                                                                   
  ISV = 0, ISS = 0x00000005         
  CM = 0, WnR = 0                                                                                  
[0000000000000058] user address but active_mm is swapper                                           
Internal error: Oops: 96000005 [#1] PREEMPT SMP
Modules linked in:     
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.75-ltsi-yocto-tiny #7                               
Hardware name: Renesas Salvator-XS board based on r8a7795 ES2.0+ (DT)                              
task: ffffffc6dd908000 task.stack: ffffffc6dd904000                             
PC is at tee_shm_free+0x18/0x40                  
LR is at optee_disable_shm_cache+0x88/0xbc       
pc : [<ffffff80083f45a4>] lr : [<ffffff80083f6540>] pstate: 80000045
sp : ffffffc6dd907c80
x29: ffffffc6dd907c80 x28: ffffff8008729300 
x27: 0000000000000007 x26: ffffff80086f8078 
x25: ffffffc6d76d4980 x24: 0000000000000004 
x23: ffffff8008559798 x22: 00000000b200000a 
x21: ffffffc6dd907ce0 x20: ffffffc6dd0c2c00 
x19: 0000000000000000 x18: 0000000000000010 
x17: 0000000000007fff x16: 00000000deadbeef 
x15: 0000000000000000 x14: ffffffc6dce4f028 
x13: ffffffc6d76cfba0 x12: 0000000000000020 
x11: ffffffc6dce4eff8 x10: ffffffc6d76cfba0 
x9 : 0000000000000001 x8 : ffffff80083f5598 
x7 : 0000000000000000 x6 : 0000000000000000 
x5 : 0000000000000000 x4 : 0000000000000000 
x3 : 0000000000000000 x2 : 0000000000000000 
x1 : 0000000000000000 x0 : ffffff80083f6540 
Process swapper/0 (pid: 1, stack limit = 0xffffffc6dd904000)
Call trace:
Exception stack(0xffffffc6dd907b40 to 0xffffffc6dd907c80)
7b40: ffffff80083f6540 0000000000000000 0000000000000000 0000000000000000
7b60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
7b80: ffffff80083f5598 0000000000000001 ffffffc6d76cfba0 ffffffc6dce4eff8
7ba0: 0000000000000020 ffffffc6d76cfba0 ffffffc6dce4f028 0000000000000000
7bc0: 00000000deadbeef 0000000000007fff 0000000000000010 0000000000000000
7be0: ffffffc6dd0c2c00 ffffffc6dd907ce0 00000000b200000a ffffff8008559798
7c00: 0000000000000004 ffffffc6d76d4980 ffffff80086f8078 0000000000000007
7c20: ffffff8008729300 ffffffc6dd907c80 ffffff80083f6540 ffffffc6dd907c80
7c40: ffffff80083f45a4 0000000080000045 ffffffc6dd907cb0 ffffff80083f6520
7c60: ffffffffffffffff ffffffc6dd0c2c00 ffffffc6dd907c80 ffffff80083f45a4
[<ffffff80083f45a4>] tee_shm_free+0x18/0x40
[<ffffff80083f6540>] optee_disable_shm_cache+0x88/0xbc
[<ffffff80083f5a90>] optee_remove+0x20/0x68
[<ffffff80086e08f4>] optee_driver_init+0x5ac/0x5ec
[<ffffff8008083078>] do_one_initcall+0x104/0x114
[<ffffff80086b0f0c>] kernel_init_freeable+0x228/0x22c
[<ffffff80084fbe68>] kernel_init+0x18/0x108
[<ffffff8008084218>] ret_from_fork+0x10/0x18
Code: f9000bf3 aa0003f3 aa1e03e0 d503201f (b9405a60) 
---[ end trace b20a0eef62340328 ]---

Note that Xen mediator returns an error and then kernel panics.

@lorc
Copy link
Author

lorc commented May 11, 2020

This small changes fixes the second issue. But I don't know if it the right way:

diff --git a/drivers/tee/optee/core.c b/drivers/tee/optee/core.c
index 34409c916882..91584b066787 100644
--- a/drivers/tee/optee/core.c
+++ b/drivers/tee/optee/core.c
@@ -728,10 +728,8 @@ static int __init optee_driver_init(void)
                return PTR_ERR(optee);
 
        rc = optee_enumerate_devices();
-       if (rc) {
-               optee_remove(optee);
-               return rc;
-       }
+         if (rc)
+                pr_warn("can't enumerate optee_devices: %d\n", rc);
 
        pr_info("initialized driver\n");

@etienne-lms
Copy link

It uses old style TMEM arg even with dynamic SHM enabled. Xen is not happy about this

Right, it would make sense that driver uses RMEM when dynamic shm capability is supported.

This small changes fixes the second issue. But I don't know if it the right way:

I think your change is legitimate. OP-TEE does not mandate TA/PTA devices enumeration.

Drivers panics during optee_remove()

Indeed, that's bad.

@jforissier
Copy link

2. If optee_enumerate_devices() fails, optee_driver_init() fails. But actually optee can work without this feature. Does it means, than older OP-TEE versions are not supported anymore?

optee_enumerate_devices() should not fail when "TEE device" enumeration is not supported by OP-TEE. It should simply enumerate nothing. I have checked that it is the case on QEMU if the PTA is not compiled in (CFG_DEVICE_ENUM_PTA=n).

So it seems to me we have only (!) two issues here, not three.

@etienne-lms
Copy link

etienne-lms commented May 12, 2020

(edited: rephrased)

@jforissier, you're right. Actually it is explicitly stated in the Linux driver. It should also does not fail if device PTA is found and it returns no device to enumerate: shm_size == 0.

Regarding device enumeration and TMEM:
Looking into device.c, i can't see where TMEM is explicitly used. Looks like the driver allocates SHM.

@lorc
Copy link
Author

lorc commented May 12, 2020

optee_enumerate_devices() should not fail when "TEE device" enumeration is not supported by OP-TEE.

Yes, you are right. In my case tee_client_open_session completes successfully, but then get_devices fails. On one hand, this means that something is not right, but on other hand - all other features still work fine afterwards.
So, I don't know: should we consider faults in optee_enumerate_devices as critical?

Looking into device.c, i can't see where TMEM is explicitly used. Looks like the driver allocates SHM.

I also checked the code. It seems legit, nevertheless Xen is not happy. I'll take a closer look.

And by the way. Is this right? I can see only one parameter used, not four.

@jenswi-linaro
Copy link

No, optee_enumerate_devices() should not be critical.

I didn't know that Xen didn't support TMEM, how does it support the argument structure? Doesn't that always go via TMEM?

It's still OK to send four parameters even if only one is used.

@jenswi-linaro
Copy link

@lorc, your patch makes sense. If you post it on LKML I'll try to get it into 5.7 if it works well.

@lorc
Copy link
Author

lorc commented May 12, 2020

I didn't know that Xen didn't support TMEM, how does it support the argument structure? Doesn't that always go via TMEM?

The problem is not in TMEM per se, but in its location. Xen does not support static SHM. And before 3.9, there was no cases when TMEM was used for user SHM. I believe, when both dynamic SHM and registered SHM features are enabled, optee client uses TMEM only for "register shm" calls, which are handled separately in Xen. "Open session" and "call TA" calls always use RMEM.

I believe Xen is being too restrictive there. I'll see what is needed to allow TMEM buffers when they are residing in guest memory.

So, Jerome is right, there is only two issues in the linux driver. Maybe, even one and half :)

I'll post patch in LKML in the nearest time.

@jforissier
Copy link

@lorc I am changing the title of this issue because it appears to be misleading now.

@jforissier jforissier changed the title optee driver can't boot without without device PTA [Xen] optee_driver_init() panic after device PTA invoke error May 12, 2020
igoropaniuk pushed a commit to igoropaniuk/linux that referenced this issue Nov 18, 2020
And there are no multiple TRBs on EP0 and WA1 workaround,
so it doesn't need to change TRB for EP0. It fixes below oops.

configfs-gadget gadget: high-speed config linaro-swg#1: b
android_work: sent uevent USB_STATE=CONFIGURED
Unable to handle kernel read from unreadable memory at virtual address 0000000000000008
Mem abort info:
android_work: sent uevent USB_STATE=DISCONNECTED
  ESR = 0x96000004
  EC = 0x25: DABT (current EL), IL = 32 bits

  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
Data abort info:
  ISV = 0, ISS = 0x00000004
  CM = 0, WnR = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=00000008b5bb7000
[0000000000000008] pgd=0000000000000000
Internal error: Oops: 96000004 [linaro-swg#1] PREEMPT SMP
Modules linked in:
CPU: 2 PID: 430 Comm: HwBinder:401_1 Not tainted 5.4.24-06071-g6fa8921409c1-dirty linaro-swg#77
Hardware name: Freescale i.MX8QXP MEK (DT)
pstate: 60400085 (nZCv daIf +PAN -UAO)
pc : cdns3_gadget_ep_dequeue+0x1d4/0x270
lr : cdns3_gadget_ep_dequeue+0x48/0x270
sp : ffff800012763ba0
x29: ffff800012763ba0 x28: ffff00082c653c00
x27: 0000000000000000 x26: ffff000068fa7b00
x25: ffff0000699b2000 x24: ffff00082c6ac000
x23: ffff000834f0a480 x22: ffff000834e87b9c
x21: 0000000000000000 x20: ffff000834e87800
x19: ffff000069eddc00 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000
x15: 0000000000000000 x14: 0000000000000000
x13: 0000000000000000 x12: 0000000000000001
x11: ffff80001180fbe8 x10: 0000000000000001
x9 : ffff800012101558 x8 : 0000000000000001
x7 : 0000000000000006 x6 : ffff000835d9c668
x5 : ffff000834f0a4c8 x4 : 0000000096000000
x3 : 0000000000001810 x2 : 0000000000000000
x1 : ffff800024bd001c x0 : 0000000000000001
Call trace:
 cdns3_gadget_ep_dequeue+0x1d4/0x270
 usb_ep_dequeue+0x34/0xf8
 composite_dev_cleanup+0x154/0x170
 configfs_composite_unbind+0x6c/0xa8
 usb_gadget_remove_driver+0x44/0x70
 usb_gadget_unregister_driver+0x74/0xe0
 unregister_gadget+0x28/0x58
 gadget_dev_desc_UDC_store+0x80/0x110
 configfs_write_file+0x1e0/0x2a0
 __vfs_write+0x48/0x90
 vfs_write+0xe4/0x1c8
 ksys_write+0x78/0x100
 __arm64_sys_write+0x24/0x30
 el0_svc_common.constprop.0+0x74/0x168
 el0_svc_handler+0x34/0xa0
 el0_svc+0x8/0xc
Code: 52830203 b9407660 f94042e4 11000400 (b9400841)
---[ end trace 1574516e4c1772ca ]---
Kernel panic - not syncing: Fatal exception
SMP: stopping secondary CPUs
Kernel Offset: disabled
CPU features: 0x0002,20002008
Memory Limit: none
Rebooting in 5 seconds..

Fixes: f616c3b ("usb: cdns3: Fix dequeue implementation")
Cc: stable <[email protected]>
Reviewed-by: Jun Li <[email protected]>
Signed-off-by: Peter Chen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants