Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GATTSERVICE_SUBEVENT_HID_SERVICE_CONNECTED not received in Pico W in latest develop branch #651

Open
ricardoquesada opened this issue Dec 21, 2024 · 8 comments

Comments

@ricardoquesada
Copy link
Contributor

ricardoquesada commented Dec 21, 2024

Describe the bug

Using BTstack with Pico W. Trying to connect a gamepad (using Bluepad32) that uses BLE. Everything works Ok.

When I switch to latest develop BTstack (hash: 16e6dda), the gamepad fails to finish the connection... or at least the event GATTSERVICE_SUBEVENT_HID_SERVICE_CONNECTED is not received.

Only happens on Pico W.
Tested on Posix, ESP32, ESP32-C6, ESP32-H2 and works Ok.

I noticed this bug a few months ago in develop branch, but didn't have the time to investigate it property.... but a few days ago when I re-tested everything on latest btstack develop branch, I was no longer able to reproduce it.

To give you more detail, the flow is more or less like the following in BLE:

  • A BLE device is detected while scanning
  • gap_connect()
  • sm_packet_handler() will try to pair
  • Wait for SM_EVENT_REENCRYPTION_COMPLETE or SM_EVENT_PAIRING_COMPLETE
  • device_information_service_client_query() is started
  • Wait for GATTSERVICE_SUBEVENT_DEVICE_INFORMATION_DONE
  • Starts hids_client_connect()
  • ... but never receives any kind of packet in the hids_client_packet_handler()... nothing... not an error, nothing.

To Reproduce

I'll try to create a test case based on hog_host_demo.c

Expected behavior

HCI Packet Logs
I'll attach logs in about ~7 days... sorry for the lack of logs. But wanted to file the bug before going on vacation.

I did a quick test using WANT_HCI_DUMP... but couldn't make it work, but probably something wrong on my side.

Environment: (please complete the following information):

  • Current BTstack branch: latest develop branch
  • Bluetooth Controller: Pico 1 W, and Pico 2 W.

Additional context
Add any other context about the problem here.

@ricardoquesada
Copy link
Contributor Author

attached the the .pklg logs from a Pico 2 W.

One pklg running btstack 1.6.1, and the other running latest develop branch, hash: 5d4d8cc

btstack_logs_161_162.zip

@ricardoquesada
Copy link
Contributor Author

mmm... and the GATTSERVICE_SUBEVENT_HID_SERVICE_CONNECTED is not received because the rp2040 is crashing:

Thread 1 "rp2040.core0" received signal SIGTRAP, Trace/breakpoint trap.
isr_hardfault () at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/pico_crt0/crt0.S:170
170     decl_isr_bkpt isr_hardfault
(gdb) bt
#0  isr_hardfault () at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/pico_crt0/crt0.S:170
#1  <signal handler called>
#2  0xa6023810 in ?? ()
#3  0x1000ea5a in hci_emit_event (event=event@entry=0x2000081c <packet_sent_event> "n", size=size@entry=2, dump=dump@entry=0) at /Users/riq/progs/bluepad32/external/btstack/src/hci.c:8028
#4  0x10012eb2 in event_handler (packet=packet@entry=0x2000081c <packet_sent_event> "n", size=size@entry=2) at /Users/riq/progs/bluepad32/external/btstack/src/hci.c:4706
#5  0x10014004 in packet_handler (packet_type=4 '\004', packet=0x2000081c <packet_sent_event> "n", size=2) at /Users/riq/progs/bluepad32/external/btstack/src/hci.c:4811
#6  0x10019aa2 in hci_transport_cyw43_send_packet (packet_type=<optimized out>, packet=0x2000f4e4 "5\f\005\001@", size=<optimized out>) at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/pico_cyw43_driver/btstack_hci_transport_cyw43.c:113
#7  0x10010652 in hci_host_num_completed_packets () at /Users/riq/progs/bluepad32/external/btstack/src/hci.c:5868
#8  0x100124c0 in hci_run () at /Users/riq/progs/bluepad32/external/btstack/src/hci.c:7668
#9  0x10013e26 in acl_handler (packet=packet@entry=0x2000bef8 <hci_packet_with_pre_buffer+4> "", size=size@entry=191) at /Users/riq/progs/bluepad32/external/btstack/src/hci.c:1273
#10 0x10013ffa in packet_handler (packet_type=2 '\002', packet=0x2000bef8 <hci_packet_with_pre_buffer+4> "", size=191) at /Users/riq/progs/bluepad32/external/btstack/src/hci.c:4814
#11 0x10019b82 in hci_transport_cyw43_process () at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/pico_cyw43_driver/btstack_hci_transport_cyw43.c:149
#12 0x10019bc8 in hci_transport_data_source_process (ds=<optimized out>, callback_type=<optimized out>) at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/pico_cyw43_driver/btstack_hci_transport_cyw43.c:49
#13 0x1000da5c in btstack_run_loop_base_poll_data_sources () at /Users/riq/progs/bluepad32/external/btstack/src/btstack_run_loop.c:166
#14 0x100199c2 in btstack_work_pending (context=<optimized out>, worker=<optimized out>) at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/pico_btstack/btstack_run_loop_async_context.c:139
#15 0x10008962 in async_context_base_execute_once (self=self@entry=0x20000f70 <cyw43_async_context_threadsafe_background>) at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/pico_async_context/async_context_base.c:101
#16 0x10008480 in process_under_lock (self=self@entry=0x20000f70 <cyw43_async_context_threadsafe_background>) at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/pico_async_context/async_context_threadsafe_background.c:251
#17 0x1000862a in low_priority_irq_handler () at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/pico_async_context/async_context_threadsafe_background.c:293
#18 <signal handler called>
#19 timer_time_reached (timer=0x40054000, t=9223372036854775807) at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/hardware_timer/include/hardware/timer.h:323
#20 time_reached (t=9223372036854775807) at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/hardware_timer/include/hardware/timer.h:335
#21 best_effort_wfe_or_timeout (timeout_timestamp=<optimized out>) at /Volumes/riq_data/src/rpi/pico-sdk/src/common/pico_time/time.c:475
#22 0x200000fe in sem_acquire_block_until (sem=0x20000fa8 <cyw43_async_context_threadsafe_background+56>, until=<optimized out>) at /Volumes/riq_data/src/rpi/pico-sdk/src/common/pico_sync/sem.c:54
#23 0x10008648 in async_context_threadsafe_background_wait_for_work_until (self_base=<optimized out>, until=<optimized out>) at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/pico_async_context/async_context_threadsafe_background.c:310
#24 0x10019726 in async_context_wait_for_work_until (context=<optimized out>, until=<optimized out>) at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/pico_async_context/include/pico/async_context.h:424
#25 btstack_run_loop_async_context_execute () at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/pico_btstack/btstack_run_loop_async_context.c:86
#26 0x1000dcfe in btstack_run_loop_execute () at /Users/riq/progs/bluepad32/external/btstack/src/btstack_run_loop.c:310
#27 0x100002fe in main () at /Users/riq/progs/bluepad32/examples/pico_w/src/main.c:38

@ricardoquesada
Copy link
Contributor Author

ricardoquesada commented Dec 29, 2024

I did some debugging and the function that has the issue is in hci.c, in particular:

static void hci_emit_event(uint8_t * event, uint16_t size, int dump){
    // dump packet
    if (dump) {
        hci_dump_packet( HCI_EVENT_PACKET, 1, event, size);
    }

   //  Added by me
    printf_hexdump(event, size);

    // dispatch to all event handlers
    btstack_linked_list_iterator_t it;
    btstack_linked_list_iterator_init(&it, &hci_stack->event_handlers);
    int i=0;
    while (btstack_linked_list_iterator_has_next(&it)){
        btstack_packet_callback_registration_t * entry = (btstack_packet_callback_registration_t*) btstack_linked_list_iterator_next(&it);
       // Added by me
        printf("about to call, iteration: %d, addr: %p, callback: %p\n", i, entry, entry ? entry->callback : 0);
        i++;
        entry->callback(HCI_EVENT_PACKET, 0, event, size);
    }
}

and before crashing the values are:

6E 00
about to call, iteration: 0, addr: 2000E36C, callback: 10018F89
about to call, iteration: 1, addr: 2000BEEC, callback: 1002383D
about to call, iteration: 2, addr: 2000BEE4, callback: 100230B5
about to call, iteration: 3, addr: 2000BEC4, callback: 1000CD95
about to call, iteration: 4, addr: 2000BEDC, callback: 1001F365
about to call, iteration: 5, addr: 2000BED4, callback: 1001D7ED

6E 00
about to call, iteration: 0, addr: 2000E36C, callback: 10018F89
about to call, iteration: 1, addr: 2000BEEC, callback: A602383D <--- INVALID ADDRESS. It should be  1002383D

if you pay attention, only the first byte is different.

1002383D   : Ok. Starts with 0x10
A602383D : Bad. Starts with 0xA6

Somewhere there is a off-by-one overflow... or so it seems

@ricardoquesada
Copy link
Contributor Author

ricardoquesada commented Dec 29, 2024

It seems that the byte just before hci_packet_with_pre_buffer gets overwritten:

Buffer defined here:
https://github.com/raspberrypi/pico-sdk/blob/master/src/rp2_common/pico_cyw43_driver/btstack_hci_transport_cyw43.c#L36

I tried adding a "zero buffer" right before hci_packet_with_pre_buffer and I noticed that my buffer gets overriten (and my program works Ok).

So, perhpas it is a pico-sdk bug ? and not a BTstack bug ?
But if so, why does it work with BTstack 1.6.1, but not with 1.6.2 ?

@mringwal
Copy link
Member

Thanks for the troubleshooting. I'm currently at 38C3 in Hamburg. We're added 4 bytes to all GATT Client events and have to use more pre-buffer bytes. In theory, a compile time check should have caught/prevented this issue.

Can you check and report HCI_INCOMING_PRE_BUFFER_SIZE in your or the pico-sdk setup?
If left untouched by the build, 8 are used for classic and 6 for LE-Only. See hci.c:163 and gatt_client.c:1054

@ricardoquesada
Copy link
Contributor Author

ricardoquesada commented Dec 29, 2024

I'm currently at 38C3 in Hamburg.

nice. Enjoy!

Can you check and report HCI_INCOMING_PRE_BUFFER_SIZE

8

I put a hardware breapoint on my own variable (the one that gets overwritten), and I got:

Thread 1 "rp2040.core0" hit Hardware watchpoint 1: hci_packet_with_pre_bufa[7]

Old value = 1 '\001'
New value = 166 '\246'
0x1001dbd6 in setup_long_characteristic_value_packet (gatt_client=gatt_client@entry=0x20008458 <gatt_client_storage+432>, type=type@entry=166 '\246', attribute_handle=72, offset=offset@entry=0,
    value=value@entry=0x2000bf01 <hci_packet_with_pre_buffer+13> "\005\001\t\005\241\001\205\003\005\001u\004\225\001%\aF;\001e\024\t9\201BE", length=length@entry=182) at /Users/riq/progs/bluepad32/external/btstack/src/ble/gatt_client.c:1072
1072        packet[1] = LONG_CHARACTERISTIC_VALUE_EVENT_HEADER_SIZE - 2 + length;
(gdb) bt
#0  0x1001dbd6 in setup_long_characteristic_value_packet (gatt_client=gatt_client@entry=0x20008458 <gatt_client_storage+432>, type=type@entry=166 '\246', attribute_handle=72, offset=offset@entry=0,
    value=value@entry=0x2000bf01 <hci_packet_with_pre_buffer+13> "\005\001\t\005\241\001\205\003\005\001u\004\225\001%\aF;\001e\024\t9\201BE", length=length@entry=182) at /Users/riq/progs/bluepad32/external/btstack/src/ble/gatt_client.c:1072
#1  0x1001dcc6 in report_gatt_long_characteristic_value_blob (gatt_client=gatt_client@entry=0x20008458 <gatt_client_storage+432>, attribute_handle=<optimized out>, blob=blob@entry=0x2000bf01 <hci_packet_with_pre_buffer+13> "\005\001\t\005\241\001\205\003\005\001u\004\225\001%\aF;\001e\024\t9\201BE",
    blob_length=blob_length@entry=182, value_offset=0) at /Users/riq/progs/bluepad32/external/btstack/src/ble/gatt_client.c:1235
#2  0x1001e660 in gatt_client_handle_att_read_response (gatt_client=0x20008458 <gatt_client_storage+432>, packet=packet@entry=0x2000bf00 <hci_packet_with_pre_buffer+12> "\v\005\001\t\005\241\001\205\003\005\001u\004\225\001%\aF;\001e\024\t9\201BE", size=size@entry=183)
    at /Users/riq/progs/bluepad32/external/btstack/src/ble/gatt_client.c:1914
#3  0x1001e7c2 in gatt_client_handle_att_response (gatt_client=0x20008458 <gatt_client_storage+432>, packet=packet@entry=0x2000bf00 <hci_packet_with_pre_buffer+12> "\v\005\001\t\005\241\001\205\003\005\001u\004\225\001%\aF;\001e\024\t9\201BE", size=size@entry=183)
    at /Users/riq/progs/bluepad32/external/btstack/src/ble/gatt_client.c:2101
#4  0x1001f322 in gatt_client_att_packet_handler (packet_type=<optimized out>, handle=64, packet=0x2000bf00 <hci_packet_with_pre_buffer+12> "\v\005\001\t\005\241\001\205\003\005\001u\004\225\001%\aF;\001e\024\t9\201BE", size=183) at /Users/riq/progs/bluepad32/external/btstack/src/ble/gatt_client.c:2489
#5  0x1001b04c in att_dispatch_handle_att_pdu (packet_type=<optimized out>, channel=<optimized out>, packet=<optimized out>, size=<optimized out>) at /Users/riq/progs/bluepad32/external/btstack/src/ble/att_dispatch.c:151
#6  0x1001b13a in att_packet_handler (packet_type=<optimized out>, channel=<optimized out>, packet=<optimized out>, size=<optimized out>) at /Users/riq/progs/bluepad32/external/btstack/src/ble/att_dispatch.c:167
#7  0x10018ec0 in l2cap_acl_le_handler (handle=handle@entry=64, packet=packet@entry=0x2000bef8 <hci_packet_with_pre_buffer+4> "@ \273", size=size@entry=191) at /Users/riq/progs/bluepad32/external/btstack/src/l2cap.c:4911
#8  0x10018f26 in l2cap_acl_handler (packet_type=<optimized out>, channel=<optimized out>, packet=0x2000bef8 <hci_packet_with_pre_buffer+4> "@ \273", size=191) at /Users/riq/progs/bluepad32/external/btstack/src/l2cap.c:4955
#9  0x1000e4ce in hci_emit_acl_packet (packet=packet@entry=0x2000bef8 <hci_packet_with_pre_buffer+4> "@ \273", size=size@entry=191) at /Users/riq/progs/bluepad32/external/btstack/src/hci.c:8048
#10 0x10013e74 in acl_handler (packet=packet@entry=0x2000bef8 <hci_packet_with_pre_buffer+4> "@ \273", size=size@entry=191) at /Users/riq/progs/bluepad32/external/btstack/src/hci.c:1250
#11 0x10014026 in packet_handler (packet_type=2 '\002', packet=0x2000bef8 <hci_packet_with_pre_buffer+4> "@ \273", size=191) at /Users/riq/progs/bluepad32/external/btstack/src/hci.c:4816
#12 0x10019bbe in hci_transport_cyw43_process () at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/pico_cyw43_driver/btstack_hci_transport_cyw43.c:152
#13 0x10019c04 in hci_transport_data_source_process (ds=<optimized out>, callback_type=<optimized out>) at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/pico_cyw43_driver/btstack_hci_transport_cyw43.c:50
#14 0x1000da5c in btstack_run_loop_base_poll_data_sources () at /Users/riq/progs/bluepad32/external/btstack/src/btstack_run_loop.c:166
#15 0x100199ee in btstack_work_pending (context=<optimized out>, worker=<optimized out>) at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/pico_btstack/btstack_run_loop_async_context.c:139
#16 0x10008962 in async_context_base_execute_once (self=self@entry=0x20000f70 <cyw43_async_context_threadsafe_background>) at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/pico_async_context/async_context_base.c:101
#17 0x10008480 in process_under_lock (self=self@entry=0x20000f70 <cyw43_async_context_threadsafe_background>) at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/pico_async_context/async_context_threadsafe_background.c:251
#18 0x1000862a in low_priority_irq_handler () at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/pico_async_context/async_context_threadsafe_background.c:293
#19 <signal handler called>
#20 timer_time_reached (timer=0x40054000, t=9223372036854775807) at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/hardware_timer/include/hardware/timer.h:323
#21 time_reached (t=9223372036854775807) at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/hardware_timer/include/hardware/timer.h:335
#22 best_effort_wfe_or_timeout (timeout_timestamp=<optimized out>) at /Volumes/riq_data/src/rpi/pico-sdk/src/common/pico_time/time.c:475
#23 0x200000fe in sem_acquire_block_until (sem=0x20000fa8 <cyw43_async_context_threadsafe_background+56>, until=<optimized out>) at /Volumes/riq_data/src/rpi/pico-sdk/src/common/pico_sync/sem.c:54
#24 0x10008648 in async_context_threadsafe_background_wait_for_work_until (self_base=<optimized out>, until=<optimized out>) at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/pico_async_context/async_context_threadsafe_background.c:310
#25 0x10019752 in async_context_wait_for_work_until (context=<optimized out>, until=<optimized out>) at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/pico_async_context/include/pico/async_context.h:424
#26 btstack_run_loop_async_context_execute () at /Volumes/riq_data/src/rpi/pico-sdk/src/rp2_common/pico_btstack/btstack_run_loop_async_context.c:86
#27 0x1000dcfe in btstack_run_loop_execute () at /Users/riq/progs/bluepad32/external/btstack/src/btstack_run_loop.c:310
#28 0x100002fe in main () at /Users/riq/progs/bluepad32/examples/pico_w/src/main.c:38

In particular, the watchpoint gets triggered here:

static uint8_t *
setup_long_characteristic_value_packet(const gatt_client_t *gatt_client, uint8_t type, uint16_t attribute_handle,
                                       uint16_t offset, uint8_t *value, uint16_t length) {
#ifdef FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION
    // avoid using pre ATT headers.
    // copy value into test packet for testing
    static uint8_t packet[1000];
    memcpy(&packet[LONG_CHARACTERISTIC_VALUE_EVENT_HEADER_SIZE], value, length);
#else
    // before the value inside the ATT PDU
    uint8_t * packet = value - LONG_CHARACTERISTIC_VALUE_EVENT_HEADER_SIZE;
#endif
    // ********** WATCHDOG GETS TRIGGERED HERE **************
    packet[0] = type;
    packet[1] = LONG_CHARACTERISTIC_VALUE_EVENT_HEADER_SIZE - 2 + length;
    little_endian_store_16(packet, 2, gatt_client->con_handle);
    little_endian_store_16(packet, 4, gatt_client->service_id);
    little_endian_store_16(packet, 6, gatt_client->connection_id);
    little_endian_store_16(packet, 8, attribute_handle);
    little_endian_store_16(packet, 10, offset);
    little_endian_store_16(packet, 12, length);
    return packet;
}

Besides that there is some pointer math that looks suspicious, wondering it might be related to the __attribute__((aligned(4))) from hci_packet_with_pre_buffer (???)

@ricardoquesada
Copy link
Contributor Author

After looking at the stack trace, it doesn't seem related to the __attribute__((aligned(4))).

Instead, LONG_CHARACTERISTIC_VALUE_EVENT_HEADER_SIZE is substracting 14 bytes, but the packet passed is & hci_packet_with_pre_buffer[13]

So either LONG_CHARACTERISTIC_VALUE_EVENT_HEADER_SIZE should be 13, or the packet passed should be & hci_packet_with_pre_buffer[14].

nit: if possible, it would be great to have more documentation about the constants. E.g: Why LONG_CHARACTERISTIC_VALUE_EVENT_HEADER_SIZE is defined as 14... from where does this value come from ?

ricardoquesada added a commit to ricardoquesada/pico-sdk that referenced this issue Jan 3, 2025
When using BTstack 1.6.2 (latest stable version), the microcontroller
might crash due to buffer underflow.
The byte before the first byte of hci_packet_with_pre_buffer will get
overwritten.

In particular the problem was that BTstack
`setup_long_characteristic_value_packet()` was receiving
`&hci_packet_with_pre_buffer[13]`, and in that function the packet gets
overwritten starting from "- LONG_CHARACTERISTIC_VALUE_EVENT_HEADER_SIZE", which is 14.
So the byte before hci_packet_with_pre_buffer gets overwritten.

See: https://github.com/bluekitchen/btstack/blob/5d4d8cc7b1d35a90bbd6d5ffd2d3050b2bfc861c/src/ble/gatt_client.c#L1060

This PR follows the same logic implemented in BTstack ESP32 port. See:
https://github.com/bluekitchen/btstack/blob/develop/port/esp32/components/btstack/btstack_port_esp32.c#L104

Fixes bluekitchen/btstack#651
@ricardoquesada
Copy link
Contributor Author

@mringwal I think I found the root cause. Sent PR against pico-sdk.
when you have the chance, could you review this possible fix:

raspberrypi/pico-sdk#2157

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants