Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

USB-NCM driver 0.17 hangs after a period of sending packets (IEC-243) #107

Open
3 tasks done
emaayan opened this issue Dec 21, 2024 · 13 comments · May be fixed by hathach/tinyusb#2950
Open
3 tasks done

USB-NCM driver 0.17 hangs after a period of sending packets (IEC-243) #107

emaayan opened this issue Dec 21, 2024 · 13 comments · May be fixed by hathach/tinyusb#2950
Assignees
Labels
Status: Reviewing Issue is being reviewed Type: Bug Bug in esp-usb

Comments

@emaayan
Copy link

emaayan commented Dec 21, 2024

Answers checklist.

  • I have read the component documentation ESP-IDF Components and the issue is not addressed there.
  • I am using target and esp-idf version as defined in component's idf_component.yml
  • I have searched the issue tracker for a similar issue and not found any related issue.

Which component are you using? If you choose Other, provide details in More Information.

device/esp_tinyusb

ESP-IDF version.

5.3.1

Development Kit.

esp-s3

Used Component version.

0.17.1

More Information.

hi, i'm directly using tinyusb port from espressif , but that repo has no issues repo

if i try to use the esp_tinyusb component wrapper the tinyusb_send method would eventually fail consistently with sending this.
steps to reproduce:

  1. download and install the windows ncm driver
  2. upload the sketch
  3. when the network device is added give it a static IP 192.168.5.3 , and gateway 192.168.5.1
    esp_tusb_ncm_bug.zip

the program generates a bytes array at random lengths from 20 to 1400 and first sends the size of the byte array and the actuall byte array while the corresponding client code does the opposite

after a while it just halts in it tracks sometimes you get log error of ethernetif_input: IP input error
private static ByteBuffer getByteBuffer(InputStream inputStream, int size) throws IOException { final ByteBuffer allocate = ByteBuffer.allocate(size).order(ByteOrder.LITTLE_ENDIAN); final byte[] array = allocate.array(); final int read = inputStream.readNBytes(array,0,size);// read(array,0,size); return allocate; } public static void main(String[] args) throws IOException { String host = "192.168.5.1"; final Socket socket = SocketFactory.getDefault().createSocket(host, 19000); while (socket.isConnected()){ final InputStream inputStream = socket.getInputStream(); final ByteBuffer byteBuffer = getByteBuffer(inputStream, 4); final int sz = byteBuffer.getInt(); final byte[] bytes = inputStream.readNBytes(sz); System.out.println(sz +" " +Arrays.toString(bytes)); } }

it hangs on tud_task() , which waits indefinitely..

@emaayan emaayan added the Type: Bug Bug in esp-usb label Dec 21, 2024
@espressif-bot espressif-bot added the Status: Opened Issue is new label Dec 21, 2024
@github-actions github-actions bot changed the title USB-NCM driver 0.17 hangs after a period of sending packets USB-NCM driver 0.17 hangs after a period of sending packets (IEC-243) Dec 21, 2024
@roma-jam
Copy link
Collaborator

Hi @emaayan ,

thanks for reporting the issue.

The last release v0.17.x of tinyusb uses DMA, so it might be helpful for the problem seeking to change it back to simple Slave mode via menuconfig (TinyUSB Stack -> TinyUSB DCD -> DCD Mode -> choose Slave/IRQ). Then rebuld & flash and check that the hanging is still present.

If that won't help, there is an option to specify the tinyusb component with previous release version in the manifest file by providing the specific component version line:

  espressif/tinyusb:
    version: "0.15.0~10"

Meanwhile we will try to find the problem, but regarding the current plans that might be not sooner than next month.
Sorry for the inconvenience, feel free to add any relevant information after trying the steps, descried above.

@emaayan
Copy link
Author

emaayan commented Dec 22, 2024

Hi @emaayan ,

thanks for reporting the issue.

The last release v0.17.x of tinyusb uses DMA, so it might be helpful for the problem seeking to change it back to simple Slave mode via menuconfig (TinyUSB Stack -> TinyUSB DCD -> DCD Mode -> choose Slave/IRQ). Then rebuld & flash and check that the hanging is still present.

If that won't help, there is an option to specify the tinyusb component with previous release version in the manifest file by providing the specific component version line:

  espressif/tinyusb:
    version: "0.15.0~10"

Meanwhile we will try to find the problem, but regarding the current plans that might be not sooner than next month. Sorry for the inconvenience, feel free to add any relevant information after trying the steps, descried above.

upon further inspection i saw this lock happens here on the tud_task method which is basically
an overloaded method for tud_task_ext(UINT32_MAX, false); which appears to just wait indefinitely
so when i replaced that call with tud_task_ext(100, false); i'm assuming it timed out and retried?
not sure what happened here, because if this method is a task, why is it being called directly and from xTaskCreate?

esp_err_t tinyusb_net_send(void *buffer, uint16_t len, void *buff_free_arg)
{
    for (;;) {
        /* if TinyUSB isn't ready, we must signal back to lwip that there is nothing we can do */
        if (!tud_ready()) {
            return ESP_ERR_INVALID_STATE;
        }

        /* if the network driver can accept another packet, we make it happen */
        if (tud_network_can_xmit(len)) {
            s_net_obj.packet_to_send.buffer = buffer;
            s_net_obj.packet_to_send.len = len;
            s_net_obj.packet_to_send.buff_free_arg = buff_free_arg;

            tud_network_xmit(&s_net_obj.packet_to_send, s_net_obj.packet_to_send.len);
            return ESP_OK;
        }

        /* transfer execution to TinyUSB in the hopes that it will finish transmitting the prior packet */
        **tud_task();** 
    }

    return ESP_OK;
}

@espressif-bot espressif-bot added Status: Selected for Development Issue is selected for development and removed Status: Opened Issue is new labels Jan 7, 2025
@roma-jam
Copy link
Collaborator

roma-jam commented Jan 8, 2025

Hi @emaayan,

is the issue still relevant?

Could you also check to which core pinned the TinyUSB stack?
It can be done via menuconfig: Component config -> TinyUSB Stack -> TinyUSB task configuration -> TinyUSB task affinity.

If there is No affinity, please pin it to CPU1, rebuild, flash and try to reproduce problem again.

@ctag-fh-kiel
Copy link

Maybe this is related to
#113

@emaayan
Copy link
Author

emaayan commented Jan 9, 2025

Hi @emaayan,

is the issue still relevant?

Could you also check to which core pinned the TinyUSB stack? It can be done via menuconfig: Component config -> TinyUSB Stack -> TinyUSB task configuration -> TinyUSB task affinity.

If there is No affinity, please pin it to CPU1, rebuild, flash and try to reproduce problem again.
yes, it's relevant. and i agree with @ctag-fh-kiel , those should be added although i'm not sure if changing enhances

the performance, there is a workaround where you can add this to CMAKEList.txt of the main

#idf_build_set_property(COMPILE_DEFINITIONS CFG_TUD_NCM_IN_NTB_N=3 APPEND)
after the idf_component_register to see if it helps you out.

hi, the project that I've attached initially contains these settings, does it make any difference to which CPU it's pinned?

CONFIG_TINYUSB_NO_DEFAULT_TASK is not set

CONFIG_TINYUSB_TASK_PRIORITY=20
CONFIG_TINYUSB_TASK_STACK_SIZE=4096

CONFIG_TINYUSB_TASK_AFFINITY_NO_AFFINITY is not set

CONFIG_TINYUSB_TASK_AFFINITY_CPU0=y

CONFIG_TINYUSB_TASK_AFFINITY_CPU1 is not set

CONFIG_TINYUSB_TASK_AFFINITY=0x0
CONFIG_TINYUSB_INIT_IN_DEFAULT_TASK=y

@peter-marcisovsky
Copy link
Collaborator

Hi @emaayan could you please confirm, that the solution in #114 fixes your problem as well?

@emaayan
Copy link
Author

emaayan commented Jan 10, 2025

Hi @emaayan could you please confirm, that the solution in #114 fixes your problem as well?
you mean change the ntb to 3 ?

1 similar comment
@emaayan
Copy link
Author

emaayan commented Jan 10, 2025

Hi @emaayan could you please confirm, that the solution in #114 fixes your problem as well?
you mean change the ntb to 3 ?

@peter-marcisovsky
Copy link
Collaborator

If you could simply, try to configure NCM transfer blocks (their counts and lengths).

Basically, it the MR I created to fix the other issue, fixes your issue as well, or if those are unrelated issues. Or you would need some higher count of NCM buffers for your solution to be stable..

@emaayan
Copy link
Author

emaayan commented Jan 12, 2025

Reference
i actually didn't need the MR for it
all i needed to do was add these to CMakeLists.txt
idf_build_set_property(COMPILE_DEFINITIONS CFG_TUD_NCM_IN_NTB_N=1 APPEND)
idf_build_set_property(COMPILE_DEFINITIONS CFG_TUD_NCM_IN_NTB_MAX_SIZE=3200 APPEND)
and change them to either 4,4096 or 3,3200
(but it does make sense to expose them properly)
I've attached a modified project where you don't even need to run any stress test on it.
when the esp 32 was exposed to as DNS and DHCP server it became unusable within a couple of seconds as it started to be flooded with MDNS queries from my environment, and i started getting these errors:
ethernetif_input: IP input error
ethernetif_input: IP input error
ethernetif_input: IP input error
ethernetif_input: IP input error

i actually there might be 2 issues here the first is the configuration and the 2nd is a what seems to me a directly call to a function was meant to be used as a task in tiny_usb_send method
https://github.com/espressif/esp-iot-solution/blob/86fa25270db8016495169672adbc3f2adfa1bd6f/examples/usb/device/usb_dongle/components/tinyusb_dongle/tinyusb_net.c#L55
if this is hangs, the entire stack is hanging and just waits.
and changing the parameters definitely improved matters .

esp_tusb_ncm_bug_v2.zip

@roma-jam
Copy link
Collaborator

Hi @emaayan,

thanks for the clarification. NTB after the PR could be configured via menuconfig, so lets keep the problem with the transfer request blocked out of scope for now.

Regarding the calling tud_task() in the tinyusb_net_send(), hey @lijunru-hub, seems we do need your assistance.
Could you please check the description from here, as the tinyusb_net_send() we are talking about is being implemented in the esp-iot-solution project, example usb_dongle.

Thanks.

@lijunru-hub
Copy link
Contributor

Hi @emaayan,

thanks for the clarification. NTB after the PR could be configured via menuconfig, so lets keep the problem with the transfer request blocked out of scope for now.

Regarding the calling tud_task() in the tinyusb_net_send(), hey @lijunru-hub, seems we do need your assistance. Could you please check the description from here, as the tinyusb_net_send() we are talking about is being implemented in the esp-iot-solution project, example usb_dongle.

Thanks.

Made example from https://github.com/hathach/tinyusb/blob/master/examples/device/net_lwip_webserver/src/main.c#L111

My thought is that since tud_task must process the transfer completion in TinyUSB, actively invoking it can accelerate the handling of messages in the queue, thereby improving transmission efficiency.

@emaayan
Copy link
Author

emaayan commented Jan 14, 2025

Hi @emaayan,
thanks for the clarification. NTB after the PR could be configured via menuconfig, so lets keep the problem with the transfer request blocked out of scope for now.
Regarding the calling tud_task() in the tinyusb_net_send(), hey @lijunru-hub, seems we do need your assistance. Could you please check the description from here, as the tinyusb_net_send() we are talking about is being implemented in the esp-iot-solution project, example usb_dongle.
Thanks.

Made example from hathach/tinyusb@master/examples/device/net_lwip_webserver/src/main.c#L111

My thought is that since tud_task must process the transfer completion in TinyUSB, actively invoking it can accelerate the handling of messages in the queue, thereby improving transmission efficiency.

my 2 main concerns is that i'm not sure that methods indented to be invoked directly because the docs say it's intented to be called either from main or RTSO loop, i don't know enough about task method to know for sure
https://github.com/hathach/tinyusb/blob/2495563600f1cd2220d740895fad701dd48f1fb6/src/device/usbd.h#L69

but my bigger concern is in the event of a single xmit failure that slides into that task, you're locked forever, because it waits indefinitely, which may ok , if you're calling it from a main super loop but not as a sub method i'm not even sure it's valid thing in that example.
, as a workaround i saw that it's actually calling tud_task_ext which allows you specify a timeout but I'm not sure if that's ok.

tore-espressif added a commit to espressif/tinyusb that referenced this issue Jan 15, 2025
In case we received invalid datagram, we silently fail
a the buffer was not returned to empty list -> it was lost.
If this happened more than CFG_TUD_NCM_OUT_NTB_N times, we run out of
NTBs and all OUT transfers are NACKed.

Closes espressif/esp-usb#107
@espressif-bot espressif-bot added Status: Reviewing Issue is being reviewed and removed Status: Selected for Development Issue is selected for development labels Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Reviewing Issue is being reviewed Type: Bug Bug in esp-usb
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants