Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STM32F10xx Serial bus fault on large data #1399

Open
chrysophylax opened this issue May 22, 2021 · 18 comments
Open

STM32F10xx Serial bus fault on large data #1399

chrysophylax opened this issue May 22, 2021 · 18 comments
Assignees
Labels
bug 🐛 Something isn't working

Comments

@chrysophylax
Copy link

chrysophylax commented May 22, 2021

Describe the bug
When sending large amounts of data over Serial Monitor (string len > 204 ), the processor enters a hard fault.

Hard fault status register (SCB_HFSR) 0xE000ED2C has bit 30 set to 1 (FORCED: Forced hard fault, 1: Forced hard fault.)
Configurable fault status register (SCB_CFSR) 0xE000ED28 has bit 10 set to 1 (IMPRECISERR: Imprecise data bus error)

To Reproduce
On the latest 2.0.0 core

  1. Upload the following minimal Arduino sketch to the board as usual
void setup() {
  Serial.begin(9600);
}

void loop (void) {
  if (Serial.available()) Serial.print((char)Serial.read());
}
  1. Connect using the serial monitor and send the following

One morning, when Gregor Samsa woke from troubled dreams, he found himself transformed in his bed into a horrible vermin. He lay on his armour-like back, and if he lifted his head a little he could see his brown belly, slightly domed and divided by arches into stiff sections. The bedding was hardly able to cover it and seemed ready to slide off any moment. His many legs, pitifully thin compared with the size of the rest of him, waved about helplessly as he looked. "What's happened to me?" he thought.

Expected behavior
The text is written back in the Serial monitor.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: Linux 5.12.5-arch1-1
  • Arduino IDE version: 1.8.15
  • STM32 core version: 2.0.0
  • Tools menu settings if not the default: USB Support: CDC Supersede Serial
  • Upload method: STM32CubeProgrammer(SWD)

Board (please complete the following information):

  • Name: BluePill STM32F103C8T6 Medium-density with 128K Flash
  • Hardware Revision: Rev X/ N/A

Additional context

I have bisected this git repository and identified the issue as originating in commit 00acff6 by @fpistm which updated the HAL Drivers to v1.1.5 for STM32f1xx. The commit before 242671d using HAL v1.1.4 works perfectly as expected.

I attach three screenshots showing the hard fault, the frozen serial monitor, and the working behaviour.
hard-fault
hard-faulted-on-long-serial
working-echo-on-long-serial

@fpistm fpistm self-assigned this May 24, 2021
@fpistm fpistm added this to the 2.x.x milestone Jun 4, 2021
@fpistm fpistm added the bug 🐛 Something isn't working label Jul 13, 2021
@fpistm
Copy link
Member

fpistm commented Jul 23, 2021

Hi @chrysophylax
It's failed due to the CDC queue which are full and so set the application buffer to NULL for next dequeue but USB continue to receive without checking the buffer validity while it should wait space to release PMA buffer.
The update remove this part of code as all HAL/LL PCD/USB have been reworked:
07c4ee4

I will try to correct this issue but not so easy. One workaround could be to increase the buffer size.

#define CDC_RECEIVE_QUEUE_BUFFER_SIZE ((uint16_t)(CDC_QUEUE_MAX_PACKET_SIZE * 3))

@benguild
Copy link

Definitely hoping for a fix here 😔

@benguild
Copy link

Is there a workaround for this?

@ABOSTM
Copy link
Contributor

ABOSTM commented Jul 26, 2021

@benguild, as said @fpistm,

One workaround could be to increase the buffer size

But size depends on your needs, depends on your application, so it is up to you to implement it:
Arduino_Core_STM32/cores/arduino/stm32/usb/cdc/cdc_queue.h

#define CDC_RECEIVE_QUEUE_BUFFER_SIZE ((uint16_t)(CDC_QUEUE_MAX_PACKET_SIZE * 3))

@benguild
Copy link

What's the recommended maximum value, and do I just replace the value in the library for the board? I can tune it down from there if I can at least try it out without it panicking.

@ABOSTM
Copy link
Contributor

ABOSTM commented Jul 27, 2021

What's the recommended maximum value, and do I just replace the value in the library for the board?

There is not recommended value, as it depends on application. Up to you to test.
Yes, replace it in the library. Arduino_Core_STM32/cores/arduino/stm32/usb/cdc/cdc_queue.h

@benguild
Copy link

There is not recommended value, as it depends on application. Up to you to test.

What's the maximum? And what are the side effects? Is it possible to brick the board this way or enter a state where it can't be reflashed?

@benguild
Copy link

I changed the multiplier from * 3 on RX to 42 and it seems fine. Not sure if that's excessive or not, but yeah the board works normally now.

@fpistm
Copy link
Member

fpistm commented Aug 16, 2021

There is not recommended value, as it depends on application. Up to you to test.

What's the maximum? And what are the side effects? Is it possible to brick the board this way or enter a state where it can't be reflashed?

CDC_QUEUE_MAX_PACKET_SIZE is 64 so for 42 you allocate 2688 bytes.

@benguild
Copy link

CDC_QUEUE_MAX_PACKET_SIZE is 64 so for 42 you allocate 2688 bytes.

How quickly does this get cleared? What's the effective RX rate?

@fpistm
Copy link
Member

fpistm commented Aug 16, 2021

It depends of several stuff. Host PC, user application, .... The main issue here is when there is no more space in the Rx buffer the USB should stop transfer until a free room is available.

@fpistm fpistm removed this from the 2.1.0 milestone Sep 28, 2021
@ag88
Copy link
Contributor

ag88 commented Dec 11, 2021

this is kind of 'off-topic', under bulk transfer section, under 'OUT'
https://www.usbmadesimple.co.uk/ums_3.htm
if the device can't accept the data, it is supposed to send 'NAK', i'm not sure if this is possible in the usb code.
this would likely also need some rounds of tests if this is done. e.g. could we keep sending 'NAK'
if the device is perpetually busy? An unorthodox 'flow control' sort of.

As for the sketch sending data to the host. i'd think it'd be necessary to check Serial.availableForWrite() if the data should not be lost before doing Serial.write() or Serial.print(). Otherwise, i'd think Serial should simply overwrite old data in the buffer, or just drop the latest addition. The host can't be deemed to be 'always there'.

either way, it seemed quite possible for a 'dead lock' situation to happen.

@Ralf9
Copy link

Ralf9 commented Jan 29, 2022

With this bug I can't use a newer core than 1.9.0

Is there another way, than increase CDC_RECEIVE_QUEUE_BUFFER_SIZE in in the library? Arduino_Core_STM32/cores/arduino/stm32/usb/cdc/cdc_queue.h

@fpistm
Copy link
Member

fpistm commented Jan 29, 2022

Currently not as the fix is not so simple to deploy.

@Ralf9
Copy link

Ralf9 commented Jan 29, 2022

Is there a way to increase CDC_RECEIVE_QUEUE_BUFFER_SIZE without change a file in the core?

@fpistm
Copy link
Member

fpistm commented Jan 29, 2022

No

@ag88
Copy link
Contributor

ag88 commented Jan 30, 2022

at the moment, i'd suggest this as a partial solution, you could copy the relevant codes for the usb stack etc and perhaps place that in the sketch folder etc. it may take editing the codes. and omit selecting usb-cdc if you are using that particular set of things in the stack. there are some occasions like SPI etc where I need some custom functionalities i'd do that. for SPI i simply omit the library and copy the SPI library codes into the sketch folder and make a custom SPI driver

@gigaj0ule
Copy link
Contributor

I had this same issue show up for me when upgrading the core from a November 2022 version to now.

SerialUSB communications simply stopped working after sending a lot of data from the PC.

I can confirm that increasing the size of CDC_RECEIVE_QUEUE_BUFFER_PACKET_NUMBER fixes it (setting it to 20 in my case!!)

But this uses a lot of RAM, which did not used to be the case in the older core.

Was there some kind of flow control that was removed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants