Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Queue PWM Commands to enable #133 #3338

Closed
wants to merge 40 commits into from

Conversation

Cirromulus
Copy link
Contributor

First of all: I like Klipper, thanks!

To enable using klipper for laser cutting as well (see #133), I tried to copy the move_queue functions of basecmd.c to a new pwm_queue in the MCU firmware.
This is still a work in progress, mainly because sometimes the host sends schedule_pwm commands too late for the actual requested clock.

Known issues:

  • Code duplication of queue (implement generic queue for both move and pwm commands?)
  • Lots of debug code
  • Queue only implemented for hardware_pwm, software_pwm will most likely fail
  • Host can't keep up with high frequency changes of pwm value (even though loadgraph does not indicate serious congestion < 30%)
    • FAN_MIN_TIME is set per guess. Host should know of the serial latency (but stepper moves work flawlessly..?)
  • So far, no actual queueing is happening (see below)

I would love to see someone else's thoughts about the host side implementation.

Here I have two logfiles of my tests wit simulavr:
timer_too_close_1ms.zip
ok_5ms.zip

If the schedule_pwm-commands are too close together (1ms FAN_MIN_TIME)

whole_sequence

it passes a lot of time until the command for the second pwm value is sent to the MCU, and then it is too late (zoomed):

first_timer

With a guessed FAN_MIN_TIME of 5ms, the sequence works already appropriately:
ok
But the queue is used very sparingly due to the slow drip of commands.

@Cirromulus
Copy link
Contributor Author

Laser Raster works "alright" in a slow setting (10mm/s) with my GeeeTeach A10 (atmega2560).
This is just a workaround, because the too-late scheduling of commands to send to the MCU is still an issue if FAN_MIN_TIME is not guessed correctly.

@KevinOConnor
Copy link
Collaborator

Hi,

As you've found, it's a bit of work to get pwm queuing to work properly. Some high-level comments:

  • It's not necessary to have a separate "move queue" and "pwm queue'. The "move queue" can be used for both. We don't really want two different queues, because the micro-controller software will not know how much space to allocate for each. The host can do a better job at this. For a simple example, one can set aside space in the move queue by altering how much queue space is given to the "steppersync" allocation in mcu.py:_connect().
  • For the micro-controller changes, it will be critical to make sure the max_duration checks (and similar sanity checks) remain functional. The pwm code can be used by heaters where being left on incorrectly could result in significant damage (and similarly for lasers).
  • I suspect the host code is where much of the challenge will be. To have scalable support, the host will need to manage the queue lifetime so that no commands are sent that would overflow the micro-controller queue, and that no one user of the queue starves other users. The host code implements this today via the klippy/chelper/stepcompress.c:steppersync system. That code would probably need to be extended to support pwm objects in addition to steppers.

Cheers,
-Kevin

@KevinOConnor
Copy link
Collaborator

Interesting. Does it work well?

-Kevin

@Cirromulus
Copy link
Contributor Author

Currently I am struggling with the fact that queued PWM-Commands seem to be held back and sent when queue gets filled a bit more. If I manually set Fan Speed to full, Zero and full, the first message is sent only when the third was composed, thus giving a timer too close

Hardwarepwm request clock: 125205673
[...]
Hardwarepwm request clock: 138682978
[...]
Hardwarepwm request clock: 199064455

Sent: 772.247564 772.247044 13: seq: 1b, schedule_pwm_out oid=1 clock=125205673 value=255
Rec:  772.264972 772.247564 11: seq: 1c, shutdown               clock=195082878 static_string_id=Timer too close

Could you explain the second meaning of the overloaded qm->min_clock field?
Should I fill it with just some other value than 0 to indicate that a heap_replace (re-sort?) is necessary?
Currently I set min_clock = req_clock, but this may be the problem.

@KevinOConnor
Copy link
Collaborator

KevinOConnor commented Oct 15, 2020

Currently I am struggling with the fact that queued PWM-Commands seem to be held back and sent when queue gets filled a bit more.

I'm not sure why that would be. Note that the steppersync code needs to be flushed (mcu.flush_moves()), but the toolhead should be doing that automatically (as long as the pwm code is calling toolhead.get_last_move_time()).

Could you explain the second meaning of the overloaded qm->min_clock field?
Should I fill it with just some other value than 0 to indicate that a heap_replace (re-sort?) is necessary?
Currently I set min_clock = req_clock, but this may be the problem.

The min_clock should be set to the time that the micro-controller will free up the space in the move_queue that was allocated for the given command. I think that should be the same as req_clock for PWM updates.

-Kevin

@Cirromulus
Copy link
Contributor Author

Cirromulus commented Oct 15, 2020

Ok, thanks for your comment. I don't actively command a steppersync flush (in fear it might collide with the stepcompress, or is this unsubstantiated ?),
Unlike some other commands, I did not find a toolhead.get_last_move_time() in the calls for the particular fan pwm, because it is entangled in the toolhead.register_lookahead_callback()
I will dig deeper into this.

Signed-off-by: Pascal Pieper <[email protected]>
Signed-off-by: Pascal Pieper <[email protected]>
…commands from setting at the same time opposed to queue flush frequency

Signed-off-by: Pascal Pieper <[email protected]>
Signed-off-by: Pascal Pieper <[email protected]>
Signed-off-by: Pascal Pieper <[email protected]>
Signed-off-by: Pascal Pieper <[email protected]>
Signed-off-by: Pascal Pieper <[email protected]>
@KevinOConnor
Copy link
Collaborator

But with acceptable experimental results I would like to hear your thoughts about a dedicated distinction between soft-pwm and hard-pwm (separate classes) to accommodate the fact that soft-pwm commands can't (and shouldn't?) be queued in the MCU.

We could make separate classes for soft-pwm and hard-pwm. However, if we are to queue pin updates on the micro-controller I think we should do the work to queue digital pin updates, soft pwm updates, and hard pwm updates. That is, once the main work of getting the logic correct is implemented, it would make sense to leverage that work for all pins.

Also I am thinking about an extras/laser.py implementing M3-M5 and to revert my changes made to extras/fan.py.

That makes sense to me.

Cheers,
-Kevin

@Cirromulus
Copy link
Contributor Author

So you suggest queueing soft-pwm and gpio as well? This is possible, would only need a way to minimize code duplication (especially the list operations).

That is, once the main work of getting the logic correct is implemented, it would make sense to leverage that work for all pins.

I am on a level where the setup works for me. How would you suggest proper testing to validate correctness? What kind of completeness needs this PR to be accepted? E.g. the extras/laser.py in a separate PR with this PRs scope being only the queueing of pwm/gpio commands with the intention of running closely coupled with the stepper sequences.

If you could give feedback for where you see hot spots for work to do on the functional logic, I can finalize this and "copy" the functionality to s-pwm and gpio.
E.g. is there a better way to notice the flushability of the pwm queue in ther steppersync? Currently I use self._mcu.flush_moves(print_time) with good results, but I do not fully understand the whole control flow. Maybe there are better alternatives like some variations of toolhead.flush_step_generation?

@Cirromulus
Copy link
Contributor Author

PWM-Updates do not force-flush the stepper's move queue anymore. In toolhead.py:

def note_synchronous_command(self, kin_time):
    self.note_kinematic_activity(kin_time)
    self._update_move_time(kin_time)
    # timer will call _flush_handler to check if
    # Batch should be sent. if not, reschedule.
    self.reactor.update_timer(self.flush_timer, self.reactor.NOW)

The update_timer arranges a flush even if no moves are queued. update_move_time is necessary to let the moves know what time to base new commands on.

This way, I could further increase my benchmark speed to 80mm/s with 10 pwm changes per millimeter -> 800 updates per second for pwm alone. Higher speeds were possible but not stable for burst changes.

About the inclusion of soft-pwm into the move queues: soft-pwm requires more info per scheduling (three uint32_t -> 12 bytes), which would decrease the total amount of available move queue elements (current biggest move element is stepper_move with 8 bytes). This is for an update that should not happen that often anyway (less than once in 0.1 seconds).
Is the cost-benefit ratio here good enough?

Signed-off-by: Pascal Pieper <[email protected]>
@KevinOConnor
Copy link
Collaborator

How would you suggest proper testing to validate correctness? What kind of completeness needs this PR to be accepted?

My high-level feedback is that I think we would need to be careful here, as the max_duration and shutdown checks are important. That would mean manually testing that each of these checks fires correctly (I can't think of a good way to automate these types of tests.) Also, all the "unknowns" causing scheduling errors would need to be tracked down. I'd also say we would want to break up the changes into a series of small changes that can be independently reviewed for correctness.

About the inclusion of soft-pwm into the move queues: soft-pwm requires more info per scheduling (three uint32_t -> 12 bytes), which would decrease the total amount of available move queue elements (current biggest move element is stepper_move with 8 bytes).

Yeah - that would be an issue. It is possible to shrink the soft-pwm update though - we could send changes to cycle time as a separate command that uses its own queue entry - which would shrink it to the same 11 bytes (on AVR; 16 on ARM) that the struct stepper_move uses.

That said, I don't think you need to make any of these changes - if you want to focus on hard_pwm, I think that's fine.

-Kevin

@Cirromulus
Copy link
Contributor Author

I'd also say we would want to break up the changes into a series of small changes that can be independently reviewed for correctness.

Yes, I agree. My suggestion is:

  1. The MCU-part with a queue for h-pwm and digital_out, which I test thoroughly with a focus on the safety timer and default value. This part can "live" with the current command style.
  2. The chelpers/MCU_pwm using the sync_queue-concept to take advantage of the MCUs capabilities (closing SET_PIN not usable for laser engraving #133)
  3. extras/laser.py to call klipper laser ready (including an option to compensate for on-lag of the laser)
  4. If needed, let set_pin also use sync_queue along with the underlying implications (updates during homing, etc.)

Is that OK with you?

@KevinOConnor
Copy link
Collaborator

The MCU-part with a queue for h-pwm and digital_out, which I test thoroughly with a focus on the safety timer and default value. This part can "live" with the current command style.

Yes - if it's possible to convert the mcu code while keeping the host code mostly unchanged, then I think that would be a good first step. I'd implement the hard_pwm changes as a set of commits separately from the digital_out commit(s).

The chelpers/MCU_pwm using the sync_queue-concept to take advantage of the MCUs capabilities (closing #133)

Agreed. Note, though, that the code could use some refactoring. The steppers could be treated just like any other command queue that needs access to the move queue. I can do this work.

extras/laser.py to call klipper laser ready (including an option to compensate for on-lag of the laser)
If needed, let set_pin also use sync_queue along with the underlying implications (updates during homing, etc.)

I don't know much about lasers, but it sounds good to me.

Thanks,
-Kevin

@Cirromulus Cirromulus closed this Nov 9, 2020
@Cirromulus Cirromulus mentioned this pull request Mar 14, 2021
@github-actions github-actions bot locked and limited conversation to collaborators Oct 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants