Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try fix https://github.com/Irqbalance/irqbalance/issues/303 #312

Merged
merged 2 commits into from
May 14, 2024

Conversation

balrog-kun
Copy link
Contributor

This is a proposed fix for #303 up for discussion. Here's the relevant commit msg:


There are situations where irqbalance may try to migrate large numbers of
IRQs to a topo_obj, there's no upper bound on the number as the
placement logic is based on load mainly. The kernel's irq bitmasks limit
the number of IRQs on each cpu and if more are tried to be migrated, the
write to smp_affinity returns -ENOSPC. This confuses irqbalance's
logic, the topo_obj.interrupts list no longer matches the irqs actually
on that CPU or cache domain, and results in floods of error messages.
See #303 for details.

For an easy fix, track the number of IRQ slots still free on each CPU.
We start with INT_MAX meaning "unknown" and when we first get a -ENOSPC,
we know we have no slots left. From there update the slots count each
time we migrate IRQs to/from the CPU core topo_obj. We may never see an
-ENOSPC and in that case there's no change in current logic, we never
start tracking.

This way we don't need to know ahead of time how many slots the kernel
has for each CPU. The number may be arch specific (it is about 200 on
x86-64) and is dependent on the number managed IRQs kernel has
registered, so we don't want to guess. This is also more tolerant to
the topo_obj.interrupts lists not matching exactly the kernel's idea of
each irq's current affinity, e.g. due to -EIO errors in the smp_affinity
writes.

For now only do the tracking at OBJ_TYPE_CPU level so we don't have to
update slots_left for all parent objs.

Th commit doesn't try to stop an ongoing activation of all the IRQs
already scheduled for moving to one cpu, when that cpu starts returning
ENOSPC. We'll still see a bunch of those errors in that iteration.
But in subsequent calculate_placement() iterations we avoid assigning
more IRQs to that cpu than we were able to successfully move before.

Add migrate_irq_obj and replace existing migrate_irq calls with calls to
the new function.  migrate_irq_obj takes source and destination
topo_obj's instead of interrupt lists so as to factor out updating of
the load on the destination cpu and of info->asssigned_obj.

Pass NULL as destination to move irq to rebalance_irq_list.

Drop the unneeded force_irq_migration.
There are situations where irqbalance may try to migrate large numbers of
IRQs to a topo_obj, there's no upper bound on the number as the
placement logic is based on load mainly.  The kernel's irq bitmasks limit
the number of IRQs on each cpu and if more are tried to be migrated, the
write to smp_affinity returns -ENOSPC.  This confuses irqbalance's
logic, the topo_obj.interrupts list no longer matches the irqs actually
on that CPU or cache domain, and results in floods of error messages.
See Irqbalance#303 for details.

For an easy fix, track the number of IRQ slots still free on each CPU.
We start with INT_MAX meaning "unknown" and when we first get a -ENOSPC,
we know we have no slots left.  From there update the slots count each
time we migrate IRQs to/from the CPU core topo_obj.  We may never see an
-ENOSPC and in that case there's no change in current logic, we never
start tracking.

This way we don't need to know ahead of time how many slots the kernel
has for each CPU.  The number may be arch specific (it is about 200 on
x86-64) and is dependent on the number managed IRQs kernel has
registered, so we don't want to guess.  This is also more tolerant to
the topo_obj.interrupts lists not matching exactly the kernel's idea of
each irq's current affinity, e.g. due to -EIO errors in the smp_affinity
writes.

For now only do the tracking at OBJ_TYPE_CPU level so we don't have to
update slots_left for all parent objs.

Th commit doesn't try to stop an ongoing activation of all the IRQs
already scheduled for moving to one cpu, when that cpu starts returning
ENOSPC.  We'll still see a bunch of those errors in that iteration.
But in subsequent calculate_placement() iterations we avoid assigning
more IRQs to that cpu than we were able to successfully move before.
@nhorman nhorman merged commit ba44a68 into Irqbalance:master May 14, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants