Skip to content

Commit

Permalink
Track IRQ "slots" count per CPU to avoid overflowing
Browse files Browse the repository at this point in the history
There are situations where irqbalance may try to migrate large numbers of
IRQs to a topo_obj, there's no upper bound on the number as the
placement logic is based on load mainly.  The kernel's irq bitmasks limit
the number of IRQs on each cpu and if more are tried to be migrated, the
write to smp_affinity returns -ENOSPC.  This confuses irqbalance's
logic, the topo_obj.interrupts list no longer matches the irqs actually
on that CPU or cache domain, and results in floods of error messages.
See #303 for details.

For an easy fix, track the number of IRQ slots still free on each CPU.
We start with INT_MAX meaning "unknown" and when we first get a -ENOSPC,
we know we have no slots left.  From there update the slots count each
time we migrate IRQs to/from the CPU core topo_obj.  We may never see an
-ENOSPC and in that case there's no change in current logic, we never
start tracking.

This way we don't need to know ahead of time how many slots the kernel
has for each CPU.  The number may be arch specific (it is about 200 on
x86-64) and is dependent on the number managed IRQs kernel has
registered, so we don't want to guess.  This is also more tolerant to
the topo_obj.interrupts lists not matching exactly the kernel's idea of
each irq's current affinity, e.g. due to -EIO errors in the smp_affinity
writes.

For now only do the tracking at OBJ_TYPE_CPU level so we don't have to
update slots_left for all parent objs.

Th commit doesn't try to stop an ongoing activation of all the IRQs
already scheduled for moving to one cpu, when that cpu starts returning
ENOSPC.  We'll still see a bunch of those errors in that iteration.
But in subsequent calculate_placement() iterations we avoid assigning
more IRQs to that cpu than we were able to successfully move before.
  • Loading branch information
balrog-kun committed May 11, 2024
1 parent d16ad5d commit 5405144
Show file tree
Hide file tree
Showing 8 changed files with 42 additions and 2 deletions.
13 changes: 12 additions & 1 deletion activate.c
Original file line number Diff line number Diff line change
Expand Up @@ -99,14 +99,25 @@ static void activate_mapping(struct irq_info *info, void *data __attribute__((un
"Cannot change IRQ %i affinity: %s\n",
info->irq, strerror(errsave));
switch (errsave) {
case ENOSPC: /* Specified CPU APIC is full. */
case EAGAIN: /* Interrupted by signal. */
case EBUSY: /* Affinity change already in progress. */
case EINVAL: /* IRQ would be bound to no CPU. */
case ERANGE: /* CPU in mask is offline. */
case ENOMEM: /* Kernel cannot allocate CPU mask. */
/* Do not blacklist the IRQ on transient errors. */
break;
case ENOSPC: /* Specified CPU APIC is full. */
if (info->assigned_obj->obj_type != OBJ_TYPE_CPU)
break;

if (info->assigned_obj->slots_left > 0)
info->assigned_obj->slots_left = -1;
else
/* Negative slots to count how many we need to free */
info->assigned_obj->slots_left--;

force_rebalance_irq(info, NULL);
break;
default:
/* Any other error is considered permanent. */
info->level = BALANCE_NONE;
Expand Down
2 changes: 2 additions & 0 deletions classify.c
Original file line number Diff line number Diff line change
Expand Up @@ -883,6 +883,8 @@ static void remove_no_existing_irq(struct irq_info *info, void *data __attribute
entry = g_list_find_custom(info->assigned_obj->interrupts, info, compare_ints);
if (entry) {
info->assigned_obj->interrupts = g_list_delete_link(info->assigned_obj->interrupts, entry);
/* Probe number of slots again, don't guess whether the IRQ left a free slot */
info->assigned_obj->slots_left = INT_MAX;
}
}
free_irq(info, NULL);
Expand Down
10 changes: 10 additions & 0 deletions cputree.c
Original file line number Diff line number Diff line change
Expand Up @@ -595,3 +595,13 @@ int get_cpu_count(void)
return g_list_length(cpus);
}

static void clear_obj_slots(struct topo_obj *d, void *data __attribute__((unused)))
{
d->slots_left = INT_MAX;
for_each_object(d->children, clear_obj_slots, NULL);
}

void clear_slots(void)
{
for_each_object(numa_nodes, clear_obj_slots, NULL);
}
3 changes: 3 additions & 0 deletions irqbalance.c
Original file line number Diff line number Diff line change
Expand Up @@ -298,6 +298,7 @@ gboolean scan(gpointer data __attribute__((unused)))
} while (need_rebuild);

for_each_irq(NULL, force_rebalance_irq, NULL);
clear_slots();
parse_proc_interrupts();
parse_proc_stat();
return TRUE;
Expand Down Expand Up @@ -695,6 +696,8 @@ int main(int argc, char** argv)
parse_proc_interrupts();
parse_proc_stat();

clear_slots();

#ifdef HAVE_IRQBALANCEUI
if (init_socket()) {
ret = EXIT_FAILURE;
Expand Down
1 change: 1 addition & 0 deletions irqbalance.h
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,7 @@ extern struct topo_obj *get_numa_node(int nodeid);
#define cpu_numa_node(cpu) ((cpu)->parent->numa_nodes)
extern struct topo_obj *find_cpu_core(int cpunr);
extern int get_cpu_count(void);
extern void clear_slots(void);

/*
* irq db functions
Expand Down
11 changes: 10 additions & 1 deletion irqlist.c
Original file line number Diff line number Diff line change
Expand Up @@ -211,8 +211,17 @@ void migrate_irq_obj(struct topo_obj *from, struct topo_obj *to, struct irq_info

migrate_irq(from_list, to_list, info);

if (to)
if (from) {
if (from->slots_left != INT_MAX)
from->slots_left++;
}

if (to) {
if (to->slots_left != INT_MAX)
to->slots_left--;

to->load += info->load + 1;
}

info->assigned_obj = to;
}
3 changes: 3 additions & 0 deletions placement.c
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,9 @@ static void find_best_object(struct topo_obj *d, void *data)
if (d->powersave_mode)
return;

if (d->slots_left <= 0)
return;

newload = d->load;
if (newload < best->best_cost) {
best->best = d;
Expand Down
1 change: 1 addition & 0 deletions types.h
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ struct topo_obj {
GList *children;
GList *numa_nodes;
GList **obj_type_list;
int slots_left;
};

struct irq_info {
Expand Down

0 comments on commit 5405144

Please sign in to comment.