Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i#6959: Add exit_if_fraction_left option #7018

Merged
merged 5 commits into from
Oct 4, 2024

Conversation

derekbruening
Copy link
Contributor

@derekbruening derekbruening commented Oct 2, 2024

Adds a new scheduler feature and CLI option exit_if_fraction_left. This applies to -core_sharded and -core_serial modes. When an input reaches EOF, if the number of non-EOF inputs left as a fraction of the original inputs is equal to or less than this value then the scheduler exits (sets all outputs to EOF) rather than finishing off the final inputs. This helps avoid long sequences of idles during staggered endings with fewer inputs left than cores and only a small fraction of the total instructions left in those inputs.

The default value in scheduler_options_t is 0 as simulators are typically already choosing to stop at some even point. For analyzers, however, via the command-line option, the default is 0.05 (i.e., 5%), which when tested on an large internal trace helps eliminate much of the final idle time from the cores (just about any value over 0.05 works well: it is not overly sensitive).

Compare the numbers below for today's default with a long idle time and so distinct differences between the "cpu busy by time" and "cpu busy by time, ignoring idle past last instr" stats on a 39-core schedule-stats run of a moderately large trace, with key stats and the 1st 2 cores (for brevity) shown here:

  1567052521 instructions
   878027975 idles
       64.09% cpu busy by record count
       82.38% cpu busy by time
       96.81% cpu busy by time, ignoring idle past last instr
Core #0 schedule: CccccccOXHhUuuuuAaSEOGOWEWQqqqFffIiTETENWwwOWEeeeeeeACMmTQFfOWLWVvvvvFQqqqqYOWOooOWOYOYQOWO_O_W_O_W_O_W_O_WO_WO_O_O_O_O_O_OR_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_RY_YyyyySUuuOSISO_S_S_SOPpSOKO_KO_KCcDKWDB_B_____________________________________________ 
Core #1 schedule: KkLWSFUQPDddddddddXxSUSVRJWKkRNJBWUWwwTttGgRNKkkRWNTtFRWKkRNWUuuGULRFSRSYKkkkRYAYFffGSRYHRYHNWMDddddddddRYGgggggYHNWK_YAHYNnGYSNHWwwwwSWSNKSYyyWKNNWKNNGAKWGggNnNW_NNWE_E_EF__________________________________________________

And now with -exit_if_fraction_left 0.05, where we lose (1567052521 - 1564522227)/1567052521. = 0.16% of the instructions but drastically reduce the tail from 14% of the time to less than 1% of the time:

  1564522227 instructions
   120512812 idles
       92.85% cpu busy by record count
       96.39% cpu busy by time
       97.46% cpu busy by time, ignoring idle past last instr
Core #0 schedule: CccccccOXHKYEGGETRARrrPRTVvvvRrrNWwwOOKWVRRrPBbbXUVvvvvvOWKVLWVvvJjSOWKVUuTIiiiFPpppKAaaMFfffAHOKWAaGNBOWKAPPOABCWKPWOKWPCXxxxZOWKCccJSOSWKJUYRCOWKCcSOSUKkkkOROK_O_O_O_O_O 
Core #1 schedule: KkLWSMmmFLSFffffffJjWBbGBUuuuuuuuuuuBDBJJRJWKkRNJWMBKkkRNWKkRNWKkkkRNWXxxxxxZOooAaUIiTHhhhSDNnnnHZzQNnnRNWXxxxxxRNWUuuRNWKXUuXRNKRWKNXxxRWKONNHRKWONURKWXRKXRKNW_KR_KkRK_KRKR_R_R_R_R_R_R_R_R_R_R_R__R__R__R___R___R___R___R___R

Fixes #6959

Adds a new scheduler feature and CLI option exit_if_fraction_left.
This applies to -core_sharded and -core_serial modes.  When an input
reaches EOF, if the number of non-EOF inputs left as a fraction of the
original inputs is equal to or less than this value then the scheduler
exits (sets all outputs to EOF) rather than finishing off the final
inputs.  This helps avoid long sequences of idles during staggered
endings with fewer inputs left than cores and only a small fraction of
the total instructions left in those inputs.

The default value in scheduler_options_t is 0 as simulators are
typically already choosing to stop at some even point.  For analyzers,
however, via the command-line option, the default is 0.05 (i.e., 5%),
which when tested on an large internal trace helps eliminate much of
the final idle time from the cores (just about any value over 0.05
works well: it is not overly sensitive).

Compare the numbers below for today's default with a long idle time
and so distinct differences between the "cpu busy by time" and "cpu
busy by time, ignoring idle past last instr" stats on a 39-core
schedule-stats run of a moderately large trace, with key stats and the
1st 2 cores (for brevity) shown here:

  1567052521 instructions
   878027975 idles
       64.09% cpu busy by record count
       82.38% cpu busy by time
       96.81% cpu busy by time, ignoring idle past last instr
Core #0 schedule: CccccccOXHhUuuuuAaSEOGOWEWQqqqFffIiTETENWwwOWEeeeeeeACMmTQFfOWLWVvvvvFQqqqqYOWOooOWOYOYQOWO_O_W_O_W_O_W_O_WO_WO_O_O_O_O_O_OR_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_RY_YyyyySUuuOSISO_S_S_SOPpSOKO_KO_KCcDKWDB_B_____________________________________________
Core #1 schedule: KkLWSFUQPDddddddddXxSUSVRJWKkRNJBWUWwwTttGgRNKkkRWNTtFRWKkRNWUuuGULRFSRSYKkkkRYAYFffGSRYHRYHNWMDddddddddRYGgggggYHNWK_YAHYNnGYSNHWwwwwSWSNKSYyyWKNNWKNNGAKWGggNnNW_NNWE_E_EF__________________________________________________

And now with -exit_if_fraction_left 0.05, where we lose (1567052521 -
1564522227)/1567052521. = 0.16% of the instructions but drastically
reduce the tail from 14% of the time to less than 1% of the time:

  1564522227 instructions
   120512812 idles
       92.85% cpu busy by record count
       96.39% cpu busy by time
       97.46% cpu busy by time, ignoring idle past last instr
766.85user 6.33system 1:15.88elapsed 1018%CPU (0avgtext+0avgdata 4947364maxresident)k
Core #0 schedule: CccccccOXHKYEGGETRARrrPRTVvvvRrrNWwwOOKWVRRrPBbbXUVvvvvvOWKVLWVvvJjSOWKVUuTIiiiFPpppKAaaMFfffAHOKWAaGNBOWKAPPOABCWKPWOKWPCXxxxZOWKCccJSOSWKJUYRCOWKCcSOSUKkkkOROK_O_O_O_O_O
Core #1 schedule: KkLWSMmmFLSFffffffJjWBbGBUuuuuuuuuuuBDBJJRJWKkRNJWMBKkkRNWKkRNWKkkkRNWXxxxxxZOooAaUIiTHhhhSDNnnnHZzQNnnRNWXxxxxxRNWUuuRNWKXUuXRNKRWKNXxxRWKONNHRKWONURKWXRKXRKNW_KR_KkRK_KRKR_R_R_R_R_R_R_R_R_R_R_R__R__R__R___R___R___R___R___R

Fixes #6959
Copy link
Contributor

@brettcoon brettcoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple suggestions.

clients/drcachesim/scheduler/scheduler.h Outdated Show resolved Hide resolved
clients/drcachesim/scheduler/scheduler.cpp Outdated Show resolved Hide resolved
clients/drcachesim/scheduler/scheduler.h Show resolved Hide resolved
clients/drcachesim/common/options.cpp Outdated Show resolved Hide resolved
clients/drcachesim/common/options.cpp Outdated Show resolved Hide resolved
clients/drcachesim/scheduler/scheduler.cpp Outdated Show resolved Hide resolved
clients/drcachesim/scheduler/scheduler.cpp Outdated Show resolved Hide resolved
@derekbruening derekbruening merged commit 94f5c44 into master Oct 4, 2024
17 checks passed
@derekbruening derekbruening deleted the i6959-exit-if-fraction branch October 4, 2024 00:02
derekbruening added a commit that referenced this pull request Oct 4, 2024
Adds a new scheduler feature and CLI option exit_if_fraction_inputs_left. This
applies to -core_sharded and -core_serial modes. When an input reaches
EOF, if the number of non-EOF inputs left as a fraction of the original
inputs is equal to or less than this value then the scheduler exits
(sets all outputs to EOF) rather than finishing off the final inputs.
This helps avoid long sequences of idles during staggered endings with
fewer inputs left than cores and only a small fraction of the total
instructions left in those inputs.

The default value in scheduler_options_t and the CLI option is 0.05 (i.e., 5%),
which when tested on an large internal trace helps eliminate much of the
final idle time from the cores without losing many instructions.

Compare the numbers below for today's default with a long idle time and
so distinct differences between the "cpu busy by time" and "cpu busy by
time, ignoring idle past last instr" stats on a 39-core schedule-stats
run of a moderately large trace, with key stats and the 1st 2 cores (for
brevity) shown here:

```
  1567052521 instructions
   878027975 idles
       64.09% cpu busy by record count
       82.38% cpu busy by time
       96.81% cpu busy by time, ignoring idle past last instr
Core #0 schedule: CccccccOXHhUuuuuAaSEOGOWEWQqqqFffIiTETENWwwOWEeeeeeeACMmTQFfOWLWVvvvvFQqqqqYOWOooOWOYOYQOWO_O_W_O_W_O_W_O_WO_WO_O_O_O_O_O_OR_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_RY_YyyyySUuuOSISO_S_S_SOPpSOKO_KO_KCcDKWDB_B_____________________________________________ 
Core #1 schedule: KkLWSFUQPDddddddddXxSUSVRJWKkRNJBWUWwwTttGgRNKkkRWNTtFRWKkRNWUuuGULRFSRSYKkkkRYAYFffGSRYHRYHNWMDddddddddRYGgggggYHNWK_YAHYNnGYSNHWwwwwSWSNKSYyyWKNNWKNNGAKWGggNnNW_NNWE_E_EF__________________________________________________
```

And now with -exit_if_fraction_inputs_left 0.05, where we lose (1567052521 -
1564522227)/1567052521. = 0.16% of the instructions but drastically
reduce the tail from 14% of the time to less than 1% of the time:

```
  1564522227 instructions
   120512812 idles
       92.85% cpu busy by record count
       96.39% cpu busy by time
       97.46% cpu busy by time, ignoring idle past last instr
Core #0 schedule: CccccccOXHKYEGGETRARrrPRTVvvvRrrNWwwOOKWVRRrPBbbXUVvvvvvOWKVLWVvvJjSOWKVUuTIiiiFPpppKAaaMFfffAHOKWAaGNBOWKAPPOABCWKPWOKWPCXxxxZOWKCccJSOSWKJUYRCOWKCcSOSUKkkkOROK_O_O_O_O_O 
Core #1 schedule: KkLWSMmmFLSFffffffJjWBbGBUuuuuuuuuuuBDBJJRJWKkRNJWMBKkkRNWKkRNWKkkkRNWXxxxxxZOooAaUIiTHhhhSDNnnnHZzQNnnRNWXxxxxxRNWUuuRNWKXUuXRNKRWKNXxxRWKONNHRKWONURKWXRKXRKNW_KR_KkRK_KRKR_R_R_R_R_R_R_R_R_R_R_R__R__R__R___R___R___R___R___R
```

Fixes #6959
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add scheduler exit-early feature to avoid long tail with sparse activity
2 participants