-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add schedstats instrument #123
base: master
Are you sure you want to change the base?
Conversation
devlib/instrument/__init__.py
Outdated
msg = 'Unexpected channel "{}"; must be in {}' | ||
raise ValueError(msg.format(chan_name, self.channels.keys())) | ||
else: | ||
for chan in self.channels.values(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the else
clause, channels is None
so this will raise a NoAttributeError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
channels
is None
but not self.channels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, my bad.
doc/instrumentation.rst
Outdated
``active_channels`` attribute of the ``Instrument``. | ||
|
||
If ``channels`` is provided, it is a list of names of channels to enable and | ||
``sites`` and ``kinds`` are ignored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the fact that user-provided parameters may be silently ignored. Either, they should be combined with the channels or an error should be raised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I'll make it raise an error if you provide both channels
and sites or kinds
devlib/instrument/schedstats.py
Outdated
# params. Does it matter? | ||
measurement_type = MeasurementType( | ||
measurement_name, '', measurement_category) | ||
self.add_channel(site=site, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of this should be done int the __init__
-- setup()
is not guaranteed to be invoked on every execution. (Set up is intended for persistent changes, such as deploying assets to the target.
devlib/instrument/schedstats.py
Outdated
measure=measurement_type) | ||
|
||
def teardown(self): | ||
self.target.write_value(self.sysctl_path, self.old_sysctl_value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably happen in reset()
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, ignore the above. Doing this in teardown is correct.
devlib/instrument/schedstats.py
Outdated
based on identifiers from the scheduler code. | ||
""" | ||
sysctl_path = '/proc/sys/kernel/sched_schedstats' | ||
schedstat_path = '/proc/schedstat' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should set mode
to INSTANTANEOUS
devlib/instrument/schedstats.py
Outdated
measures = DOMAIN_MEASURES | ||
else: | ||
raise TargetError( | ||
'Unrecognised schedstats line: "{}"'.format(line)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will fails in all our Android systems, since we have additional stats reporting EAS codepaths.
In Android kernel we will have stats like:
cpu0 9703 0 227370 45501 209763 87404 75510237990 76927923766 168389
eas 50510 0 0 29148 0 21362 51469 11340 3022 43 14180 169 22712 40129 3 0 3023 28011 15055 11640
domain0 0f 5237 5089 86 95279 66 0 0 5089 2021 1945 62 74554 19 1 3 1942 43224 36754 5499 4393208 971 0 7128 29626 1 0 1 0 0 0 0 0 0 39102 1466 0
eas 0 0 0 9389 0 0 0 0 0 0 0 0 0 0 0 0 0 0 13480 0
domain1 ff 3981 3733 220 673383 33 2 5 1149 1076 1069 6 36930 2 0 3 176 42131 36952 4441 18425660 738 3 261 36691 1 0 1 0 0 0 0 0 0 83215 794 0
eas 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 15055 0
Where basically, after each cpu
and domain
we do have an eas
entry with EAS specific counters.
I was wondering if we can end up with a more "generic" parser which allows to accept custom defined lines.
Maybe we can use a "map" to define what are the columns for each specific "channel", the map can be defined by default using the hardcoded values you have now... but we can allow the user to specify a custom map.
In that case, the custom map should allow to parse additional fields (e.g. eas stats) as well as open the door to parse stats versions not supported by default.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not design & document an API for that given that these EAS stats are probably the only ever user (also TBH we have broken schedstats with those additions, we should really change the version string to "15.eas0" or something IMO). How about if:
- Unrecognised lines are ignored with a warning
- I rewrite this so we can inherit from it and write a subclass that knows about the EAS stats
- Then, if we decide that the EAS stats format is "stable", or come up with a way to fixup the version string so we can detect changes, we just add support for the EAS stats directly here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rewrite this so we can inherit from it and write a subclass that knows about the EAS stats
+1 for this - if we handle each line in a class member which can access curr_cpu and add curr_domain, then 'eas' lines can be parsed easily in a subclassed instrument
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started doing part 2. and realised that it's also pretty unpalatable; the parsing state (curr_cpu
and curr_domain
) makes the code quite arcane and it still isn't generic. If we want to keep fiddling with schedstats we're going to have to fiddle with this code again anyway so I propose just skipping part 2. and only doing parts 1. and 3.
f7c2a09
to
2e4943f
Compare
Latest version
|
I think this could be really useful for use with WA3 but I would say it's low priority. It's also just occurred to me that I should think about how to get this to report diffs, i.e. you'd like to take a sample at the beginning and a sample at the end, and have the metric report the diff in schedstats during workload execution (maybe WA already does that?). Also would like some input from @derkling on the EAS stats thing. Are you OK with having this PR sit around a little bit longer? |
WA has a generic "sysfs_extractor", which takes a list of sysfs locations, which it then pulls before and after each iteration, and then performs a generic diff; but it has no awareness of what it's diff'ing, so it doesn't generate any metrics (you just get a directory structure with the diffs).
Yeah, no problem. There is no rush. It's just that there hasn't been any activity on this for a while, so I just wanted to make sure this is still relevant. |
0821c1f
to
3675dee
Compare
I've just refreshed this and plugged it into WA3 as an experiment.. definitely interesting but the only problem is it produces literally hundreds of metrics for each workload! |
That's not necessarily an issue, if those metrics are actually individually useful. An alternative though, if they're more of a "state dump" for debug/deep-dive analysis, is to write them out to a file that is than added as an Artifact, rather than individually adding them as metrics in WA (or adding only a few key/summary ones as metrics). |
Hmm yeah.. my dream for this Instrument is that you can immediately see that your kernel change e.g. "increased the average number of failed load balances" for a certain workload. All the added stats are potentially individually useful, but the particular one that's individually useful is probably going to change in every situation. On the other hand I've just realised that's only going to happen if you explicitly ask for schedstats data, so maybe it's fine. There are also a few potential aggregates I can think of (aggregate across CPU/sched_domain/idle type/any combination of those 3) but none of them makes any more sense than the others, so probably best considered a future feature. Subject to a bit more testing this is probably ready for merge. |
That sounds good to me. There is nothing in WA which demands or even encourages reducing the number of metrics. We specifically use Ultimately, it comes down to what's easier to work with. As you'll likely be, at least initially, the only/the primary user of this instrument, this is for you to decide. I should also point out that there is also the option of tagging the metrics with some kind of classifier to make it easier to filter/split them from the rest of the results. |
OK sweet. Last time I tested this I saw some suspicious values, once I've investigated that, let's go ahead (this is still low priority though so might be a while). |
5db1f83
to
0f45fbc
Compare
OK I am now using this instrument in anger and getting sane results. |
devlib/instrument/schedstats.py
Outdated
|
||
- If :method:`reset` is called with ``kinds=['alb_pushed']`` then the count | ||
of migrations successfully triggered by active_load_balance will be | ||
colelcted for each sched domain, with a channel for each domain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/colelcted/collected/
devlib/instrument/schedstats.py
Outdated
match = re.search(r'version ([0-9]+)', lines[0]) | ||
if not match or match.group(1) != '15': | ||
raise TargetError( | ||
'Unsupported schedstat version string: "{}"'.format(lines[0])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth to report which versions we support?
devlib/instrument/schedstats.py
Outdated
|
||
# On 4.6+ kernels, schedstats needs to be enabled via kernel cmdline or | ||
# sysctl. If not, we'll just get a load of zeroes. | ||
self.old_sysctl_value = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for code readability, maybe by using some "returns" on success/failure, can we move the following checks (lines 128-145) into a dedicated function, something like _check_availability()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, seems like a good idea, will try it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so I split out this particular chunk (that enables the runtime config) into a separate function but kept the bit that checks for the presence and version of schedstat in the kernel, since the lines
variable is re-used later to add the channels (to avoid more round-trips to the target slowing things down). If you like I can split it up further and just drop the "optimisation".
# create. | ||
# We'll create a site for each CPU and a site for each sched_domain. | ||
for site, measures in self._get_sample().iteritems(): | ||
if site.startswith('cpu'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it not more generic to parse the first column (up to excluding the ID) and create a category with that name?
Something like that should do the job:
site_re = regexp.compile("^(?P<category>[a-z]+)(?P<id>[0-9]+)")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC I thought about that, but the thing is that domain
s are a sub-category of cpu
s, and there's no way to detect that: the parser has to be aware of this relationship. I can't imagine any other types of category (tasks? They have a separate schedstats file though. Task groups & sched entities would too, if they had schedstats), so I can't predict whether it makes sense to assume that the other category is a sub-entity of cpu
s, a sub-category of domain
s, or a new category entirely. Does that make sense?
Edit: Wait, that makes no sense.. I thought this comment was on different code (I assumed it was attached to a line in _get_sample
) and interpreted it totally wrong. Will think about this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah OK now I understand you!
To be honest while it would be more generic I don't think this is worth it unless there are more categories.. even with the EAS stats there are still only 2 category types, and even with the addition of tasks there are only 3. I don't think it's worth adding a regex (which nobody likes reading) just to deal with that..
# it might not be stable. | ||
continue | ||
else: | ||
self.logger.warning( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dunno if it really can make sense... but... can we think about generating at run-time a category which has the name of this line's first column and a "generic list of measures", e.g. measure1, measure2, measure3... etc..
This will allows to parse Android specific fileds and then map the measures to proper names in the client code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that seems like a sensible idea (although in fact it might be easier to just add the EAS-specific schedstats?). Do you mind if we leave that for a separate feature though?
devlib/instrument/schedstats.py
Outdated
|
||
return ret | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove empty lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review! I think I will have to wait until monday to fix it up.
devlib/instrument/schedstats.py
Outdated
|
||
# On 4.6+ kernels, schedstats needs to be enabled via kernel cmdline or | ||
# sysctl. If not, we'll just get a load of zeroes. | ||
self.old_sysctl_value = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, seems like a good idea, will try it.
# create. | ||
# We'll create a site for each CPU and a site for each sched_domain. | ||
for site, measures in self._get_sample().iteritems(): | ||
if site.startswith('cpu'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC I thought about that, but the thing is that domain
s are a sub-category of cpu
s, and there's no way to detect that: the parser has to be aware of this relationship. I can't imagine any other types of category (tasks? They have a separate schedstats file though. Task groups & sched entities would too, if they had schedstats), so I can't predict whether it makes sense to assume that the other category is a sub-entity of cpu
s, a sub-category of domain
s, or a new category entirely. Does that make sense?
Edit: Wait, that makes no sense.. I thought this comment was on different code (I assumed it was attached to a line in _get_sample
) and interpreted it totally wrong. Will think about this!
# it might not be stable. | ||
continue | ||
else: | ||
self.logger.warning( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that seems like a sensible idea (although in fact it might be easier to just add the EAS-specific schedstats?). Do you mind if we leave that for a separate feature though?
This instrument collects snapshots of the CPU/sched_domain scheduler statistics in /proc/schedstat, and reports a Measure for each of them. It currently doesn't support the per-task stats reported in /proc/<pid>/schedstat, and it ignores the additional EAS schedstats reported by Android kernels. The measures are named according to the corresponding identifiers in the kernel source code. Although This might appear to be an obscure way to refer to them, the information in /proc/schedstats is not going to be any use to anyone who doesn't have the kernel code handy. There is an English description in Documentation/scheduler/sched-stats.txt but it can only really be considered a reference for those who already understand the scheduler. Some of the measures are representing time intervals, but their units are jiffies, the meaning of which depends on the kernel configuration. Therefore no units are reported for the measurements.
Not sure how useful this will actually be hence RFC marker and not too much care or documentation. This basically jimmies
/proc/schedstat
into theInstrument
API. What do you think?Low priority at the moment, don't spend ages reviewing if you're busy :)