From e4b3c9928b4480dd687e5bf341ee4558f63ba4eb Mon Sep 17 00:00:00 2001 From: Patrick Lam Date: Tue, 17 Sep 2024 16:45:29 +1200 Subject: [PATCH] L29 cosmetic --- lectures/L29-slides.tex | 15 +++++++++------ lectures/L29.tex | 6 +++--- 2 files changed, 12 insertions(+), 9 deletions(-) diff --git a/lectures/L29-slides.tex b/lectures/L29-slides.tex index e50ce99b..0fb0be02 100644 --- a/lectures/L29-slides.tex +++ b/lectures/L29-slides.tex @@ -46,10 +46,12 @@ \part{Liar, Liar} \frametitle{Assumptions} The main assumptions underlying sampling are that: +\vspace*{-6em} +\begin{itemize} +\item Samples are ``random''; and -Samples are ``random''; and - -The sample distribution approximates the actual time-spent distribution. +\item The sample distribution approximates the actual time-spent distribution. +\end{itemize} \end{frame} @@ -84,11 +86,12 @@ \part{Lies from Metrics} mostly we'll talk about CPU perf counters. +\vspace*{-6em} \begin{center} Reference: Paul Khuong,\\ - \tiny + \scriptsize \url{http://www.pvk.ca/Blog/2014/10/19/performance-optimisation-~-writing-an-essay/} \end{center} @@ -357,7 +360,7 @@ \part{Lies from Counters} To make counters as deterministic as possible: \begin{itemize} -\item disable Address Space Layout Randomization (randomized pointer addresses affect hash layouts); +\item disable Address Space Layout Randomization (security mitigation; but, randomized pointer addresses affect hash layouts); \item subtract time spent processing interrupts (IRQs); \item profile one thread only (if you can, in your context). \end{itemize} @@ -396,7 +399,7 @@ \part{Lies about Calling Context} \begin{center} Reference: Yossi Kreinin,\\ - \tiny + \scriptsize \url{http://www.yosefk.com/blog/how-profilers-lie-the-cases-of-gprof-and-kcachegrind.html} \end{center} diff --git a/lectures/L29.tex b/lectures/L29.tex index 38d9b413..6f55b357 100644 --- a/lectures/L29.tex +++ b/lectures/L29.tex @@ -115,7 +115,7 @@ \section*{Lies from Metrics} \end{tabular} That 15\% number is a total lie. -Profilers, even using CPU expense counts, drastically underestimate the impact of mfence, +Profilers, even using CPU counts, drastically underestimate the impact of mfence, and overestimate the impact of locks. This is because mfence causes a pipeline flush, and the resulting @@ -149,7 +149,7 @@ \section*{The Long Tail} Well, for one thing, perf samples are done with interrupts. Processing interrupts takes a fair amount of time and if you crank up the rate of interrupts, before long, you are spending all your time handling the interrupts rather than doing useful work. So sampling tools usually don't interrupt the program too often. SHIM gets around this by being more invasive---it instruments the program, adding some periodically executed code that puts information out whenever there is an appropriate event (e.g., function return). This produces a bunch of data which can be dealt with later to produce something useful. -This instrumentation-based approach is more expensive in general, but note that DTrace\footnote{Note also the comment in the blog post: ``Yes, that includes dtrace, which I'm calling out in particular because any time you have one of these discussions, a dtrace troll will come along to say that dtrace has supported that for years. It's like the common lisp of trace tools, in terms of community trolling.''} and Nethercote's counts tool (discussed in L25) also enable custom instrumentation of select events. +This instrumentation-based approach is more expensive in general, but note that DTrace\footnote{Note also the comment in the blog post: ``Yes, that includes dtrace, which I'm calling out in particular because any time you have one of these discussions, a dtrace troll will come along to say that dtrace has supported that for years. It's like the common lisp of trace tools, in terms of community trolling.''} and Nethercote's counts tool (discussed in L27) also enable custom instrumentation of select events. \section*{Lies from Counters} This is fairly niche, but Rust compiler hackers were trying to include @@ -161,7 +161,7 @@ \section*{Lies from Counters} To make counters as deterministic as possible: \begin{itemize}[noitemsep] -\item disable Address Space Layout Randomization (randomized pointer addresses affect hash layouts); +\item disable Address Space Layout Randomization (security mitigation; but, randomized pointer addresses affect hash layouts); \item subtract time spent processing interrupts (IRQs); \item profile one thread only (if you can, in your context). \end{itemize}