Skip to content

Commit

Permalink
Render text less
Browse files Browse the repository at this point in the history
There is a big performance hit w.r.t. rendering text.

Here is flamegraph of ghc-events-analyze after this patch which
illustrates the problem: http://fuuzetsu.co.uk/images/1482368246.png

The reason for this is diagrams/SVGFonts#16 ;

For now this commit tries to mitigate the issue:

* use a simpler font for vector graphics, `bit`, which loads much, much faster; the downside
  is it's ugly and has fewer characters; once upstream is fixed, we can
  revert and we should be fast(-ish)

* Use `text` from diagrams for thread names; there is a problem here in
  that this `text` has no space as far as diagrams is concerned which
  means we can not tell how much space it occupies and therefore how
  much to move the boxes. We work around this by finding longest thread
  name, rendering it, taking width of that and assuming that's the
  widest one. Of course that's not true in general, there are wide
  glyphs &c. but worst-case scenario, we just overlap the first few
  blocks (and even then probably just the first, unused one).

* For rendering the timeline numbers, memoize last rendering. The times
  on timeline are in strictly increasing order which means for
  `bucketTimes` values, we only render `length (nub bucketTimes)` times
  and not `length (bucketTimes)`.

By the way here is the current heap:
http://fuuzetsu.co.uk/images/1482368318.png

and prof

```
	Thu Dec 22 00:52 2016 Time and Allocation Profiling Report  (Final)

	   ghc-events-analyze +RTS -p -L40 -hc -s -RTS message-handling.eventlog

	total time  =        6.70 secs   (6697 ticks @ 1000 us, 1 processor)
	total alloc = 11,355,497,656 bytes  (excludes profiling overheads)

COST CENTRE                 MODULE                                %time %alloc

dot                         Linear.V2                               7.2   14.4
fmap                        Linear.V2                               6.1    5.0
atParam                     Diagrams.Segment                        4.3    7.5
moveOriginTo.\              Diagrams.Core.Envelope                  2.9    5.5
floatToDigits.gen           Data.Text.Lazy.Builder.RealFloat        2.4    1.5
<>                          Diagrams.Core.Envelope                  2.4    1.2
renderAttr.\                Graphics.Rendering.SVG                  2.3    1.6
floatToDigits               Data.Text.Lazy.Builder.RealFloat        2.3    2.5
getEnvelope.\               Diagrams.Segment                        2.0    4.3
formatRealFloat.doFmt       Data.Text.Lazy.Builder.RealFloat        2.0    1.0
transferFunction            Data.Colour.SRGB                        1.7    1.2
^                           Diagrams.Solve.Polynomial               1.7    1.3
getAttr                     Diagrams.Core.Style                     1.7    0.8
getAttr.ty                  Diagrams.Core.Style                     1.7    0.9
buildAttr                   Graphics.Svg.Core                       1.6    2.7
measure                     Diagrams.Segment                        1.5    1.4
floatToDigits.gen.(...)     Data.Text.Lazy.Builder.RealFloat        1.5    2.0
getEnvelope.\.\             Diagrams.Segment                        1.4    2.2
formatRealFloat             Data.Text.Lazy.Builder.RealFloat        1.3    0.6
quadForm                    Diagrams.Solve.Polynomial               1.1    1.2
^+^                         Linear.Vector                           1.1    0.8
concat.ts'                  Data.Text                               1.0    1.0
getEnvelope.\               Diagrams.Segment                        1.0    1.6
formatRealFloat.doFmt.(...) Data.Text.Lazy.Builder.RealFloat        0.7    1.2
writeReport'.writeLine      GHC.RTS.Events.Analyze.Reports.Timed    0.7    1.5
fromHtmlEscapedText         Blaze.ByteString.Builder.Html.Utf8      0.6    2.1
```

```
Generated message-handling.totals.txt using default script
Generated message-handling.timed.svg using default script
Generated message-handling.timed.txt using default script
  18,049,345,672 bytes allocated in the heap
  16,302,984,912 bytes copied during GC
     204,379,936 bytes maximum residency (114 sample(s))
       1,706,848 bytes maximum slop
             455 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     34700 colls,     0 par    0.976s   0.975s     0.0000s    0.0003s
  Gen  1       114 colls,     0 par   12.782s  12.796s     0.1122s    0.2042s

  INIT    time    0.001s  (  0.001s elapsed)
  MUT     time    9.758s  ( 15.550s elapsed)
  GC      time    7.984s  (  7.989s elapsed)
  RP      time    0.000s  (  0.000s elapsed)
  PROF    time    5.774s  (  5.782s elapsed)
  EXIT    time    0.002s  (  0.002s elapsed)
  Total   time   23.522s  ( 23.542s elapsed)

  %GC     time      33.9%  (33.9% elapsed)

  Alloc rate    1,849,658,888 bytes per MUT second

  Productivity  41.5% of total user, 41.5% of total elapsed
```

This was ran on 819KB eventlog.

In comparison, here is the situation before this and the previous
commit (but mostly this):

Heap: http://fuuzetsu.co.uk/images/1482369579.png ; many more times
memory usage
Flamegraph http://fuuzetsu.co.uk/images/1482369638.png ; looks a bit
different because all that complex text that it's generating has to be
stuck together but if we zoom in a little it's easy to see where time is
going http://fuuzetsu.co.uk/images/1482369694.png

Finally profiling and GC stats:

```
	Thu Dec 22 01:08 2016 Time and Allocation Profiling Report  (Final)

	   ghc-events-analyze +RTS -p -L40 -hc -s -RTS message-handling.eventlog

	total time  =       41.95 secs   (41955 ticks @ 1000 us, 1 processor)
	total alloc = 81,752,383,184 bytes  (excludes profiling overheads)

COST CENTRE             MODULE                                %time %alloc

atParam                 Diagrams.Segment                        9.3   15.8
fmap                    Linear.V2                               8.3    6.6
dot                     Linear.V2                               6.6    9.9
^                       Diagrams.Solve.Polynomial               5.7    3.6
getEnvelope.\           Diagrams.Segment                        4.2    5.0
quadForm                Diagrams.Solve.Polynomial               4.0    3.4
floatToDigits.gen       Data.Text.Lazy.Builder.RealFloat        3.7    2.4
^+^                     Linear.Vector                           3.1    2.1
floatToDigits.gen.(...) Data.Text.Lazy.Builder.RealFloat        2.7    3.0
quadForm.q              Diagrams.Solve.Polynomial               2.5    2.3
writeReport'.showValue  GHC.RTS.Events.Analyze.Reports.Timed    2.4    3.3
moveOriginTo.\          Diagrams.Core.Envelope                  2.1    2.8
>>=                     Text.Parsec.Prim                        1.9    2.4
floatToDigits           Data.Text.Lazy.Builder.RealFloat        1.9    2.7
getEnvelope.\.\         Diagrams.Segment                        1.9    2.4
formatRealFloat.doFmt   Data.Text.Lazy.Builder.RealFloat        1.7    1.0
measure                 Diagrams.Segment                        1.5    1.0
<>                      Diagrams.Core.Envelope                  1.4    0.7
quadForm.d              Diagrams.Solve.Polynomial               1.1    1.3
<*>                     Linear.V2                               1.1    0.2
floatToDigits.gen.mUpN' Data.Text.Lazy.Builder.RealFloat        1.1    0.4
satisfy                 Text.Parsec.Char                        1.1    1.5
formatRealFloat         Data.Text.Lazy.Builder.RealFloat        1.0    0.6
fromHtmlEscapedText     Blaze.ByteString.Builder.Html.Utf8      0.3    1.2
append.x                Data.Text                               0.2    1.5
```

```
Generated message-handling.totals.txt using default script
Generated message-handling.timed.svg using default script
Generated message-handling.timed.txt using default script
 129,320,970,480 bytes allocated in the heap
 499,460,983,080 bytes copied during GC
   1,516,879,816 bytes maximum residency (587 sample(s))
       8,188,512 bytes maximum slop
            2968 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     248630 colls,     0 par    4.016s   4.011s     0.0000s    0.0003s
  Gen  1       587 colls,     0 par   419.172s  420.109s     0.7157s    1.4724s

  INIT    time    0.002s  (  0.002s elapsed)
  MUT     time   56.979s  (262.965s elapsed)
  GC      time  217.769s  (218.280s elapsed)
  RP      time    0.000s  (  0.000s elapsed)
  PROF    time  205.420s  (205.839s elapsed)
  EXIT    time    0.027s  (  0.027s elapsed)
  Total   time  480.199s  (481.274s elapsed)

  %GC     time      45.3%  (45.4% elapsed)

  Alloc rate    2,269,632,625 bytes per MUT second

  Productivity  11.9% of total user, 11.8% of total elapsed
```
We have around an order of magnitude improvement in time and space

We can do more of course but I would consider these short-term fixes. I
plan to see what's happening in SVGFonts before pursuing more issues
here.

I don't want to have to go for a walk when the input size is few tens of
MBs.
  • Loading branch information
Fuuzetsu committed Dec 22, 2016
1 parent 67f68c0 commit 48b3dfc
Showing 1 changed file with 67 additions and 32 deletions.
99 changes: 67 additions & 32 deletions src/GHC/RTS/Events/Analyze/Reports/Timed/SVG.hs
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ module GHC.RTS.Events.Analyze.Reports.Timed.SVG (
writeReport
) where

import Data.Maybe (catMaybes)
import Data.List (foldl')
import Data.Monoid ((<>))
import Diagrams.Backend.SVG (B, renderSVG)
import Diagrams.Prelude (QDiagram, Colour, V2, N, Any, (#), (|||))
Expand All @@ -15,6 +15,7 @@ import qualified Data.Map as Map
import qualified Diagrams.Prelude as D
import qualified Graphics.SVGFonts.Fonts as F
import qualified Graphics.SVGFonts.Text as F
import qualified Diagrams.TwoD.Text as TT

#if !MIN_VERSION_base(4,8,0)
import Data.Monoid (mempty, mconcat)
Expand All @@ -24,7 +25,7 @@ import GHC.RTS.Events.Analyze.Types
import GHC.RTS.Events.Analyze.Reports.Timed hiding (writeReport)

writeReport :: Options -> Quantized -> Report -> FilePath -> IO ()
writeReport options quantized report path =
writeReport options quantized report path = do
uncurry (renderSVG path) $ renderReport options quantized report

type D = QDiagram B V2 (N B) Any
Expand All @@ -44,7 +45,6 @@ renderReport options@Options{..}
rendered = D.vcat $ map (uncurry renderSVGFragment)
$ zip (cycle [D.white, D.ghostwhite])
(SVGTimeline : fragments)

fragments :: [SVGFragment]
fragments = map (renderFragment options) $ zip report (cycle allColors)

Expand All @@ -53,7 +53,7 @@ renderReport options@Options{..}
padHeader (2 * optionsBucketHeight) title
renderSVGFragment bg (SVGLine header blocks) =
-- Add empty block at the start so that the whole thing doesn't shift up
(padHeader optionsBucketHeight header ||| (blocks <> (block options 0 # D.lw D.none)))
(padHeader optionsBucketHeight (renderText header (optionsBucketHeight + 2)) ||| (blocks <> (block options 0 # D.lw D.none)))
`D.atop`
(D.rect lineWidth optionsBucketHeight # D.alignL # D.fc bg # D.lw D.none)
renderSVGFragment _ SVGTimeline =
Expand All @@ -68,29 +68,37 @@ renderReport options@Options{..}
D.translateX (0.5 * optionsBucketWidth) h
<> D.rect headerWidth height # D.alignL # D.lw D.none

-- optimisation: find the longest text header, render
-- it then check the rendered size and use that for
-- width; it does not necessarily mean it's the right
-- width to use but it's good enough considering speed
-- trade-off
headerWidth :: Double
headerWidth = optionsBucketWidth -- extra padding
+ (maximum . catMaybes . map headerWidthOf $ fragments)
headerWidth = optionsBucketWidth + widestHeader -- extra padding

headerWidthOf :: SVGFragment -> Maybe Double
headerWidthOf (SVGLine header _) = Just (D.width header)
headerWidthOf _ = Nothing
widestHeader :: Double
widestHeader =
let headers = [ (header, length header) | SVGLine header _ <- fragments ]
(maxHeader, _) = foldl' (\(s, l) (s', l') ->
if l' > l then (s', l') else (s, l))
("", 0) headers
in D.width $! mkSVGText maxHeader (optionsBucketHeight + 2)

data SVGFragment =
SVGTimeline
| SVGSection D
| SVGLine D D
| SVGLine String D

renderFragment :: Options -> (ReportFragment, Colour Double) -> SVGFragment
renderFragment options@Options{..} = go
where
go :: (ReportFragment, Colour Double) -> SVGFragment
go (ReportSection title,_) = SVGSection (renderText title (optionsBucketHeight + 2))
go (ReportSection title,_) = SVGSection (mkSVGText title (optionsBucketHeight + 2))
go (ReportLine line,c) = uncurry SVGLine $ renderLine options c line

renderLine :: Options -> Colour Double -> ReportLine -> (D, D)
renderLine :: Options -> Colour Double -> ReportLine -> (String, D)
renderLine options@Options{..} lc line@ReportLineData{..} =
( renderText lineHeader (optionsBucketHeight + 2)
( lineHeader -- renderText lineHeader (optionsBucketHeight + 2)
, blocks lc <> bgBlocks options lineBackground
)
where
Expand Down Expand Up @@ -119,23 +127,34 @@ bgBlocks options = go
| b <- [fr .. to]
]

renderText :: String -> Double -> D
renderText str size =
D.stroke textSVG # D.fc D.black # D.lc D.black # D.alignL # D.lw D.none
-- | Create a text diagram that is sized (as opposed to 'renderText').
-- The problem with this function is that it's *extremely* slow and
-- memory hungry in comparison to something simple like 'TT.text'.
-- This function should therefore be used as little as possible.
mkSVGText :: String -> Double -> D
mkSVGText str size =
D.stroke textSVG # D.fc D.black # D.lc D.black # D.alignL # D.lw D.none
where
textSVG = F.textSVG' (textOpts size) str

textOpts :: Double -> TextOpts Double
textOpts size =
TextOpts {
textFont = F.lin
textFont = F.bit
, mode = F.INSIDE_H
, spacing = F.KERN
, underline = False
, textWidth = 0 -- not important
, textHeight = size
}

-- | Render text with diagram's own engine. The issue with this text
-- is that it has no size: we can not tell how wide it is. For a
-- sized-text see 'mkSVGText'.
renderText :: String -> Double -> D
renderText str size =
TT.fontSizeL (size / 2) $ TT.alignedText 0 0.5 str

-- | Translate quantized value to opacity
--
-- For every event and every bucket we record the percentage of that bucket
Expand All @@ -160,27 +179,43 @@ block Options{..} i =

timeline :: Options -> Int -> Timestamp -> D
timeline Options{..} numBuckets bucketSize =
mconcat [ timelineBlock tb # D.translateX (fromIntegral tb * timelineBlockWidth)
| -- bucket number
b <- [0 .. numBuckets - 1]
-- timeline block number, index within this timeline block @(0 .. optionsTickEvery - 1)@
, let (tb, tidx) = b `divMod` optionsTickEvery
-- we show the timeline block when the index is 0
, tidx == 0
]
let timeBlocks = [ tb
| b <- [0 .. numBuckets - 1]
-- timeline block number, index within this timeline block @(0 .. optionsTickEvery - 1)@
, let (tb, tidx) = b `divMod` optionsTickEvery
-- we show the timeline block when the index is 0
, tidx == 0 ]

-- memoize the rendering of the last time label: if it's the same
-- for the next 10 displays, why render it 10 times? Text is expensive.
in case foldl' (\acc tb -> timelineBlock acc tb) mempty timeBlocks of
(_, _, fullDiag) -> fullDiag

where
timelineBlockWidth :: Double
timelineBlockWidth = fromIntegral optionsTickEvery * optionsBucketWidth

moveAlongTimeline :: Int -> D -> D
moveAlongTimeline tb = D.translateX (fromIntegral tb * timelineBlockWidth)

-- Single block on the time-line; every 5 blocks a larger line and a time
-- label; for the remainder just a shorter line
timelineBlock :: Int -> D
timelineBlock tb
| tb `rem` 5 == 0
= D.strokeLine bigLine # D.lw (D.local 0.5)
<> (renderText (bucketTime tb) optionsBucketHeight # D.translateY (optionsBucketHeight - 2))
| otherwise
= D.strokeLine smallLine # D.lw (D.local 0.5) # D.translateY 1
timelineBlock :: (String, D, D) -> Int -> (String, D, D)
timelineBlock (lastStr, lastNumD, fullDiag) tb
| tb `rem` 5 == 0 =
let btime = bucketTime tb
myNum = if lastStr == btime
then lastNumD
else mkSVGText btime optionsBucketHeight # D.translateY (optionsBucketHeight - 2)
myDiag = D.strokeLine bigLine # D.lw (D.local 0.5) <> myNum
in (btime, myNum, fullDiag <> myDiag # moveAlongTimeline tb)
| otherwise =
let myDiag :: D
myDiag = D.strokeLine smallLine
# D.lw (D.local 0.5)
# D.translateY 1
# moveAlongTimeline tb
in (lastStr, lastNumD, fullDiag <> myDiag)

bucketTime :: Int -> String
bucketTime tb = case optionsGranularity of
Expand Down

0 comments on commit 48b3dfc

Please sign in to comment.