Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up local reification by storing Map Names in DsM #130

Open
RyanGlScott opened this issue Feb 4, 2020 · 3 comments
Open

Speed up local reification by storing Map Names in DsM #130

RyanGlScott opened this issue Feb 4, 2020 · 3 comments

Comments

@RyanGlScott
Copy link
Collaborator

Reifying things in Q (e.g., with reify or reifyFixity) is quite cheap, as it essentially amounts to looking up what a Name maps to in an IntMap. Reifying things in DsM (i.e., local reification), on the other hand, is not nearly as cheap. A lookup in DsM effectively requires scanning through the entire list of local Decs until a match for the Name is found.

This isn't so unbearable for one-off reifications, but for programs that repeatedly reify things, this can add up. As one data point, the reifyWithLocals_maybe function took up a significant portion of the time spent in the profiling report for the program in #123 (comment). The singletons library also uses local reification extensively, and the amount of local reifications will only go up after goldfirere/singletons#432 lands. In light of this, it may be worthwhile to try and speed up the asymptotic complexity of local reification.

One possibility is to trade space for time by storing extra information in DsM to speed up local reification. I'm envisioning something like this:

newtype DsM q a = DsM (ReaderT DsEnv q a)

data DsEnv = DsEnv
  { dsEnvDecs      :: [Dec]
  , dsEnvDecMap    :: Map Name Info
  , dsEnvFixityMap :: Map Name Fixity
  , dsEnvTypeMap   :: Map Name Type
  }

class DsMonad m where
  localDeclarations     :: m [Dec]
  reifyWithLocals_maybe :: Name -> m (Maybe Info)
  reifyFixityWithLocals :: Name -> m (Maybe Fixity)
  reifyTypeWithLocals   :: Name -> m (Maybe Type)

instance DsMonad (DsM q) where
  localDeclarations       = DsM $ asks dsEnvDecs
  reifyWithLocals_maybe n = DsM $ asks (lookup n . dsEnvDecMap)
  reifyFixityWithLocals n = DsM $ asks (lookup n . dsEnvFixityMap)
  reifyTypeWithLocals   n = DsM $ asks (lookup n . dsEnvTypeMap)

The dsEnvDecs field of DsEnv would behave just as it does currently: whenever withLocalDeclarations is invoked, the new Decs would get added on. In addition, the new Map fields would also gain new entries. These Map fields would then become the engine that powers the new-and-improved reifyWithLocals_maybe, reifyFixityWithLocals, reifyTypeWithLocals functions, which constitute the heart of L.H.T.Desugar.Reify. This turns local reification from an O(n) operation to an O(log n) operation.

@RyanGlScott
Copy link
Collaborator Author

One thing I omitted in #130 (comment) is that we'll need to update the definition of withLocalDeclarations. Here is a sketch of what that might look like:

withLocalDeclarations :: DsMonad q => [Dec] -> DsM q a -> q a
withLocalDeclarations new_decs (DsM x) = do
  env <- ask
  let new_infos    = ...
      new_fixities = ...
      new_types    = ...
      env' = DsEnv { dsEnvDecs      = orig_decs ++ new_decs
                   , dsEnvDecMap    = combine (dsEnvDecMap    env) new_infos
                   , dsEnvFixityMap = combine (dsEnvFixityMap env) new_fixities
                   , dsEnvTypes     = combine (dsEnvTypeMap   env) new_types
                   }
  runReaderT x env'

combine :: Map Name a -> Map Name a -> Map Name a
combine old_map new_map = ???

Question: how should combine be implemented? Should it be this?

combine old_map new_map = Map.union old_map new_map

Or should it be this?

combine old_map new_map = Map.union new_map old_map

As it turns out, only the former version of combine will preserve the current semantics of withLocalDeclarations. Recall that Map.union is left biased, i.e., it prefers entries from the leftmost Map when duplicate keys are encountered. Therefore, Map.union old_map new_map matches what happens today in withLocalDeclarations, as adding a duplicate local declaration will not shadow the existing local declaration. (I only discovered this fact through experimentation, so this ought to be documented more prominently.)

@goldfirere
Copy link
Owner

This certainly looks like a good idea to me. It's kind of amazing that anyone has been hurt by this performance problem, but I guess people actually find singletons useful. :)

@RyanGlScott
Copy link
Collaborator Author

I have pushed a working implementation of this idea to the T130 branch. Although it compiles and passes the test suite, it unfortunately falls short of a more important goal. Namely, it's not any faster than the current approach. To test this, I constructed a pathological program that requires lots of reifications across a couple thousand local declarations:

{-# LANGUAGE TemplateHaskell #-}
module Bug where

import Language.Haskell.TH
import Language.Haskell.TH.Desugar

$(do [d| type A = Either B B
         data A1
         -- data A2 through data A1999 ...
         data A2000

         type G = Either () ()
         type F = Either G G
         type E = Either F F
         type D = Either E E
         type C = Either D D
         type B = Either C C
       |] >>= \decs -> do
          t <- withLocalDeclarations decs (expand (DConT (mkName "A")))
          runIO $ print t
          return [])

Here is the time it takes to compile this program using th-desugar from the current tip of master (e7aa9b0):

$ time /opt/ghc/8.10.1/bin/ghc -fforce-recomp Bug.hs   
Loaded package environment from /home/rgscott/Documents/Hacking/Haskell/cabal-sandbox/.ghc.environment.x86_64-linux-8.10.0.20200123
[1 of 1] Compiling Bug              ( Bug.hs, Bug.o, Bug.dyn_o )
...
real    0m5.777s
user    0m5.496s
sys     0m0.283s

And here is the time it takes to compile this program using th-desugar from the T130 branch (46755c2):

$ time /opt/ghc/8.10.1/bin/ghc -fforce-recomp Bug.hs
Loaded package environment from /home/rgscott/Documents/Hacking/Haskell/cabal-sandbox/.ghc.environment.x86_64-linux-8.10.0.20200123
[1 of 1] Compiling Bug              ( Bug.hs, Bug.o, Bug.dyn_o )
...
real    0m5.707s
user    0m5.417s
sys     0m0.295s

There's hardly any difference overall! I also tried compiling some lengthier test cases (e.g., the singletons library and the eliminators test suite) and observed little difference in compilation times in those examples as well.

Given the lack of noticeable speedups and the fact that the code in the T130 branch is a fair bit more complicated than the status quo, I'm inclined to give up for now and park this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants