Skip to content

jrp2014/smuggler2

 
 

Repository files navigation

smuggler2

MPL-2.0 license Smuggler2 Build Status Hackage Stackage

Smuggler2 is a Haskell GHC Source Plugin that automatically

  • rewrites module imports to produce a minimal set. This may make code easier to read because the provenance of imported names is explcit.

  • adds or replaces explicit exports to produce a maximalist set for hand pruning. All values, types and classes defined in a module are exported (excluding those that are imported). It does not check whether an exported name is used elsewhere in your package. Limiting exports may make it easier for ghc to optimise some code.

The Haskell Wiki sets out the pros and cons of using explicit import lists. Smuggler2 offers the option of leaving a module imports open (by not specifiying explcitly what is to be imported from them) while developing and then getting Smuggler2 to add minimal lists of explicit exports. This helps to document modules and, arguably, makes them easier to read by avoiding the need to qualify names to give an indication of where they came from. It could also provides a cross-check that only expected names are being used.

How to use

Install smuggler2 using cabal install --lib smuggler2.

Adding Smuggler2 to your Cabal dependencies

Add smuggler2 to the dependencies of your project and to your compiler flags. For example, you could include in your project cabal file something like

flag smuggler2
  description: Rewrite sources to cleanup imports, and create explicit exports
  exports
  default:     False
  manual:      True

common smuggler-options
  if flag(smuggler2)
    ghc-options: -fplugin=Smuggler2.Plugin
    build-depends: smuggler2 >= 0.3 && < 0.4

and then import: smuggler-options in the appropriate library or executable sections.

The use of the flag allows you to build with or without source processing. Eg,

$ cabal build -fsmuggler2

using the example above.

You might use this approach to refine your imports or get a starting point for your exports, but not rewrite them every time you compile. The use of a flag means that you can also exclude smuggler2 dependencies from your final builds.

Alternatively, using a local version

If you have installed smuggler2 from a local copy of this repository, you may need to add -package-env default -package smuggler2 to your ghc-options, if you did not install using the --lib flag to cabal install. (This will depend on your setup and your version of cabal.

Or use a ghc wrapper

The repository also has a very simple ghc wrapper ghc-smuggler2 in the app folder that you can tweak to accomodate your local build environment. This allows you to run the plugin over your sources without modifying your .cabal file:

$ cabal build -with-compiler=ghc-smuggler2

or just

$ cabal build -w ghc-smuggler2

You can just run ghcid as usual if you have set up your cabal file to run the plugin.

$ ghcid --command='cabal repl'

Options

Smuggler2 has several (case-insensitive) options, which can be set by adding -fplugin-opt=Smuggler2.Plugin: flags:

  • NoImportProcessing - do no import processing

  • PreserveInstanceImports - remove unused imports, but preserve a library import stub. such as import Mod (), to import only instances of typeclasses from it. (The default.)

  • MinimiseImports - remove unused imports, including any that may be needed only to import typeclass instances. This may, therefore, stop the module from compiling.

  • NoExportProcessing - do no export processing

  • AddExplicitExports - add an explicit list of all available exports (excluding those that are imported) if there is no existing export list. (The default.) You may want to edit it to keep specific values, types or classes local to the module. At present, a typeclass and its class methods are exported individually. You may want to replace those exports with an abbreviation such as C(..).

  • ReplaceExports - replace any existing module export list with one containing all available exports (which, again, you can, of course, then prune to your requirements).

  • LeaveOpenImports and MakeOpenImports take a comma-separated list of module names. The specified modules are to be left open if they were open in the sourcee (in the case of LeaveOpenImports) and made open even if they were not originally (in the case of MakeOpenImports). For example, you could add

    -fplugin-opt=Smuggler2.Plugin:LeaveOpenImports:Relude,RIO,Prelude,Some.Module

    This may be helpful if you use ghc's NoImplicitPrelude language feature and import a prelude manually.

    If the PreserveInstanceImports option was sepecified, the LeaveOpenImports and MakeOpenImports options override it for the specified modules, They have no effect, if NoImportProcessing was specified. If a module is specified both to be left open and made open, it will be made open.

  • Any other option value is used to generate a source file with a new extension of the option value (new in the following example) rather than replacing the original file.

    ghc-options: -fplugin=Smuggler2.Plugin -fplugin-opt=Smuggler2.Plugin:new

    This will create output files with a .new suffix rather the overwriting the originals.

Caveats

  • Smuggler2 rewrites the existing imports, rather than attempting to prune them. (This is a more aggressive approach than smuggler which focuses on removing redundant imports.) It has advantages and disadvantages. The advantage is that a minimal set of imports is generated in a reproducible format. So you can just import a library without specifying any specific imports and Smuggler2 will add an explict list of things that are used from it. This can be a useful check and better documents your modules. The disdvantage is that imports may be reordered, comments and blank lines dropped, external imports mixed with external, etc.
  • By default Smuggler2 does not remove imports completely because an import may be being used to only import instances of typeclasses, So it will leave stubs like

    import Mod ()

    that you may want to remove manually. Alternatively use the MinimiseImports option to remove them anyway, at the risk of producing code that fails to compile.

  • CPP files will not be processed correctly: the imports will be generated for current CPP settings and any CPP annotations in the import block will be discarded. This may be a particular problem if you are writing code for several generations of ghc and base for example. Nevetheless, Smuggler2 will generate a new CPP preprocessed output file with a -cpp suffix. retrie solves this problem by generating all possible versions of the module (exponential in the number of #if directives), operating on each version individually, and splicing results back into the original file. A tour de force!

  • Because cabal and ghc don't have full support for distinguishing dependent packages from plug-ins you will probably want to ensure that the build the dependencies for your project that are installed into your local package db first, before enabling Smuggler2, or ghc-smuggler2 otherwise they will all be processed by it too, as your project builds, which should do no harm, but will increase your build time:

    $ cabal build --dependencies-only
    $ cabal clean
    $ cabal -w ghc-smuggler2
  • if you import patterns synonyms from a library without naming them explicitly in an import list, you do not need the PatternSynonyms language extension. If you import them explicitly, using the pattern keyword, the language extension is required (otherwise you will just get a syntax error on compilation). Smuggler2.Plugin will not add that for you.

  • Multiple separate import lines referring to the same library are not consolidated

  • Literate Haskell .lhs files will processed into ordinary haskell files wth a -lhs suffix.

  • hiding imports are not needed and replaced by explicit ones.

Smuggler2 is robust -- it can chew through the Agda codebase of over 370 modules with complex interdependencies and be tripped over by only

  • a couple of ambiguous exports (are we trying to export something defined in the current module or something with the same name from an imported module)

  • and a couple of imports where both qualifed and unqualifed version of the module are imported and there are references to both qualified and unqualifed version of the same names

  • smuggler2 depends on the current ghc compiler and base library to check whether an import is redundant. Different versions of the compiler may, of course, need different slightly imports, typically from base. The base library changelog provides some details of what was made available when.

  • The plugin does not run reliably on Windows with versions of ghc prior to 8.10.3. This is probably more of an issue with the way that the tests are run, than Smuggler2 itself.

  • Currently cabal does not have a particular way of specifying plugins. (See, eg, https://gitlab.haskell.org/ghc/ghc/issues/11244 and haskell/cabal#2965) which would allow cleaner separation of user code and plugin-code

For contributors

Requirements:

  • ghc-8.6.5, ghc-8.8.3 and ghc-8.10.1: Smuggler2 will not compile with earlier versions.
  • The test golden values are for ghc-8.10.1 and ghc-8.8.3. Some of them fail on ghc-8.6.5 because it seems to need to import Data.Bool whereas later versions of GHC don't. The results compile on ghc-8.6.5 and later anyway, but the imports are not as minimal for later versions as they could be. ghc-exactprint 0.6.3.1 adds extra '\r` inside comments under Windows, so the tests fail.
  • cabal >= 3.0 (ideally 3.2)
  • The Windows version of the plugin is a bit flaky because of apparent compiler bugs.

How to build

There is a Makefile at the root of the distribution that covers various maintenance tasks, including building the package.

$ cabal update
$ cabal build --write-ghc-environment-files=always

Writing the ghc environment file allows tests to be run from within the repository using ghc -fplugin=Smuggler2.Plugin without needing to use cabal exec -- ghc -fplugin=Smugler2.Plugin or a -package smuggler2 flag. cabal clean to get rid of it, to avoid surprises when you are done.

To build with debugging:

$ cabal build -fdebug --write-ghc-environment-files=always

Curently this just adds an -fdump-minimal-imports parameter to GHC compilation.

How to run tests

There is a tasty-golden-based test suite that can be run by

$ cabal test smuggler2-test --enable-tests

Further help can be found by

$ cabal run smuggler2-test -- --help

(note the extra --)

For example, if you are running on ghc-8.6.5 you can

$ cabal run smuggler2-test -- --accept

to update the golden outputs to the current results of (failing) tests.

It is sometimes necessary to run cabal clean before running tests to ensure that old build artefacts do not lead to misleading results.

Importing a test module from another test module in the same directory is likely to lead to race conditions as 'Tasty' runs tests in parallel and so will try to generate the same smuggler2 output both when the imported module is being tested directly and when it is being processed when the importing module is being tested. Put the imported module in a subdirectory to avoid this issue, as the test harness only looks for tests in test\tests and not its subdirectories.

Implementation approach

smuggler2 uses the ghc-exactprint library to modiify the source code. The documentation for the library is fairly spartan, and the library is not widely used, at least in publicly available code, so the use here can, no doubt, be optimised.

The library is needed because the annotated AST that GHC alone generates does not have enough information to reconstitute the original source. Some parts of the renamed syntax tree (for example, imports) are not found in the typechecked one. ghc-exactprint provides parsers that preserve this information, which is stored in a separate Anns Map used to generate properly formatted source text.

To make manipulation of GHC's AST and ghc-exactprint's Anns easier, ghc-exactprint provides a set of Transform functions. These are intended to facilitate making changes to the AST and adjusting the Anns to suit the changes.

These functions are said to be under heavy development. The approach provided by retrie wraps an AST and Anns into a single type that seems to make AST transformations easier to compose and reduces the risk of the Anns and AST getting out of sync as it is being transformed, something with which the type system doesn't help you since the Anns are stored as a Map. (That approach is not used by smuggler2.)

Imports

smuggler2 uses GHC to generate a set of minimal imports. It

  • parses the original file
  • dumps the minimal exports that GHC generates and parses them back in (to pick up the annotations needed for printing). The ghc version of getMinimalImports does not handle pattern and type imports correctly. smuggler2 uses a fixed version of that function.
  • drops implicit imports (such as Prelude) and, optionally, imports that are for instances only
  • replaces the original imports with minimal ones
  • exactPrints the result back over the original file (or one with a different suffix, if that was specified as option to smuggler2)

This round tripping is needed because the AST that ghc provides does not have enough information in it to reconstitute the source (which is why ghc-exactprint exists).

Exports

Exports are simpler to deal with as GHC's exports_from_avail does the work.

Other projects

  • Smuggler2 was is a rewrite of smuggler that rewrites rather than just pruning existing imports
  • retrie a code modding tool that works with GHC 8.10.1
  • refact-global-hse an ambitious import refactoring tool. This uses haskell-src-exts rather than ghc-exactprint and so may not work with current versions of GHC.
  • These blog posts contain some fragments on the topic of using ghc-exactprint to manipulate import lists Terser import declarations and GHC API (The site doesn't always seem to be up.)

Acknowledgements

Thanks to

  • Dmitrii Kovanikov and Veronika Romashkina who wrote smuggler
  • Alan Zimmerman and Matthew Pickering for ghc-exactprint
  • The ghc authors who have made the compiler internals available through an API.

Packages

No packages published

Languages

  • Haskell 94.7%
  • Shell 2.7%
  • Makefile 2.4%
  • Dhall 0.2%