3.0.0 - Complete Overhaul
Description
This is a major change! It fundamentally rewrites the core logic for tracking imports and rearranges the arguments to the import tracking functionality. The high-level gist is:
- Rather than capturing imports during the processing of an
import_module
, the imports are computed after importing the target module by recursively inspecting the bytecode for all modules stemming from the target. - The tracking no longer needs to launch
subprocesses
to perform recursion because it does not rely on the diff insys.modules
- It's way faster!
But why?
Ok, the old way was working pretty well, so why refactor it all? The obvious answer is speed
, but the less obvious answer is actually the correct one: the old implementation was not answering the right question. The old implementation answered the question
What modules are brought into sys.modules between starting the import of <target> and concluding the import of <target>?
Instead, what we really want to know is:
If we stripped away all code not required for <target>, what modules would we need to have installed for the import of <target> to work?
The difference here comes down to whether you count siblings of nested dependencies. This is much easier to describe with an example:
deep_siblings/
├── __init__.py
├── blocks
│ ├── __init__.py
│ ├── bar_type
│ │ ├── __init__.py
│ │ └── bar.py # imports alog
│ └── foo_type
│ ├── __init__.py
│ └── foo.py # imports yaml
└── workflows
├── __init__.py
└── foo_type
├── __init__.py
└── foo.py # imports ..blocks.foo_type.foo
In this example, under the old implementation, workflows.foo_type.foo
would depend on both alog
and yaml
because the ..blocks
portion of the import requires that all of the dependencies of blocks
be brought into sys.modules
. This, however, voids the value proposition of finding separable import sets. Under the new implementation, workflows.foo_type.foo
only depends on yaml
because it imports blocks.foo_type.foo
from the deepest point where the only requirement is yaml
.
What breaks in the API?
- The
side_effects_modules
argument is gone. This was a hack to work around the fact that there were some modules that, when trapped by aDeferredModule
would cause the overall import to fail. With the refactor, this is unnecessary as the import proceeds exactly as normal with no interferance. - The output with
track_import_stacks
is different. It no longer attempts to look like stack traces, but it is actually more useful. Now, instead of a partially-useful stack trace, it's a list of lists where each entry is a stack of module imports that causes the given dependency allocation. - By default,
import_module
stops looking for imports at the boundary of the target module's parent library. This means that if a third party module transitively imports another third party module, it won't be allocated to the target unlessfull_depth=True
is given. LazyModule
is gone! This tool was a bit of a hack anyway and is no longer necessary.