Torch-TensorRT v2.2.0 #2646
narendasan
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Dynamo Frontend for Torch-TensorRT, PyTorch 2.2, CUDA 12.1, TensorRT 8.6
Torch-TensorRT 2.2.0 targets PyTorch 2.2, CUDA 12.1 (builds for CUDA 11.8 are available via the PyTorch package index - https://download.pytorch.org/whl/cu118) and TensorRT 8.6. This release is the second major release of Torch-TensorRT as the default frontend has changed from TorchScript to Dynamo allowing for users to more easily control and customize the compiler in Python.
The dynamo frontend can support both JIT workflows through
torch.compile
and AOT workflows throughtorch.export + torch_tensorrt.compile
. It targets the Core ATen Opset (https://pytorch.org/docs/stable/torch.compiler_ir.html#core-aten-ir) and currently has 82% coverage. Just like in Torchscript graphs will be partitioned based on the ability to map operators to TensorRT in addition to any graph surgery done in Dynamo.Output Format
Through the Dynamo frontend, different output formats can be selected for AOT workflows via the
output_format
kwarg. The choices aretorchscript
where the resulting compiled module will be traced withtorch.jit.trace
, suitable for Pythonless deployments,exported_program
a new serializable format for PyTorch models or finally if you would like to run further graph transformations on the resultant model,graph_module
will return atorch.fx.GraphModule
.Multi-GPU Safety
To address a long standing source of overhead, single GPU systems will now operate without typical required device checks. This check can be re-added when multiple GPUs are available to the host process using
torch_tensorrt.runtime.set_multi_device_safe_mode
More information can be found here: https://pytorch.org/TensorRT/user_guide/runtime.html
Capability Validators
In the Dynamo frontend, tests can be written and associated with converters to dynamically enable or disable them based on conditions in the target graph.
For example, the convolution converter in dynamo only supports 1D, 2D, and 3D convolution. We can therefore create a lambda which given a convolution FX node can determine if the convolution is supported:
In such a case where the
Node
is not supported, the node will be partitioned out and run in PyTorch.All capability validators are run prior to partitioning, after the lowering phase.
More information on writing converters for the Dynamo frontend can be found here: https://pytorch.org/TensorRT/contributors/dynamo_converters.html
Breaking Changes
torch.nn.Module
s ortorch.fx.GraphModule
s provided totorch_tensorrt.compile
will by default be exported usingtorch.export
then compiled. This default can be overridden by setting their=[torchscript|fx]
kwarg. Any bugs reported will first be attempted to be resolved in the dynamo stack before attempting other frontends however pull requests for additional functionally in the TorchScript and FX frontends from the community will still be accepted.What's Changed
main
by @gs-olive in chore: Update Torch and Torch-TRT versions and docs onmain
#1784input_signature
) by @gs-olive in fix: Allow full model compilation with collection inputs (input_signature
) #1656TRTEngine.to_str()
method by @gs-olive in fix: Error caused by invalid binding name inTRTEngine.to_str()
method #1846aten.mean.default
andaten.mean.dim
converters by @gs-olive in fix: Implementaten.mean.default
andaten.mean.dim
converters #1810torch._dynamo
import in__init__
by @gs-olive in fix: Add version checking fortorch._dynamo
import in__init__
#1881acc_ops
convolution layers in FX by @gs-olive in fix: Improve input weight handling toacc_ops
convolution layers in FX #1886main
to TRT 8.6, CUDA 11.8, CuDNN 8.8, Torch Dev by @gs-olive in fix: Upgrademain
to TRT 8.6, CUDA 11.8, CuDNN 8.8, Torch Dev #1852torch.compile
path by @gs-olive in fix: Improve partitioning + lowering systems intorch.compile
path #1879aten.cat
by @gs-olive in fix: Add support for default dimension inaten.cat
#1863.numpy()
issue on fake tensors by @gs-olive in fix: Address.numpy()
issue on fake tensors #1949aten::Int.Tensor
uses by @gs-olive in fix/feat: Add lowering pass to resolve mostaten::Int.Tensor
uses #1937aten.addmm
by @gs-olive in fix: Add decomposition foraten.addmm
#1953convert_method_to_trt_engine
calls by @gs-olive in fix: Add lowering pass to remove output repacking inconvert_method_to_trt_engine
calls #1945TRTInterpreter
impl in Dynamo compile [1 / x] by @gs-olive in chore/fix: UpdateTRTInterpreter
impl in Dynamo compile [1 / x] #2002options
kwargs for Torch compile [3 / x] by @gs-olive in feat: Addoptions
kwargs for Torch compile [3 / x] #2005TRTInterpreter
[2 / x] by @gs-olive in feat: Add support for output data types inTRTInterpreter
[2 / x] #20042.1.0.dev20230605
[4 / x] by @gs-olive in chore: Upgrade Torch nightly to2.1.0.dev20230605
[4 / x] #1975impl
+ add feature (FX converter refactor) by @gs-olive in fix/feat: Move convolution core toimpl
+ add feature (FX converter refactor) #1972TorchTensorRTModule
in Dynamo [1 / x] by @gs-olive in feat: Add support forTorchTensorRTModule
in Dynamo [1 / x] #2003truncate_long_and_double
in Dynamo [8 / x] by @gs-olive in fix: Add support fortruncate_long_and_double
in Dynamo [8 / x] #1983aten
PRs to Dynamo converter registry by @gs-olive in fix: Move allaten
PRs to Dynamo converter registry #2070torch_tensorrt.dynamo.compile
path [1.1 / x] by @gs-olive in examples: Add example usage scripts fortorch_tensorrt.dynamo.compile
path [1.1 / x] #1966main
by @gs-olive in ci: Add automatic GHA job to build + push Docker Container onmain
#2129pyyaml
import to GHA Docker job by @gs-olive in chore: Addpyyaml
import to GHA Docker job #2170aten.embedding
to reflect schema by @gs-olive in fix: Updateaten.embedding
to reflect schema #2182_to_copy
,operator.get
andclone
ATen converters by @gs-olive in feat: Add_to_copy
,operator.get
andclone
ATen converters #2161aten.where
by @gs-olive in fix: Repair broadcasting utility foraten.where
#2228dynamic=False
intorch.compile
call by @gs-olive in fix: Setdynamic=False
intorch.compile
call #2240aten.expand
by @gs-olive in fix: Allow rank differences inaten.expand
#2234pip
installation by @gs-olive in fix: Legacy CIpip
installation #2239require_full_compilation
in Dynamo by @gs-olive in feat: Add support forrequire_full_compilation
in Dynamo #2138clone
andto_copy
where input of graph is output by @gs-olive in fix: Add special cases forclone
andto_copy
where input of graph is output #2265get_ir
prefixes by @gs-olive in minor fix: Updateget_ir
prefixes #2369aten.where
with Numpy + Broadcast by @gs-olive in fix: Repairaten.where
with Numpy + Broadcast #2372release/2.1
by @gs-olive in cherry-pick: Key converters and documentation torelease/2.1
#2387release/2.1
by @gs-olive in cherry-pick: Transformer XL fix torelease/2.1
#2414release
to Torch 2.1.1 by @gs-olive in chore: Upgraderelease
to Torch 2.1.1 #2472release/2.1
CI Repair by @gs-olive in fix:release/2.1
CI Repair #2528main
by @gs-olive in cherry-pick: Port most changes frommain
#2574release/2.2
by @gs-olive in cherry-pick: Docker fixesrelease/2.2
#2628compile
(small fix: Remove extraneous argument incompile
#2635) by @gs-olive in cherry-pick: Remove extraneous argument incompile
(#2635) #2638New Contributors
Full Changelog: v1.4.0...v2.2.0
This discussion was created from the release Torch-TensorRT v2.2.0.
Beta Was this translation helpful? Give feedback.
All reactions