Sync fb internal change to OSS #1892

wushirong · 2023-05-05T19:40:47Z

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes # (issue)

Type of change

Please delete options that are not relevant and/or add your own.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

github-actions

Code conforms to C++ style guidelines

github-actions

There are some changes that do not conform to Python style guidelines:

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

github-actions

Code conforms to C++ style guidelines

github-actions

There are some changes that do not conform to Python style guidelines:

--- py/torch_tensorrt/fx/tracer/dispatch_tracer/aten_tracer.py	2023-05-15 21:23:46.302436 +0000
+++ py/torch_tensorrt/fx/tracer/dispatch_tracer/aten_tracer.py	2023-05-15 21:24:03.801292 +0000
@@ -61,10 +61,12 @@

    def deactivate(self) -> None:
        torchdynamo.config.capture_scalar_outputs = True
        torchdynamo.config.guard_nn_modules = True
        torchdynamo.config.dynamic_shapes = True
+
+
@contextmanager
def using_config(config: DynamoConfig) -> Generator[DynamoConfig, None, None]:
    config.activate()
    try:
        yield config

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

483102cd4151f02c2d3632e6b6df7a5e59c0d6f3 Wei Wei <wwei6@meta.com> [fx2trt] move acc op `torch.ops._caffe2.RoIAlign` to fb only 8ce94a01caa090d56adb4708452b52890160ba69 Wei Wei <wwei6@meta.com> [aten2trt] reshape support 422326213bad177019e92c95dbc61af7a427bebc Shirong Wu <shirong@meta.com> nan_to_num aten converter f729c8a7f1268f329d15e3cf05f1fb9232fab2d9 Huamin Li <huaminli@meta.com> Record TRT/AIT lower context into Scuba gpu_lowering_diagnostics 2df64af6bcf102a0ce40f1c5ab8472370d012904 Wei Wei <wwei6@meta.com> [aten2ait][fx2ait] sin,cos,sqrt,clone support 9fa6469ccb9d00320d78684d748fe1a7e5c3cf60 Janet Yang <qxy11@meta.com> Split nodes w/ float64 inputs from lowering d2ea242f721156df9e075927ea7956db772d4107 Fei Kou <feikou@meta.com> Handle Ellipsis in dper passes d053b097a0d1c158cde29792a35c4ec4174d9417 Jason Ansel <jansel@meta.com> Fix tests broken by D42953629 e18c6c76b1678a95c35583dabb41666b33c3df63 Zhijing Li (Accelerator Enablement) <tissue030@meta.com> Add dper test for push_down_split pass 5008c6d200f2a9ca035547204b47eb5e1704ce88 Zhijing Li (Accelerator Enablement) <tissue030@meta.com> Add passes as option to AITTestCase.run_test f7bc0c543b553ca2f80149995b4c28599a6ea396 Ying Zhang <yingz@meta.com> Back out "Add passes as option to AITTestCase.run_test" 22d4044c66720e0e656f41538c81a3e90ef1a433 Zhijing Li (Accelerator Enablement) <tissue030@meta.com> Relaunch add passes as option to AITTestCase.run_test ae0de22b6a97bca82c0ef6a14b0be2b570eb443a Eli Uriegas <eliuriegas@meta.com> Remove fx2trt/torch2trt backends (#93822) b08e568951c911e4c3bbc72b55830fa1d4be4b2b Eli Uriegas <eliuriegas@meta.com> Remove torch/_dynamo/optimizations (#93871) 725266c0b7eb0549060e79b65346d703cc5bc39e Benson Ma <bensonma415@meta.com> [T143761882] Migrate CUTLASS group gemm operators to OSS 44110f5df422e84cd9d9afbf5dfbe742057b9d98 Zhijing Li (Accelerator Enablement) <tissue030@meta.com> Add noop pass for torch.ops.fb.scale_gradient 84befb25b778485c8694ba659248d4d570d92390 Chao Gu <guchao@meta.com> [FX] Add log_softmax b641713bd774cb7c7bf903f514bff5c87a6f3a33 Wei Wei <wwei6@meta.com> [fx2ait] support torch.unbind, torch.group_norm d263b38b53b93a18a78cd34b2a1c48711c3c59cd Shirong Wu <shirong@meta.com> Add extra logging for layer norm eb2591231195cc0ab6780f345f095807a7d45f7c Callum Ryan <callumryan@meta.com> Make GPU test run in bundled mode f63d3834e87a819f8335c50b351e48f60573d474 Sarunya Pumma <sarunya@meta.com> Back out "[T143761882] Migrate CUTLASS group gemm operators to OSS" a9f489c1c3a182698385053c0a94b792c4e310ba Shirong Wu <shirong@meta.com> Change opt_profile_replica to 3 b8bdde86f0bae6010062c33aec03a4e13a87a6ab Brian Hirsh <hirsheybar@meta.com> forward fix for new _native_batch_norm_legit_no_training op e8f4cbd46402e5603cc48d24395db3f0e010581a Shirong Wu <shirong@meta.com> Fix reshape op b860725bfaf74a0043190d1140ddee987dd82d0c generatedunixname89002005232357 <generatedunixname89002005232357@fb.com> Revert D43278214: Multisect successfully blamed D43278214 for test or build failures d4ea365cf8aa56d752912f7878b8046e89c804c2 Chunxing Yin <cyin9@meta.com> [mtia] Add sigmoid_backward kernel binding in Glow a768c82a51a058e56a64ff82f90e619795611b66 Mor Tzur <mortzur@meta.com> lower to ait 8eb52426aaca586ae50fde75cccca6a0827a8328 Wei Wei <wwei6@meta.com> [hstu][fx2ait] op support 55d95ffa096d9de7952a6a1c4628efd67e554d82 Wei Wei <wwei6@meta.com> [fx2ait] temp solution to set correct dynamic batch size for jagged tensor 0a42e2f0874c48e9b60503a25705f0fc6319ff87 Jia Jiunn Ang <jiajiunn@meta.com> [CMF] chunk-squeeze-cat op fusion when split on last dimension 8bd509596a799f1270796772e12be090a6db5d39 Wei Wei <wwei6@meta.com> [aten2trt] update comment 1761b440d646836116fdadf2b5c7b55c7d2b989b Oleg Khabinov <khabinov@meta.com> [fx2ait] Fix a dper pass when acc_ops.squeeze doesn't have a dim 3cc405a92c9fcec886d890de87ac94e024c682a5 Jia Jiunn Ang <jiajiunn@meta.com> [CMF] Fuse chunk-linear-cat into baddbmm 5f42f56c5b5d0bd4c058aa280a980e64dd89b0a9 Xiaodong Wang <xdwang@meta.com> [cudnn] static linking 229969542a2c1e96fe8345ff7adc2fd48f6a0707 Romain Sauvestre <romainsauvestre@meta.com> Remove base_module from acc_tracer target a174195c484d5a25f06e4c0665bbb2e9d9dcae82 Janet Yang <qxy11@meta.com> Support input_tensor_spec w/ multiple batch dims in TRT 0246365e6facc6dfb13843fa9854802f35c0938a Zhijing Li (Accelerator Enablement) <tissue030@meta.com> Remove noop dropout op with acc tracer 4c287b9f6238e8bbbd80e742262a0eee6efa57de Kunming Ho <kmh@meta.com> Operator support for threshold_backward 71bb34c81289173b83c7e7cf544b851096d9d99d Fei Kou <feikou@meta.com> specialize_int_float to specialize_int from D43925225 037db53f89a7b863ef0fbaa7b94425fd9a08dc96 Wei Wei <wwei6@meta.com> enable torchscripting 77f3dce76fd5407b08826f67213d8299d9d48542 Adnan Akhundov <aakhundov@meta.com> [fx2ait] Extend jagged tensor support e6b551e48a0c03db63fc46ff85d975b489e30079 Jordan Fix <jfix@meta.com> [acc_tracer] Add dont_retrace_gm option ada3cbbb3d6c3b3631496a3bceea775f45649c6c Adam Simpkins <simpkins@meta.com> Fix a bunch of invalid Python escape warnings in torch_tensorrt 98254d631e8748a85b05851c97fb74f3e3922cfe Brandon Tran (Realtime Integrity) <bvtran@meta.com> Add torch.nn.functional.normalize to TensorRT fce21e2248ad0fddfcc12dbe2e3a9a6ac9ea2a5f Shirong Wu <shirong@meta.com> Fix trt input spec a08bad1ac74a6d1409bb3f2e96953ed0c149d006 Wei Wei <wwei6@meta.com> [fx2ait] changes to improve jagged tensor and add b2b bmm 7745d70a17677777dcb5806e1e8008532f961f5d generatedunixname485339166882981 <noreply+485339166882981@fb.com> [Codemod][[pyunit][static_listing] Convert python unit test dynamic listing to static listing] oncall+gpu_enablement_0 ba33951ae2d2ebc99794aff8026a01a31f9ad8da Shirong Wu <shirong@meta.com> Add ait full op converters b3bfd69f15fc4e32f27217a3efa8204a2f062af8 Chao Gu <guchao@meta.com> [FX] support index_add in acc ops and tracer a965bafc517afc81591052e355fd34062b028a89 Shirong Wu <shirong@meta.com> Make fill op read dtype from input/kwarg 72f9b0925eceffc12dfa51769c1bd0cb38a3e50c generatedunixname485339166882981 <noreply+485339166882981@fb.com> [Codemod][[pyunit][static_listing] Convert python unit test dynamic listing to static listing] oncall+gpu_enablement 2e7feece191d6178ff6ec750d8fe481175bb27b9 Max Podkorytov <maxdp@meta.com> [fx2ait] enable lowering to bfloat16 94607911ffb11e78082e061a670b5140e9a55d72 Archie Sravankumar <archishmans@meta.com> Add support for nan_to_num 42fddd20d303dbbc3355a8c09a86d4a74317be97 Max Podkorytov <maxdp@meta.com> [AITemplate] feed_lower_benchmark cli argument tweak for choosing precision 648ec682f2214e67912fe7c800f7ca059195cf4e Huamin Li <huaminli@meta.com> Re-enable previous disabled TRT unit tests 3e5c2aac8a7b9e50efe04fcae361a3c0ee1777a7 Janet Yang <qxy11@meta.com> Skip acc normalization of repeat_interleave if input dims aren't integral f412f35baeee9a1b17f67b7749ca1f9b8cbbe77b Janet Yang <qxy11@meta.com> Skip acc normalization of repeat if dims aren't ints 5b9cfe428f29e27da76b19029bda03a8b43c17d1 Huamin Li <huaminli@meta.com> add import into generate_standalone_repro 9f88965e87e72658aa6a4973dc870d50b8a22ca4 Fei Kou <feikou@meta.com> lowering with bf16 7f761df34d672c87c40b18369b28bc593374122c Fei Kou <feikou@meta.com> [benchmark] Support bfloat16 in mts_gpu_benchmark fa9b09e11ba8f888d761e1398367973d30e0aa1e Wei Wei <wwei6@meta.com> [fx2ait] add a simple eager run to verify the input generatation is correct 4f8ca36dbdc72dfa60e667c3592d0a2bc466b994 Max Podkorytov <maxdp@meta.com> [AITemplate] implement per op benchmark backbone 9873be1e82f2dd4a8a768497ac9cdb3b9b95cfe9 Thomas Orozco <torozco@meta.com> buck2: tag more long running tests 0d6827c464aa2141a48a8d768a8c7facd65c0bc4 generatedunixname485339166882981 <noreply+485339166882981@fb.com> [Codemod][[pyunit][static_listing] Convert python unit test dynamic listing to static listing] oncall+gpu_enablement_0_2ea3 04f9c1105a2a6a711d025d5c85b95147343d0ecd Zhijing Li (Accelerator Enablement) <tissue030@meta.com> [fx2ait] Fix acc_ops converter on std when keepdim=False 906bad1deebb235a9c80d0f0d46145da08afa091 Danylo Baibak <baibak@meta.com> Forward fix to switch calling convention back to real tensors 48ffa2ab3dd66487922f9f0bf9a145db6eaf3fe2 Kefei Lu <kefeilu@meta.com> Lowering: validate inference with alternative batch sizes ca5dc1a2896bd476e3a327db834df859a3fcc11f Jordan Fix <jfix@meta.com> [fba_pass_manager_builder][BE] General cleanup/refactor afb4df5e84571f466b0f385472493aefb89344cc Shirong Wu <shirong@meta.com> Mask select converter 25e8afb1f8be19ec6c4ef4bc74ea48e64017cde2 Janet Yang <qxy11@meta.com> Fix lowering FusedTsEncodingModule for coffee model 7fdf06ecfc6b4efb7008ce399dcd0c32ef1f1f75 generatedunixname485339166882981 <noreply+485339166882981@fb.com> [Codemod][[pyunit][static_listing] Convert python unit test dynamic listing to static listing] oncall+deprecated_techdebt_do_not_use_4_9c34 a58c5e454412585c4cc48ced1798dbf234cc13b6 Michael Liu <limic@meta.com> Initialize `output_count` in `get_model_info_str` 2c6f13ddcc52e8f833fcd164d0c479ca3398322e Wei Wei <wwei6@meta.com> jagged SHA and MHA module support 2fe5c7cd3b763b839af3d1b05eecc73f1df05286 Shirong Wu <shirong@meta.com> Add BF16 support for ads model 2486edbe5013f3b7e5807503538f3164bdd4ee19 Shirong Wu <shirong@meta.com> Add low_level_module conversion pass ca7c51407ab0410d311c984b31aeb757dd840bc2 Wei Wei <wwei6@meta.com> [hstu] remove torch_package from RelativeBucketedTimeAndPositionBasedBias after packaged 80596e459343d5630e16a6175eafffd2c25a3123 Shirong Wu <shirong@meta.com> Block a pass that yield problem ded609195500a8edc5bed80ee85f41b35224c19f Huamin Li <huaminli@meta.com> Do not test test_implicit_batch_dim if >= 8.6 8e8e736e14d23e77fa2bd5e72123d66943f7716f Huamin Li <huaminli@meta.com> Speed up TRT compile time in test env 2db82572e509cfe827c34a4060c058ae44b5547a Jordan Fix <jfix@meta.com> [acc_tracer] Add in use_concrete_args option to enable pytree flatten/unflatten 946f957b6636c6b4f64e52148c9baf6e0351fb5e Wei Wei <wwei6@meta.com> [hstu] changes to bias module and sha/mha pass to adapt to removing presences d904b26386c2ef98f292edae7c5e98c27119f9d9 Oleg Khabinov <khabinov@meta.com> [fx2ait] Rename split_op_to_reshape to chunk_op_to_reshape ca36733f0ea67aeeb38a3740f795bbf99b24037b Oleg Khabinov <khabinov@meta.com> [fx2ait] Rewrite chunk_op_to_reshape() to use while loop instead of recursion 4361feb4399eec3816b534991020703d099d2896 Oleg Khabinov <khabinov@meta.com> [fx2ait] Optimize chunk_op_to_reshape() 071b84e3cda4f0175b37ae62c37b2d4f2de7925f Huamin Li <huaminli@meta.com> Disable libkineto for TRT unit tests 92f9acaac8f9a8f0fc2e1382bf4c79d0b94cbea5 Wei Wei <wwei6@meta.com> [fx2ait] improve bf16 support 8b92e8356278eb9676a5299373841593af942fb4 Jongsoo Park <jongsoo@meta.com> [acc_tracer] skip None module in rewriting 0d1d644bad22c86efec12009ca1464587d1e7d38 Kefei Lu <kefeilu@meta.com> Remove non-existent argument doc string 2efe5e78bc8627a30ba132e5b8e14e06538d463f shirong <shirong@fb.com> Temp fix a15a564a567eb689604d27ca814553e38c287698 shirong <shirong@fb.com> Temporary commit at 4/24/2023, 2:32:22 PM 78825462243c09760ebb73156a4c18bbc9ddee75 shirong <shirong@fb.com> Temporary commit at 4/24/2023, 2:32:37 PM 9bfea274462fd77cb04c38c17bc237541af87c55 laksrm <laksrm@fb.com> [DNR] onboard ctr to aimp with lowering fixes 8bb482b10f7f63270c329c88d5ac028b40f6b757 shirong <shirong@fb.com> Reenable pass

github-actions

Code conforms to Python style guidelines

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

facebook-github-bot added cla signed fx labels May 5, 2023

github-actions bot added component: api [Python] Issues re: Python API component: fx labels May 5, 2023

github-actions bot requested a review from yinghai May 5, 2023 19:41

github-actions bot approved these changes May 5, 2023

View reviewed changes

github-actions bot requested changes May 5, 2023

View reviewed changes

wushirong force-pushed the fb-sync-shirong branch from 06ba60c to 1b99e4c Compare May 5, 2023 21:49

github-actions bot approved these changes May 5, 2023

View reviewed changes

wushirong force-pushed the fb-sync-shirong branch from 1b99e4c to 1c7e1a9 Compare May 5, 2023 22:38

github-actions bot approved these changes May 5, 2023

View reviewed changes

wushirong force-pushed the fb-sync-shirong branch from 1c7e1a9 to c8217ba Compare May 15, 2023 21:23

github-actions bot approved these changes May 15, 2023

View reviewed changes

github-actions bot requested changes May 15, 2023

View reviewed changes

wushirong force-pushed the fb-sync-shirong branch from c8217ba to 9c70601 Compare May 17, 2023 18:03

github-actions bot approved these changes May 17, 2023

View reviewed changes

wushirong force-pushed the fb-sync-shirong branch from 9c70601 to df52d36 Compare May 17, 2023 20:28

github-actions bot approved these changes May 17, 2023

View reviewed changes

wushirong force-pushed the fb-sync-shirong branch from df52d36 to f77a002 Compare May 17, 2023 22:17

github-actions bot approved these changes May 17, 2023

View reviewed changes

frank-wei self-requested a review May 17, 2023 23:34

frank-wei approved these changes May 17, 2023

View reviewed changes

github-actions bot approved these changes May 17, 2023

View reviewed changes

wushirong merged commit fc037c1 into main May 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync fb internal change to OSS #1892

Sync fb internal change to OSS #1892

wushirong commented May 5, 2023

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

Sync fb internal change to OSS #1892

Sync fb internal change to OSS #1892

Conversation

wushirong commented May 5, 2023

Description

Type of change

Checklist:

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment