Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[torchbench] Background_Matting fails when lowering UpsampleBilinear2D #6520

Open
ysiraichi opened this issue Feb 12, 2024 · 4 comments
Open
Labels

Comments

@ysiraichi
Copy link
Collaborator

🐛 Bug

After converting the Background_Matting model to bfloat16 and running it (see command below), it fails with the following error:

python xla/benchmarks/experiment_runner.py \
    --suite-name torchbench --accelerator cuda \
    --xla PJRT --dynamo None --test eval \
    --no-resume --print-subprocess \
    -k Background_Matting
Traceback (most recent call last):
  File "xla/benchmarks/experiment_runner.py", line 914, in <module>
    main()
  File "xla/benchmarks/experiment_runner.py", line 910, in main
    runner.run()
  File "xla/benchmarks/experiment_runner.py", line 59, in run
    self.run_single_config()
  File "xla/benchmarks/experiment_runner.py", line 254, in run_single_config
    metrics, last_output = self.run_once_and_gather_metrics(
  File "xla/benchmarks/experiment_runner.py", line 331, in run_once_and_gather_metrics
    output, _ = loop(iter_fn=self._default_iter_fn)
  File "xla/benchmarks/experiment_runner.py", line 300, in loop
    output, timing, trace = iter_fn(benchmark_experiment, benchmark_model,
  File "xla/benchmarks/experiment_runner.py", line 222, in _default_iter_fn
    self._mark_step(benchmark_experiment)
  File "xla/benchmarks/experiment_runner.py", line 405, in _mark_step
    xm.mark_step()
  File "xla/torch_xla/core/xla_model.py", line 907, in mark_step
    torch_xla._XLAC._xla_step_marker(
RuntimeError: Error while lowering: [] aten::upsample_bilinear2d, xla_shape=bf16[1,128,512,512]{3,2,1,0}, dynamic_dims: (), output_size=(512, 512), align_corners=1
Error: torch_xla/csrc/resize_ops.cpp:52 : Check failed: input_type == output_type
*** Begin stack trace ***
        tsl::CurrentStackTrace[abi:cxx11]()
        torch_xla::resize::LowerForward2d(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, xla::XlaOp, xla::Shape const&, bool, bool)
        torch_xla::UpsampleBilinear::Lower(torch_xla::LoweringContext*) const
        torch_xla::LoweringContext::LowerNode(torch::lazy::Node const*)
        torch_xla::LoweringContext::LoweringContext(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, torch::lazy::BackendDevice, c10::ArrayRef<torch::lazy::Node const*>, std::unordered_map<torch::lazy::Node const*, torch::lazy::Util::EmitStatus, std::hash<torch::lazy::Node const*>, std::equal_to<torch::lazy::Node const*>, std::allocator<std::pair<torch::lazy::Node const* const, torch::lazy::Util::EmitStatus> > >)
        torch_xla::XLAGraphExecutor::Compile(std::vector<c10::intrusive_ptr<torch_xla::XLATensor, c10::detail::intrusive_target_default_null_type<torch_xla::XLATensor> >, std::allocator<c10::intrusive_ptr<torch_xla::XLATensor, c10::detail::intrusive_target_default_null_type<torch_xla::XLATensor> > > > const&, absl::lts_20230802::Span<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const>, torch::lazy::LazyGraphExecutor::SyncTensorCollection const&, torch::lazy::LazyGraphExecutor::PostOrderData*, std::vector<torch::lazy::Value, std::allocator<torch::lazy::Value> > const&)
        torch_xla::XLAGraphExecutor::SyncTensorsGraphInternal(std::vector<c10::intrusive_ptr<torch_xla::XLATensor, c10::detail::intrusive_target_default_null_type<torch_xla::XLATensor> >, std::allocator<c10::intrusive_ptr<torch_xla::XLATensor, c10::detail::intrusive_target_default_null_type<torch_xla::XLATensor> > > >*, absl::lts_20230802::Span<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const>, torch::lazy::LazyGraphExecutor::SyncTensorsConfig const&, bool)
        torch_xla::XLAGraphExecutor::SyncTensorsGraph(std::vector<c10::intrusive_ptr<torch_xla::XLATensor, c10::detail::intrusive_target_default_null_type<torch_xla::XLATensor> >, std::allocator<c10::intrusive_ptr<torch_xla::XLATensor, c10::detail::intrusive_target_default_null_type<torch_xla::XLATensor> > > >*, absl::lts_20230802::Span<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const>, bool, bool, bool)
        torch_xla::XLAGraphExecutor::SyncLiveTensorsGraph(torch::lazy::BackendDevice const*, c10::ArrayRef<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, bool)




        _PyObject_MakeTpCall
        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault


        _PyEval_EvalFrameDefault
        _PyEval_EvalCodeWithName
        _PyFunction_Vectorcall
        _PyEval_EvalFrameDefault
        _PyEval_EvalCodeWithName
        _PyFunction_Vectorcall
        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault
        _PyEval_EvalCodeWithName
        PyEval_EvalCodeEx
        PyEval_EvalCode



        PyRun_SimpleFileExFlags
        Py_RunMain
        Py_BytesMain
        __libc_start_main
        _start
*** End stack trace ***
input (f32[1,256,256,128]{1,2,3,0}) and output (bf16[1,512,512,128]{1,2,3,0}) must have the same element type

To Reproduce

Environment

  • Reproducible on XLA backend [CPU/TPU]: CUDA
  • torch_xla version: 408b376

cc @miladm @JackCaoG

@ysiraichi
Copy link
Collaborator Author

I investigated this problem a bit. It was strange that, in the UpsampleBilinear::Lower function, operand(0).node is of type bf16, but ShapeHelper::ShapeOfXlaOp(input) is of type fp32. Looking at the HLO representation of the computation, it's also strange that there are a few f32 scattered around. Below is the HLO representation of this node's input.

HLO Representation
ENTRY %IrToHlo.878 (p0.1: bf16[128], p1.2: bf16[128], p2.3: bf16[128], p3.4: bf16[128], p4.5: bf16[128], p5.6: bf16[128,256,3,3], p6.8: bf16[256], p7.9: bf16[256], p8.10: bf16[256], p9.11: bf16[256], p10.12: bf16[256], p11.13: bf16[256,256,3,3], p12.14: bf16[256], p13.15: bf16[256], p14.16: bf16[256], p15.17: bf16[256], p16.18: bf16[256], p17.19: bf16[256,256,3,3], p18.21: bf16[256], p19.22: bf16[256], p20.23: bf16[256], p21.24: bf16[256], p22.25: bf16[256], p23.26: bf16[256,256,3,3], p24.27: bf16[256], p25.28: bf16[256], p26.29: bf16[256], p27.30: bf16[256], p28.31: bf16[256], p29.32: bf16[256,256,3,3], p30.34: bf16[256], p31.35: bf16[256], p32.36: bf16[256], p33.37: bf16[256], p34.38: bf16[256], p35.39: bf16[256,256,3,3], p36.40: bf16[256], p37.41: bf16[256], p38.42: bf16[256], p39.43: bf16[256], p40.44: bf16[256], p41.45: bf16[256,256,3,3], p42.47: bf16[256], p43.48: bf16[256], p44.49: bf16[256], p45.50: bf16[256], p46.51: bf16[256], p47.52: bf16[256,256,3,3], p48.53: bf16[256], p49.54: bf16[256], p50.55: bf16[256], p51.56: bf16[256], p52.57: bf16[256], p53.58: bf16[256,256,3,3], p54.60: bf16[256], p55.61: bf16[256], p56.62: bf16[256], p57.63: bf16[256], p58.64: bf16[256], p59.65: bf16[256,256,3,3], p60.66: bf16[256], p61.67: bf16[256], p62.68: bf16[256], p63.69: bf16[256], p64.70: bf16[256], p65.71: bf16[256,256,3,3], p66.73: bf16[256], p67.74: bf16[256], p68.75: bf16[256], p69.76: bf16[256], p70.77: bf16[256], p71.78: bf16[256,256,3,3], p72.79: bf16[256], p73.80: bf16[256], p74.81: bf16[256], p75.82: bf16[256], p76.83: bf16[256], p77.84: bf16[256,256,3,3], p78.86: bf16[256], p79.87: bf16[256], p80.88: bf16[256], p81.89: bf16[256], p82.90: bf16[256], p83.91: bf16[256,256,3,3], p84.92: bf16[256], p85.93: bf16[256], p86.94: bf16[256], p87.95: bf16[256], p88.96: bf16[256], p89.97: bf16[256,256,3,3], p90.99: bf16[256], p91.100: bf16[256], p92.101: bf16[256], p93.102: bf16[256], p94.103: bf16[256], p95.104: bf16[256,256,3,3], p96.105: bf16[256], p97.106: bf16[256], p98.107: bf16[256], p99.108: bf16[256], p100.109: bf16[256], p101.110: bf16[256,256,3,3], p102.112: bf16[256], p103.113: bf16[256], p104.114: bf16[256], p105.115: bf16[256], p106.116: bf16[256], p107.117: bf16[256,256,3,3], p108.118: bf16[256], p109.119: bf16[256], p110.120: bf16[256], p111.121: bf16[256], p112.122: bf16[256], p113.123: bf16[256,256,3,3], p114.125: bf16[256], p115.126: bf16[256], p116.127: bf16[256], p117.128: bf16[256], p118.129: bf16[256], p119.130: bf16[256,256,3,3], p120.131: bf16[256], p121.132: bf16[256], p122.133: bf16[256], p123.134: bf16[256], p124.135: bf16[256], p125.136: bf16[256,256,3,3], p126.137: bf16[256], p127.138: bf16[256], p128.139: bf16[256], p129.140: bf16[256], p130.141: bf16[256,448,1,1], p131.142: bf16[64], p132.143: bf16[64], p133.144: bf16[64], p134.145: bf16[64], p135.146: bf16[64,512,1,1], p136.147: bf16[256], p137.148: bf16[256], p138.149: bf16[256], p139.150: bf16[256], p140.151: bf16[256], p141.152: bf16[256,128,3,3], p142.153: bf16[128], p143.154: bf16[128], p144.155: bf16[128], p145.156: bf16[128], p146.157: bf16[128], p147.158: bf16[128,64,3,3], p148.159: bf16[64], p149.160: bf16[64], p150.161: bf16[64], p151.162: bf16[64], p152.163: bf16[64], p153.164: bf16[64,3,7,7], p154.165: bf16[1,3,512,512], p155.210: bf16[256], p156.211: bf16[256], p157.212: bf16[256], p158.213: bf16[256], p159.214: bf16[256], p160.215: bf16[256,128,3,3], p161.216: bf16[128], p162.217: bf16[128], p163.218: bf16[128], p164.219: bf16[128], p165.220: bf16[128], p166.221: bf16[128,64,3,3], p167.222: bf16[64], p168.223: bf16[64], p169.224: bf16[64], p170.225: bf16[64], p171.226: bf16[64], p172.227: bf16[64,3,7,7], p173.228: bf16[1,3,512,512], p174.283: bf16[64], p175.284: bf16[64], p176.285: bf16[64], p177.286: bf16[64], p178.287: bf16[64,512,1,1], p179.288: bf16[256], p180.289: bf16[256], p181.290: bf16[256], p182.291: bf16[256], p183.292: bf16[256], p184.293: bf16[256,128,3,3], p185.294: bf16[128], p186.295: bf16[128], p187.296: bf16[128], p188.297: bf16[128], p189.298: bf16[128], p190.299: bf16[128,64,3,3], p191.300: bf16[64], p192.301: bf16[64], p193.302: bf16[64], p194.303: bf16[64], p195.304: bf16[64], p196.305: bf16[64,1,7,7], p197.306: bf16[1,1,512,512], p198.361: bf16[64], p199.362: bf16[64], p200.363: bf16[64], p201.364: bf16[64], p202.365: bf16[64,512,1,1]) -> (f32[1,128,256,256]) {
  %p148.159 = bf16[64]{0} parameter(148)
  %constant.179 = bf16[] constant(1.001e-05)
  %broadcast.180 = bf16[64]{0} broadcast(bf16[] %constant.179), dimensions={}
  %add.181 = bf16[64]{0} add(bf16[64]{0} %p148.159, bf16[64]{0} %broadcast.180)
  %rsqrt.182 = bf16[64]{0} rsqrt(bf16[64]{0} %add.181)
  %p142.153 = bf16[128]{0} parameter(142)
  %constant.191 = bf16[] constant(1.001e-05)
  %broadcast.192 = bf16[128]{0} broadcast(bf16[] %constant.191), dimensions={}
  %add.193 = bf16[128]{0} add(bf16[128]{0} %p142.153, bf16[128]{0} %broadcast.192)
  %rsqrt.194 = bf16[128]{0} rsqrt(bf16[128]{0} %add.193)
  %p136.147 = bf16[256]{0} parameter(136)
  %constant.203 = bf16[] constant(1.001e-05)
  %broadcast.204 = bf16[256]{0} broadcast(bf16[] %constant.203), dimensions={}
  %add.205 = bf16[256]{0} add(bf16[256]{0} %p136.147, bf16[256]{0} %broadcast.204)
  %rsqrt.206 = bf16[256]{0} rsqrt(bf16[256]{0} %add.205)
  %p167.222 = bf16[64]{0} parameter(167)
  %constant.242 = bf16[] constant(1.001e-05)
  %broadcast.243 = bf16[64]{0} broadcast(bf16[] %constant.242), dimensions={}
  %add.244 = bf16[64]{0} add(bf16[64]{0} %p167.222, bf16[64]{0} %broadcast.243)
  %rsqrt.245 = bf16[64]{0} rsqrt(bf16[64]{0} %add.244)
  %p161.216 = bf16[128]{0} parameter(161)
  %constant.254 = bf16[] constant(1.001e-05)
  %broadcast.255 = bf16[128]{0} broadcast(bf16[] %constant.254), dimensions={}
  %add.256 = bf16[128]{0} add(bf16[128]{0} %p161.216, bf16[128]{0} %broadcast.255)
  %rsqrt.257 = bf16[128]{0} rsqrt(bf16[128]{0} %add.256)
  %p155.210 = bf16[256]{0} parameter(155)
  %constant.266 = bf16[] constant(1.001e-05)
  %broadcast.267 = bf16[256]{0} broadcast(bf16[] %constant.266), dimensions={}
  %add.268 = bf16[256]{0} add(bf16[256]{0} %p155.210, bf16[256]{0} %broadcast.267)
  %rsqrt.269 = bf16[256]{0} rsqrt(bf16[256]{0} %add.268)
  %p131.142 = bf16[64]{0} parameter(131)
  %constant.276 = bf16[] constant(1.001e-05)
  %broadcast.277 = bf16[64]{0} broadcast(bf16[] %constant.276), dimensions={}
  %add.278 = bf16[64]{0} add(bf16[64]{0} %p131.142, bf16[64]{0} %broadcast.277)
  %rsqrt.279 = bf16[64]{0} rsqrt(bf16[64]{0} %add.278)
  %p191.300 = bf16[64]{0} parameter(191)
  %constant.320 = bf16[] constant(1.001e-05)
  %broadcast.321 = bf16[64]{0} broadcast(bf16[] %constant.320), dimensions={}
  %add.322 = bf16[64]{0} add(bf16[64]{0} %p191.300, bf16[64]{0} %broadcast.321)
  %rsqrt.323 = bf16[64]{0} rsqrt(bf16[64]{0} %add.322)
  %p185.294 = bf16[128]{0} parameter(185)
  %constant.332 = bf16[] constant(1.001e-05)
  %broadcast.333 = bf16[128]{0} broadcast(bf16[] %constant.332), dimensions={}
  %add.334 = bf16[128]{0} add(bf16[128]{0} %p185.294, bf16[128]{0} %broadcast.333)
  %rsqrt.335 = bf16[128]{0} rsqrt(bf16[128]{0} %add.334)
  %p179.288 = bf16[256]{0} parameter(179)
  %constant.344 = bf16[] constant(1.001e-05)
  %broadcast.345 = bf16[256]{0} broadcast(bf16[] %constant.344), dimensions={}
  %add.346 = bf16[256]{0} add(bf16[256]{0} %p179.288, bf16[256]{0} %broadcast.345)
  %rsqrt.347 = bf16[256]{0} rsqrt(bf16[256]{0} %add.346)
  %p174.283 = bf16[64]{0} parameter(174)
  %constant.354 = bf16[] constant(1.001e-05)
  %broadcast.355 = bf16[64]{0} broadcast(bf16[] %constant.354), dimensions={}
  %add.356 = bf16[64]{0} add(bf16[64]{0} %p174.283, bf16[64]{0} %broadcast.355)
  %rsqrt.357 = bf16[64]{0} rsqrt(bf16[64]{0} %add.356)
  %p198.361 = bf16[64]{0} parameter(198)
  %constant.369 = bf16[] constant(1.001e-05)
  %broadcast.370 = bf16[64]{0} broadcast(bf16[] %constant.369), dimensions={}
  %add.371 = bf16[64]{0} add(bf16[64]{0} %p198.361, bf16[64]{0} %broadcast.370)
  %rsqrt.372 = bf16[64]{0} rsqrt(bf16[64]{0} %add.371)
  %p126.137 = bf16[256]{0} parameter(126)
  %constant.380 = bf16[] constant(1.001e-05)
  %broadcast.381 = bf16[256]{0} broadcast(bf16[] %constant.380), dimensions={}
  %add.382 = bf16[256]{0} add(bf16[256]{0} %p126.137, bf16[256]{0} %broadcast.381)
  %rsqrt.383 = bf16[256]{0} rsqrt(bf16[256]{0} %add.382)
  %p120.131 = bf16[256]{0} parameter(120)
  %constant.400 = bf16[] constant(1.001e-05)
  %broadcast.401 = bf16[256]{0} broadcast(bf16[] %constant.400), dimensions={}
  %add.402 = bf16[256]{0} add(bf16[256]{0} %p120.131, bf16[256]{0} %broadcast.401)
  %rsqrt.403 = bf16[256]{0} rsqrt(bf16[256]{0} %add.402)
  %p114.125 = bf16[256]{0} parameter(114)
  %constant.420 = bf16[] constant(1.001e-05)
  %broadcast.421 = bf16[256]{0} broadcast(bf16[] %constant.420), dimensions={}
  %add.422 = bf16[256]{0} add(bf16[256]{0} %p114.125, bf16[256]{0} %broadcast.421)
  %rsqrt.423 = bf16[256]{0} rsqrt(bf16[256]{0} %add.422)
  %p108.118 = bf16[256]{0} parameter(108)
  %constant.440 = bf16[] constant(1.001e-05)
  %broadcast.441 = bf16[256]{0} broadcast(bf16[] %constant.440), dimensions={}
  %add.442 = bf16[256]{0} add(bf16[256]{0} %p108.118, bf16[256]{0} %broadcast.441)
  %rsqrt.443 = bf16[256]{0} rsqrt(bf16[256]{0} %add.442)
  %p102.112 = bf16[256]{0} parameter(102)
  %constant.460 = bf16[] constant(1.001e-05)
  %broadcast.461 = bf16[256]{0} broadcast(bf16[] %constant.460), dimensions={}
  %add.462 = bf16[256]{0} add(bf16[256]{0} %p102.112, bf16[256]{0} %broadcast.461)
  %rsqrt.463 = bf16[256]{0} rsqrt(bf16[256]{0} %add.462)
  %p96.105 = bf16[256]{0} parameter(96)
  %constant.480 = bf16[] constant(1.001e-05)
  %broadcast.481 = bf16[256]{0} broadcast(bf16[] %constant.480), dimensions={}
  %add.482 = bf16[256]{0} add(bf16[256]{0} %p96.105, bf16[256]{0} %broadcast.481)
  %rsqrt.483 = bf16[256]{0} rsqrt(bf16[256]{0} %add.482)
  %p90.99 = bf16[256]{0} parameter(90)
  %constant.500 = bf16[] constant(1.001e-05)
  %broadcast.501 = bf16[256]{0} broadcast(bf16[] %constant.500), dimensions={}
  %add.502 = bf16[256]{0} add(bf16[256]{0} %p90.99, bf16[256]{0} %broadcast.501)
  %rsqrt.503 = bf16[256]{0} rsqrt(bf16[256]{0} %add.502)
  %p84.92 = bf16[256]{0} parameter(84)
  %constant.520 = bf16[] constant(1.001e-05)
  %broadcast.521 = bf16[256]{0} broadcast(bf16[] %constant.520), dimensions={}
  %add.522 = bf16[256]{0} add(bf16[256]{0} %p84.92, bf16[256]{0} %broadcast.521)
  %rsqrt.523 = bf16[256]{0} rsqrt(bf16[256]{0} %add.522)
  %p78.86 = bf16[256]{0} parameter(78)
  %constant.540 = bf16[] constant(1.001e-05)
  %broadcast.541 = bf16[256]{0} broadcast(bf16[] %constant.540), dimensions={}
  %add.542 = bf16[256]{0} add(bf16[256]{0} %p78.86, bf16[256]{0} %broadcast.541)
  %rsqrt.543 = bf16[256]{0} rsqrt(bf16[256]{0} %add.542)
  %p72.79 = bf16[256]{0} parameter(72)
  %constant.560 = bf16[] constant(1.001e-05)
  %broadcast.561 = bf16[256]{0} broadcast(bf16[] %constant.560), dimensions={}
  %add.562 = bf16[256]{0} add(bf16[256]{0} %p72.79, bf16[256]{0} %broadcast.561)
  %rsqrt.563 = bf16[256]{0} rsqrt(bf16[256]{0} %add.562)
  %p66.73 = bf16[256]{0} parameter(66)
  %constant.580 = bf16[] constant(1.001e-05)
  %broadcast.581 = bf16[256]{0} broadcast(bf16[] %constant.580), dimensions={}
  %add.582 = bf16[256]{0} add(bf16[256]{0} %p66.73, bf16[256]{0} %broadcast.581)
  %rsqrt.583 = bf16[256]{0} rsqrt(bf16[256]{0} %add.582)
  %p60.66 = bf16[256]{0} parameter(60)
  %constant.600 = bf16[] constant(1.001e-05)
  %broadcast.601 = bf16[256]{0} broadcast(bf16[] %constant.600), dimensions={}
  %add.602 = bf16[256]{0} add(bf16[256]{0} %p60.66, bf16[256]{0} %broadcast.601)
  %rsqrt.603 = bf16[256]{0} rsqrt(bf16[256]{0} %add.602)
  %p54.60 = bf16[256]{0} parameter(54)
  %constant.620 = bf16[] constant(1.001e-05)
  %broadcast.621 = bf16[256]{0} broadcast(bf16[] %constant.620), dimensions={}
  %add.622 = bf16[256]{0} add(bf16[256]{0} %p54.60, bf16[256]{0} %broadcast.621)
  %rsqrt.623 = bf16[256]{0} rsqrt(bf16[256]{0} %add.622)
  %p48.53 = bf16[256]{0} parameter(48)
  %constant.640 = bf16[] constant(1.001e-05)
  %broadcast.641 = bf16[256]{0} broadcast(bf16[] %constant.640), dimensions={}
  %add.642 = bf16[256]{0} add(bf16[256]{0} %p48.53, bf16[256]{0} %broadcast.641)
  %rsqrt.643 = bf16[256]{0} rsqrt(bf16[256]{0} %add.642)
  %p42.47 = bf16[256]{0} parameter(42)
  %constant.660 = bf16[] constant(1.001e-05)
  %broadcast.661 = bf16[256]{0} broadcast(bf16[] %constant.660), dimensions={}
  %add.662 = bf16[256]{0} add(bf16[256]{0} %p42.47, bf16[256]{0} %broadcast.661)
  %rsqrt.663 = bf16[256]{0} rsqrt(bf16[256]{0} %add.662)
  %p36.40 = bf16[256]{0} parameter(36)
  %constant.680 = bf16[] constant(1.001e-05)
  %broadcast.681 = bf16[256]{0} broadcast(bf16[] %constant.680), dimensions={}
  %add.682 = bf16[256]{0} add(bf16[256]{0} %p36.40, bf16[256]{0} %broadcast.681)
  %rsqrt.683 = bf16[256]{0} rsqrt(bf16[256]{0} %add.682)
  %p30.34 = bf16[256]{0} parameter(30)
  %constant.700 = bf16[] constant(1.001e-05)
  %broadcast.701 = bf16[256]{0} broadcast(bf16[] %constant.700), dimensions={}
  %add.702 = bf16[256]{0} add(bf16[256]{0} %p30.34, bf16[256]{0} %broadcast.701)
  %rsqrt.703 = bf16[256]{0} rsqrt(bf16[256]{0} %add.702)
  %p24.27 = bf16[256]{0} parameter(24)
  %constant.720 = bf16[] constant(1.001e-05)
  %broadcast.721 = bf16[256]{0} broadcast(bf16[] %constant.720), dimensions={}
  %add.722 = bf16[256]{0} add(bf16[256]{0} %p24.27, bf16[256]{0} %broadcast.721)
  %rsqrt.723 = bf16[256]{0} rsqrt(bf16[256]{0} %add.722)
  %p18.21 = bf16[256]{0} parameter(18)
  %constant.740 = bf16[] constant(1.001e-05)
  %broadcast.741 = bf16[256]{0} broadcast(bf16[] %constant.740), dimensions={}
  %add.742 = bf16[256]{0} add(bf16[256]{0} %p18.21, bf16[256]{0} %broadcast.741)
  %rsqrt.743 = bf16[256]{0} rsqrt(bf16[256]{0} %add.742)
  %p12.14 = bf16[256]{0} parameter(12)
  %constant.760 = bf16[] constant(1.001e-05)
  %broadcast.761 = bf16[256]{0} broadcast(bf16[] %constant.760), dimensions={}
  %add.762 = bf16[256]{0} add(bf16[256]{0} %p12.14, bf16[256]{0} %broadcast.761)
  %rsqrt.763 = bf16[256]{0} rsqrt(bf16[256]{0} %add.762)
  %p6.8 = bf16[256]{0} parameter(6)
  %constant.780 = bf16[] constant(1.001e-05)
  %broadcast.781 = bf16[256]{0} broadcast(bf16[] %constant.780), dimensions={}
  %add.782 = bf16[256]{0} add(bf16[256]{0} %p6.8, bf16[256]{0} %broadcast.781)
  %rsqrt.783 = bf16[256]{0} rsqrt(bf16[256]{0} %add.782)
  %constant.791 = f64[] constant(0.5)
  %convert.792 = f32[] convert(f64[] %constant.791)
  %p0.1 = bf16[128]{0} parameter(0)
  %constant.870 = bf16[] constant(1.001e-05)
  %broadcast.871 = bf16[128]{0} broadcast(bf16[] %constant.870), dimensions={}
  %add.872 = bf16[128]{0} add(bf16[128]{0} %p0.1, bf16[128]{0} %broadcast.871)
  %rsqrt.873 = bf16[128]{0} rsqrt(bf16[128]{0} %add.872)
  %constant.793 = s32[] constant(0)
  %convert.794 = f32[] convert(s32[] %constant.793)
  %broadcast.834 = f32[256,3]{1,0} broadcast(f32[] %convert.794), dimensions={}
  %constant.789 = s32[] constant(1)
  %convert.790 = f32[] convert(s32[] %constant.789)
  %broadcast.832 = f32[256,3]{1,0} broadcast(f32[] %convert.790), dimensions={}
  %broadcast.819 = f32[256]{0} broadcast(f32[] %convert.794), dimensions={}
  %iota.809 = f32[256]{0} iota(), iota_dimension=0
  %constant.810 = f32[] constant(0.498039216)
  %convert.811 = f32[] convert(f32[] %constant.810)
  %broadcast.812 = f32[256]{0} broadcast(f32[] %convert.811), dimensions={}
  %multiply.813 = f32[256]{0} multiply(f32[256]{0} %iota.809, f32[256]{0} %broadcast.812)
  %broadcast.814 = f32[256]{0} broadcast(f32[] %convert.790), dimensions={}
  %subtract.815 = f32[256]{0} subtract(f32[256]{0} %multiply.813, f32[256]{0} %broadcast.814)
  %ceil.816 = f32[256]{0} ceil(f32[256]{0} %subtract.815)
  %constant.817 = s64[] constant(125)
  %convert.818 = f32[] convert(s64[] %constant.817)
  %broadcast.820 = f32[256]{0} broadcast(f32[] %convert.818), dimensions={}
  %clamp.821 = f32[256]{0} clamp(f32[256]{0} %broadcast.819, f32[256]{0} %ceil.816, f32[256]{0} %broadcast.820)
  %subtract.826 = f32[256]{0} subtract(f32[256]{0} %clamp.821, f32[256]{0} %multiply.813)
  %broadcast.827 = f32[256,3]{1,0} broadcast(f32[256]{0} %subtract.826), dimensions={0}
  %iota.828 = f32[3]{0} iota(), iota_dimension=0
  %broadcast.829 = f32[256,3]{1,0} broadcast(f32[3]{0} %iota.828), dimensions={1}
  %add.830 = f32[256,3]{1,0} add(f32[256,3]{1,0} %broadcast.827, f32[256,3]{1,0} %broadcast.829)
  %abs.831 = f32[256,3]{1,0} abs(f32[256,3]{1,0} %add.830)
  %subtract.833 = f32[256,3]{1,0} subtract(f32[256,3]{1,0} %broadcast.832, f32[256,3]{1,0} %abs.831)
  %maximum.835 = f32[256,3]{1,0} maximum(f32[256,3]{1,0} %broadcast.834, f32[256,3]{1,0} %subtract.833)
  %reduce.840 = f32[256]{0} reduce(f32[256,3]{1,0} %maximum.835, f32[] %convert.794), dimensions={1}, to_apply=%AddComputation.836
  %broadcast.841 = f32[256,3]{1,0} broadcast(f32[256]{0} %reduce.840), dimensions={0}
  %divide.842 = f32[256,3]{1,0} divide(f32[256,3]{1,0} %maximum.835, f32[256,3]{1,0} %broadcast.841)
  %broadcast.851 = f32[256,3]{1,0} broadcast(f32[] %convert.794), dimensions={}
  %broadcast.849 = f32[256,3]{1,0} broadcast(f32[] %convert.790), dimensions={}
  %broadcast.805 = f32[256]{0} broadcast(f32[] %convert.794), dimensions={}
  %iota.795 = f32[256]{0} iota(), iota_dimension=0
  %constant.796 = f32[] constant(0.498039216)
  %convert.797 = f32[] convert(f32[] %constant.796)
  %broadcast.798 = f32[256]{0} broadcast(f32[] %convert.797), dimensions={}
  %multiply.799 = f32[256]{0} multiply(f32[256]{0} %iota.795, f32[256]{0} %broadcast.798)
  %broadcast.800 = f32[256]{0} broadcast(f32[] %convert.790), dimensions={}
  %subtract.801 = f32[256]{0} subtract(f32[256]{0} %multiply.799, f32[256]{0} %broadcast.800)
  %ceil.802 = f32[256]{0} ceil(f32[256]{0} %subtract.801)
  %constant.803 = s64[] constant(125)
  %convert.804 = f32[] convert(s64[] %constant.803)
  %broadcast.806 = f32[256]{0} broadcast(f32[] %convert.804), dimensions={}
  %clamp.807 = f32[256]{0} clamp(f32[256]{0} %broadcast.805, f32[256]{0} %ceil.802, f32[256]{0} %broadcast.806)
  %subtract.843 = f32[256]{0} subtract(f32[256]{0} %clamp.807, f32[256]{0} %multiply.799)
  %broadcast.844 = f32[256,3]{1,0} broadcast(f32[256]{0} %subtract.843), dimensions={0}
  %iota.845 = f32[3]{0} iota(), iota_dimension=0
  %broadcast.846 = f32[256,3]{1,0} broadcast(f32[3]{0} %iota.845), dimensions={1}
  %add.847 = f32[256,3]{1,0} add(f32[256,3]{1,0} %broadcast.844, f32[256,3]{1,0} %broadcast.846)
  %abs.848 = f32[256,3]{1,0} abs(f32[256,3]{1,0} %add.847)
  %subtract.850 = f32[256,3]{1,0} subtract(f32[256,3]{1,0} %broadcast.849, f32[256,3]{1,0} %abs.848)
  %maximum.852 = f32[256,3]{1,0} maximum(f32[256,3]{1,0} %broadcast.851, f32[256,3]{1,0} %subtract.850)
  %reduce.857 = f32[256]{0} reduce(f32[256,3]{1,0} %maximum.852, f32[] %convert.794), dimensions={1}, to_apply=%AddComputation.853
  %broadcast.858 = f32[256,3]{1,0} broadcast(f32[256]{0} %reduce.857), dimensions={0}
  %divide.859 = f32[256,3]{1,0} divide(f32[256,3]{1,0} %maximum.852, f32[256,3]{1,0} %broadcast.858)
  %dot.860 = f32[256,3,256,3]{3,2,1,0} dot(f32[256,3]{1,0} %divide.842, f32[256,3]{1,0} %divide.859), lhs_contracting_dims={}, rhs_contracting_dims={}
  %p173.228 = bf16[1,3,512,512]{3,2,1,0} parameter(173)
  %reverse.229 = bf16[1,3,512,512]{3,2,1,0} reverse(bf16[1,3,512,512]{3,2,1,0} %p173.228), dimensions={3}
  %slice.230 = bf16[1,3,512,3]{3,2,1,0} slice(bf16[1,3,512,512]{3,2,1,0} %reverse.229), slice={[0:1], [0:3], [0:512], [508:511]}
  %slice.231 = bf16[1,3,512,3]{3,2,1,0} slice(bf16[1,3,512,512]{3,2,1,0} %reverse.229), slice={[0:1], [0:3], [0:512], [1:4]}
  %concatenate.232 = bf16[1,3,512,518]{3,2,1,0} concatenate(bf16[1,3,512,3]{3,2,1,0} %slice.230, bf16[1,3,512,512]{3,2,1,0} %p173.228, bf16[1,3,512,3]{3,2,1,0} %slice.231), dimensions={3}
  %reverse.233 = bf16[1,3,512,518]{3,2,1,0} reverse(bf16[1,3,512,518]{3,2,1,0} %concatenate.232), dimensions={2}
  %slice.234 = bf16[1,3,3,518]{3,2,1,0} slice(bf16[1,3,512,518]{3,2,1,0} %reverse.233), slice={[0:1], [0:3], [508:511], [0:518]}
  %slice.235 = bf16[1,3,3,518]{3,2,1,0} slice(bf16[1,3,512,518]{3,2,1,0} %reverse.233), slice={[0:1], [0:3], [1:4], [0:518]}
  %concatenate.236 = bf16[1,3,518,518]{3,2,1,0} concatenate(bf16[1,3,3,518]{3,2,1,0} %slice.234, bf16[1,3,512,518]{3,2,1,0} %concatenate.232, bf16[1,3,3,518]{3,2,1,0} %slice.235), dimensions={2}
  %p172.227 = bf16[64,3,7,7]{3,2,1,0} parameter(172)
  %convolution.237 = bf16[1,64,512,512]{3,2,1,0} convolution(bf16[1,3,518,518]{3,2,1,0} %concatenate.236, bf16[64,3,7,7]{3,2,1,0} %p172.227), window={size=7x7}, dim_labels=bf01_oi01->bf01
  %p171.226 = bf16[64]{0} parameter(171)
  %broadcast.238 = bf16[1,512,512,64]{3,2,1,0} broadcast(bf16[64]{0} %p171.226), dimensions={3}
  %transpose.239 = bf16[1,64,512,512]{1,3,2,0} transpose(bf16[1,512,512,64]{3,2,1,0} %broadcast.238), dimensions={0,3,1,2}
  %add.240 = bf16[1,64,512,512]{3,2,1,0} add(bf16[1,64,512,512]{3,2,1,0} %convolution.237, bf16[1,64,512,512]{1,3,2,0} %transpose.239)
  %p170.225 = bf16[64]{0} parameter(170)
  %p169.224 = bf16[64]{0} parameter(169)
  %p168.223 = bf16[64]{0} parameter(168)
  %batch-norm-inference.241 = bf16[1,64,512,512]{3,2,1,0} batch-norm-inference(bf16[1,64,512,512]{3,2,1,0} %add.240, bf16[64]{0} %p170.225, bf16[64]{0} %p169.224, bf16[64]{0} %p168.223, bf16[64]{0} %p167.222), epsilon=1e-05, feature_index=1
  %constant.246 = bf16[] constant(0)
  %broadcast.247 = bf16[1,64,512,512]{3,2,1,0} broadcast(bf16[] %constant.246), dimensions={}
  %maximum.248 = bf16[1,64,512,512]{3,2,1,0} maximum(bf16[1,64,512,512]{3,2,1,0} %batch-norm-inference.241, bf16[1,64,512,512]{3,2,1,0} %broadcast.247)
  %p166.221 = bf16[128,64,3,3]{3,2,1,0} parameter(166)
  %convolution.249 = bf16[1,128,256,256]{3,2,1,0} convolution(bf16[1,64,512,512]{3,2,1,0} %maximum.248, bf16[128,64,3,3]{3,2,1,0} %p166.221), window={size=3x3 stride=2x2 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01
  %p165.220 = bf16[128]{0} parameter(165)
  %broadcast.250 = bf16[1,256,256,128]{3,2,1,0} broadcast(bf16[128]{0} %p165.220), dimensions={3}
  %transpose.251 = bf16[1,128,256,256]{1,3,2,0} transpose(bf16[1,256,256,128]{3,2,1,0} %broadcast.250), dimensions={0,3,1,2}
  %add.252 = bf16[1,128,256,256]{3,2,1,0} add(bf16[1,128,256,256]{3,2,1,0} %convolution.249, bf16[1,128,256,256]{1,3,2,0} %transpose.251)
  %p164.219 = bf16[128]{0} parameter(164)
  %p163.218 = bf16[128]{0} parameter(163)
  %p162.217 = bf16[128]{0} parameter(162)
  %batch-norm-inference.253 = bf16[1,128,256,256]{3,2,1,0} batch-norm-inference(bf16[1,128,256,256]{3,2,1,0} %add.252, bf16[128]{0} %p164.219, bf16[128]{0} %p163.218, bf16[128]{0} %p162.217, bf16[128]{0} %p161.216), epsilon=1e-05, feature_index=1
  %constant.258 = bf16[] constant(0)
  %broadcast.259 = bf16[1,128,256,256]{3,2,1,0} broadcast(bf16[] %constant.258), dimensions={}
  %maximum.260 = bf16[1,128,256,256]{3,2,1,0} maximum(bf16[1,128,256,256]{3,2,1,0} %batch-norm-inference.253, bf16[1,128,256,256]{3,2,1,0} %broadcast.259)
  %p160.215 = bf16[256,128,3,3]{3,2,1,0} parameter(160)
  %convolution.261 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,128,256,256]{3,2,1,0} %maximum.260, bf16[256,128,3,3]{3,2,1,0} %p160.215), window={size=3x3 stride=2x2 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01
  %p159.214 = bf16[256]{0} parameter(159)
  %broadcast.262 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p159.214), dimensions={3}
  %transpose.263 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.262), dimensions={0,3,1,2}
  %add.264 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.261, bf16[1,256,128,128]{1,3,2,0} %transpose.263)
  %p158.213 = bf16[256]{0} parameter(158)
  %p157.212 = bf16[256]{0} parameter(157)
  %p156.211 = bf16[256]{0} parameter(156)
  %batch-norm-inference.265 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.264, bf16[256]{0} %p158.213, bf16[256]{0} %p157.212, bf16[256]{0} %p156.211, bf16[256]{0} %p155.210), epsilon=1e-05, feature_index=1
  %constant.270 = bf16[] constant(0)
  %broadcast.271 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.270), dimensions={}
  %maximum.272 = bf16[1,256,128,128]{3,2,1,0} maximum(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.265, bf16[1,256,128,128]{3,2,1,0} %broadcast.271)
  %p154.165 = bf16[1,3,512,512]{3,2,1,0} parameter(154)
  %reverse.166 = bf16[1,3,512,512]{3,2,1,0} reverse(bf16[1,3,512,512]{3,2,1,0} %p154.165), dimensions={3}
  %slice.167 = bf16[1,3,512,3]{3,2,1,0} slice(bf16[1,3,512,512]{3,2,1,0} %reverse.166), slice={[0:1], [0:3], [0:512], [508:511]}
  %slice.168 = bf16[1,3,512,3]{3,2,1,0} slice(bf16[1,3,512,512]{3,2,1,0} %reverse.166), slice={[0:1], [0:3], [0:512], [1:4]}
  %concatenate.169 = bf16[1,3,512,518]{3,2,1,0} concatenate(bf16[1,3,512,3]{3,2,1,0} %slice.167, bf16[1,3,512,512]{3,2,1,0} %p154.165, bf16[1,3,512,3]{3,2,1,0} %slice.168), dimensions={3}
  %reverse.170 = bf16[1,3,512,518]{3,2,1,0} reverse(bf16[1,3,512,518]{3,2,1,0} %concatenate.169), dimensions={2}
  %slice.171 = bf16[1,3,3,518]{3,2,1,0} slice(bf16[1,3,512,518]{3,2,1,0} %reverse.170), slice={[0:1], [0:3], [508:511], [0:518]}
  %slice.172 = bf16[1,3,3,518]{3,2,1,0} slice(bf16[1,3,512,518]{3,2,1,0} %reverse.170), slice={[0:1], [0:3], [1:4], [0:518]}
  %concatenate.173 = bf16[1,3,518,518]{3,2,1,0} concatenate(bf16[1,3,3,518]{3,2,1,0} %slice.171, bf16[1,3,512,518]{3,2,1,0} %concatenate.169, bf16[1,3,3,518]{3,2,1,0} %slice.172), dimensions={2}
  %p153.164 = bf16[64,3,7,7]{3,2,1,0} parameter(153)
  %convolution.174 = bf16[1,64,512,512]{3,2,1,0} convolution(bf16[1,3,518,518]{3,2,1,0} %concatenate.173, bf16[64,3,7,7]{3,2,1,0} %p153.164), window={size=7x7}, dim_labels=bf01_oi01->bf01
  %p152.163 = bf16[64]{0} parameter(152)
  %broadcast.175 = bf16[1,512,512,64]{3,2,1,0} broadcast(bf16[64]{0} %p152.163), dimensions={3}
  %transpose.176 = bf16[1,64,512,512]{1,3,2,0} transpose(bf16[1,512,512,64]{3,2,1,0} %broadcast.175), dimensions={0,3,1,2}
  %add.177 = bf16[1,64,512,512]{3,2,1,0} add(bf16[1,64,512,512]{3,2,1,0} %convolution.174, bf16[1,64,512,512]{1,3,2,0} %transpose.176)
  %p151.162 = bf16[64]{0} parameter(151)
  %p150.161 = bf16[64]{0} parameter(150)
  %p149.160 = bf16[64]{0} parameter(149)
  %batch-norm-inference.178 = bf16[1,64,512,512]{3,2,1,0} batch-norm-inference(bf16[1,64,512,512]{3,2,1,0} %add.177, bf16[64]{0} %p151.162, bf16[64]{0} %p150.161, bf16[64]{0} %p149.160, bf16[64]{0} %p148.159), epsilon=1e-05, feature_index=1
  %constant.183 = bf16[] constant(0)
  %broadcast.184 = bf16[1,64,512,512]{3,2,1,0} broadcast(bf16[] %constant.183), dimensions={}
  %maximum.185 = bf16[1,64,512,512]{3,2,1,0} maximum(bf16[1,64,512,512]{3,2,1,0} %batch-norm-inference.178, bf16[1,64,512,512]{3,2,1,0} %broadcast.184)
  %p147.158 = bf16[128,64,3,3]{3,2,1,0} parameter(147)
  %convolution.186 = bf16[1,128,256,256]{3,2,1,0} convolution(bf16[1,64,512,512]{3,2,1,0} %maximum.185, bf16[128,64,3,3]{3,2,1,0} %p147.158), window={size=3x3 stride=2x2 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01
  %p146.157 = bf16[128]{0} parameter(146)
  %broadcast.187 = bf16[1,256,256,128]{3,2,1,0} broadcast(bf16[128]{0} %p146.157), dimensions={3}
  %transpose.188 = bf16[1,128,256,256]{1,3,2,0} transpose(bf16[1,256,256,128]{3,2,1,0} %broadcast.187), dimensions={0,3,1,2}
  %add.189 = bf16[1,128,256,256]{3,2,1,0} add(bf16[1,128,256,256]{3,2,1,0} %convolution.186, bf16[1,128,256,256]{1,3,2,0} %transpose.188)
  %p145.156 = bf16[128]{0} parameter(145)
  %p144.155 = bf16[128]{0} parameter(144)
  %p143.154 = bf16[128]{0} parameter(143)
  %batch-norm-inference.190 = bf16[1,128,256,256]{3,2,1,0} batch-norm-inference(bf16[1,128,256,256]{3,2,1,0} %add.189, bf16[128]{0} %p145.156, bf16[128]{0} %p144.155, bf16[128]{0} %p143.154, bf16[128]{0} %p142.153), epsilon=1e-05, feature_index=1
  %constant.195 = bf16[] constant(0)
  %broadcast.196 = bf16[1,128,256,256]{3,2,1,0} broadcast(bf16[] %constant.195), dimensions={}
  %maximum.197 = bf16[1,128,256,256]{3,2,1,0} maximum(bf16[1,128,256,256]{3,2,1,0} %batch-norm-inference.190, bf16[1,128,256,256]{3,2,1,0} %broadcast.196)
  %p141.152 = bf16[256,128,3,3]{3,2,1,0} parameter(141)
  %convolution.198 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,128,256,256]{3,2,1,0} %maximum.197, bf16[256,128,3,3]{3,2,1,0} %p141.152), window={size=3x3 stride=2x2 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01
  %p140.151 = bf16[256]{0} parameter(140)
  %broadcast.199 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p140.151), dimensions={3}
  %transpose.200 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.199), dimensions={0,3,1,2}
  %add.201 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.198, bf16[1,256,128,128]{1,3,2,0} %transpose.200)
  %p139.150 = bf16[256]{0} parameter(139)
  %p138.149 = bf16[256]{0} parameter(138)
  %p137.148 = bf16[256]{0} parameter(137)
  %batch-norm-inference.202 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.201, bf16[256]{0} %p139.150, bf16[256]{0} %p138.149, bf16[256]{0} %p137.148, bf16[256]{0} %p136.147), epsilon=1e-05, feature_index=1
  %constant.207 = bf16[] constant(0)
  %broadcast.208 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.207), dimensions={}
  %maximum.209 = bf16[1,256,128,128]{3,2,1,0} maximum(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.202, bf16[1,256,128,128]{3,2,1,0} %broadcast.208)
  %concatenate.366 = bf16[1,512,128,128]{3,2,1,0} concatenate(bf16[1,256,128,128]{3,2,1,0} %maximum.272, bf16[1,256,128,128]{3,2,1,0} %maximum.209), dimensions={1}
  %p202.365 = bf16[64,512,1,1]{3,2,1,0} parameter(202)
  %convolution.367 = bf16[1,64,128,128]{3,2,1,0} convolution(bf16[1,512,128,128]{3,2,1,0} %concatenate.366, bf16[64,512,1,1]{3,2,1,0} %p202.365), window={size=1x1}, dim_labels=bf01_oi01->bf01
  %p201.364 = bf16[64]{0} parameter(201)
  %p200.363 = bf16[64]{0} parameter(200)
  %p199.362 = bf16[64]{0} parameter(199)
  %batch-norm-inference.368 = bf16[1,64,128,128]{3,2,1,0} batch-norm-inference(bf16[1,64,128,128]{3,2,1,0} %convolution.367, bf16[64]{0} %p201.364, bf16[64]{0} %p200.363, bf16[64]{0} %p199.362, bf16[64]{0} %p198.361), epsilon=1e-05, feature_index=1
  %constant.373 = bf16[] constant(0)
  %broadcast.374 = bf16[1,64,128,128]{3,2,1,0} broadcast(bf16[] %constant.373), dimensions={}
  %maximum.375 = bf16[1,64,128,128]{3,2,1,0} maximum(bf16[1,64,128,128]{3,2,1,0} %batch-norm-inference.368, bf16[1,64,128,128]{3,2,1,0} %broadcast.374)
  %p197.306 = bf16[1,1,512,512]{3,2,1,0} parameter(197)
  %reverse.307 = bf16[1,1,512,512]{3,2,1,0} reverse(bf16[1,1,512,512]{3,2,1,0} %p197.306), dimensions={3}
  %slice.308 = bf16[1,1,512,3]{3,2,1,0} slice(bf16[1,1,512,512]{3,2,1,0} %reverse.307), slice={[0:1], [0:1], [0:512], [508:511]}
  %slice.309 = bf16[1,1,512,3]{3,2,1,0} slice(bf16[1,1,512,512]{3,2,1,0} %reverse.307), slice={[0:1], [0:1], [0:512], [1:4]}
  %concatenate.310 = bf16[1,1,512,518]{3,2,1,0} concatenate(bf16[1,1,512,3]{3,2,1,0} %slice.308, bf16[1,1,512,512]{3,2,1,0} %p197.306, bf16[1,1,512,3]{3,2,1,0} %slice.309), dimensions={3}
  %reverse.311 = bf16[1,1,512,518]{3,2,1,0} reverse(bf16[1,1,512,518]{3,2,1,0} %concatenate.310), dimensions={2}
  %slice.312 = bf16[1,1,3,518]{3,2,1,0} slice(bf16[1,1,512,518]{3,2,1,0} %reverse.311), slice={[0:1], [0:1], [508:511], [0:518]}
  %slice.313 = bf16[1,1,3,518]{3,2,1,0} slice(bf16[1,1,512,518]{3,2,1,0} %reverse.311), slice={[0:1], [0:1], [1:4], [0:518]}
  %concatenate.314 = bf16[1,1,518,518]{3,2,1,0} concatenate(bf16[1,1,3,518]{3,2,1,0} %slice.312, bf16[1,1,512,518]{3,2,1,0} %concatenate.310, bf16[1,1,3,518]{3,2,1,0} %slice.313), dimensions={2}
  %p196.305 = bf16[64,1,7,7]{3,2,1,0} parameter(196)
  %convolution.315 = bf16[1,64,512,512]{3,2,1,0} convolution(bf16[1,1,518,518]{3,2,1,0} %concatenate.314, bf16[64,1,7,7]{3,2,1,0} %p196.305), window={size=7x7}, dim_labels=bf01_oi01->bf01
  %p195.304 = bf16[64]{0} parameter(195)
  %broadcast.316 = bf16[1,512,512,64]{3,2,1,0} broadcast(bf16[64]{0} %p195.304), dimensions={3}
  %transpose.317 = bf16[1,64,512,512]{1,3,2,0} transpose(bf16[1,512,512,64]{3,2,1,0} %broadcast.316), dimensions={0,3,1,2}
  %add.318 = bf16[1,64,512,512]{3,2,1,0} add(bf16[1,64,512,512]{3,2,1,0} %convolution.315, bf16[1,64,512,512]{1,3,2,0} %transpose.317)
  %p194.303 = bf16[64]{0} parameter(194)
  %p193.302 = bf16[64]{0} parameter(193)
  %p192.301 = bf16[64]{0} parameter(192)
  %batch-norm-inference.319 = bf16[1,64,512,512]{3,2,1,0} batch-norm-inference(bf16[1,64,512,512]{3,2,1,0} %add.318, bf16[64]{0} %p194.303, bf16[64]{0} %p193.302, bf16[64]{0} %p192.301, bf16[64]{0} %p191.300), epsilon=1e-05, feature_index=1
  %constant.324 = bf16[] constant(0)
  %broadcast.325 = bf16[1,64,512,512]{3,2,1,0} broadcast(bf16[] %constant.324), dimensions={}
  %maximum.326 = bf16[1,64,512,512]{3,2,1,0} maximum(bf16[1,64,512,512]{3,2,1,0} %batch-norm-inference.319, bf16[1,64,512,512]{3,2,1,0} %broadcast.325)
  %p190.299 = bf16[128,64,3,3]{3,2,1,0} parameter(190)
  %convolution.327 = bf16[1,128,256,256]{3,2,1,0} convolution(bf16[1,64,512,512]{3,2,1,0} %maximum.326, bf16[128,64,3,3]{3,2,1,0} %p190.299), window={size=3x3 stride=2x2 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01
  %p189.298 = bf16[128]{0} parameter(189)
  %broadcast.328 = bf16[1,256,256,128]{3,2,1,0} broadcast(bf16[128]{0} %p189.298), dimensions={3}
  %transpose.329 = bf16[1,128,256,256]{1,3,2,0} transpose(bf16[1,256,256,128]{3,2,1,0} %broadcast.328), dimensions={0,3,1,2}
  %add.330 = bf16[1,128,256,256]{3,2,1,0} add(bf16[1,128,256,256]{3,2,1,0} %convolution.327, bf16[1,128,256,256]{1,3,2,0} %transpose.329)
  %p188.297 = bf16[128]{0} parameter(188)
  %p187.296 = bf16[128]{0} parameter(187)
  %p186.295 = bf16[128]{0} parameter(186)
  %batch-norm-inference.331 = bf16[1,128,256,256]{3,2,1,0} batch-norm-inference(bf16[1,128,256,256]{3,2,1,0} %add.330, bf16[128]{0} %p188.297, bf16[128]{0} %p187.296, bf16[128]{0} %p186.295, bf16[128]{0} %p185.294), epsilon=1e-05, feature_index=1
  %constant.336 = bf16[] constant(0)
  %broadcast.337 = bf16[1,128,256,256]{3,2,1,0} broadcast(bf16[] %constant.336), dimensions={}
  %maximum.338 = bf16[1,128,256,256]{3,2,1,0} maximum(bf16[1,128,256,256]{3,2,1,0} %batch-norm-inference.331, bf16[1,128,256,256]{3,2,1,0} %broadcast.337)
  %p184.293 = bf16[256,128,3,3]{3,2,1,0} parameter(184)
  %convolution.339 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,128,256,256]{3,2,1,0} %maximum.338, bf16[256,128,3,3]{3,2,1,0} %p184.293), window={size=3x3 stride=2x2 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01
  %p183.292 = bf16[256]{0} parameter(183)
  %broadcast.340 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p183.292), dimensions={3}
  %transpose.341 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.340), dimensions={0,3,1,2}
  %add.342 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.339, bf16[1,256,128,128]{1,3,2,0} %transpose.341)
  %p182.291 = bf16[256]{0} parameter(182)
  %p181.290 = bf16[256]{0} parameter(181)
  %p180.289 = bf16[256]{0} parameter(180)
  %batch-norm-inference.343 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.342, bf16[256]{0} %p182.291, bf16[256]{0} %p181.290, bf16[256]{0} %p180.289, bf16[256]{0} %p179.288), epsilon=1e-05, feature_index=1
  %constant.348 = bf16[] constant(0)
  %broadcast.349 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.348), dimensions={}
  %maximum.350 = bf16[1,256,128,128]{3,2,1,0} maximum(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.343, bf16[1,256,128,128]{3,2,1,0} %broadcast.349)
  %concatenate.351 = bf16[1,512,128,128]{3,2,1,0} concatenate(bf16[1,256,128,128]{3,2,1,0} %maximum.272, bf16[1,256,128,128]{3,2,1,0} %maximum.350), dimensions={1}
  %p178.287 = bf16[64,512,1,1]{3,2,1,0} parameter(178)
  %convolution.352 = bf16[1,64,128,128]{3,2,1,0} convolution(bf16[1,512,128,128]{3,2,1,0} %concatenate.351, bf16[64,512,1,1]{3,2,1,0} %p178.287), window={size=1x1}, dim_labels=bf01_oi01->bf01
  %p177.286 = bf16[64]{0} parameter(177)
  %p176.285 = bf16[64]{0} parameter(176)
  %p175.284 = bf16[64]{0} parameter(175)
  %batch-norm-inference.353 = bf16[1,64,128,128]{3,2,1,0} batch-norm-inference(bf16[1,64,128,128]{3,2,1,0} %convolution.352, bf16[64]{0} %p177.286, bf16[64]{0} %p176.285, bf16[64]{0} %p175.284, bf16[64]{0} %p174.283), epsilon=1e-05, feature_index=1
  %constant.358 = bf16[] constant(0)
  %broadcast.359 = bf16[1,64,128,128]{3,2,1,0} broadcast(bf16[] %constant.358), dimensions={}
  %maximum.360 = bf16[1,64,128,128]{3,2,1,0} maximum(bf16[1,64,128,128]{3,2,1,0} %batch-norm-inference.353, bf16[1,64,128,128]{3,2,1,0} %broadcast.359)
  %concatenate.273 = bf16[1,512,128,128]{3,2,1,0} concatenate(bf16[1,256,128,128]{3,2,1,0} %maximum.272, bf16[1,256,128,128]{3,2,1,0} %maximum.209), dimensions={1}
  %p135.146 = bf16[64,512,1,1]{3,2,1,0} parameter(135)
  %convolution.274 = bf16[1,64,128,128]{3,2,1,0} convolution(bf16[1,512,128,128]{3,2,1,0} %concatenate.273, bf16[64,512,1,1]{3,2,1,0} %p135.146), window={size=1x1}, dim_labels=bf01_oi01->bf01
  %p134.145 = bf16[64]{0} parameter(134)
  %p133.144 = bf16[64]{0} parameter(133)
  %p132.143 = bf16[64]{0} parameter(132)
  %batch-norm-inference.275 = bf16[1,64,128,128]{3,2,1,0} batch-norm-inference(bf16[1,64,128,128]{3,2,1,0} %convolution.274, bf16[64]{0} %p134.145, bf16[64]{0} %p133.144, bf16[64]{0} %p132.143, bf16[64]{0} %p131.142), epsilon=1e-05, feature_index=1
  %constant.280 = bf16[] constant(0)
  %broadcast.281 = bf16[1,64,128,128]{3,2,1,0} broadcast(bf16[] %constant.280), dimensions={}
  %maximum.282 = bf16[1,64,128,128]{3,2,1,0} maximum(bf16[1,64,128,128]{3,2,1,0} %batch-norm-inference.275, bf16[1,64,128,128]{3,2,1,0} %broadcast.281)
  %concatenate.376 = bf16[1,192,128,128]{3,2,1,0} concatenate(bf16[1,64,128,128]{3,2,1,0} %maximum.375, bf16[1,64,128,128]{3,2,1,0} %maximum.360, bf16[1,64,128,128]{3,2,1,0} %maximum.282), dimensions={1}
  %concatenate.377 = bf16[1,448,128,128]{3,2,1,0} concatenate(bf16[1,256,128,128]{3,2,1,0} %maximum.272, bf16[1,192,128,128]{3,2,1,0} %concatenate.376), dimensions={1}
  %p130.141 = bf16[256,448,1,1]{3,2,1,0} parameter(130)
  %convolution.378 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,448,128,128]{3,2,1,0} %concatenate.377, bf16[256,448,1,1]{3,2,1,0} %p130.141), window={size=1x1}, dim_labels=bf01_oi01->bf01
  %p129.140 = bf16[256]{0} parameter(129)
  %p128.139 = bf16[256]{0} parameter(128)
  %p127.138 = bf16[256]{0} parameter(127)
  %batch-norm-inference.379 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %convolution.378, bf16[256]{0} %p129.140, bf16[256]{0} %p128.139, bf16[256]{0} %p127.138, bf16[256]{0} %p126.137), epsilon=1e-05, feature_index=1
  %constant.384 = bf16[] constant(0)
  %broadcast.385 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.384), dimensions={}
  %maximum.386 = bf16[1,256,128,128]{3,2,1,0} maximum(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.379, bf16[1,256,128,128]{3,2,1,0} %broadcast.385)
  %reverse.387 = bf16[1,256,128,128]{3,2,1,0} reverse(bf16[1,256,128,128]{3,2,1,0} %maximum.386), dimensions={3}
  %slice.388 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.387), slice={[0:1], [0:256], [0:128], [126:127]}
  %slice.389 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.387), slice={[0:1], [0:256], [0:128], [1:2]}
  %concatenate.390 = bf16[1,256,128,130]{3,2,1,0} concatenate(bf16[1,256,128,1]{3,2,1,0} %slice.388, bf16[1,256,128,128]{3,2,1,0} %maximum.386, bf16[1,256,128,1]{3,2,1,0} %slice.389), dimensions={3}
  %reverse.391 = bf16[1,256,128,130]{3,2,1,0} reverse(bf16[1,256,128,130]{3,2,1,0} %concatenate.390), dimensions={2}
  %slice.392 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.391), slice={[0:1], [0:256], [126:127], [0:130]}
  %slice.393 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.391), slice={[0:1], [0:256], [1:2], [0:130]}
  %concatenate.394 = bf16[1,256,130,130]{3,2,1,0} concatenate(bf16[1,256,1,130]{3,2,1,0} %slice.392, bf16[1,256,128,130]{3,2,1,0} %concatenate.390, bf16[1,256,1,130]{3,2,1,0} %slice.393), dimensions={2}
  %p125.136 = bf16[256,256,3,3]{3,2,1,0} parameter(125)
  %convolution.395 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,256,130,130]{3,2,1,0} %concatenate.394, bf16[256,256,3,3]{3,2,1,0} %p125.136), window={size=3x3}, dim_labels=bf01_oi01->bf01
  %p124.135 = bf16[256]{0} parameter(124)
  %broadcast.396 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p124.135), dimensions={3}
  %transpose.397 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.396), dimensions={0,3,1,2}
  %add.398 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.395, bf16[1,256,128,128]{1,3,2,0} %transpose.397)
  %p123.134 = bf16[256]{0} parameter(123)
  %p122.133 = bf16[256]{0} parameter(122)
  %p121.132 = bf16[256]{0} parameter(121)
  %batch-norm-inference.399 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.398, bf16[256]{0} %p123.134, bf16[256]{0} %p122.133, bf16[256]{0} %p121.132, bf16[256]{0} %p120.131), epsilon=1e-05, feature_index=1
  %constant.404 = bf16[] constant(0)
  %broadcast.405 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.404), dimensions={}
  %maximum.406 = bf16[1,256,128,128]{3,2,1,0} maximum(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.399, bf16[1,256,128,128]{3,2,1,0} %broadcast.405)
  %reverse.407 = bf16[1,256,128,128]{3,2,1,0} reverse(bf16[1,256,128,128]{3,2,1,0} %maximum.406), dimensions={3}
  %slice.408 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.407), slice={[0:1], [0:256], [0:128], [126:127]}
  %slice.409 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.407), slice={[0:1], [0:256], [0:128], [1:2]}
  %concatenate.410 = bf16[1,256,128,130]{3,2,1,0} concatenate(bf16[1,256,128,1]{3,2,1,0} %slice.408, bf16[1,256,128,128]{3,2,1,0} %maximum.406, bf16[1,256,128,1]{3,2,1,0} %slice.409), dimensions={3}
  %reverse.411 = bf16[1,256,128,130]{3,2,1,0} reverse(bf16[1,256,128,130]{3,2,1,0} %concatenate.410), dimensions={2}
  %slice.412 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.411), slice={[0:1], [0:256], [126:127], [0:130]}
  %slice.413 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.411), slice={[0:1], [0:256], [1:2], [0:130]}
  %concatenate.414 = bf16[1,256,130,130]{3,2,1,0} concatenate(bf16[1,256,1,130]{3,2,1,0} %slice.412, bf16[1,256,128,130]{3,2,1,0} %concatenate.410, bf16[1,256,1,130]{3,2,1,0} %slice.413), dimensions={2}
  %p119.130 = bf16[256,256,3,3]{3,2,1,0} parameter(119)
  %convolution.415 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,256,130,130]{3,2,1,0} %concatenate.414, bf16[256,256,3,3]{3,2,1,0} %p119.130), window={size=3x3}, dim_labels=bf01_oi01->bf01
  %p118.129 = bf16[256]{0} parameter(118)
  %broadcast.416 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p118.129), dimensions={3}
  %transpose.417 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.416), dimensions={0,3,1,2}
  %add.418 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.415, bf16[1,256,128,128]{1,3,2,0} %transpose.417)
  %p117.128 = bf16[256]{0} parameter(117)
  %p116.127 = bf16[256]{0} parameter(116)
  %p115.126 = bf16[256]{0} parameter(115)
  %batch-norm-inference.419 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.418, bf16[256]{0} %p117.128, bf16[256]{0} %p116.127, bf16[256]{0} %p115.126, bf16[256]{0} %p114.125), epsilon=1e-05, feature_index=1
  %constant.124 = bf16[] constant(1)
  %broadcast.424 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.124), dimensions={}
  %multiply.425 = bf16[1,256,128,128]{3,2,1,0} multiply(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.419, bf16[1,256,128,128]{3,2,1,0} %broadcast.424)
  %add.426 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %maximum.386, bf16[1,256,128,128]{3,2,1,0} %multiply.425)
  %reverse.427 = bf16[1,256,128,128]{3,2,1,0} reverse(bf16[1,256,128,128]{3,2,1,0} %add.426), dimensions={3}
  %slice.428 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.427), slice={[0:1], [0:256], [0:128], [126:127]}
  %slice.429 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.427), slice={[0:1], [0:256], [0:128], [1:2]}
  %concatenate.430 = bf16[1,256,128,130]{3,2,1,0} concatenate(bf16[1,256,128,1]{3,2,1,0} %slice.428, bf16[1,256,128,128]{3,2,1,0} %add.426, bf16[1,256,128,1]{3,2,1,0} %slice.429), dimensions={3}
  %reverse.431 = bf16[1,256,128,130]{3,2,1,0} reverse(bf16[1,256,128,130]{3,2,1,0} %concatenate.430), dimensions={2}
  %slice.432 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.431), slice={[0:1], [0:256], [126:127], [0:130]}
  %slice.433 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.431), slice={[0:1], [0:256], [1:2], [0:130]}
  %concatenate.434 = bf16[1,256,130,130]{3,2,1,0} concatenate(bf16[1,256,1,130]{3,2,1,0} %slice.432, bf16[1,256,128,130]{3,2,1,0} %concatenate.430, bf16[1,256,1,130]{3,2,1,0} %slice.433), dimensions={2}
  %p113.123 = bf16[256,256,3,3]{3,2,1,0} parameter(113)
  %convolution.435 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,256,130,130]{3,2,1,0} %concatenate.434, bf16[256,256,3,3]{3,2,1,0} %p113.123), window={size=3x3}, dim_labels=bf01_oi01->bf01
  %p112.122 = bf16[256]{0} parameter(112)
  %broadcast.436 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p112.122), dimensions={3}
  %transpose.437 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.436), dimensions={0,3,1,2}
  %add.438 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.435, bf16[1,256,128,128]{1,3,2,0} %transpose.437)
  %p111.121 = bf16[256]{0} parameter(111)
  %p110.120 = bf16[256]{0} parameter(110)
  %p109.119 = bf16[256]{0} parameter(109)
  %batch-norm-inference.439 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.438, bf16[256]{0} %p111.121, bf16[256]{0} %p110.120, bf16[256]{0} %p109.119, bf16[256]{0} %p108.118), epsilon=1e-05, feature_index=1
  %constant.444 = bf16[] constant(0)
  %broadcast.445 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.444), dimensions={}
  %maximum.446 = bf16[1,256,128,128]{3,2,1,0} maximum(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.439, bf16[1,256,128,128]{3,2,1,0} %broadcast.445)
  %reverse.447 = bf16[1,256,128,128]{3,2,1,0} reverse(bf16[1,256,128,128]{3,2,1,0} %maximum.446), dimensions={3}
  %slice.448 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.447), slice={[0:1], [0:256], [0:128], [126:127]}
  %slice.449 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.447), slice={[0:1], [0:256], [0:128], [1:2]}
  %concatenate.450 = bf16[1,256,128,130]{3,2,1,0} concatenate(bf16[1,256,128,1]{3,2,1,0} %slice.448, bf16[1,256,128,128]{3,2,1,0} %maximum.446, bf16[1,256,128,1]{3,2,1,0} %slice.449), dimensions={3}
  %reverse.451 = bf16[1,256,128,130]{3,2,1,0} reverse(bf16[1,256,128,130]{3,2,1,0} %concatenate.450), dimensions={2}
  %slice.452 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.451), slice={[0:1], [0:256], [126:127], [0:130]}
  %slice.453 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.451), slice={[0:1], [0:256], [1:2], [0:130]}
  %concatenate.454 = bf16[1,256,130,130]{3,2,1,0} concatenate(bf16[1,256,1,130]{3,2,1,0} %slice.452, bf16[1,256,128,130]{3,2,1,0} %concatenate.450, bf16[1,256,1,130]{3,2,1,0} %slice.453), dimensions={2}
  %p107.117 = bf16[256,256,3,3]{3,2,1,0} parameter(107)
  %convolution.455 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,256,130,130]{3,2,1,0} %concatenate.454, bf16[256,256,3,3]{3,2,1,0} %p107.117), window={size=3x3}, dim_labels=bf01_oi01->bf01
  %p106.116 = bf16[256]{0} parameter(106)
  %broadcast.456 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p106.116), dimensions={3}
  %transpose.457 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.456), dimensions={0,3,1,2}
  %add.458 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.455, bf16[1,256,128,128]{1,3,2,0} %transpose.457)
  %p105.115 = bf16[256]{0} parameter(105)
  %p104.114 = bf16[256]{0} parameter(104)
  %p103.113 = bf16[256]{0} parameter(103)
  %batch-norm-inference.459 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.458, bf16[256]{0} %p105.115, bf16[256]{0} %p104.114, bf16[256]{0} %p103.113, bf16[256]{0} %p102.112), epsilon=1e-05, feature_index=1
  %constant.111 = bf16[] constant(1)
  %broadcast.464 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.111), dimensions={}
  %multiply.465 = bf16[1,256,128,128]{3,2,1,0} multiply(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.459, bf16[1,256,128,128]{3,2,1,0} %broadcast.464)
  %add.466 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %add.426, bf16[1,256,128,128]{3,2,1,0} %multiply.465)
  %reverse.467 = bf16[1,256,128,128]{3,2,1,0} reverse(bf16[1,256,128,128]{3,2,1,0} %add.466), dimensions={3}
  %slice.468 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.467), slice={[0:1], [0:256], [0:128], [126:127]}
  %slice.469 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.467), slice={[0:1], [0:256], [0:128], [1:2]}
  %concatenate.470 = bf16[1,256,128,130]{3,2,1,0} concatenate(bf16[1,256,128,1]{3,2,1,0} %slice.468, bf16[1,256,128,128]{3,2,1,0} %add.466, bf16[1,256,128,1]{3,2,1,0} %slice.469), dimensions={3}
  %reverse.471 = bf16[1,256,128,130]{3,2,1,0} reverse(bf16[1,256,128,130]{3,2,1,0} %concatenate.470), dimensions={2}
  %slice.472 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.471), slice={[0:1], [0:256], [126:127], [0:130]}
  %slice.473 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.471), slice={[0:1], [0:256], [1:2], [0:130]}
  %concatenate.474 = bf16[1,256,130,130]{3,2,1,0} concatenate(bf16[1,256,1,130]{3,2,1,0} %slice.472, bf16[1,256,128,130]{3,2,1,0} %concatenate.470, bf16[1,256,1,130]{3,2,1,0} %slice.473), dimensions={2}
  %p101.110 = bf16[256,256,3,3]{3,2,1,0} parameter(101)
  %convolution.475 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,256,130,130]{3,2,1,0} %concatenate.474, bf16[256,256,3,3]{3,2,1,0} %p101.110), window={size=3x3}, dim_labels=bf01_oi01->bf01
  %p100.109 = bf16[256]{0} parameter(100)
  %broadcast.476 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p100.109), dimensions={3}
  %transpose.477 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.476), dimensions={0,3,1,2}
  %add.478 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.475, bf16[1,256,128,128]{1,3,2,0} %transpose.477)
  %p99.108 = bf16[256]{0} parameter(99)
  %p98.107 = bf16[256]{0} parameter(98)
  %p97.106 = bf16[256]{0} parameter(97)
  %batch-norm-inference.479 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.478, bf16[256]{0} %p99.108, bf16[256]{0} %p98.107, bf16[256]{0} %p97.106, bf16[256]{0} %p96.105), epsilon=1e-05, feature_index=1
  %constant.484 = bf16[] constant(0)
  %broadcast.485 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.484), dimensions={}
  %maximum.486 = bf16[1,256,128,128]{3,2,1,0} maximum(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.479, bf16[1,256,128,128]{3,2,1,0} %broadcast.485)
  %reverse.487 = bf16[1,256,128,128]{3,2,1,0} reverse(bf16[1,256,128,128]{3,2,1,0} %maximum.486), dimensions={3}
  %slice.488 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.487), slice={[0:1], [0:256], [0:128], [126:127]}
  %slice.489 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.487), slice={[0:1], [0:256], [0:128], [1:2]}
  %concatenate.490 = bf16[1,256,128,130]{3,2,1,0} concatenate(bf16[1,256,128,1]{3,2,1,0} %slice.488, bf16[1,256,128,128]{3,2,1,0} %maximum.486, bf16[1,256,128,1]{3,2,1,0} %slice.489), dimensions={3}
  %reverse.491 = bf16[1,256,128,130]{3,2,1,0} reverse(bf16[1,256,128,130]{3,2,1,0} %concatenate.490), dimensions={2}
  %slice.492 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.491), slice={[0:1], [0:256], [126:127], [0:130]}
  %slice.493 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.491), slice={[0:1], [0:256], [1:2], [0:130]}
  %concatenate.494 = bf16[1,256,130,130]{3,2,1,0} concatenate(bf16[1,256,1,130]{3,2,1,0} %slice.492, bf16[1,256,128,130]{3,2,1,0} %concatenate.490, bf16[1,256,1,130]{3,2,1,0} %slice.493), dimensions={2}
  %p95.104 = bf16[256,256,3,3]{3,2,1,0} parameter(95)
  %convolution.495 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,256,130,130]{3,2,1,0} %concatenate.494, bf16[256,256,3,3]{3,2,1,0} %p95.104), window={size=3x3}, dim_labels=bf01_oi01->bf01
  %p94.103 = bf16[256]{0} parameter(94)
  %broadcast.496 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p94.103), dimensions={3}
  %transpose.497 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.496), dimensions={0,3,1,2}
  %add.498 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.495, bf16[1,256,128,128]{1,3,2,0} %transpose.497)
  %p93.102 = bf16[256]{0} parameter(93)
  %p92.101 = bf16[256]{0} parameter(92)
  %p91.100 = bf16[256]{0} parameter(91)
  %batch-norm-inference.499 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.498, bf16[256]{0} %p93.102, bf16[256]{0} %p92.101, bf16[256]{0} %p91.100, bf16[256]{0} %p90.99), epsilon=1e-05, feature_index=1
  %constant.98 = bf16[] constant(1)
  %broadcast.504 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.98), dimensions={}
  %multiply.505 = bf16[1,256,128,128]{3,2,1,0} multiply(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.499, bf16[1,256,128,128]{3,2,1,0} %broadcast.504)
  %add.506 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %add.466, bf16[1,256,128,128]{3,2,1,0} %multiply.505)
  %reverse.507 = bf16[1,256,128,128]{3,2,1,0} reverse(bf16[1,256,128,128]{3,2,1,0} %add.506), dimensions={3}
  %slice.508 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.507), slice={[0:1], [0:256], [0:128], [126:127]}
  %slice.509 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.507), slice={[0:1], [0:256], [0:128], [1:2]}
  %concatenate.510 = bf16[1,256,128,130]{3,2,1,0} concatenate(bf16[1,256,128,1]{3,2,1,0} %slice.508, bf16[1,256,128,128]{3,2,1,0} %add.506, bf16[1,256,128,1]{3,2,1,0} %slice.509), dimensions={3}
  %reverse.511 = bf16[1,256,128,130]{3,2,1,0} reverse(bf16[1,256,128,130]{3,2,1,0} %concatenate.510), dimensions={2}
  %slice.512 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.511), slice={[0:1], [0:256], [126:127], [0:130]}
  %slice.513 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.511), slice={[0:1], [0:256], [1:2], [0:130]}
  %concatenate.514 = bf16[1,256,130,130]{3,2,1,0} concatenate(bf16[1,256,1,130]{3,2,1,0} %slice.512, bf16[1,256,128,130]{3,2,1,0} %concatenate.510, bf16[1,256,1,130]{3,2,1,0} %slice.513), dimensions={2}
  %p89.97 = bf16[256,256,3,3]{3,2,1,0} parameter(89)
  %convolution.515 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,256,130,130]{3,2,1,0} %concatenate.514, bf16[256,256,3,3]{3,2,1,0} %p89.97), window={size=3x3}, dim_labels=bf01_oi01->bf01
  %p88.96 = bf16[256]{0} parameter(88)
  %broadcast.516 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p88.96), dimensions={3}
  %transpose.517 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.516), dimensions={0,3,1,2}
  %add.518 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.515, bf16[1,256,128,128]{1,3,2,0} %transpose.517)
  %p87.95 = bf16[256]{0} parameter(87)
  %p86.94 = bf16[256]{0} parameter(86)
  %p85.93 = bf16[256]{0} parameter(85)
  %batch-norm-inference.519 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.518, bf16[256]{0} %p87.95, bf16[256]{0} %p86.94, bf16[256]{0} %p85.93, bf16[256]{0} %p84.92), epsilon=1e-05, feature_index=1
  %constant.524 = bf16[] constant(0)
  %broadcast.525 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.524), dimensions={}
  %maximum.526 = bf16[1,256,128,128]{3,2,1,0} maximum(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.519, bf16[1,256,128,128]{3,2,1,0} %broadcast.525)
  %reverse.527 = bf16[1,256,128,128]{3,2,1,0} reverse(bf16[1,256,128,128]{3,2,1,0} %maximum.526), dimensions={3}
  %slice.528 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.527), slice={[0:1], [0:256], [0:128], [126:127]}
  %slice.529 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.527), slice={[0:1], [0:256], [0:128], [1:2]}
  %concatenate.530 = bf16[1,256,128,130]{3,2,1,0} concatenate(bf16[1,256,128,1]{3,2,1,0} %slice.528, bf16[1,256,128,128]{3,2,1,0} %maximum.526, bf16[1,256,128,1]{3,2,1,0} %slice.529), dimensions={3}
  %reverse.531 = bf16[1,256,128,130]{3,2,1,0} reverse(bf16[1,256,128,130]{3,2,1,0} %concatenate.530), dimensions={2}
  %slice.532 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.531), slice={[0:1], [0:256], [126:127], [0:130]}
  %slice.533 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.531), slice={[0:1], [0:256], [1:2], [0:130]}
  %concatenate.534 = bf16[1,256,130,130]{3,2,1,0} concatenate(bf16[1,256,1,130]{3,2,1,0} %slice.532, bf16[1,256,128,130]{3,2,1,0} %concatenate.530, bf16[1,256,1,130]{3,2,1,0} %slice.533), dimensions={2}
  %p83.91 = bf16[256,256,3,3]{3,2,1,0} parameter(83)
  %convolution.535 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,256,130,130]{3,2,1,0} %concatenate.534, bf16[256,256,3,3]{3,2,1,0} %p83.91), window={size=3x3}, dim_labels=bf01_oi01->bf01
  %p82.90 = bf16[256]{0} parameter(82)
  %broadcast.536 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p82.90), dimensions={3}
  %transpose.537 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.536), dimensions={0,3,1,2}
  %add.538 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.535, bf16[1,256,128,128]{1,3,2,0} %transpose.537)
  %p81.89 = bf16[256]{0} parameter(81)
  %p80.88 = bf16[256]{0} parameter(80)
  %p79.87 = bf16[256]{0} parameter(79)
  %batch-norm-inference.539 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.538, bf16[256]{0} %p81.89, bf16[256]{0} %p80.88, bf16[256]{0} %p79.87, bf16[256]{0} %p78.86), epsilon=1e-05, feature_index=1
  %constant.85 = bf16[] constant(1)
  %broadcast.544 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.85), dimensions={}
  %multiply.545 = bf16[1,256,128,128]{3,2,1,0} multiply(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.539, bf16[1,256,128,128]{3,2,1,0} %broadcast.544)
  %add.546 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %add.506, bf16[1,256,128,128]{3,2,1,0} %multiply.545)
  %reverse.547 = bf16[1,256,128,128]{3,2,1,0} reverse(bf16[1,256,128,128]{3,2,1,0} %add.546), dimensions={3}
  %slice.548 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.547), slice={[0:1], [0:256], [0:128], [126:127]}
  %slice.549 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.547), slice={[0:1], [0:256], [0:128], [1:2]}
  %concatenate.550 = bf16[1,256,128,130]{3,2,1,0} concatenate(bf16[1,256,128,1]{3,2,1,0} %slice.548, bf16[1,256,128,128]{3,2,1,0} %add.546, bf16[1,256,128,1]{3,2,1,0} %slice.549), dimensions={3}
  %reverse.551 = bf16[1,256,128,130]{3,2,1,0} reverse(bf16[1,256,128,130]{3,2,1,0} %concatenate.550), dimensions={2}
  %slice.552 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.551), slice={[0:1], [0:256], [126:127], [0:130]}
  %slice.553 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.551), slice={[0:1], [0:256], [1:2], [0:130]}
  %concatenate.554 = bf16[1,256,130,130]{3,2,1,0} concatenate(bf16[1,256,1,130]{3,2,1,0} %slice.552, bf16[1,256,128,130]{3,2,1,0} %concatenate.550, bf16[1,256,1,130]{3,2,1,0} %slice.553), dimensions={2}
  %p77.84 = bf16[256,256,3,3]{3,2,1,0} parameter(77)
  %convolution.555 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,256,130,130]{3,2,1,0} %concatenate.554, bf16[256,256,3,3]{3,2,1,0} %p77.84), window={size=3x3}, dim_labels=bf01_oi01->bf01
  %p76.83 = bf16[256]{0} parameter(76)
  %broadcast.556 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p76.83), dimensions={3}
  %transpose.557 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.556), dimensions={0,3,1,2}
  %add.558 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.555, bf16[1,256,128,128]{1,3,2,0} %transpose.557)
  %p75.82 = bf16[256]{0} parameter(75)
  %p74.81 = bf16[256]{0} parameter(74)
  %p73.80 = bf16[256]{0} parameter(73)
  %batch-norm-inference.559 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.558, bf16[256]{0} %p75.82, bf16[256]{0} %p74.81, bf16[256]{0} %p73.80, bf16[256]{0} %p72.79), epsilon=1e-05, feature_index=1
  %constant.564 = bf16[] constant(0)
  %broadcast.565 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.564), dimensions={}
  %maximum.566 = bf16[1,256,128,128]{3,2,1,0} maximum(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.559, bf16[1,256,128,128]{3,2,1,0} %broadcast.565)
  %reverse.567 = bf16[1,256,128,128]{3,2,1,0} reverse(bf16[1,256,128,128]{3,2,1,0} %maximum.566), dimensions={3}
  %slice.568 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.567), slice={[0:1], [0:256], [0:128], [126:127]}
  %slice.569 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.567), slice={[0:1], [0:256], [0:128], [1:2]}
  %concatenate.570 = bf16[1,256,128,130]{3,2,1,0} concatenate(bf16[1,256,128,1]{3,2,1,0} %slice.568, bf16[1,256,128,128]{3,2,1,0} %maximum.566, bf16[1,256,128,1]{3,2,1,0} %slice.569), dimensions={3}
  %reverse.571 = bf16[1,256,128,130]{3,2,1,0} reverse(bf16[1,256,128,130]{3,2,1,0} %concatenate.570), dimensions={2}
  %slice.572 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.571), slice={[0:1], [0:256], [126:127], [0:130]}
  %slice.573 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.571), slice={[0:1], [0:256], [1:2], [0:130]}
  %concatenate.574 = bf16[1,256,130,130]{3,2,1,0} concatenate(bf16[1,256,1,130]{3,2,1,0} %slice.572, bf16[1,256,128,130]{3,2,1,0} %concatenate.570, bf16[1,256,1,130]{3,2,1,0} %slice.573), dimensions={2}
  %p71.78 = bf16[256,256,3,3]{3,2,1,0} parameter(71)
  %convolution.575 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,256,130,130]{3,2,1,0} %concatenate.574, bf16[256,256,3,3]{3,2,1,0} %p71.78), window={size=3x3}, dim_labels=bf01_oi01->bf01
  %p70.77 = bf16[256]{0} parameter(70)
  %broadcast.576 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p70.77), dimensions={3}
  %transpose.577 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.576), dimensions={0,3,1,2}
  %add.578 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.575, bf16[1,256,128,128]{1,3,2,0} %transpose.577)
  %p69.76 = bf16[256]{0} parameter(69)
  %p68.75 = bf16[256]{0} parameter(68)
  %p67.74 = bf16[256]{0} parameter(67)
  %batch-norm-inference.579 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.578, bf16[256]{0} %p69.76, bf16[256]{0} %p68.75, bf16[256]{0} %p67.74, bf16[256]{0} %p66.73), epsilon=1e-05, feature_index=1
  %constant.72 = bf16[] constant(1)
  %broadcast.584 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.72), dimensions={}
  %multiply.585 = bf16[1,256,128,128]{3,2,1,0} multiply(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.579, bf16[1,256,128,128]{3,2,1,0} %broadcast.584)
  %add.586 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %add.546, bf16[1,256,128,128]{3,2,1,0} %multiply.585)
  %reverse.587 = bf16[1,256,128,128]{3,2,1,0} reverse(bf16[1,256,128,128]{3,2,1,0} %add.586), dimensions={3}
  %slice.588 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.587), slice={[0:1], [0:256], [0:128], [126:127]}
  %slice.589 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.587), slice={[0:1], [0:256], [0:128], [1:2]}
  %concatenate.590 = bf16[1,256,128,130]{3,2,1,0} concatenate(bf16[1,256,128,1]{3,2,1,0} %slice.588, bf16[1,256,128,128]{3,2,1,0} %add.586, bf16[1,256,128,1]{3,2,1,0} %slice.589), dimensions={3}
  %reverse.591 = bf16[1,256,128,130]{3,2,1,0} reverse(bf16[1,256,128,130]{3,2,1,0} %concatenate.590), dimensions={2}
  %slice.592 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.591), slice={[0:1], [0:256], [126:127], [0:130]}
  %slice.593 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.591), slice={[0:1], [0:256], [1:2], [0:130]}
  %concatenate.594 = bf16[1,256,130,130]{3,2,1,0} concatenate(bf16[1,256,1,130]{3,2,1,0} %slice.592, bf16[1,256,128,130]{3,2,1,0} %concatenate.590, bf16[1,256,1,130]{3,2,1,0} %slice.593), dimensions={2}
  %p65.71 = bf16[256,256,3,3]{3,2,1,0} parameter(65)
  %convolution.595 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,256,130,130]{3,2,1,0} %concatenate.594, bf16[256,256,3,3]{3,2,1,0} %p65.71), window={size=3x3}, dim_labels=bf01_oi01->bf01
  %p64.70 = bf16[256]{0} parameter(64)
  %broadcast.596 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p64.70), dimensions={3}
  %transpose.597 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.596), dimensions={0,3,1,2}
  %add.598 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.595, bf16[1,256,128,128]{1,3,2,0} %transpose.597)
  %p63.69 = bf16[256]{0} parameter(63)
  %p62.68 = bf16[256]{0} parameter(62)
  %p61.67 = bf16[256]{0} parameter(61)
  %batch-norm-inference.599 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.598, bf16[256]{0} %p63.69, bf16[256]{0} %p62.68, bf16[256]{0} %p61.67, bf16[256]{0} %p60.66), epsilon=1e-05, feature_index=1
  %constant.604 = bf16[] constant(0)
  %broadcast.605 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.604), dimensions={}
  %maximum.606 = bf16[1,256,128,128]{3,2,1,0} maximum(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.599, bf16[1,256,128,128]{3,2,1,0} %broadcast.605)
  %reverse.607 = bf16[1,256,128,128]{3,2,1,0} reverse(bf16[1,256,128,128]{3,2,1,0} %maximum.606), dimensions={3}
  %slice.608 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.607), slice={[0:1], [0:256], [0:128], [126:127]}
  %slice.609 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.607), slice={[0:1], [0:256], [0:128], [1:2]}
  %concatenate.610 = bf16[1,256,128,130]{3,2,1,0} concatenate(bf16[1,256,128,1]{3,2,1,0} %slice.608, bf16[1,256,128,128]{3,2,1,0} %maximum.606, bf16[1,256,128,1]{3,2,1,0} %slice.609), dimensions={3}
  %reverse.611 = bf16[1,256,128,130]{3,2,1,0} reverse(bf16[1,256,128,130]{3,2,1,0} %concatenate.610), dimensions={2}
  %slice.612 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.611), slice={[0:1], [0:256], [126:127], [0:130]}
  %slice.613 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.611), slice={[0:1], [0:256], [1:2], [0:130]}
  %concatenate.614 = bf16[1,256,130,130]{3,2,1,0} concatenate(bf16[1,256,1,130]{3,2,1,0} %slice.612, bf16[1,256,128,130]{3,2,1,0} %concatenate.610, bf16[1,256,1,130]{3,2,1,0} %slice.613), dimensions={2}
  %p59.65 = bf16[256,256,3,3]{3,2,1,0} parameter(59)
  %convolution.615 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,256,130,130]{3,2,1,0} %concatenate.614, bf16[256,256,3,3]{3,2,1,0} %p59.65), window={size=3x3}, dim_labels=bf01_oi01->bf01
  %p58.64 = bf16[256]{0} parameter(58)
  %broadcast.616 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p58.64), dimensions={3}
  %transpose.617 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.616), dimensions={0,3,1,2}
  %add.618 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.615, bf16[1,256,128,128]{1,3,2,0} %transpose.617)
  %p57.63 = bf16[256]{0} parameter(57)
  %p56.62 = bf16[256]{0} parameter(56)
  %p55.61 = bf16[256]{0} parameter(55)
  %batch-norm-inference.619 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.618, bf16[256]{0} %p57.63, bf16[256]{0} %p56.62, bf16[256]{0} %p55.61, bf16[256]{0} %p54.60), epsilon=1e-05, feature_index=1
  %constant.59 = bf16[] constant(1)
  %broadcast.624 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.59), dimensions={}
  %multiply.625 = bf16[1,256,128,128]{3,2,1,0} multiply(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.619, bf16[1,256,128,128]{3,2,1,0} %broadcast.624)
  %add.626 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %add.586, bf16[1,256,128,128]{3,2,1,0} %multiply.625)
  %reverse.627 = bf16[1,256,128,128]{3,2,1,0} reverse(bf16[1,256,128,128]{3,2,1,0} %add.626), dimensions={3}
  %slice.628 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.627), slice={[0:1], [0:256], [0:128], [126:127]}
  %slice.629 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.627), slice={[0:1], [0:256], [0:128], [1:2]}
  %concatenate.630 = bf16[1,256,128,130]{3,2,1,0} concatenate(bf16[1,256,128,1]{3,2,1,0} %slice.628, bf16[1,256,128,128]{3,2,1,0} %add.626, bf16[1,256,128,1]{3,2,1,0} %slice.629), dimensions={3}
  %reverse.631 = bf16[1,256,128,130]{3,2,1,0} reverse(bf16[1,256,128,130]{3,2,1,0} %concatenate.630), dimensions={2}
  %slice.632 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.631), slice={[0:1], [0:256], [126:127], [0:130]}
  %slice.633 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.631), slice={[0:1], [0:256], [1:2], [0:130]}
  %concatenate.634 = bf16[1,256,130,130]{3,2,1,0} concatenate(bf16[1,256,1,130]{3,2,1,0} %slice.632, bf16[1,256,128,130]{3,2,1,0} %concatenate.630, bf16[1,256,1,130]{3,2,1,0} %slice.633), dimensions={2}
  %p53.58 = bf16[256,256,3,3]{3,2,1,0} parameter(53)
  %convolution.635 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,256,130,130]{3,2,1,0} %concatenate.634, bf16[256,256,3,3]{3,2,1,0} %p53.58), window={size=3x3}, dim_labels=bf01_oi01->bf01
  %p52.57 = bf16[256]{0} parameter(52)
  %broadcast.636 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p52.57), dimensions={3}
  %transpose.637 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.636), dimensions={0,3,1,2}
  %add.638 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.635, bf16[1,256,128,128]{1,3,2,0} %transpose.637)
  %p51.56 = bf16[256]{0} parameter(51)
  %p50.55 = bf16[256]{0} parameter(50)
  %p49.54 = bf16[256]{0} parameter(49)
  %batch-norm-inference.639 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.638, bf16[256]{0} %p51.56, bf16[256]{0} %p50.55, bf16[256]{0} %p49.54, bf16[256]{0} %p48.53), epsilon=1e-05, feature_index=1
  %constant.644 = bf16[] constant(0)
  %broadcast.645 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.644), dimensions={}
  %maximum.646 = bf16[1,256,128,128]{3,2,1,0} maximum(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.639, bf16[1,256,128,128]{3,2,1,0} %broadcast.645)
  %reverse.647 = bf16[1,256,128,128]{3,2,1,0} reverse(bf16[1,256,128,128]{3,2,1,0} %maximum.646), dimensions={3}
  %slice.648 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.647), slice={[0:1], [0:256], [0:128], [126:127]}
  %slice.649 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.647), slice={[0:1], [0:256], [0:128], [1:2]}
  %concatenate.650 = bf16[1,256,128,130]{3,2,1,0} concatenate(bf16[1,256,128,1]{3,2,1,0} %slice.648, bf16[1,256,128,128]{3,2,1,0} %maximum.646, bf16[1,256,128,1]{3,2,1,0} %slice.649), dimensions={3}
  %reverse.651 = bf16[1,256,128,130]{3,2,1,0} reverse(bf16[1,256,128,130]{3,2,1,0} %concatenate.650), dimensions={2}
  %slice.652 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.651), slice={[0:1], [0:256], [126:127], [0:130]}
  %slice.653 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.651), slice={[0:1], [0:256], [1:2], [0:130]}
  %concatenate.654 = bf16[1,256,130,130]{3,2,1,0} concatenate(bf16[1,256,1,130]{3,2,1,0} %slice.652, bf16[1,256,128,130]{3,2,1,0} %concatenate.650, bf16[1,256,1,130]{3,2,1,0} %slice.653), dimensions={2}
  %p47.52 = bf16[256,256,3,3]{3,2,1,0} parameter(47)
  %convolution.655 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,256,130,130]{3,2,1,0} %concatenate.654, bf16[256,256,3,3]{3,2,1,0} %p47.52), window={size=3x3}, dim_labels=bf01_oi01->bf01
  %p46.51 = bf16[256]{0} parameter(46)
  %broadcast.656 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p46.51), dimensions={3}
  %transpose.657 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.656), dimensions={0,3,1,2}
  %add.658 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.655, bf16[1,256,128,128]{1,3,2,0} %transpose.657)
  %p45.50 = bf16[256]{0} parameter(45)
  %p44.49 = bf16[256]{0} parameter(44)
  %p43.48 = bf16[256]{0} parameter(43)
  %batch-norm-inference.659 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.658, bf16[256]{0} %p45.50, bf16[256]{0} %p44.49, bf16[256]{0} %p43.48, bf16[256]{0} %p42.47), epsilon=1e-05, feature_index=1
  %constant.46 = bf16[] constant(1)
  %broadcast.664 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.46), dimensions={}
  %multiply.665 = bf16[1,256,128,128]{3,2,1,0} multiply(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.659, bf16[1,256,128,128]{3,2,1,0} %broadcast.664)
  %add.666 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %add.626, bf16[1,256,128,128]{3,2,1,0} %multiply.665)
  %reverse.667 = bf16[1,256,128,128]{3,2,1,0} reverse(bf16[1,256,128,128]{3,2,1,0} %add.666), dimensions={3}
  %slice.668 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.667), slice={[0:1], [0:256], [0:128], [126:127]}
  %slice.669 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.667), slice={[0:1], [0:256], [0:128], [1:2]}
  %concatenate.670 = bf16[1,256,128,130]{3,2,1,0} concatenate(bf16[1,256,128,1]{3,2,1,0} %slice.668, bf16[1,256,128,128]{3,2,1,0} %add.666, bf16[1,256,128,1]{3,2,1,0} %slice.669), dimensions={3}
  %reverse.671 = bf16[1,256,128,130]{3,2,1,0} reverse(bf16[1,256,128,130]{3,2,1,0} %concatenate.670), dimensions={2}
  %slice.672 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.671), slice={[0:1], [0:256], [126:127], [0:130]}
  %slice.673 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.671), slice={[0:1], [0:256], [1:2], [0:130]}
  %concatenate.674 = bf16[1,256,130,130]{3,2,1,0} concatenate(bf16[1,256,1,130]{3,2,1,0} %slice.672, bf16[1,256,128,130]{3,2,1,0} %concatenate.670, bf16[1,256,1,130]{3,2,1,0} %slice.673), dimensions={2}
  %p41.45 = bf16[256,256,3,3]{3,2,1,0} parameter(41)
  %convolution.675 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,256,130,130]{3,2,1,0} %concatenate.674, bf16[256,256,3,3]{3,2,1,0} %p41.45), window={size=3x3}, dim_labels=bf01_oi01->bf01
  %p40.44 = bf16[256]{0} parameter(40)
  %broadcast.676 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p40.44), dimensions={3}
  %transpose.677 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.676), dimensions={0,3,1,2}
  %add.678 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.675, bf16[1,256,128,128]{1,3,2,0} %transpose.677)
  %p39.43 = bf16[256]{0} parameter(39)
  %p38.42 = bf16[256]{0} parameter(38)
  %p37.41 = bf16[256]{0} parameter(37)
  %batch-norm-inference.679 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.678, bf16[256]{0} %p39.43, bf16[256]{0} %p38.42, bf16[256]{0} %p37.41, bf16[256]{0} %p36.40), epsilon=1e-05, feature_index=1
  %constant.684 = bf16[] constant(0)
  %broadcast.685 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.684), dimensions={}
  %maximum.686 = bf16[1,256,128,128]{3,2,1,0} maximum(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.679, bf16[1,256,128,128]{3,2,1,0} %broadcast.685)
  %reverse.687 = bf16[1,256,128,128]{3,2,1,0} reverse(bf16[1,256,128,128]{3,2,1,0} %maximum.686), dimensions={3}
  %slice.688 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.687), slice={[0:1], [0:256], [0:128], [126:127]}
  %slice.689 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.687), slice={[0:1], [0:256], [0:128], [1:2]}
  %concatenate.690 = bf16[1,256,128,130]{3,2,1,0} concatenate(bf16[1,256,128,1]{3,2,1,0} %slice.688, bf16[1,256,128,128]{3,2,1,0} %maximum.686, bf16[1,256,128,1]{3,2,1,0} %slice.689), dimensions={3}
  %reverse.691 = bf16[1,256,128,130]{3,2,1,0} reverse(bf16[1,256,128,130]{3,2,1,0} %concatenate.690), dimensions={2}
  %slice.692 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.691), slice={[0:1], [0:256], [126:127], [0:130]}
  %slice.693 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.691), slice={[0:1], [0:256], [1:2], [0:130]}
  %concatenate.694 = bf16[1,256,130,130]{3,2,1,0} concatenate(bf16[1,256,1,130]{3,2,1,0} %slice.692, bf16[1,256,128,130]{3,2,1,0} %concatenate.690, bf16[1,256,1,130]{3,2,1,0} %slice.693), dimensions={2}
  %p35.39 = bf16[256,256,3,3]{3,2,1,0} parameter(35)
  %convolution.695 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,256,130,130]{3,2,1,0} %concatenate.694, bf16[256,256,3,3]{3,2,1,0} %p35.39), window={size=3x3}, dim_labels=bf01_oi01->bf01
  %p34.38 = bf16[256]{0} parameter(34)
  %broadcast.696 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p34.38), dimensions={3}
  %transpose.697 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.696), dimensions={0,3,1,2}
  %add.698 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.695, bf16[1,256,128,128]{1,3,2,0} %transpose.697)
  %p33.37 = bf16[256]{0} parameter(33)
  %p32.36 = bf16[256]{0} parameter(32)
  %p31.35 = bf16[256]{0} parameter(31)
  %batch-norm-inference.699 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.698, bf16[256]{0} %p33.37, bf16[256]{0} %p32.36, bf16[256]{0} %p31.35, bf16[256]{0} %p30.34), epsilon=1e-05, feature_index=1
  %constant.33 = bf16[] constant(1)
  %broadcast.704 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.33), dimensions={}
  %multiply.705 = bf16[1,256,128,128]{3,2,1,0} multiply(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.699, bf16[1,256,128,128]{3,2,1,0} %broadcast.704)
  %add.706 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %add.666, bf16[1,256,128,128]{3,2,1,0} %multiply.705)
  %reverse.707 = bf16[1,256,128,128]{3,2,1,0} reverse(bf16[1,256,128,128]{3,2,1,0} %add.706), dimensions={3}
  %slice.708 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.707), slice={[0:1], [0:256], [0:128], [126:127]}
  %slice.709 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.707), slice={[0:1], [0:256], [0:128], [1:2]}
  %concatenate.710 = bf16[1,256,128,130]{3,2,1,0} concatenate(bf16[1,256,128,1]{3,2,1,0} %slice.708, bf16[1,256,128,128]{3,2,1,0} %add.706, bf16[1,256,128,1]{3,2,1,0} %slice.709), dimensions={3}
  %reverse.711 = bf16[1,256,128,130]{3,2,1,0} reverse(bf16[1,256,128,130]{3,2,1,0} %concatenate.710), dimensions={2}
  %slice.712 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.711), slice={[0:1], [0:256], [126:127], [0:130]}
  %slice.713 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.711), slice={[0:1], [0:256], [1:2], [0:130]}
  %concatenate.714 = bf16[1,256,130,130]{3,2,1,0} concatenate(bf16[1,256,1,130]{3,2,1,0} %slice.712, bf16[1,256,128,130]{3,2,1,0} %concatenate.710, bf16[1,256,1,130]{3,2,1,0} %slice.713), dimensions={2}
  %p29.32 = bf16[256,256,3,3]{3,2,1,0} parameter(29)
  %convolution.715 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,256,130,130]{3,2,1,0} %concatenate.714, bf16[256,256,3,3]{3,2,1,0} %p29.32), window={size=3x3}, dim_labels=bf01_oi01->bf01
  %p28.31 = bf16[256]{0} parameter(28)
  %broadcast.716 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p28.31), dimensions={3}
  %transpose.717 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.716), dimensions={0,3,1,2}
  %add.718 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.715, bf16[1,256,128,128]{1,3,2,0} %transpose.717)
  %p27.30 = bf16[256]{0} parameter(27)
  %p26.29 = bf16[256]{0} parameter(26)
  %p25.28 = bf16[256]{0} parameter(25)
  %batch-norm-inference.719 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.718, bf16[256]{0} %p27.30, bf16[256]{0} %p26.29, bf16[256]{0} %p25.28, bf16[256]{0} %p24.27), epsilon=1e-05, feature_index=1
  %constant.724 = bf16[] constant(0)
  %broadcast.725 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.724), dimensions={}
  %maximum.726 = bf16[1,256,128,128]{3,2,1,0} maximum(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.719, bf16[1,256,128,128]{3,2,1,0} %broadcast.725)
  %reverse.727 = bf16[1,256,128,128]{3,2,1,0} reverse(bf16[1,256,128,128]{3,2,1,0} %maximum.726), dimensions={3}
  %slice.728 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.727), slice={[0:1], [0:256], [0:128], [126:127]}
  %slice.729 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.727), slice={[0:1], [0:256], [0:128], [1:2]}
  %concatenate.730 = bf16[1,256,128,130]{3,2,1,0} concatenate(bf16[1,256,128,1]{3,2,1,0} %slice.728, bf16[1,256,128,128]{3,2,1,0} %maximum.726, bf16[1,256,128,1]{3,2,1,0} %slice.729), dimensions={3}
  %reverse.731 = bf16[1,256,128,130]{3,2,1,0} reverse(bf16[1,256,128,130]{3,2,1,0} %concatenate.730), dimensions={2}
  %slice.732 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.731), slice={[0:1], [0:256], [126:127], [0:130]}
  %slice.733 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.731), slice={[0:1], [0:256], [1:2], [0:130]}
  %concatenate.734 = bf16[1,256,130,130]{3,2,1,0} concatenate(bf16[1,256,1,130]{3,2,1,0} %slice.732, bf16[1,256,128,130]{3,2,1,0} %concatenate.730, bf16[1,256,1,130]{3,2,1,0} %slice.733), dimensions={2}
  %p23.26 = bf16[256,256,3,3]{3,2,1,0} parameter(23)
  %convolution.735 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,256,130,130]{3,2,1,0} %concatenate.734, bf16[256,256,3,3]{3,2,1,0} %p23.26), window={size=3x3}, dim_labels=bf01_oi01->bf01
  %p22.25 = bf16[256]{0} parameter(22)
  %broadcast.736 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p22.25), dimensions={3}
  %transpose.737 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.736), dimensions={0,3,1,2}
  %add.738 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.735, bf16[1,256,128,128]{1,3,2,0} %transpose.737)
  %p21.24 = bf16[256]{0} parameter(21)
  %p20.23 = bf16[256]{0} parameter(20)
  %p19.22 = bf16[256]{0} parameter(19)
  %batch-norm-inference.739 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.738, bf16[256]{0} %p21.24, bf16[256]{0} %p20.23, bf16[256]{0} %p19.22, bf16[256]{0} %p18.21), epsilon=1e-05, feature_index=1
  %constant.20 = bf16[] constant(1)
  %broadcast.744 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.20), dimensions={}
  %multiply.745 = bf16[1,256,128,128]{3,2,1,0} multiply(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.739, bf16[1,256,128,128]{3,2,1,0} %broadcast.744)
  %add.746 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %add.706, bf16[1,256,128,128]{3,2,1,0} %multiply.745)
  %reverse.747 = bf16[1,256,128,128]{3,2,1,0} reverse(bf16[1,256,128,128]{3,2,1,0} %add.746), dimensions={3}
  %slice.748 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.747), slice={[0:1], [0:256], [0:128], [126:127]}
  %slice.749 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.747), slice={[0:1], [0:256], [0:128], [1:2]}
  %concatenate.750 = bf16[1,256,128,130]{3,2,1,0} concatenate(bf16[1,256,128,1]{3,2,1,0} %slice.748, bf16[1,256,128,128]{3,2,1,0} %add.746, bf16[1,256,128,1]{3,2,1,0} %slice.749), dimensions={3}
  %reverse.751 = bf16[1,256,128,130]{3,2,1,0} reverse(bf16[1,256,128,130]{3,2,1,0} %concatenate.750), dimensions={2}
  %slice.752 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.751), slice={[0:1], [0:256], [126:127], [0:130]}
  %slice.753 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.751), slice={[0:1], [0:256], [1:2], [0:130]}
  %concatenate.754 = bf16[1,256,130,130]{3,2,1,0} concatenate(bf16[1,256,1,130]{3,2,1,0} %slice.752, bf16[1,256,128,130]{3,2,1,0} %concatenate.750, bf16[1,256,1,130]{3,2,1,0} %slice.753), dimensions={2}
  %p17.19 = bf16[256,256,3,3]{3,2,1,0} parameter(17)
  %convolution.755 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,256,130,130]{3,2,1,0} %concatenate.754, bf16[256,256,3,3]{3,2,1,0} %p17.19), window={size=3x3}, dim_labels=bf01_oi01->bf01
  %p16.18 = bf16[256]{0} parameter(16)
  %broadcast.756 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p16.18), dimensions={3}
  %transpose.757 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.756), dimensions={0,3,1,2}
  %add.758 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.755, bf16[1,256,128,128]{1,3,2,0} %transpose.757)
  %p15.17 = bf16[256]{0} parameter(15)
  %p14.16 = bf16[256]{0} parameter(14)
  %p13.15 = bf16[256]{0} parameter(13)
  %batch-norm-inference.759 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.758, bf16[256]{0} %p15.17, bf16[256]{0} %p14.16, bf16[256]{0} %p13.15, bf16[256]{0} %p12.14), epsilon=1e-05, feature_index=1
  %constant.764 = bf16[] constant(0)
  %broadcast.765 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.764), dimensions={}
  %maximum.766 = bf16[1,256,128,128]{3,2,1,0} maximum(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.759, bf16[1,256,128,128]{3,2,1,0} %broadcast.765)
  %reverse.767 = bf16[1,256,128,128]{3,2,1,0} reverse(bf16[1,256,128,128]{3,2,1,0} %maximum.766), dimensions={3}
  %slice.768 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.767), slice={[0:1], [0:256], [0:128], [126:127]}
  %slice.769 = bf16[1,256,128,1]{3,2,1,0} slice(bf16[1,256,128,128]{3,2,1,0} %reverse.767), slice={[0:1], [0:256], [0:128], [1:2]}
  %concatenate.770 = bf16[1,256,128,130]{3,2,1,0} concatenate(bf16[1,256,128,1]{3,2,1,0} %slice.768, bf16[1,256,128,128]{3,2,1,0} %maximum.766, bf16[1,256,128,1]{3,2,1,0} %slice.769), dimensions={3}
  %reverse.771 = bf16[1,256,128,130]{3,2,1,0} reverse(bf16[1,256,128,130]{3,2,1,0} %concatenate.770), dimensions={2}
  %slice.772 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.771), slice={[0:1], [0:256], [126:127], [0:130]}
  %slice.773 = bf16[1,256,1,130]{3,2,1,0} slice(bf16[1,256,128,130]{3,2,1,0} %reverse.771), slice={[0:1], [0:256], [1:2], [0:130]}
  %concatenate.774 = bf16[1,256,130,130]{3,2,1,0} concatenate(bf16[1,256,1,130]{3,2,1,0} %slice.772, bf16[1,256,128,130]{3,2,1,0} %concatenate.770, bf16[1,256,1,130]{3,2,1,0} %slice.773), dimensions={2}
  %p11.13 = bf16[256,256,3,3]{3,2,1,0} parameter(11)
  %convolution.775 = bf16[1,256,128,128]{3,2,1,0} convolution(bf16[1,256,130,130]{3,2,1,0} %concatenate.774, bf16[256,256,3,3]{3,2,1,0} %p11.13), window={size=3x3}, dim_labels=bf01_oi01->bf01
  %p10.12 = bf16[256]{0} parameter(10)
  %broadcast.776 = bf16[1,128,128,256]{3,2,1,0} broadcast(bf16[256]{0} %p10.12), dimensions={3}
  %transpose.777 = bf16[1,256,128,128]{1,3,2,0} transpose(bf16[1,128,128,256]{3,2,1,0} %broadcast.776), dimensions={0,3,1,2}
  %add.778 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %convolution.775, bf16[1,256,128,128]{1,3,2,0} %transpose.777)
  %p9.11 = bf16[256]{0} parameter(9)
  %p8.10 = bf16[256]{0} parameter(8)
  %p7.9 = bf16[256]{0} parameter(7)
  %batch-norm-inference.779 = bf16[1,256,128,128]{3,2,1,0} batch-norm-inference(bf16[1,256,128,128]{3,2,1,0} %add.778, bf16[256]{0} %p9.11, bf16[256]{0} %p8.10, bf16[256]{0} %p7.9, bf16[256]{0} %p6.8), epsilon=1e-05, feature_index=1
  %constant.7 = bf16[] constant(1)
  %broadcast.784 = bf16[1,256,128,128]{3,2,1,0} broadcast(bf16[] %constant.7), dimensions={}
  %multiply.785 = bf16[1,256,128,128]{3,2,1,0} multiply(bf16[1,256,128,128]{3,2,1,0} %batch-norm-inference.779, bf16[1,256,128,128]{3,2,1,0} %broadcast.784)
  %add.786 = bf16[1,256,128,128]{3,2,1,0} add(bf16[1,256,128,128]{3,2,1,0} %add.746, bf16[1,256,128,128]{3,2,1,0} %multiply.785)
  %transpose.787 = bf16[1,128,128,256]{1,2,3,0} transpose(bf16[1,256,128,128]{3,2,1,0} %add.786), dimensions={0,3,2,1}
  %convert.788 = f32[1,128,128,256]{1,2,3,0} convert(bf16[1,128,128,256]{1,2,3,0} %transpose.787)
  %broadcast.808 = f32[256,256,1]{2,1,0} broadcast(f32[256]{0} %clamp.807), dimensions={0}
  %broadcast.822 = f32[256,256,1]{2,1,0} broadcast(f32[256]{0} %clamp.821), dimensions={1}
  %concatenate.823 = f32[256,256,2]{2,1,0} concatenate(f32[256,256,1]{2,1,0} %broadcast.808, f32[256,256,1]{2,1,0} %broadcast.822), dimensions={2}
  %convert.824 = s32[256,256,2]{2,1,0} convert(f32[256,256,2]{2,1,0} %concatenate.823)
  %gather.825 = f32[1,3,3,256,256,256]{5,4,3,2,1,0} gather(f32[1,128,128,256]{1,2,3,0} %convert.788, s32[256,256,2]{2,1,0} %convert.824), offset_dims={0,1,2,3}, collapsed_slice_dims={}, start_index_map={1,2}, index_vector_dim=2, slice_sizes={1,3,3,256}
  %dot.861 = f32[256,256,1,256]{3,2,1,0} dot(f32[256,3,256,3]{3,2,1,0} %dot.860, f32[1,3,3,256,256,256]{5,4,3,2,1,0} %gather.825), lhs_batch_dims={2,0}, lhs_contracting_dims={3,1}, rhs_batch_dims={4,5}, rhs_contracting_dims={1,2}
  %transpose.862 = f32[1,256,256,256]{3,0,2,1} transpose(f32[256,256,1,256]{3,2,1,0} %dot.861), dimensions={2,0,1,3}
  %transpose.863 = f32[1,256,256,256]{1,0,2,3} transpose(f32[1,256,256,256]{3,0,2,1} %transpose.862), dimensions={0,3,2,1}
  %p5.6 = bf16[128,256,3,3]{3,2,1,0} parameter(5)
  %convolution.864 = f32[1,128,256,256]{3,2,1,0} convolution(f32[1,256,256,256]{1,0,2,3} %transpose.863, bf16[128,256,3,3]{3,2,1,0} %p5.6), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01
  %p4.5 = bf16[128]{0} parameter(4)
  %broadcast.865 = bf16[1,256,256,128]{3,2,1,0} broadcast(bf16[128]{0} %p4.5), dimensions={3}
  %transpose.866 = bf16[1,128,256,256]{1,3,2,0} transpose(bf16[1,256,256,128]{3,2,1,0} %broadcast.865), dimensions={0,3,1,2}
  %convert.867 = f32[1,128,256,256]{1,3,2,0} convert(bf16[1,128,256,256]{1,3,2,0} %transpose.866)
  %add.868 = f32[1,128,256,256]{3,2,1,0} add(f32[1,128,256,256]{3,2,1,0} %convolution.864, f32[1,128,256,256]{1,3,2,0} %convert.867)
  %p3.4 = bf16[128]{0} parameter(3)
  %p2.3 = bf16[128]{0} parameter(2)
  %p1.2 = bf16[128]{0} parameter(1)
  %batch-norm-inference.869 = f32[1,128,256,256]{3,2,1,0} batch-norm-inference(f32[1,128,256,256]{3,2,1,0} %add.868, bf16[128]{0} %p3.4, bf16[128]{0} %p2.3, bf16[128]{0} %p1.2, bf16[128]{0} %p0.1), epsilon=1e-05, feature_index=1
  %constant.874 = f32[] constant(0)
  %broadcast.875 = f32[1,128,256,256]{3,2,1,0} broadcast(f32[] %constant.874), dimensions={}
  %maximum.876 = f32[1,128,256,256]{3,2,1,0} maximum(f32[1,128,256,256]{3,2,1,0} %batch-norm-inference.869, f32[1,128,256,256]{3,2,1,0} %broadcast.875)
  ROOT %tuple.877 = (f32[1,128,256,256]{3,2,1,0}) tuple(f32[1,128,256,256]{3,2,1,0} %maximum.876)
}

@miladm
Copy link
Collaborator

miladm commented Feb 12, 2024

cc @vanbasten23

@miladm
Copy link
Collaborator

miladm commented Feb 12, 2024

can you please list out the models that run into this issue @ysiraichi?

@ysiraichi
Copy link
Collaborator Author

This is the only model that runs into this specific issue. There are, however, a similar issue with hf_GPT2: #6521.

@ysiraichi ysiraichi changed the title [benchmarks] Background_Matting fails when lowering UpsampleBilinear2D [torchbench] Background_Matting fails when lowering UpsampleBilinear2D Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants