-
Notifications
You must be signed in to change notification settings - Fork 12
/
food_1B_seqqq_bs40.txt
1723 lines (1712 loc) · 169 KB
/
food_1B_seqqq_bs40.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
******* loading model args.model='vitsmart'
******* loading model args.model='vitsmart'
******* loading model args.model='vitsmart'
******* loading model args.model='vitsmart'
******* loading model args.model='vitsmart'
******* loading model args.model='vitsmart'
******* loading model args.model='vitsmart'
******* loading model args.model='vitsmart'
--> World Size = 8
--> Device_count = 8
--> running with these defaults train_config(seed=2023, verbose=True, total_steps_to_run=None, warmup_steps=5, print_memory_summary=False, num_epochs=12, model_weights_bf16=False, use_mixed_precision=True, use_low_precision_gradient_policy=False, use_tf32=True, optimizer='dadapt_adanip', ap_use_kahan_summation=False, sharding_strategy=<ShardingStrategy.FULL_SHARD: 1>, print_sharding_plan=False, run_profiler=False, profile_folder='tp_fsdp/profile_tracing', log_every=1, num_workers_dataloader=2, batch_size_training=40, fsdp_activation_checkpointing=True, run_validation=True, memory_report=True, nccl_debug_handler=True, distributed_debug=True, use_non_recursive_wrapping=False, use_parallel_attention=False, use_multi_query_attention=True, use_fused_attention=True, use_deferred_init=False, use_tp=False, image_size=224, use_synthetic_data=False, use_pokemon_dataset=False, use_beans_dataset=False, save_model_checkpoint=False, load_model_checkpoint=False, checkpoint_max_save_count=2, save_optimizer=False, load_optimizer=False, optimizer_checkpoint_file='Adam-vit--1.pt', checkpoint_model_filename='vit--1.pt')
clearing gpu cache for all ranks
--> running with torch dist debug set to detail
--> total memory per gpu (GB) = 22.035
wrapping policy is functools.partial(<function transformer_auto_wrap_policy at 0x7f2caf4630d0>, transformer_layer_cls={<class 'models.smart_vit.vit_main.ResPostBlock'>})
******************* bulding the model here ************
**** Use MQA = True
local_cfg.NUM_CLASSES=101
{'patch_size': 14, 'embed_dim': 1408, 'depth': 52, 'num_heads': 16, 'num_classes': 101, 'image_size': 224, 'use_fused_attention': True, 'use_upper_fusion': True, 'use_multi_query_attention': False}
Num classes = 101
Building with Sequential Attention
******************* bulding the model here ************
**** Use MQA = True
local_cfg.NUM_CLASSES=101
{'patch_size': 14, 'embed_dim': 1408, 'depth': 52, 'num_heads': 16, 'num_classes': 101, 'image_size': 224, 'use_fused_attention': True, 'use_upper_fusion': True, 'use_multi_query_attention': False}
Num classes = 101
Building with Sequential Attention
******************* bulding the model here ************
**** Use MQA = True
local_cfg.NUM_CLASSES=101
{'patch_size': 14, 'embed_dim': 1408, 'depth': 52, 'num_heads': 16, 'num_classes': 101, 'image_size': 224, 'use_fused_attention': True, 'use_upper_fusion': True, 'use_multi_query_attention': False}
Num classes = 101
Building with Sequential Attention
******************* bulding the model here ************
**** Use MQA = True
local_cfg.NUM_CLASSES=101
{'patch_size': 14, 'embed_dim': 1408, 'depth': 52, 'num_heads': 16, 'num_classes': 101, 'image_size': 224, 'use_fused_attention': True, 'use_upper_fusion': True, 'use_multi_query_attention': False}
Num classes = 101
Building with Sequential Attention
******************* bulding the model here ************
**** Use MQA = True
local_cfg.NUM_CLASSES=101
{'patch_size': 14, 'embed_dim': 1408, 'depth': 52, 'num_heads': 16, 'num_classes': 101, 'image_size': 224, 'use_fused_attention': True, 'use_upper_fusion': True, 'use_multi_query_attention': False}
Num classes = 101
Building with Sequential Attention
******************* bulding the model here ************
**** Use MQA = True
local_cfg.NUM_CLASSES=101
{'patch_size': 14, 'embed_dim': 1408, 'depth': 52, 'num_heads': 16, 'num_classes': 101, 'image_size': 224, 'use_fused_attention': True, 'use_upper_fusion': True, 'use_multi_query_attention': False}
Num classes = 101
Building with Sequential Attention
******************* bulding the model here ************
**** Use MQA = True
local_cfg.NUM_CLASSES=101
{'patch_size': 14, 'embed_dim': 1408, 'depth': 52, 'num_heads': 16, 'num_classes': 101, 'image_size': 224, 'use_fused_attention': True, 'use_upper_fusion': True, 'use_multi_query_attention': False}
Num classes = 101
Building with Sequential Attention
--> Prepping 1B model ...
stats is ready....? _stats=defaultdict(<class 'list'>, {'best_accuracy': 0.0}), local_rank=0, rank=0
******************* bulding the model here ************
using deferred? False
**** Use MQA = True
local_cfg.NUM_CLASSES=101
{'patch_size': 14, 'embed_dim': 1408, 'depth': 52, 'num_heads': 16, 'num_classes': 101, 'image_size': 224, 'use_fused_attention': True, 'use_upper_fusion': True, 'use_multi_query_attention': False}
Num classes = 101
Building with Sequential Attention
Classifer head set for num_classes=101
Classifer head set for num_classes=101
Classifer head set for num_classes=101
Classifer head set for num_classes=101
Classifer head set for num_classes=101
Classifer head set for num_classes=101
Model has 52 layers
Classifer head set for num_classes=101
Classifer head set for num_classes=101
vit, GPU peak memory allocation: 0.0GB, GPU peak memory reserved: 0.0GB, GPU peak memory active: 0.0GB
--> 1B built.
built model with 1245.913021M params
bf16 check passed
--> Running with mixed precision MixedPrecision(param_dtype=torch.bfloat16, reduce_dtype=torch.bfloat16, buffer_dtype=torch.bfloat16, keep_low_precision_grads=False, cast_forward_inputs=False, cast_root_forward_inputs=True) policy
backward prefetch set to BackwardPrefetch.BACKWARD_PRE
sharding set to ShardingStrategy.FULL_SHARD
--> Batch Size = 40
vit, GPU peak memory allocation: 0.0GB, GPU peak memory reserved: 0.0GB, GPU peak memory active: 0.0GB
Activation Checkpointing with Sequential - ResPostBlock
--> FSDP activation checkpointing in use
local rank 0 init time = 3.7240866319989436
memory stats reset, ready to track
Running with DAdapt AdanIP optimizer
Epoch: 0 starting...
Activation Checkpointing with Sequential - ResPostBlock
Activation Checkpointing with Sequential - ResPostBlock
Activation Checkpointing with Sequential - ResPostBlock
Activation Checkpointing with Sequential - ResPostBlock
Activation Checkpointing with Sequential - ResPostBlock
Activation Checkpointing with Sequential - ResPostBlock
Activation Checkpointing with Sequential - ResPostBlock
step: 1: time taken for the last 1 steps is 11.2631, including opt 13.340048, loss is 4.84375
step: 2: time taken for the last 1 steps is 9.4237, including opt 11.501806, loss is 4.96875
step: 3: time taken for the last 1 steps is 9.4137, including opt 11.476092, loss is 4.875
step: 4: time taken for the last 1 steps is 9.4276, including opt 11.479196, loss is 4.9375
step: 5: time taken for the last 1 steps is 9.3852, including opt 11.455376, loss is 5.03125
step: 6: time taken for the last 1 steps is 9.3988, including opt 11.452257, loss is 4.71875
step: 7: time taken for the last 1 steps is 9.4061, including opt 11.471692, loss is 4.59375
step: 8: time taken for the last 1 steps is 9.4078, including opt 11.461837, loss is 4.90625
step: 9: time taken for the last 1 steps is 9.4337, including opt 11.503760, loss is 4.8125
step: 10: time taken for the last 1 steps is 9.4337, including opt 11.502152, loss is 4.875
step: 11: time taken for the last 1 steps is 9.4750, including opt 11.537771, loss is 5.03125
step: 12: time taken for the last 1 steps is 9.4158, including opt 11.475121, loss is 4.90625
step: 13: time taken for the last 1 steps is 9.4357, including opt 11.491728, loss is 4.75
step: 14: time taken for the last 1 steps is 9.4146, including opt 11.475699, loss is 5.03125
step: 15: time taken for the last 1 steps is 9.4082, including opt 11.475239, loss is 4.78125
step: 16: time taken for the last 1 steps is 9.3952, including opt 11.449722, loss is 4.75
step: 17: time taken for the last 1 steps is 9.4040, including opt 11.464655, loss is 4.8125
step: 18: time taken for the last 1 steps is 9.3972, including opt 11.451152, loss is 4.625
step: 19: time taken for the last 1 steps is 9.4396, including opt 11.484147, loss is 4.6875
step: 20: time taken for the last 1 steps is 9.3389, including opt 11.387716, loss is 4.875
step: 21: time taken for the last 1 steps is 9.3725, including opt 11.438219, loss is 4.96875
step: 22: time taken for the last 1 steps is 9.4431, including opt 11.514353, loss is 4.71875
step: 23: time taken for the last 1 steps is 9.4093, including opt 11.469514, loss is 4.75
step: 24: time taken for the last 1 steps is 9.4268, including opt 11.481884, loss is 4.84375
step: 25: time taken for the last 1 steps is 9.4247, including opt 11.476191, loss is 4.84375
step: 26: time taken for the last 1 steps is 9.4187, including opt 11.464394, loss is 4.6875
step: 27: time taken for the last 1 steps is 9.3693, including opt 11.424056, loss is 4.8125
step: 28: time taken for the last 1 steps is 9.4095, including opt 11.470768, loss is 4.75
step: 29: time taken for the last 1 steps is 9.4066, including opt 11.478506, loss is 4.59375
step: 30: time taken for the last 1 steps is 9.3989, including opt 11.448286, loss is 4.625
step: 31: time taken for the last 1 steps is 9.3875, including opt 11.465101, loss is 4.6875
step: 32: time taken for the last 1 steps is 9.4180, including opt 11.482848, loss is 4.71875
step: 33: time taken for the last 1 steps is 9.4097, including opt 11.461093, loss is 4.71875
step: 34: time taken for the last 1 steps is 9.4403, including opt 11.484447, loss is 4.75
step: 35: time taken for the last 1 steps is 9.3792, including opt 11.439472, loss is 4.5625
step: 36: time taken for the last 1 steps is 9.3723, including opt 11.441670, loss is 4.46875
step: 37: time taken for the last 1 steps is 9.3953, including opt 11.464172, loss is 4.59375
step: 38: time taken for the last 1 steps is 9.4009, including opt 11.455902, loss is 4.53125
step: 39: time taken for the last 1 steps is 9.4160, including opt 11.482568, loss is 4.46875
step: 40: time taken for the last 1 steps is 9.3743, including opt 11.451550, loss is 4.625
step: 41: time taken for the last 1 steps is 9.4517, including opt 11.525169, loss is 4.59375
step: 42: time taken for the last 1 steps is 9.4140, including opt 11.489067, loss is 4.5625
step: 43: time taken for the last 1 steps is 9.4349, including opt 11.509362, loss is 4.53125
step: 44: time taken for the last 1 steps is 9.4358, including opt 11.512817, loss is 4.71875
step: 45: time taken for the last 1 steps is 9.3721, including opt 11.434136, loss is 4.53125
step: 46: time taken for the last 1 steps is 9.3919, including opt 11.443220, loss is 4.59375
step: 47: time taken for the last 1 steps is 9.3879, including opt 11.450353, loss is 4.46875
step: 48: time taken for the last 1 steps is 9.4231, including opt 11.486390, loss is 4.46875
step: 49: time taken for the last 1 steps is 9.4497, including opt 11.512747, loss is 4.71875
step: 50: time taken for the last 1 steps is 9.3751, including opt 11.415161, loss is 4.40625
step: 51: time taken for the last 1 steps is 9.3985, including opt 11.461091, loss is 4.21875
step: 52: time taken for the last 1 steps is 9.4185, including opt 11.478948, loss is 4.5
step: 53: time taken for the last 1 steps is 9.4461, including opt 11.529348, loss is 4.5625
step: 54: time taken for the last 1 steps is 9.3645, including opt 11.442898, loss is 4.34375
step: 55: time taken for the last 1 steps is 9.4230, including opt 11.484075, loss is 4.625
step: 56: time taken for the last 1 steps is 9.3989, including opt 11.469749, loss is 4.5
step: 57: time taken for the last 1 steps is 9.3595, including opt 11.430087, loss is 4.625
step: 58: time taken for the last 1 steps is 9.4202, including opt 11.485704, loss is 4.59375
step: 59: time taken for the last 1 steps is 9.3934, including opt 11.458304, loss is 4.3125
step: 60: time taken for the last 1 steps is 9.3711, including opt 11.431211, loss is 4.34375
step: 61: time taken for the last 1 steps is 9.3938, including opt 11.451713, loss is 4.5
step: 62: time taken for the last 1 steps is 9.3832, including opt 11.434626, loss is 4.25
step: 63: time taken for the last 1 steps is 9.4095, including opt 11.480940, loss is 4.5
step: 64: time taken for the last 1 steps is 9.4408, including opt 11.497334, loss is 4.40625
step: 65: time taken for the last 1 steps is 9.4082, including opt 11.491589, loss is 4.40625
step: 66: time taken for the last 1 steps is 9.4473, including opt 11.513734, loss is 4.4375
step: 67: time taken for the last 1 steps is 9.3994, including opt 11.457286, loss is 4.0625
step: 68: time taken for the last 1 steps is 9.4332, including opt 11.498949, loss is 4.4375
step: 69: time taken for the last 1 steps is 9.4144, including opt 11.475203, loss is 4.59375
step: 70: time taken for the last 1 steps is 9.4358, including opt 11.495526, loss is 4.5625
step: 71: time taken for the last 1 steps is 9.3944, including opt 11.464044, loss is 4.8125
step: 72: time taken for the last 1 steps is 9.4292, including opt 11.483151, loss is 4.4375
step: 73: time taken for the last 1 steps is 9.4071, including opt 11.487452, loss is 4.4375
step: 74: time taken for the last 1 steps is 9.4425, including opt 11.510492, loss is 4.40625
step: 75: time taken for the last 1 steps is 9.3815, including opt 11.441687, loss is 4.40625
step: 76: time taken for the last 1 steps is 9.4071, including opt 11.460156, loss is 4.5
step: 77: time taken for the last 1 steps is 9.3782, including opt 11.448515, loss is 4.21875
step: 78: time taken for the last 1 steps is 9.4201, including opt 11.492965, loss is 4.46875
step: 79: time taken for the last 1 steps is 9.4100, including opt 11.464198, loss is 4.4375
step: 80: time taken for the last 1 steps is 9.4076, including opt 11.478237, loss is 4.34375
step: 81: time taken for the last 1 steps is 9.3622, including opt 11.419988, loss is 4.5
step: 82: time taken for the last 1 steps is 9.4197, including opt 11.480606, loss is 4.4375
step: 83: time taken for the last 1 steps is 9.3785, including opt 11.437220, loss is 4.375
step: 84: time taken for the last 1 steps is 9.3850, including opt 11.439312, loss is 4.34375
step: 85: time taken for the last 1 steps is 9.3787, including opt 11.436296, loss is 4.46875
step: 86: time taken for the last 1 steps is 9.3948, including opt 11.456871, loss is 4.34375
step: 87: time taken for the last 1 steps is 9.4072, including opt 11.444022, loss is 4.375
step: 88: time taken for the last 1 steps is 9.3628, including opt 11.433337, loss is 4.65625
step: 89: time taken for the last 1 steps is 9.4551, including opt 11.536972, loss is 4.46875
step: 90: time taken for the last 1 steps is 9.4127, including opt 11.459951, loss is 4.21875
step: 91: time taken for the last 1 steps is 9.3755, including opt 11.435441, loss is 4.375
step: 92: time taken for the last 1 steps is 9.4256, including opt 11.470386, loss is 4.5625
step: 93: time taken for the last 1 steps is 9.3925, including opt 11.448233, loss is 4.34375
step: 94: time taken for the last 1 steps is 9.3960, including opt 11.457381, loss is 4.4375
step: 95: time taken for the last 1 steps is 9.4390, including opt 11.500223, loss is 4.46875
step: 96: time taken for the last 1 steps is 9.3988, including opt 11.450413, loss is 4.40625
step: 97: time taken for the last 1 steps is 9.4074, including opt 11.477993, loss is 4.375
step: 98: time taken for the last 1 steps is 9.3989, including opt 11.469656, loss is 4.34375
step: 99: time taken for the last 1 steps is 9.3637, including opt 11.425572, loss is 4.28125
step: 100: time taken for the last 1 steps is 9.3815, including opt 11.441835, loss is 4.34375
step: 101: time taken for the last 1 steps is 9.3900, including opt 11.453919, loss is 4.53125
step: 102: time taken for the last 1 steps is 9.3968, including opt 11.480069, loss is 4.34375
step: 103: time taken for the last 1 steps is 9.3939, including opt 11.468360, loss is 4.40625
step: 104: time taken for the last 1 steps is 9.4539, including opt 11.516620, loss is 4.28125
step: 105: time taken for the last 1 steps is 9.3805, including opt 11.449989, loss is 4.40625
step: 106: time taken for the last 1 steps is 9.4351, including opt 11.494983, loss is 4.3125
step: 107: time taken for the last 1 steps is 9.4021, including opt 11.466608, loss is 4.4375
step: 108: time taken for the last 1 steps is 9.3945, including opt 11.446064, loss is 4.5625
step: 109: time taken for the last 1 steps is 9.4482, including opt 11.499631, loss is 4.25
step: 110: time taken for the last 1 steps is 9.4338, including opt 11.482933, loss is 4.46875
step: 111: time taken for the last 1 steps is 9.4223, including opt 11.485695, loss is 4.0
step: 112: time taken for the last 1 steps is 9.3627, including opt 11.430242, loss is 4.3125
step: 113: time taken for the last 1 steps is 9.3943, including opt 11.450852, loss is 4.34375
step: 114: time taken for the last 1 steps is 9.3971, including opt 11.461727, loss is 4.5
step: 115: time taken for the last 1 steps is 9.3729, including opt 11.439388, loss is 4.34375
step: 116: time taken for the last 1 steps is 9.3899, including opt 11.448849, loss is 4.15625
step: 117: time taken for the last 1 steps is 9.3949, including opt 11.470041, loss is 4.34375
step: 118: time taken for the last 1 steps is 9.3939, including opt 11.469519, loss is 4.3125
step: 119: time taken for the last 1 steps is 9.3800, including opt 11.441139, loss is 4.40625
step: 120: time taken for the last 1 steps is 9.4217, including opt 11.481314, loss is 4.4375
step: 121: time taken for the last 1 steps is 9.4034, including opt 11.477586, loss is 4.125
step: 122: time taken for the last 1 steps is 9.4150, including opt 11.473501, loss is 4.40625
step: 123: time taken for the last 1 steps is 9.3938, including opt 11.461568, loss is 4.21875
step: 124: time taken for the last 1 steps is 9.3853, including opt 11.451783, loss is 4.25
step: 125: time taken for the last 1 steps is 9.3988, including opt 11.478466, loss is 4.375
step: 126: time taken for the last 1 steps is 9.4027, including opt 11.471987, loss is 4.3125
step: 127: time taken for the last 1 steps is 9.4303, including opt 11.492205, loss is 4.40625
step: 128: time taken for the last 1 steps is 9.3938, including opt 11.446505, loss is 4.34375
step: 129: time taken for the last 1 steps is 9.4391, including opt 11.496885, loss is 4.34375
step: 130: time taken for the last 1 steps is 9.3961, including opt 11.437565, loss is 4.28125
step: 131: time taken for the last 1 steps is 9.3981, including opt 11.473846, loss is 4.375
step: 132: time taken for the last 1 steps is 9.4230, including opt 11.478226, loss is 4.1875
step: 133: time taken for the last 1 steps is 9.4014, including opt 11.450571, loss is 4.28125
step: 134: time taken for the last 1 steps is 9.3910, including opt 11.449475, loss is 4.15625
step: 135: time taken for the last 1 steps is 9.3991, including opt 11.463601, loss is 4.4375
step: 136: time taken for the last 1 steps is 9.3719, including opt 11.431061, loss is 4.53125
step: 137: time taken for the last 1 steps is 9.4635, including opt 11.515579, loss is 4.625
step: 138: time taken for the last 1 steps is 9.3793, including opt 11.433769, loss is 4.25
step: 139: time taken for the last 1 steps is 9.3933, including opt 11.447942, loss is 4.40625
step: 140: time taken for the last 1 steps is 9.4184, including opt 11.493299, loss is 4.125
step: 141: time taken for the last 1 steps is 9.4189, including opt 11.473823, loss is 4.21875
step: 142: time taken for the last 1 steps is 9.3459, including opt 11.399847, loss is 4.15625
step: 143: time taken for the last 1 steps is 9.4070, including opt 11.447401, loss is 4.40625
step: 144: time taken for the last 1 steps is 9.4009, including opt 11.451496, loss is 4.40625
step: 145: time taken for the last 1 steps is 9.3972, including opt 11.443168, loss is 4.375
step: 146: time taken for the last 1 steps is 9.3919, including opt 11.450273, loss is 4.4375
step: 147: time taken for the last 1 steps is 9.3680, including opt 11.441748, loss is 4.40625
step: 148: time taken for the last 1 steps is 9.4126, including opt 11.462787, loss is 4.21875
step: 149: time taken for the last 1 steps is 9.4240, including opt 11.486212, loss is 4.125
step: 150: time taken for the last 1 steps is 9.3599, including opt 11.411660, loss is 4.25
step: 151: time taken for the last 1 steps is 9.4023, including opt 11.456817, loss is 4.0625
step: 152: time taken for the last 1 steps is 9.4250, including opt 11.485327, loss is 4.21875
step: 153: time taken for the last 1 steps is 9.3798, including opt 11.454165, loss is 4.40625
step: 154: time taken for the last 1 steps is 9.3747, including opt 11.436834, loss is 4.3125
step: 155: time taken for the last 1 steps is 9.4050, including opt 11.471986, loss is 4.3125
step: 156: time taken for the last 1 steps is 9.4063, including opt 11.464679, loss is 4.46875
step: 157: time taken for the last 1 steps is 9.3829, including opt 11.438219, loss is 4.28125
step: 158: time taken for the last 1 steps is 9.3835, including opt 11.449126, loss is 4.40625
step: 159: time taken for the last 1 steps is 9.4069, including opt 11.466046, loss is 4.40625
step: 160: time taken for the last 1 steps is 9.4136, including opt 11.479406, loss is 4.40625
step: 161: time taken for the last 1 steps is 9.3942, including opt 11.454194, loss is 4.5
step: 162: time taken for the last 1 steps is 9.4010, including opt 11.457991, loss is 4.21875
step: 163: time taken for the last 1 steps is 9.3957, including opt 11.467690, loss is 4.3125
step: 164: time taken for the last 1 steps is 9.4270, including opt 11.482030, loss is 4.46875
step: 165: time taken for the last 1 steps is 9.3789, including opt 11.434787, loss is 4.375
step: 166: time taken for the last 1 steps is 9.3441, including opt 11.400670, loss is 4.34375
step: 167: time taken for the last 1 steps is 9.4130, including opt 11.484826, loss is 4.15625
step: 168: time taken for the last 1 steps is 9.3743, including opt 11.453407, loss is 4.28125
step: 169: time taken for the last 1 steps is 9.3972, including opt 11.470097, loss is 4.53125
step: 170: time taken for the last 1 steps is 9.4080, including opt 11.482421, loss is 4.25
step: 171: time taken for the last 1 steps is 9.3962, including opt 11.471543, loss is 4.375
step: 172: time taken for the last 1 steps is 9.4019, including opt 11.464205, loss is 4.09375
step: 173: time taken for the last 1 steps is 9.3526, including opt 11.402273, loss is 4.21875
step: 174: time taken for the last 1 steps is 9.3936, including opt 11.448681, loss is 4.28125
step: 175: time taken for the last 1 steps is 9.4403, including opt 11.500091, loss is 4.375
step: 176: time taken for the last 1 steps is 9.3990, including opt 11.453891, loss is 4.21875
step: 177: time taken for the last 1 steps is 9.4041, including opt 11.464046, loss is 4.03125
step: 178: time taken for the last 1 steps is 9.3804, including opt 11.445695, loss is 4.25
step: 179: time taken for the last 1 steps is 9.3806, including opt 11.441214, loss is 4.15625
step: 180: time taken for the last 1 steps is 9.4325, including opt 11.518668, loss is 4.21875
step: 181: time taken for the last 1 steps is 9.3790, including opt 11.450908, loss is 4.5625
step: 182: time taken for the last 1 steps is 9.3750, including opt 11.441389, loss is 4.375
step: 183: time taken for the last 1 steps is 9.3765, including opt 11.442236, loss is 4.28125
step: 184: time taken for the last 1 steps is 9.4071, including opt 11.460891, loss is 4.3125
step: 185: time taken for the last 1 steps is 9.4118, including opt 11.471905, loss is 4.03125
step: 186: time taken for the last 1 steps is 9.4335, including opt 11.491899, loss is 4.09375
step: 187: time taken for the last 1 steps is 9.4057, including opt 11.451257, loss is 4.15625
step: 188: time taken for the last 1 steps is 9.3929, including opt 11.453368, loss is 4.28125
step: 189: time taken for the last 1 steps is 9.3774, including opt 11.454038, loss is 4.1875
step: 190: time taken for the last 1 steps is 9.3655, including opt 11.431973, loss is 4.5
step: 191: time taken for the last 1 steps is 9.4125, including opt 11.486911, loss is 4.34375
step: 192: time taken for the last 1 steps is 9.3741, including opt 11.427835, loss is 4.4375
step: 193: time taken for the last 1 steps is 9.4176, including opt 11.476845, loss is 4.15625
step: 194: time taken for the last 1 steps is 9.3621, including opt 11.427428, loss is 4.375
step: 195: time taken for the last 1 steps is 9.4031, including opt 11.474934, loss is 3.96875
step: 196: time taken for the last 1 steps is 9.4148, including opt 11.494553, loss is 4.46875
step: 197: time taken for the last 1 steps is 9.3848, including opt 11.442613, loss is 4.0625
step: 198: time taken for the last 1 steps is 9.4174, including opt 11.479000, loss is 4.03125
step: 199: time taken for the last 1 steps is 9.4377, including opt 11.509066, loss is 4.46875
step: 200: time taken for the last 1 steps is 9.3892, including opt 11.458462, loss is 4.34375
step: 201: time taken for the last 1 steps is 9.4024, including opt 11.460934, loss is 4.28125
step: 202: time taken for the last 1 steps is 9.4318, including opt 11.500893, loss is 4.21875
step: 203: time taken for the last 1 steps is 9.3991, including opt 11.461899, loss is 4.4375
step: 204: time taken for the last 1 steps is 9.3950, including opt 11.458367, loss is 4.125
step: 205: time taken for the last 1 steps is 9.3844, including opt 11.423071, loss is 4.25
step: 206: time taken for the last 1 steps is 9.3977, including opt 11.465806, loss is 4.0
step: 207: time taken for the last 1 steps is 9.3787, including opt 11.429632, loss is 4.25
step: 208: time taken for the last 1 steps is 9.4036, including opt 11.475098, loss is 4.375
step: 209: time taken for the last 1 steps is 9.4183, including opt 11.478568, loss is 4.65625
step: 210: time taken for the last 1 steps is 9.3938, including opt 11.463320, loss is 4.34375
step: 211: time taken for the last 1 steps is 9.4070, including opt 11.475894, loss is 3.96875
step: 212: time taken for the last 1 steps is 9.3897, including opt 11.468218, loss is 4.53125
step: 213: time taken for the last 1 steps is 9.4563, including opt 11.505069, loss is 4.0625
step: 214: time taken for the last 1 steps is 9.3757, including opt 11.446231, loss is 4.0625
step: 215: time taken for the last 1 steps is 9.4180, including opt 11.485618, loss is 4.5
step: 216: time taken for the last 1 steps is 9.4154, including opt 11.476403, loss is 4.40625
step: 217: time taken for the last 1 steps is 9.4061, including opt 11.465642, loss is 3.9375
step: 218: time taken for the last 1 steps is 9.4206, including opt 11.492988, loss is 4.0625
step: 219: time taken for the last 1 steps is 9.3729, including opt 11.428424, loss is 4.09375
step: 220: time taken for the last 1 steps is 9.3969, including opt 11.441978, loss is 4.34375
step: 221: time taken for the last 1 steps is 9.3642, including opt 11.432912, loss is 4.125
step: 222: time taken for the last 1 steps is 9.3690, including opt 11.421738, loss is 4.25
step: 223: time taken for the last 1 steps is 9.3916, including opt 11.447294, loss is 3.90625
step: 224: time taken for the last 1 steps is 9.3680, including opt 11.431197, loss is 4.4375
step: 225: time taken for the last 1 steps is 9.4196, including opt 11.493489, loss is 4.21875
step: 226: time taken for the last 1 steps is 9.3856, including opt 11.459461, loss is 4.375
step: 227: time taken for the last 1 steps is 9.4133, including opt 11.466736, loss is 4.1875
step: 228: time taken for the last 1 steps is 9.4061, including opt 11.472329, loss is 4.3125
step: 229: time taken for the last 1 steps is 9.4146, including opt 11.480548, loss is 4.3125
step: 230: time taken for the last 1 steps is 9.3980, including opt 11.449739, loss is 4.21875
step: 231: time taken for the last 1 steps is 9.3947, including opt 11.460648, loss is 4.09375
step: 232: time taken for the last 1 steps is 9.4031, including opt 11.462769, loss is 3.984375
step: 233: time taken for the last 1 steps is 9.3762, including opt 11.437806, loss is 4.03125
step: 234: time taken for the last 1 steps is 9.4038, including opt 11.457846, loss is 4.0625
step: 235: time taken for the last 1 steps is 9.4198, including opt 11.476734, loss is 4.375
step: 236: time taken for the last 1 steps is 9.4290, including opt 11.491078, loss is 4.3125
step: 237: time taken for the last 1 steps is 9.4000, including opt 11.430783, loss is 4.5
val_loss : 4.1247 : val_acc: 0.0843
updating stats...
Epoch: 1 starting...
step: 1: time taken for the last 1 steps is 12.9396, including opt 15.484605, loss is 4.167440414428711
step: 2: time taken for the last 1 steps is 12.9831, including opt 15.521644, loss is 4.310443878173828
step: 3: time taken for the last 1 steps is 12.9087, including opt 15.418536, loss is 4.30471134185791
step: 4: time taken for the last 1 steps is 12.9458, including opt 15.466213, loss is 4.140689373016357
step: 5: time taken for the last 1 steps is 12.9599, including opt 15.473787, loss is 4.02284574508667
step: 6: time taken for the last 1 steps is 13.0020, including opt 15.529746, loss is 4.221026420593262
step: 7: time taken for the last 1 steps is 13.0069, including opt 15.541109, loss is 3.8605129718780518
step: 8: time taken for the last 1 steps is 12.9384, including opt 15.461183, loss is 4.223278999328613
step: 9: time taken for the last 1 steps is 12.9533, including opt 15.487463, loss is 4.335699081420898
step: 10: time taken for the last 1 steps is 12.9733, including opt 15.508762, loss is 4.242855548858643
step: 11: time taken for the last 1 steps is 12.9844, including opt 15.492400, loss is 4.403379440307617
step: 12: time taken for the last 1 steps is 12.9642, including opt 15.499506, loss is 4.292414665222168
step: 13: time taken for the last 1 steps is 12.9246, including opt 15.439789, loss is 4.1653289794921875
step: 14: time taken for the last 1 steps is 13.0044, including opt 15.540658, loss is 4.184687614440918
step: 15: time taken for the last 1 steps is 12.9271, including opt 15.470079, loss is 4.247711658477783
step: 16: time taken for the last 1 steps is 12.9218, including opt 15.480423, loss is 4.382834434509277
step: 17: time taken for the last 1 steps is 12.9454, including opt 15.472416, loss is 4.146566390991211
step: 18: time taken for the last 1 steps is 13.0268, including opt 15.577655, loss is 4.343238353729248
step: 19: time taken for the last 1 steps is 12.9735, including opt 15.492442, loss is 4.3585638999938965
step: 20: time taken for the last 1 steps is 13.0258, including opt 15.524184, loss is 4.130335807800293
step: 21: time taken for the last 1 steps is 13.0380, including opt 15.555067, loss is 4.08211612701416
step: 22: time taken for the last 1 steps is 13.1038, including opt 15.625860, loss is 4.112156391143799
step: 23: time taken for the last 1 steps is 12.9475, including opt 15.506193, loss is 4.238764762878418
step: 24: time taken for the last 1 steps is 12.9958, including opt 15.548911, loss is 4.058945178985596
step: 25: time taken for the last 1 steps is 12.9666, including opt 15.461774, loss is 4.297049522399902
step: 26: time taken for the last 1 steps is 12.9741, including opt 15.496772, loss is 4.319031238555908
step: 27: time taken for the last 1 steps is 12.9627, including opt 15.493593, loss is 4.4946208000183105
step: 28: time taken for the last 1 steps is 12.9657, including opt 15.518299, loss is 4.26330041885376
step: 29: time taken for the last 1 steps is 12.9270, including opt 15.447360, loss is 4.025725364685059
step: 30: time taken for the last 1 steps is 12.9309, including opt 15.468969, loss is 4.115725517272949
step: 31: time taken for the last 1 steps is 12.9462, including opt 15.477381, loss is 4.021841526031494
step: 32: time taken for the last 1 steps is 12.9593, including opt 15.499305, loss is 4.1673479080200195
step: 33: time taken for the last 1 steps is 13.0138, including opt 15.554763, loss is 4.161728858947754
step: 34: time taken for the last 1 steps is 12.8948, including opt 15.419153, loss is 4.107372283935547
step: 35: time taken for the last 1 steps is 13.0125, including opt 15.528621, loss is 4.09205436706543
step: 36: time taken for the last 1 steps is 12.9275, including opt 15.440338, loss is 4.0176239013671875
step: 37: time taken for the last 1 steps is 12.9229, including opt 15.463348, loss is 4.128479957580566
step: 38: time taken for the last 1 steps is 12.9322, including opt 15.468640, loss is 4.28554630279541
step: 39: time taken for the last 1 steps is 12.9644, including opt 15.496979, loss is 4.138161659240723
step: 40: time taken for the last 1 steps is 13.0677, including opt 15.586997, loss is 4.0256876945495605
step: 41: time taken for the last 1 steps is 12.9324, including opt 15.483683, loss is 3.972130537033081
step: 42: time taken for the last 1 steps is 12.9441, including opt 15.479485, loss is 4.0311174392700195
step: 43: time taken for the last 1 steps is 12.8994, including opt 15.417847, loss is 3.9185538291931152
step: 44: time taken for the last 1 steps is 12.9409, including opt 15.488409, loss is 4.065271377563477
step: 45: time taken for the last 1 steps is 12.9286, including opt 15.443700, loss is 3.9932618141174316
step: 46: time taken for the last 1 steps is 12.9120, including opt 15.455338, loss is 4.081323146820068
step: 47: time taken for the last 1 steps is 13.0052, including opt 15.540198, loss is 3.8715367317199707
step: 48: time taken for the last 1 steps is 12.9929, including opt 15.545744, loss is 4.2177228927612305
step: 49: time taken for the last 1 steps is 12.9673, including opt 15.518516, loss is 4.45845890045166
step: 50: time taken for the last 1 steps is 12.9798, including opt 15.523351, loss is 4.039594650268555
step: 51: time taken for the last 1 steps is 12.9620, including opt 15.496831, loss is 3.9254963397979736
step: 52: time taken for the last 1 steps is 12.9157, including opt 15.461945, loss is 3.976642608642578
step: 53: time taken for the last 1 steps is 12.9590, including opt 15.446785, loss is 4.0883283615112305
step: 54: time taken for the last 1 steps is 12.9424, including opt 15.490139, loss is 4.116480827331543
step: 55: time taken for the last 1 steps is 12.8908, including opt 15.409193, loss is 4.153362274169922
step: 56: time taken for the last 1 steps is 12.9281, including opt 15.451170, loss is 4.133035182952881
step: 57: time taken for the last 1 steps is 12.8520, including opt 15.363346, loss is 4.266077995300293
step: 58: time taken for the last 1 steps is 12.9321, including opt 15.445749, loss is 4.132763862609863
step: 59: time taken for the last 1 steps is 13.0451, including opt 15.572758, loss is 3.9971771240234375
step: 60: time taken for the last 1 steps is 12.9307, including opt 15.476062, loss is 4.082694053649902
step: 61: time taken for the last 1 steps is 12.8837, including opt 15.368966, loss is 4.143889904022217
step: 62: time taken for the last 1 steps is 12.9980, including opt 15.516172, loss is 4.068274021148682
step: 63: time taken for the last 1 steps is 12.9981, including opt 15.548573, loss is 4.149722099304199
step: 64: time taken for the last 1 steps is 12.9677, including opt 15.511054, loss is 4.112788200378418
step: 65: time taken for the last 1 steps is 12.9627, including opt 15.517936, loss is 4.032925128936768
step: 66: time taken for the last 1 steps is 12.9166, including opt 15.453117, loss is 4.121283531188965
step: 67: time taken for the last 1 steps is 13.0573, including opt 15.593669, loss is 3.4774010181427
step: 68: time taken for the last 1 steps is 12.9726, including opt 15.506117, loss is 3.9426848888397217
step: 69: time taken for the last 1 steps is 12.9427, including opt 15.503791, loss is 4.1868414878845215
step: 70: time taken for the last 1 steps is 12.9720, including opt 15.506082, loss is 4.382619380950928
step: 71: time taken for the last 1 steps is 13.0162, including opt 15.546473, loss is 4.17063570022583
step: 72: time taken for the last 1 steps is 12.9903, including opt 15.525604, loss is 4.100834369659424
step: 73: time taken for the last 1 steps is 13.0342, including opt 15.583878, loss is 4.226833343505859
step: 74: time taken for the last 1 steps is 12.9811, including opt 15.494874, loss is 3.965019702911377
step: 75: time taken for the last 1 steps is 12.9650, including opt 15.485584, loss is 4.0560808181762695
step: 76: time taken for the last 1 steps is 12.8947, including opt 15.432226, loss is 4.037145614624023
step: 77: time taken for the last 1 steps is 12.9411, including opt 15.482930, loss is 3.998286724090576
step: 78: time taken for the last 1 steps is 12.9716, including opt 15.524530, loss is 4.0506205558776855
step: 79: time taken for the last 1 steps is 12.9936, including opt 15.521055, loss is 4.36601448059082
step: 80: time taken for the last 1 steps is 12.9495, including opt 15.484536, loss is 4.167021751403809
step: 81: time taken for the last 1 steps is 12.9443, including opt 15.423739, loss is 4.165156364440918
step: 82: time taken for the last 1 steps is 12.9280, including opt 15.441017, loss is 4.001506805419922
step: 83: time taken for the last 1 steps is 12.9672, including opt 15.494712, loss is 3.8665778636932373
step: 84: time taken for the last 1 steps is 12.9858, including opt 15.546013, loss is 4.1099772453308105
step: 85: time taken for the last 1 steps is 13.0493, including opt 15.599342, loss is 4.197932243347168
step: 86: time taken for the last 1 steps is 12.9717, including opt 15.529596, loss is 3.909334182739258
step: 87: time taken for the last 1 steps is 13.0683, including opt 15.637194, loss is 4.1783952713012695
step: 88: time taken for the last 1 steps is 13.0121, including opt 15.544597, loss is 4.438275337219238
step: 89: time taken for the last 1 steps is 12.9617, including opt 15.482368, loss is 4.112347602844238
step: 90: time taken for the last 1 steps is 12.9866, including opt 15.547726, loss is 3.975226640701294
step: 91: time taken for the last 1 steps is 12.9881, including opt 15.533653, loss is 3.936728000640869
step: 92: time taken for the last 1 steps is 13.0575, including opt 15.602130, loss is 3.9754414558410645
step: 93: time taken for the last 1 steps is 12.9520, including opt 15.496204, loss is 3.954409122467041
step: 94: time taken for the last 1 steps is 12.9411, including opt 15.471723, loss is 4.208159446716309
step: 95: time taken for the last 1 steps is 12.9817, including opt 15.491983, loss is 4.028080940246582
step: 96: time taken for the last 1 steps is 12.9718, including opt 15.506463, loss is 3.9571659564971924
step: 97: time taken for the last 1 steps is 13.0134, including opt 15.572786, loss is 3.999809980392456
step: 98: time taken for the last 1 steps is 12.9330, including opt 15.443197, loss is 4.0678510665893555
step: 99: time taken for the last 1 steps is 12.9593, including opt 15.484007, loss is 4.009326457977295
step: 100: time taken for the last 1 steps is 12.9158, including opt 15.455107, loss is 3.8418259620666504
step: 101: time taken for the last 1 steps is 12.9877, including opt 15.500998, loss is 4.291062355041504
step: 102: time taken for the last 1 steps is 12.9821, including opt 15.502387, loss is 4.0232834815979
step: 103: time taken for the last 1 steps is 12.9495, including opt 15.475742, loss is 4.089905738830566
step: 104: time taken for the last 1 steps is 13.0075, including opt 15.567437, loss is 4.056121826171875
step: 105: time taken for the last 1 steps is 13.0179, including opt 15.546013, loss is 4.030211448669434
step: 106: time taken for the last 1 steps is 12.9129, including opt 15.461455, loss is 3.965719699859619
step: 107: time taken for the last 1 steps is 12.9828, including opt 15.529472, loss is 4.331610679626465
step: 108: time taken for the last 1 steps is 12.8204, including opt 15.366524, loss is 4.134668827056885
step: 109: time taken for the last 1 steps is 12.9237, including opt 15.464372, loss is 3.9802699089050293
step: 110: time taken for the last 1 steps is 12.8747, including opt 15.409375, loss is 4.267245292663574
step: 111: time taken for the last 1 steps is 12.8704, including opt 15.390920, loss is 3.700106382369995
step: 112: time taken for the last 1 steps is 12.9629, including opt 15.479098, loss is 3.865509510040283
step: 113: time taken for the last 1 steps is 12.9455, including opt 15.484842, loss is 3.878971576690674
step: 114: time taken for the last 1 steps is 12.9171, including opt 15.426737, loss is 4.177242279052734
step: 115: time taken for the last 1 steps is 12.9746, including opt 15.516199, loss is 4.108732223510742
step: 116: time taken for the last 1 steps is 12.9418, including opt 15.484719, loss is 3.7063651084899902
step: 117: time taken for the last 1 steps is 12.9917, including opt 15.526182, loss is 4.03324031829834
step: 118: time taken for the last 1 steps is 12.9353, including opt 15.451454, loss is 4.08077335357666
step: 119: time taken for the last 1 steps is 12.9082, including opt 15.416273, loss is 3.8962974548339844
step: 120: time taken for the last 1 steps is 12.9771, including opt 15.499500, loss is 4.275963306427002
step: 121: time taken for the last 1 steps is 12.9590, including opt 15.484139, loss is 3.6256115436553955
step: 122: time taken for the last 1 steps is 12.8942, including opt 15.396664, loss is 4.223722457885742
step: 123: time taken for the last 1 steps is 12.9140, including opt 15.452754, loss is 3.8450114727020264
step: 124: time taken for the last 1 steps is 12.9281, including opt 15.454626, loss is 3.9138126373291016
step: 125: time taken for the last 1 steps is 12.9417, including opt 15.459686, loss is 4.245029449462891
step: 126: time taken for the last 1 steps is 12.8724, including opt 15.377251, loss is 3.9536757469177246
step: 127: time taken for the last 1 steps is 12.9797, including opt 15.486872, loss is 3.850186586380005
step: 128: time taken for the last 1 steps is 12.9182, including opt 15.414702, loss is 4.176522254943848
step: 129: time taken for the last 1 steps is 12.8911, including opt 15.399451, loss is 4.1182332038879395
step: 130: time taken for the last 1 steps is 12.9208, including opt 15.434717, loss is 3.80688738822937
step: 131: time taken for the last 1 steps is 12.8675, including opt 15.376936, loss is 4.030346870422363
step: 132: time taken for the last 1 steps is 12.9278, including opt 15.458185, loss is 4.123183727264404
step: 133: time taken for the last 1 steps is 12.8988, including opt 15.389774, loss is 4.2830705642700195
step: 134: time taken for the last 1 steps is 12.8968, including opt 15.420472, loss is 3.904261827468872
step: 135: time taken for the last 1 steps is 12.9451, including opt 15.462808, loss is 4.093894958496094
step: 136: time taken for the last 1 steps is 12.9078, including opt 15.458155, loss is 4.320589065551758
step: 137: time taken for the last 1 steps is 12.9732, including opt 15.504683, loss is 4.034948825836182
step: 138: time taken for the last 1 steps is 12.9268, including opt 15.471451, loss is 3.8169326782226562
step: 139: time taken for the last 1 steps is 12.9442, including opt 15.481073, loss is 3.897155284881592
step: 140: time taken for the last 1 steps is 12.9664, including opt 15.482441, loss is 3.7415566444396973
step: 141: time taken for the last 1 steps is 12.9158, including opt 15.442649, loss is 4.038598537445068
step: 142: time taken for the last 1 steps is 12.9533, including opt 15.479935, loss is 4.149287700653076
step: 143: time taken for the last 1 steps is 12.9067, including opt 15.479794, loss is 4.366002082824707
step: 144: time taken for the last 1 steps is 12.9430, including opt 15.493544, loss is 4.13712215423584
step: 145: time taken for the last 1 steps is 12.9148, including opt 15.446710, loss is 4.321962356567383
step: 146: time taken for the last 1 steps is 13.0166, including opt 15.566175, loss is 4.076898097991943
step: 147: time taken for the last 1 steps is 12.9258, including opt 15.446396, loss is 4.3491106033325195
step: 148: time taken for the last 1 steps is 12.9243, including opt 15.450511, loss is 3.8747940063476562
step: 149: time taken for the last 1 steps is 12.9808, including opt 15.514072, loss is 3.7246711254119873
step: 150: time taken for the last 1 steps is 13.0091, including opt 15.562476, loss is 3.822829484939575
step: 151: time taken for the last 1 steps is 12.9837, including opt 15.539664, loss is 3.5614707469940186
step: 152: time taken for the last 1 steps is 12.9537, including opt 15.493366, loss is 3.694937229156494
step: 153: time taken for the last 1 steps is 12.9679, including opt 15.497889, loss is 4.039566993713379
step: 154: time taken for the last 1 steps is 12.9923, including opt 15.534631, loss is 3.9887452125549316
step: 155: time taken for the last 1 steps is 12.9496, including opt 15.453995, loss is 4.047961235046387
step: 156: time taken for the last 1 steps is 12.9339, including opt 15.483441, loss is 4.155400276184082
step: 157: time taken for the last 1 steps is 12.9742, including opt 15.488947, loss is 4.296553134918213
step: 158: time taken for the last 1 steps is 12.9849, including opt 15.521920, loss is 4.198834419250488
step: 159: time taken for the last 1 steps is 13.0191, including opt 15.581592, loss is 4.109445571899414
step: 160: time taken for the last 1 steps is 12.9269, including opt 15.431576, loss is 4.1314897537231445
step: 161: time taken for the last 1 steps is 12.9327, including opt 15.455484, loss is 4.201825141906738
step: 162: time taken for the last 1 steps is 12.9502, including opt 15.471120, loss is 3.868298053741455
step: 163: time taken for the last 1 steps is 12.9141, including opt 15.455554, loss is 3.7572085857391357
step: 164: time taken for the last 1 steps is 12.9603, including opt 15.497371, loss is 4.350882530212402
step: 165: time taken for the last 1 steps is 12.8947, including opt 15.412590, loss is 3.9138236045837402
step: 166: time taken for the last 1 steps is 12.9546, including opt 15.479428, loss is 3.73881459236145
step: 167: time taken for the last 1 steps is 12.9135, including opt 15.440014, loss is 3.7829749584198
step: 168: time taken for the last 1 steps is 12.9010, including opt 15.435443, loss is 4.134357452392578
step: 169: time taken for the last 1 steps is 12.9437, including opt 15.486006, loss is 3.9704291820526123
step: 170: time taken for the last 1 steps is 13.0512, including opt 15.585337, loss is 3.8195278644561768
step: 171: time taken for the last 1 steps is 12.9688, including opt 15.505422, loss is 4.154661655426025
step: 172: time taken for the last 1 steps is 12.9081, including opt 15.436310, loss is 3.8458938598632812
step: 173: time taken for the last 1 steps is 12.9159, including opt 15.451505, loss is 3.994682788848877
step: 174: time taken for the last 1 steps is 12.9744, including opt 15.491071, loss is 3.9827303886413574
step: 175: time taken for the last 1 steps is 13.0385, including opt 15.574370, loss is 3.9415225982666016
step: 176: time taken for the last 1 steps is 12.9875, including opt 15.532671, loss is 4.0056939125061035
step: 177: time taken for the last 1 steps is 13.0040, including opt 15.540879, loss is 3.904787540435791
step: 178: time taken for the last 1 steps is 12.9388, including opt 15.464014, loss is 3.8894119262695312
step: 179: time taken for the last 1 steps is 13.0049, including opt 15.549776, loss is 4.014441967010498
step: 180: time taken for the last 1 steps is 12.9981, including opt 15.533463, loss is 3.518415927886963
step: 181: time taken for the last 1 steps is 13.0379, including opt 15.578986, loss is 4.456755638122559
step: 182: time taken for the last 1 steps is 12.9333, including opt 15.466089, loss is 3.919780731201172
step: 183: time taken for the last 1 steps is 12.9355, including opt 15.476069, loss is 3.856545925140381
step: 184: time taken for the last 1 steps is 13.0540, including opt 15.586934, loss is 3.865908145904541
step: 185: time taken for the last 1 steps is 13.0032, including opt 15.513469, loss is 3.7256417274475098
step: 186: time taken for the last 1 steps is 12.9785, including opt 15.486298, loss is 3.809781312942505
step: 187: time taken for the last 1 steps is 12.9627, including opt 15.477745, loss is 3.5904552936553955
step: 188: time taken for the last 1 steps is 12.9180, including opt 15.433560, loss is 3.8654086589813232
step: 189: time taken for the last 1 steps is 12.9231, including opt 15.474602, loss is 3.983609437942505
step: 190: time taken for the last 1 steps is 12.9456, including opt 15.471073, loss is 3.9745116233825684
step: 191: time taken for the last 1 steps is 12.9331, including opt 15.437964, loss is 4.032744407653809
step: 192: time taken for the last 1 steps is 12.9808, including opt 15.516388, loss is 4.048998832702637
step: 193: time taken for the last 1 steps is 12.9150, including opt 15.437906, loss is 3.993208646774292
step: 194: time taken for the last 1 steps is 12.9166, including opt 15.441233, loss is 4.2695746421813965
step: 195: time taken for the last 1 steps is 12.9687, including opt 15.457888, loss is 3.9331963062286377
step: 196: time taken for the last 1 steps is 12.9443, including opt 15.470049, loss is 3.9313583374023438
step: 197: time taken for the last 1 steps is 12.9514, including opt 15.473977, loss is 3.8616459369659424
step: 198: time taken for the last 1 steps is 12.9375, including opt 15.473077, loss is 3.8012237548828125
step: 199: time taken for the last 1 steps is 12.9915, including opt 15.504104, loss is 3.876148223876953
step: 200: time taken for the last 1 steps is 12.9415, including opt 15.477687, loss is 4.076910495758057
step: 201: time taken for the last 1 steps is 12.9374, including opt 15.453683, loss is 3.8360702991485596
step: 202: time taken for the last 1 steps is 12.9037, including opt 15.446213, loss is 4.039969444274902
step: 203: time taken for the last 1 steps is 12.9512, including opt 15.472159, loss is 4.178214073181152
step: 204: time taken for the last 1 steps is 12.9826, including opt 15.494752, loss is 3.8460800647735596
step: 205: time taken for the last 1 steps is 12.9096, including opt 15.439256, loss is 3.8968143463134766
step: 206: time taken for the last 1 steps is 12.9127, including opt 15.429628, loss is 3.592207431793213
step: 207: time taken for the last 1 steps is 13.0112, including opt 15.527710, loss is 3.9597442150115967
step: 208: time taken for the last 1 steps is 13.0084, including opt 15.551590, loss is 3.934818744659424
step: 209: time taken for the last 1 steps is 12.9340, including opt 15.466067, loss is 4.338805198669434
step: 210: time taken for the last 1 steps is 13.0348, including opt 15.574777, loss is 4.065990924835205
step: 211: time taken for the last 1 steps is 12.9279, including opt 15.432986, loss is 3.476410388946533
step: 212: time taken for the last 1 steps is 12.9265, including opt 15.455601, loss is 3.8762543201446533
step: 213: time taken for the last 1 steps is 12.9253, including opt 15.439193, loss is 4.150082588195801
step: 214: time taken for the last 1 steps is 13.0092, including opt 15.536773, loss is 3.8242526054382324
step: 215: time taken for the last 1 steps is 12.9073, including opt 15.445593, loss is 3.827552080154419
step: 216: time taken for the last 1 steps is 12.9314, including opt 15.455007, loss is 4.002578258514404
step: 217: time taken for the last 1 steps is 12.9600, including opt 15.494655, loss is 3.757901668548584
step: 218: time taken for the last 1 steps is 12.9485, including opt 15.480834, loss is 3.609649658203125
step: 219: time taken for the last 1 steps is 13.0042, including opt 15.546495, loss is 3.832533359527588
step: 220: time taken for the last 1 steps is 12.9662, including opt 15.504641, loss is 3.744711399078369
step: 221: time taken for the last 1 steps is 12.8926, including opt 15.403493, loss is 3.6496822834014893
step: 222: time taken for the last 1 steps is 12.9847, including opt 15.546947, loss is 3.8788063526153564
step: 223: time taken for the last 1 steps is 13.0225, including opt 15.559668, loss is 3.547461986541748
step: 224: time taken for the last 1 steps is 12.9725, including opt 15.513461, loss is 4.108065128326416
step: 225: time taken for the last 1 steps is 12.9484, including opt 15.488515, loss is 3.7978196144104004
step: 226: time taken for the last 1 steps is 12.9548, including opt 15.518584, loss is 3.8338141441345215
step: 227: time taken for the last 1 steps is 12.9512, including opt 15.449868, loss is 3.891359806060791
step: 228: time taken for the last 1 steps is 12.9622, including opt 15.492038, loss is 3.6764252185821533
step: 229: time taken for the last 1 steps is 12.9835, including opt 15.520390, loss is 4.283459663391113
step: 230: time taken for the last 1 steps is 12.9892, including opt 15.540339, loss is 3.733856201171875
step: 231: time taken for the last 1 steps is 12.9593, including opt 15.511137, loss is 3.7730743885040283
step: 232: time taken for the last 1 steps is 12.9507, including opt 15.458694, loss is 3.740229368209839
step: 233: time taken for the last 1 steps is 12.9527, including opt 15.516920, loss is 3.828632354736328
step: 234: time taken for the last 1 steps is 12.9201, including opt 15.454271, loss is 3.730721950531006
step: 235: time taken for the last 1 steps is 12.8955, including opt 15.421086, loss is 3.930812120437622
step: 236: time taken for the last 1 steps is 12.9120, including opt 15.449381, loss is 4.08349609375
step: 237: time taken for the last 1 steps is 12.7587, including opt 15.183150, loss is 4.140689849853516
val_loss : 3.7658 : val_acc: 0.1363
updating stats...
Epoch: 2 starting...
step: 1: time taken for the last 1 steps is 12.9442, including opt 15.500418, loss is 3.8693785667419434
step: 2: time taken for the last 1 steps is 13.0088, including opt 15.561602, loss is 3.9792373180389404
step: 3: time taken for the last 1 steps is 12.8946, including opt 15.422806, loss is 4.081082344055176
step: 4: time taken for the last 1 steps is 12.8780, including opt 15.381064, loss is 3.6788439750671387
step: 5: time taken for the last 1 steps is 12.9047, including opt 15.431922, loss is 3.773883104324341
step: 6: time taken for the last 1 steps is 12.9458, including opt 15.473321, loss is 3.636366605758667
step: 7: time taken for the last 1 steps is 12.9351, including opt 15.436436, loss is 3.5335609912872314
step: 8: time taken for the last 1 steps is 12.8874, including opt 15.401332, loss is 3.9463844299316406
step: 9: time taken for the last 1 steps is 12.9272, including opt 15.444679, loss is 4.226304054260254
step: 10: time taken for the last 1 steps is 12.9444, including opt 15.459855, loss is 3.937394380569458
step: 11: time taken for the last 1 steps is 12.9503, including opt 15.469539, loss is 4.087075233459473
step: 12: time taken for the last 1 steps is 12.9083, including opt 15.429811, loss is 3.8830344676971436
step: 13: time taken for the last 1 steps is 12.8840, including opt 15.394755, loss is 3.6829559803009033
step: 14: time taken for the last 1 steps is 12.9502, including opt 15.454894, loss is 3.976586103439331
step: 15: time taken for the last 1 steps is 12.9239, including opt 15.450068, loss is 4.190851211547852
step: 16: time taken for the last 1 steps is 12.9341, including opt 15.429611, loss is 3.952547788619995
step: 17: time taken for the last 1 steps is 12.9296, including opt 15.440862, loss is 3.7871603965759277
step: 18: time taken for the last 1 steps is 12.9603, including opt 15.491840, loss is 4.0150861740112305
step: 19: time taken for the last 1 steps is 12.9941, including opt 15.516226, loss is 4.3052449226379395
step: 20: time taken for the last 1 steps is 12.8718, including opt 15.398144, loss is 3.899247646331787
step: 21: time taken for the last 1 steps is 12.9971, including opt 15.513976, loss is 3.785951614379883
step: 22: time taken for the last 1 steps is 12.8837, including opt 15.416794, loss is 3.6675915718078613
step: 23: time taken for the last 1 steps is 12.9539, including opt 15.484563, loss is 3.9332480430603027
step: 24: time taken for the last 1 steps is 12.9074, including opt 15.454839, loss is 3.8857052326202393
step: 25: time taken for the last 1 steps is 12.9634, including opt 15.503206, loss is 3.8974671363830566
step: 26: time taken for the last 1 steps is 12.9905, including opt 15.521319, loss is 3.9284679889678955
step: 27: time taken for the last 1 steps is 12.9472, including opt 15.466173, loss is 4.175166130065918
step: 28: time taken for the last 1 steps is 12.9444, including opt 15.469876, loss is 3.9889779090881348
step: 29: time taken for the last 1 steps is 13.0158, including opt 15.555745, loss is 3.843360185623169
step: 30: time taken for the last 1 steps is 12.9154, including opt 15.440644, loss is 3.8078861236572266
step: 31: time taken for the last 1 steps is 12.9920, including opt 15.535599, loss is 3.5937564373016357
step: 32: time taken for the last 1 steps is 12.9185, including opt 15.415441, loss is 3.8238329887390137
step: 33: time taken for the last 1 steps is 12.9115, including opt 15.424296, loss is 3.7380683422088623
step: 34: time taken for the last 1 steps is 12.9345, including opt 15.482251, loss is 3.932255983352661
step: 35: time taken for the last 1 steps is 12.9662, including opt 15.490415, loss is 3.8617348670959473
step: 36: time taken for the last 1 steps is 12.9865, including opt 15.497824, loss is 3.783409833908081
step: 37: time taken for the last 1 steps is 12.9115, including opt 15.450164, loss is 3.6983535289764404
step: 38: time taken for the last 1 steps is 12.9041, including opt 15.411453, loss is 4.040431022644043
step: 39: time taken for the last 1 steps is 12.9316, including opt 15.440201, loss is 3.948491334915161
step: 40: time taken for the last 1 steps is 12.9256, including opt 15.432503, loss is 3.9251182079315186
step: 41: time taken for the last 1 steps is 12.9618, including opt 15.482773, loss is 3.8617775440216064
step: 42: time taken for the last 1 steps is 12.9818, including opt 15.513099, loss is 3.8632075786590576
step: 43: time taken for the last 1 steps is 12.9491, including opt 15.496150, loss is 3.6430039405822754
step: 44: time taken for the last 1 steps is 12.8868, including opt 15.408708, loss is 3.7017974853515625
step: 45: time taken for the last 1 steps is 12.9301, including opt 15.460009, loss is 3.477083683013916
step: 46: time taken for the last 1 steps is 12.9688, including opt 15.503328, loss is 4.0226545333862305
step: 47: time taken for the last 1 steps is 12.9602, including opt 15.477783, loss is 3.5224056243896484
step: 48: time taken for the last 1 steps is 12.9789, including opt 15.518287, loss is 4.023545265197754
step: 49: time taken for the last 1 steps is 12.8682, including opt 15.405310, loss is 4.040861129760742
step: 50: time taken for the last 1 steps is 12.8977, including opt 15.429769, loss is 3.8449320793151855
step: 51: time taken for the last 1 steps is 12.9900, including opt 15.543626, loss is 3.5373528003692627
step: 52: time taken for the last 1 steps is 12.9680, including opt 15.525267, loss is 3.7762482166290283
step: 53: time taken for the last 1 steps is 12.9841, including opt 15.507757, loss is 3.9796111583709717
step: 54: time taken for the last 1 steps is 12.9231, including opt 15.498917, loss is 3.9565861225128174
step: 55: time taken for the last 1 steps is 12.9349, including opt 15.455965, loss is 3.65606951713562
step: 56: time taken for the last 1 steps is 12.9445, including opt 15.453082, loss is 3.806586742401123
step: 57: time taken for the last 1 steps is 12.9321, including opt 15.448373, loss is 3.981889247894287
step: 58: time taken for the last 1 steps is 12.8700, including opt 15.408945, loss is 3.919682025909424
step: 59: time taken for the last 1 steps is 12.9804, including opt 15.523428, loss is 3.446833372116089
step: 60: time taken for the last 1 steps is 12.8511, including opt 15.356082, loss is 3.8101649284362793
step: 61: time taken for the last 1 steps is 12.9115, including opt 15.442744, loss is 3.9135518074035645
step: 62: time taken for the last 1 steps is 12.9924, including opt 15.508216, loss is 3.9118220806121826
step: 63: time taken for the last 1 steps is 12.8832, including opt 15.421386, loss is 3.840965747833252
step: 64: time taken for the last 1 steps is 13.0525, including opt 15.593725, loss is 3.6466617584228516
step: 65: time taken for the last 1 steps is 12.9645, including opt 15.498373, loss is 3.7638537883758545
step: 66: time taken for the last 1 steps is 12.9627, including opt 15.467786, loss is 3.6427388191223145
step: 67: time taken for the last 1 steps is 12.9252, including opt 15.461765, loss is 3.3356337547302246
step: 68: time taken for the last 1 steps is 12.9163, including opt 15.444053, loss is 3.4050605297088623
step: 69: time taken for the last 1 steps is 12.9154, including opt 15.426556, loss is 3.961452007293701
step: 70: time taken for the last 1 steps is 12.8952, including opt 15.419515, loss is 4.251779556274414
step: 71: time taken for the last 1 steps is 12.9304, including opt 15.443650, loss is 3.784717559814453
step: 72: time taken for the last 1 steps is 12.8891, including opt 15.392598, loss is 3.981940507888794
step: 73: time taken for the last 1 steps is 12.8288, including opt 15.349109, loss is 3.6910758018493652
step: 74: time taken for the last 1 steps is 12.8900, including opt 15.436270, loss is 3.6866517066955566
step: 75: time taken for the last 1 steps is 12.9285, including opt 15.465192, loss is 3.7594590187072754
step: 76: time taken for the last 1 steps is 12.8934, including opt 15.415413, loss is 4.074532985687256
step: 77: time taken for the last 1 steps is 12.8943, including opt 15.376234, loss is 3.7796006202697754
step: 78: time taken for the last 1 steps is 12.8903, including opt 15.435956, loss is 3.7586357593536377
step: 79: time taken for the last 1 steps is 12.8939, including opt 15.435058, loss is 3.766085386276245
step: 80: time taken for the last 1 steps is 12.9284, including opt 15.474598, loss is 4.014730930328369
step: 81: time taken for the last 1 steps is 12.8834, including opt 15.404382, loss is 3.877291202545166
step: 82: time taken for the last 1 steps is 13.0028, including opt 15.535750, loss is 3.632462739944458
step: 83: time taken for the last 1 steps is 12.9918, including opt 15.530176, loss is 3.812713623046875
step: 84: time taken for the last 1 steps is 12.8818, including opt 15.420387, loss is 3.7077033519744873
step: 85: time taken for the last 1 steps is 12.9755, including opt 15.545513, loss is 4.224617958068848
step: 86: time taken for the last 1 steps is 12.9743, including opt 15.524801, loss is 3.6131770610809326
step: 87: time taken for the last 1 steps is 12.9182, including opt 15.458738, loss is 4.029125690460205
step: 88: time taken for the last 1 steps is 12.9826, including opt 15.534688, loss is 4.307351112365723
step: 89: time taken for the last 1 steps is 12.9514, including opt 15.476123, loss is 3.769984006881714
step: 90: time taken for the last 1 steps is 12.9856, including opt 15.524625, loss is 3.794914960861206
step: 91: time taken for the last 1 steps is 12.9238, including opt 15.441860, loss is 3.7221591472625732
step: 92: time taken for the last 1 steps is 12.9347, including opt 15.468599, loss is 3.822981357574463
step: 93: time taken for the last 1 steps is 12.9471, including opt 15.460849, loss is 3.6502506732940674
step: 94: time taken for the last 1 steps is 12.9769, including opt 15.505740, loss is 3.889395236968994
step: 95: time taken for the last 1 steps is 12.9349, including opt 15.484605, loss is 3.964022159576416
step: 96: time taken for the last 1 steps is 12.9996, including opt 15.524884, loss is 3.8354854583740234
step: 97: time taken for the last 1 steps is 13.0363, including opt 15.561466, loss is 3.5679984092712402
step: 98: time taken for the last 1 steps is 12.9713, including opt 15.527176, loss is 3.7331442832946777
step: 99: time taken for the last 1 steps is 13.0213, including opt 15.595122, loss is 3.7279372215270996
step: 100: time taken for the last 1 steps is 12.9131, including opt 15.469147, loss is 3.7998039722442627
step: 101: time taken for the last 1 steps is 12.9822, including opt 15.527607, loss is 4.0680437088012695
step: 102: time taken for the last 1 steps is 13.0174, including opt 15.554097, loss is 3.864187240600586
step: 103: time taken for the last 1 steps is 12.9696, including opt 15.518837, loss is 3.4726994037628174
step: 104: time taken for the last 1 steps is 12.9783, including opt 15.507070, loss is 3.7774112224578857
step: 105: time taken for the last 1 steps is 12.9477, including opt 15.495096, loss is 3.6553001403808594
step: 106: time taken for the last 1 steps is 13.0102, including opt 15.534119, loss is 3.4753241539001465
step: 107: time taken for the last 1 steps is 12.9694, including opt 15.494922, loss is 4.052267551422119
step: 108: time taken for the last 1 steps is 12.8628, including opt 15.398893, loss is 3.792501926422119
step: 109: time taken for the last 1 steps is 12.9833, including opt 15.497093, loss is 3.9620304107666016
step: 110: time taken for the last 1 steps is 12.9509, including opt 15.476830, loss is 3.816257953643799
step: 111: time taken for the last 1 steps is 12.9042, including opt 15.443487, loss is 3.275386095046997
step: 112: time taken for the last 1 steps is 12.9869, including opt 15.516625, loss is 3.495631456375122
step: 113: time taken for the last 1 steps is 12.9317, including opt 15.437650, loss is 3.5483336448669434
step: 114: time taken for the last 1 steps is 12.9450, including opt 15.493485, loss is 3.9034581184387207
step: 115: time taken for the last 1 steps is 12.9363, including opt 15.466371, loss is 3.9867405891418457
step: 116: time taken for the last 1 steps is 12.9397, including opt 15.468494, loss is 3.7894797325134277
step: 117: time taken for the last 1 steps is 12.9591, including opt 15.491241, loss is 3.819809675216675
step: 118: time taken for the last 1 steps is 12.8991, including opt 15.444613, loss is 3.759282350540161
step: 119: time taken for the last 1 steps is 12.9390, including opt 15.464764, loss is 3.558647632598877
step: 120: time taken for the last 1 steps is 12.9128, including opt 15.461951, loss is 3.775280714035034
step: 121: time taken for the last 1 steps is 12.9186, including opt 15.417841, loss is 3.2794430255889893
step: 122: time taken for the last 1 steps is 12.9386, including opt 15.448179, loss is 3.7108657360076904
step: 123: time taken for the last 1 steps is 12.9343, including opt 15.459695, loss is 3.4786174297332764
step: 124: time taken for the last 1 steps is 12.9770, including opt 15.525058, loss is 3.5597338676452637
step: 125: time taken for the last 1 steps is 12.9434, including opt 15.498511, loss is 3.89516019821167
step: 126: time taken for the last 1 steps is 12.8340, including opt 15.353129, loss is 3.675840377807617
step: 127: time taken for the last 1 steps is 12.8955, including opt 15.410745, loss is 3.7606403827667236
step: 128: time taken for the last 1 steps is 12.8651, including opt 15.372340, loss is 3.9862377643585205
step: 129: time taken for the last 1 steps is 12.9400, including opt 15.496903, loss is 3.969996929168701
step: 130: time taken for the last 1 steps is 12.9548, including opt 15.486128, loss is 3.617870807647705
step: 131: time taken for the last 1 steps is 12.9647, including opt 15.507628, loss is 3.835763931274414
step: 132: time taken for the last 1 steps is 12.9123, including opt 15.447646, loss is 3.9005188941955566
step: 133: time taken for the last 1 steps is 12.9638, including opt 15.491998, loss is 3.683218002319336
step: 134: time taken for the last 1 steps is 12.9394, including opt 15.425437, loss is 3.717252731323242
step: 135: time taken for the last 1 steps is 12.9059, including opt 15.424013, loss is 4.217154026031494
step: 136: time taken for the last 1 steps is 13.0227, including opt 15.560290, loss is 4.108799457550049
step: 137: time taken for the last 1 steps is 12.9820, including opt 15.517214, loss is 3.8628177642822266
step: 138: time taken for the last 1 steps is 12.9314, including opt 15.477653, loss is 3.463736057281494
step: 139: time taken for the last 1 steps is 12.9048, including opt 15.409589, loss is 3.4335436820983887
step: 140: time taken for the last 1 steps is 12.9824, including opt 15.516419, loss is 3.4740982055664062
step: 141: time taken for the last 1 steps is 13.0369, including opt 15.556335, loss is 3.7226855754852295
step: 142: time taken for the last 1 steps is 12.9169, including opt 15.458154, loss is 3.6321969032287598
step: 143: time taken for the last 1 steps is 13.0048, including opt 15.526147, loss is 4.139720439910889
step: 144: time taken for the last 1 steps is 12.9202, including opt 15.445394, loss is 3.696511745452881
step: 145: time taken for the last 1 steps is 12.9502, including opt 15.472101, loss is 4.052031517028809
step: 146: time taken for the last 1 steps is 12.9695, including opt 15.530929, loss is 3.719536542892456
step: 147: time taken for the last 1 steps is 13.0113, including opt 15.534097, loss is 3.701813220977783
step: 148: time taken for the last 1 steps is 12.9885, including opt 15.479730, loss is 3.7236251831054688
step: 149: time taken for the last 1 steps is 12.9250, including opt 15.473389, loss is 3.659418821334839
step: 150: time taken for the last 1 steps is 12.9159, including opt 15.452405, loss is 3.7285995483398438
step: 151: time taken for the last 1 steps is 12.9199, including opt 15.470051, loss is 3.4710097312927246
step: 152: time taken for the last 1 steps is 12.9747, including opt 15.517505, loss is 3.3675448894500732
step: 153: time taken for the last 1 steps is 12.8559, including opt 15.399139, loss is 3.7412056922912598
step: 154: time taken for the last 1 steps is 12.8315, including opt 15.373615, loss is 3.5863394737243652
step: 155: time taken for the last 1 steps is 12.9727, including opt 15.503167, loss is 3.668532609939575
step: 156: time taken for the last 1 steps is 12.9381, including opt 15.494199, loss is 4.021303653717041
step: 157: time taken for the last 1 steps is 12.9155, including opt 15.436383, loss is 3.8147780895233154
step: 158: time taken for the last 1 steps is 12.9454, including opt 15.481202, loss is 3.7744317054748535
step: 159: time taken for the last 1 steps is 12.9060, including opt 15.409739, loss is 4.249209880828857
step: 160: time taken for the last 1 steps is 12.9412, including opt 15.439086, loss is 3.7267208099365234
step: 161: time taken for the last 1 steps is 12.9571, including opt 15.486141, loss is 3.9516587257385254
step: 162: time taken for the last 1 steps is 12.8691, including opt 15.386513, loss is 3.6936638355255127
step: 163: time taken for the last 1 steps is 12.9219, including opt 15.442968, loss is 3.419236421585083
step: 164: time taken for the last 1 steps is 12.8657, including opt 15.366190, loss is 4.328894138336182
step: 165: time taken for the last 1 steps is 12.9186, including opt 15.408548, loss is 4.017320156097412
step: 166: time taken for the last 1 steps is 12.8518, including opt 15.366262, loss is 3.4963481426239014
step: 167: time taken for the last 1 steps is 12.8981, including opt 15.389576, loss is 3.6090798377990723
step: 168: time taken for the last 1 steps is 12.9646, including opt 15.496112, loss is 3.568174362182617
step: 169: time taken for the last 1 steps is 12.8651, including opt 15.407665, loss is 3.602795124053955
step: 170: time taken for the last 1 steps is 12.9845, including opt 15.494582, loss is 3.6015982627868652
step: 171: time taken for the last 1 steps is 12.8716, including opt 15.421354, loss is 4.120584487915039
step: 172: time taken for the last 1 steps is 12.9543, including opt 15.496878, loss is 3.3847193717956543
step: 173: time taken for the last 1 steps is 12.8864, including opt 15.405001, loss is 3.6578030586242676
step: 174: time taken for the last 1 steps is 13.0012, including opt 15.547443, loss is 3.5505294799804688
step: 175: time taken for the last 1 steps is 12.9691, including opt 15.515635, loss is 3.8324851989746094
step: 176: time taken for the last 1 steps is 12.9633, including opt 15.498666, loss is 3.5782501697540283
step: 177: time taken for the last 1 steps is 12.9995, including opt 15.538529, loss is 3.503589153289795
step: 178: time taken for the last 1 steps is 12.8677, including opt 15.413456, loss is 3.865133285522461
step: 179: time taken for the last 1 steps is 12.9153, including opt 15.461205, loss is 3.8488001823425293
step: 180: time taken for the last 1 steps is 12.8797, including opt 15.408162, loss is 3.503871202468872
step: 181: time taken for the last 1 steps is 12.9252, including opt 15.431849, loss is 4.041670799255371
step: 182: time taken for the last 1 steps is 12.9049, including opt 15.430162, loss is 3.8116672039031982
step: 183: time taken for the last 1 steps is 12.9389, including opt 15.485245, loss is 3.630807399749756
step: 184: time taken for the last 1 steps is 12.9640, including opt 15.468776, loss is 3.721752643585205
step: 185: time taken for the last 1 steps is 12.9218, including opt 15.445239, loss is 3.581904649734497
step: 186: time taken for the last 1 steps is 12.8807, including opt 15.413580, loss is 3.573009967803955
step: 187: time taken for the last 1 steps is 12.9947, including opt 15.511935, loss is 3.598236083984375
step: 188: time taken for the last 1 steps is 12.8576, including opt 15.370087, loss is 3.7251949310302734
step: 189: time taken for the last 1 steps is 12.9276, including opt 15.450574, loss is 3.7362399101257324
step: 190: time taken for the last 1 steps is 12.9990, including opt 15.558790, loss is 3.7169318199157715
step: 191: time taken for the last 1 steps is 12.9364, including opt 15.450858, loss is 3.7119698524475098
step: 192: time taken for the last 1 steps is 12.9471, including opt 15.485657, loss is 3.7040343284606934
step: 193: time taken for the last 1 steps is 12.9536, including opt 15.484104, loss is 3.5547409057617188
step: 194: time taken for the last 1 steps is 12.9502, including opt 15.472856, loss is 3.919175624847412
step: 195: time taken for the last 1 steps is 12.8888, including opt 15.418819, loss is 3.726811170578003
step: 196: time taken for the last 1 steps is 12.9400, including opt 15.475915, loss is 3.8413748741149902
step: 197: time taken for the last 1 steps is 12.9597, including opt 15.486987, loss is 3.790189027786255
step: 198: time taken for the last 1 steps is 12.8814, including opt 15.423330, loss is 3.387907028198242
step: 199: time taken for the last 1 steps is 12.9542, including opt 15.484891, loss is 3.6528961658477783
step: 200: time taken for the last 1 steps is 12.9157, including opt 15.418907, loss is 3.831444263458252
step: 201: time taken for the last 1 steps is 12.8434, including opt 15.372467, loss is 3.826207399368286
step: 202: time taken for the last 1 steps is 12.9342, including opt 15.439191, loss is 4.011013984680176
step: 203: time taken for the last 1 steps is 12.8541, including opt 15.381104, loss is 4.2047810554504395
step: 204: time taken for the last 1 steps is 12.9228, including opt 15.434483, loss is 3.8292431831359863
step: 205: time taken for the last 1 steps is 12.8958, including opt 15.427595, loss is 3.7016398906707764
step: 206: time taken for the last 1 steps is 12.9149, including opt 15.430025, loss is 3.537548065185547
step: 207: time taken for the last 1 steps is 12.9216, including opt 15.448825, loss is 3.817686080932617
step: 208: time taken for the last 1 steps is 12.8758, including opt 15.394265, loss is 3.9235167503356934
step: 209: time taken for the last 1 steps is 12.9049, including opt 15.422406, loss is 3.8598697185516357
step: 210: time taken for the last 1 steps is 13.0007, including opt 15.532795, loss is 3.808232545852661
step: 211: time taken for the last 1 steps is 13.0503, including opt 15.563010, loss is 3.180088520050049
step: 212: time taken for the last 1 steps is 12.8978, including opt 15.433463, loss is 3.900413990020752
step: 213: time taken for the last 1 steps is 12.9318, including opt 15.462909, loss is 3.8021774291992188
step: 214: time taken for the last 1 steps is 12.9407, including opt 15.457791, loss is 3.6988117694854736
step: 215: time taken for the last 1 steps is 12.9380, including opt 15.458375, loss is 3.8368582725524902
step: 216: time taken for the last 1 steps is 12.9659, including opt 15.486094, loss is 3.6612448692321777
step: 217: time taken for the last 1 steps is 13.0166, including opt 15.543893, loss is 3.4830710887908936
step: 218: time taken for the last 1 steps is 12.9695, including opt 15.489317, loss is 3.1466619968414307
step: 219: time taken for the last 1 steps is 12.9664, including opt 15.508867, loss is 3.565274477005005
step: 220: time taken for the last 1 steps is 12.9791, including opt 15.493776, loss is 3.316155195236206
step: 221: time taken for the last 1 steps is 12.8882, including opt 15.422657, loss is 3.4011497497558594
step: 222: time taken for the last 1 steps is 12.8779, including opt 15.416411, loss is 3.8370423316955566
step: 223: time taken for the last 1 steps is 12.9202, including opt 15.455450, loss is 3.1403775215148926
step: 224: time taken for the last 1 steps is 12.8252, including opt 15.342908, loss is 4.10186243057251
step: 225: time taken for the last 1 steps is 12.9047, including opt 15.440386, loss is 3.711658000946045
step: 226: time taken for the last 1 steps is 12.8974, including opt 15.424906, loss is 3.8342010974884033
step: 227: time taken for the last 1 steps is 12.8964, including opt 15.417400, loss is 3.5993683338165283
step: 228: time taken for the last 1 steps is 13.0100, including opt 15.537453, loss is 3.391702175140381
step: 229: time taken for the last 1 steps is 12.9159, including opt 15.433802, loss is 3.8852157592773438
step: 230: time taken for the last 1 steps is 12.8858, including opt 15.422072, loss is 3.527730941772461
step: 231: time taken for the last 1 steps is 12.8843, including opt 15.367741, loss is 3.571895122528076
step: 232: time taken for the last 1 steps is 12.9135, including opt 15.404237, loss is 3.8786094188690186
step: 233: time taken for the last 1 steps is 12.9468, including opt 15.470755, loss is 3.689422607421875
step: 234: time taken for the last 1 steps is 12.9095, including opt 15.438729, loss is 3.762986660003662
step: 235: time taken for the last 1 steps is 12.9713, including opt 15.468927, loss is 4.044989109039307
step: 236: time taken for the last 1 steps is 12.8516, including opt 15.374075, loss is 3.830427885055542
step: 237: time taken for the last 1 steps is 12.6264, including opt 15.067387, loss is 4.106186866760254
val_loss : 3.5847 : val_acc: 0.1679
updating stats...
Epoch: 3 starting...
step: 1: time taken for the last 1 steps is 12.8758, including opt 15.389861, loss is 3.850447177886963
step: 2: time taken for the last 1 steps is 12.9057, including opt 15.428613, loss is 3.918501377105713
step: 3: time taken for the last 1 steps is 12.9590, including opt 15.495163, loss is 3.8393447399139404
step: 4: time taken for the last 1 steps is 12.8820, including opt 15.443376, loss is 3.5741050243377686
step: 5: time taken for the last 1 steps is 12.9896, including opt 15.523921, loss is 3.6870036125183105
step: 6: time taken for the last 1 steps is 12.9079, including opt 15.424634, loss is 3.496582508087158
step: 7: time taken for the last 1 steps is 12.9207, including opt 15.462881, loss is 3.462451219558716
step: 8: time taken for the last 1 steps is 12.9681, including opt 15.469160, loss is 3.7258973121643066
step: 9: time taken for the last 1 steps is 12.8833, including opt 15.416888, loss is 3.8339531421661377
step: 10: time taken for the last 1 steps is 12.8924, including opt 15.427411, loss is 3.5096611976623535
step: 11: time taken for the last 1 steps is 12.8972, including opt 15.398716, loss is 4.048272132873535
step: 12: time taken for the last 1 steps is 12.8842, including opt 15.386993, loss is 3.7614645957946777
step: 13: time taken for the last 1 steps is 12.8886, including opt 15.416074, loss is 3.3098082542419434
step: 14: time taken for the last 1 steps is 12.8301, including opt 15.352362, loss is 4.003854751586914
step: 15: time taken for the last 1 steps is 12.9753, including opt 15.495409, loss is 3.9652836322784424
step: 16: time taken for the last 1 steps is 12.9164, including opt 15.413550, loss is 3.8235225677490234
step: 17: time taken for the last 1 steps is 12.9305, including opt 15.445741, loss is 3.5726001262664795
step: 18: time taken for the last 1 steps is 12.8025, including opt 15.304949, loss is 3.8658032417297363
step: 19: time taken for the last 1 steps is 12.8771, including opt 15.394322, loss is 4.181225299835205
step: 20: time taken for the last 1 steps is 12.8763, including opt 15.422109, loss is 3.622831344604492
step: 21: time taken for the last 1 steps is 12.9114, including opt 15.425892, loss is 3.566603899002075
step: 22: time taken for the last 1 steps is 12.9546, including opt 15.445686, loss is 3.514868974685669
step: 23: time taken for the last 1 steps is 12.9126, including opt 15.425624, loss is 3.833653688430786
step: 24: time taken for the last 1 steps is 12.8106, including opt 15.315744, loss is 3.290740966796875
step: 25: time taken for the last 1 steps is 12.8920, including opt 15.393887, loss is 3.6859936714172363
step: 26: time taken for the last 1 steps is 12.8986, including opt 15.442122, loss is 4.184823513031006
step: 27: time taken for the last 1 steps is 12.9150, including opt 15.440011, loss is 4.087386131286621
step: 28: time taken for the last 1 steps is 12.9889, including opt 15.535068, loss is 3.9971728324890137
step: 29: time taken for the last 1 steps is 12.9089, including opt 15.432907, loss is 3.6413350105285645
step: 30: time taken for the last 1 steps is 12.8989, including opt 15.397594, loss is 3.6482913494110107
step: 31: time taken for the last 1 steps is 12.9820, including opt 15.498545, loss is 3.515425443649292
step: 32: time taken for the last 1 steps is 12.8628, including opt 15.356711, loss is 3.5614027976989746
step: 33: time taken for the last 1 steps is 12.8281, including opt 15.346552, loss is 3.693094253540039
step: 34: time taken for the last 1 steps is 12.8676, including opt 15.362768, loss is 3.6241180896759033
step: 35: time taken for the last 1 steps is 12.8688, including opt 15.373358, loss is 3.4916813373565674
step: 36: time taken for the last 1 steps is 12.9056, including opt 15.417670, loss is 3.704329252243042
step: 37: time taken for the last 1 steps is 12.9849, including opt 15.512247, loss is 3.6497550010681152
step: 38: time taken for the last 1 steps is 12.9312, including opt 15.467974, loss is 3.656665086746216
step: 39: time taken for the last 1 steps is 12.9619, including opt 15.472399, loss is 3.680276870727539
step: 40: time taken for the last 1 steps is 12.9418, including opt 15.437608, loss is 3.7009830474853516
step: 41: time taken for the last 1 steps is 12.8373, including opt 15.356011, loss is 3.796445846557617
step: 42: time taken for the last 1 steps is 12.9777, including opt 15.486048, loss is 3.825899124145508
step: 43: time taken for the last 1 steps is 13.0015, including opt 15.525326, loss is 3.4007065296173096
step: 44: time taken for the last 1 steps is 13.0467, including opt 15.558323, loss is 3.5433361530303955
step: 45: time taken for the last 1 steps is 12.8692, including opt 15.407467, loss is 3.480074644088745
step: 46: time taken for the last 1 steps is 12.9389, including opt 15.481784, loss is 3.7855911254882812
step: 47: time taken for the last 1 steps is 12.9161, including opt 15.422138, loss is 3.57977294921875
step: 48: time taken for the last 1 steps is 12.8954, including opt 15.432739, loss is 3.894746780395508
step: 49: time taken for the last 1 steps is 12.9629, including opt 15.468826, loss is 3.993455410003662
step: 50: time taken for the last 1 steps is 12.9072, including opt 15.412170, loss is 3.6020736694335938
step: 51: time taken for the last 1 steps is 12.9011, including opt 15.422876, loss is 3.3021578788757324
step: 52: time taken for the last 1 steps is 12.9462, including opt 15.496021, loss is 3.4053432941436768
step: 53: time taken for the last 1 steps is 12.9035, including opt 15.423616, loss is 3.7588977813720703
step: 54: time taken for the last 1 steps is 12.8963, including opt 15.390591, loss is 3.890963315963745
step: 55: time taken for the last 1 steps is 12.9422, including opt 15.484032, loss is 3.555626392364502
step: 56: time taken for the last 1 steps is 12.9311, including opt 15.429685, loss is 3.5128817558288574
step: 57: time taken for the last 1 steps is 12.8874, including opt 15.427317, loss is 3.8082916736602783
step: 58: time taken for the last 1 steps is 12.9212, including opt 15.417509, loss is 3.8167643547058105
step: 59: time taken for the last 1 steps is 12.9362, including opt 15.460818, loss is 3.269003391265869
step: 60: time taken for the last 1 steps is 12.9547, including opt 15.490236, loss is 3.5460143089294434
step: 61: time taken for the last 1 steps is 12.9537, including opt 15.466359, loss is 3.760171413421631
step: 62: time taken for the last 1 steps is 12.9689, including opt 15.510064, loss is 3.5497467517852783
step: 63: time taken for the last 1 steps is 12.9352, including opt 15.431441, loss is 3.6221401691436768
step: 64: time taken for the last 1 steps is 12.9883, including opt 15.497996, loss is 3.5380280017852783
step: 65: time taken for the last 1 steps is 12.8839, including opt 15.401450, loss is 3.621501922607422
step: 66: time taken for the last 1 steps is 12.9163, including opt 15.430975, loss is 3.534719944000244
step: 67: time taken for the last 1 steps is 12.8899, including opt 15.379703, loss is 3.040767192840576
step: 68: time taken for the last 1 steps is 12.9959, including opt 15.500005, loss is 3.518791913986206
step: 69: time taken for the last 1 steps is 12.9328, including opt 15.448372, loss is 3.815081834793091
step: 70: time taken for the last 1 steps is 12.8792, including opt 15.401789, loss is 4.026322364807129
step: 71: time taken for the last 1 steps is 12.8674, including opt 15.398872, loss is 3.625756025314331
step: 72: time taken for the last 1 steps is 12.8840, including opt 15.396788, loss is 3.8352203369140625
step: 73: time taken for the last 1 steps is 12.8609, including opt 15.386285, loss is 3.5460171699523926
step: 74: time taken for the last 1 steps is 12.9433, including opt 15.428664, loss is 3.6238770484924316
step: 75: time taken for the last 1 steps is 12.7501, including opt 15.263395, loss is 3.6286983489990234
step: 76: time taken for the last 1 steps is 12.8994, including opt 15.422184, loss is 3.8733742237091064
step: 77: time taken for the last 1 steps is 12.9004, including opt 15.400892, loss is 3.577197551727295
step: 78: time taken for the last 1 steps is 12.9831, including opt 15.469310, loss is 3.985025405883789
step: 79: time taken for the last 1 steps is 12.9044, including opt 15.418798, loss is 3.2982497215270996
step: 80: time taken for the last 1 steps is 12.9078, including opt 15.428104, loss is 3.5881423950195312
step: 81: time taken for the last 1 steps is 12.8602, including opt 15.376176, loss is 3.571816921234131
step: 82: time taken for the last 1 steps is 12.8427, including opt 15.354308, loss is 3.6531360149383545
step: 83: time taken for the last 1 steps is 12.9693, including opt 15.488738, loss is 3.667128086090088
step: 84: time taken for the last 1 steps is 12.9339, including opt 15.444362, loss is 3.7207419872283936
step: 85: time taken for the last 1 steps is 12.9262, including opt 15.428766, loss is 4.058774471282959
step: 86: time taken for the last 1 steps is 12.9295, including opt 15.457137, loss is 3.625124454498291
step: 87: time taken for the last 1 steps is 12.9321, including opt 15.470272, loss is 3.7473902702331543
step: 88: time taken for the last 1 steps is 12.9522, including opt 15.468776, loss is 3.9235401153564453
step: 89: time taken for the last 1 steps is 12.9380, including opt 15.451127, loss is 3.845609664916992
step: 90: time taken for the last 1 steps is 12.9523, including opt 15.479926, loss is 3.617638111114502
step: 91: time taken for the last 1 steps is 12.9128, including opt 15.421896, loss is 3.701866626739502
step: 92: time taken for the last 1 steps is 12.8880, including opt 15.433577, loss is 3.819185256958008
step: 93: time taken for the last 1 steps is 12.8984, including opt 15.409822, loss is 3.578911542892456
step: 94: time taken for the last 1 steps is 12.9390, including opt 15.444121, loss is 3.6822457313537598
step: 95: time taken for the last 1 steps is 12.9726, including opt 15.481013, loss is 3.944187641143799
step: 96: time taken for the last 1 steps is 12.9518, including opt 15.470816, loss is 3.5304970741271973
step: 97: time taken for the last 1 steps is 12.9471, including opt 15.450621, loss is 3.6983871459960938
step: 98: time taken for the last 1 steps is 12.9052, including opt 15.403150, loss is 3.6770267486572266
step: 99: time taken for the last 1 steps is 12.9204, including opt 15.416047, loss is 3.7626125812530518
step: 100: time taken for the last 1 steps is 12.9174, including opt 15.442979, loss is 3.689861297607422
step: 101: time taken for the last 1 steps is 12.9173, including opt 15.437241, loss is 3.9444937705993652
step: 102: time taken for the last 1 steps is 12.9650, including opt 15.492455, loss is 3.76350736618042
step: 103: time taken for the last 1 steps is 12.8332, including opt 15.352000, loss is 3.365955352783203
step: 104: time taken for the last 1 steps is 12.9347, including opt 15.465519, loss is 3.7175488471984863
step: 105: time taken for the last 1 steps is 12.8685, including opt 15.355808, loss is 3.596315860748291
step: 106: time taken for the last 1 steps is 12.8631, including opt 15.374160, loss is 3.1915295124053955
step: 107: time taken for the last 1 steps is 12.8880, including opt 15.372206, loss is 3.903794765472412
step: 108: time taken for the last 1 steps is 12.9099, including opt 15.409998, loss is 3.544762134552002
step: 109: time taken for the last 1 steps is 12.9228, including opt 15.439552, loss is 3.6249403953552246
step: 110: time taken for the last 1 steps is 13.0086, including opt 15.499981, loss is 3.711193799972534
step: 111: time taken for the last 1 steps is 12.8937, including opt 15.394070, loss is 3.1679463386535645
step: 112: time taken for the last 1 steps is 12.9722, including opt 15.461345, loss is 3.383730411529541
step: 113: time taken for the last 1 steps is 12.9082, including opt 15.408362, loss is 3.5059139728546143
step: 114: time taken for the last 1 steps is 13.0441, including opt 15.576561, loss is 3.532622814178467
step: 115: time taken for the last 1 steps is 12.9548, including opt 15.471687, loss is 3.784928798675537
step: 116: time taken for the last 1 steps is 12.8449, including opt 15.346841, loss is 3.6342501640319824
step: 117: time taken for the last 1 steps is 12.8942, including opt 15.426289, loss is 3.6560301780700684
step: 118: time taken for the last 1 steps is 12.9269, including opt 15.416285, loss is 3.663069486618042
step: 119: time taken for the last 1 steps is 12.9168, including opt 15.415083, loss is 3.6044952869415283
step: 120: time taken for the last 1 steps is 12.9729, including opt 15.507153, loss is 3.815861940383911
step: 121: time taken for the last 1 steps is 12.9139, including opt 15.402746, loss is 3.1694087982177734
step: 122: time taken for the last 1 steps is 12.9708, including opt 15.492156, loss is 3.6695799827575684
step: 123: time taken for the last 1 steps is 12.9901, including opt 15.519064, loss is 3.2285284996032715
step: 124: time taken for the last 1 steps is 12.9064, including opt 15.424682, loss is 3.2342381477355957
step: 125: time taken for the last 1 steps is 12.9984, including opt 15.523388, loss is 3.6105258464813232
step: 126: time taken for the last 1 steps is 12.9609, including opt 15.463489, loss is 3.4300220012664795
step: 127: time taken for the last 1 steps is 12.9628, including opt 15.500603, loss is 3.4844627380371094
step: 128: time taken for the last 1 steps is 12.9176, including opt 15.439389, loss is 3.738433837890625
step: 129: time taken for the last 1 steps is 12.9311, including opt 15.434257, loss is 3.8698782920837402
step: 130: time taken for the last 1 steps is 12.9239, including opt 15.401520, loss is 3.622169017791748
step: 131: time taken for the last 1 steps is 12.9322, including opt 15.454633, loss is 3.953129529953003
step: 132: time taken for the last 1 steps is 12.9496, including opt 15.467794, loss is 3.7623531818389893
step: 133: time taken for the last 1 steps is 12.9395, including opt 15.475105, loss is 3.9012513160705566
step: 134: time taken for the last 1 steps is 12.8619, including opt 15.376457, loss is 3.640132427215576
step: 135: time taken for the last 1 steps is 12.8713, including opt 15.390433, loss is 3.926755428314209
step: 136: time taken for the last 1 steps is 12.8766, including opt 15.394422, loss is 3.7170090675354004
step: 137: time taken for the last 1 steps is 12.9308, including opt 15.432727, loss is 3.692924976348877
step: 138: time taken for the last 1 steps is 12.9009, including opt 15.405311, loss is 3.2269325256347656
step: 139: time taken for the last 1 steps is 12.9403, including opt 15.465116, loss is 3.290743350982666
step: 140: time taken for the last 1 steps is 12.9162, including opt 15.395824, loss is 3.2843480110168457
step: 141: time taken for the last 1 steps is 12.9091, including opt 15.417799, loss is 3.6836981773376465
step: 142: time taken for the last 1 steps is 12.8815, including opt 15.384154, loss is 3.484724760055542
step: 143: time taken for the last 1 steps is 12.9406, including opt 15.475830, loss is 3.873539447784424
step: 144: time taken for the last 1 steps is 12.8875, including opt 15.419949, loss is 3.3060073852539062
step: 145: time taken for the last 1 steps is 12.9555, including opt 15.472954, loss is 3.9039669036865234
step: 146: time taken for the last 1 steps is 13.0341, including opt 15.550731, loss is 3.572918653488159
step: 147: time taken for the last 1 steps is 12.8776, including opt 15.405139, loss is 3.960991621017456
step: 148: time taken for the last 1 steps is 12.9133, including opt 15.426279, loss is 3.324143171310425
step: 149: time taken for the last 1 steps is 12.8078, including opt 15.333707, loss is 3.4263787269592285
step: 150: time taken for the last 1 steps is 12.8921, including opt 15.403645, loss is 3.5133895874023438
step: 151: time taken for the last 1 steps is 12.9378, including opt 15.432031, loss is 3.2527718544006348
step: 152: time taken for the last 1 steps is 12.8966, including opt 15.415637, loss is 3.2782351970672607
step: 153: time taken for the last 1 steps is 12.9413, including opt 15.435282, loss is 3.8511993885040283
step: 154: time taken for the last 1 steps is 12.9340, including opt 15.444955, loss is 3.755619764328003
step: 155: time taken for the last 1 steps is 12.9224, including opt 15.441126, loss is 3.507967472076416
step: 156: time taken for the last 1 steps is 12.9423, including opt 15.442500, loss is 3.5681724548339844
step: 157: time taken for the last 1 steps is 12.9390, including opt 15.456769, loss is 3.8080849647521973
step: 158: time taken for the last 1 steps is 12.9217, including opt 15.427735, loss is 3.6451644897460938
step: 159: time taken for the last 1 steps is 12.9089, including opt 15.439120, loss is 4.014640808105469
step: 160: time taken for the last 1 steps is 12.9397, including opt 15.469562, loss is 3.562218427658081
step: 161: time taken for the last 1 steps is 12.9165, including opt 15.447373, loss is 3.833735942840576
step: 162: time taken for the last 1 steps is 12.8798, including opt 15.399373, loss is 3.3969902992248535
step: 163: time taken for the last 1 steps is 12.9610, including opt 15.480253, loss is 3.686244487762451
step: 164: time taken for the last 1 steps is 13.0229, including opt 15.565086, loss is 4.185545921325684
step: 165: time taken for the last 1 steps is 12.9861, including opt 15.510920, loss is 3.561957836151123
step: 166: time taken for the last 1 steps is 12.9399, including opt 15.480123, loss is 3.6180591583251953
step: 167: time taken for the last 1 steps is 12.9647, including opt 15.468926, loss is 3.424541473388672
step: 168: time taken for the last 1 steps is 12.9879, including opt 15.490615, loss is 3.6972687244415283
step: 169: time taken for the last 1 steps is 12.9469, including opt 15.479034, loss is 3.4680397510528564
step: 170: time taken for the last 1 steps is 12.9381, including opt 15.468832, loss is 3.448010206222534
step: 171: time taken for the last 1 steps is 12.8889, including opt 15.405982, loss is 3.9643054008483887
step: 172: time taken for the last 1 steps is 12.9181, including opt 15.435524, loss is 3.4254117012023926
step: 173: time taken for the last 1 steps is 12.9086, including opt 15.422223, loss is 3.4156863689422607
step: 174: time taken for the last 1 steps is 12.9707, including opt 15.508315, loss is 3.6665046215057373
step: 175: time taken for the last 1 steps is 12.9509, including opt 15.471068, loss is 3.7447731494903564