weight shared supernet training with pretrained BERT
- See changes in modeling_bert_super.py
- See main supernet code in supernet_engine.py and supernet_train.py
- See v1 for scripts relevant to first version of weight shared training, which is deprecated.