-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Model][MXNet] MXNet Tree LSTM example #279
Conversation
Continue from your last comments:
Maybe you could use this? https://github.com/dmlc/dgl/blob/master/python/dgl/data/tree.py#L68 . We could expose this member and doc it clearly. |
Not really. That's still only available after instantiating SST class. |
@szha feel free to change the |
Epoch 00019 | Step 00005 | Loss 4429.1836 | Acc 0.8113 | Root Acc 0.4727 | Time(s) 0.1602
Epoch 00019 | Step 00010 | Loss 4375.8833 | Acc 0.8147 | Root Acc 0.5352 | Time(s) 0.1601
Epoch 00019 | Step 00015 | Loss 4424.3398 | Acc 0.8081 | Root Acc 0.5703 | Time(s) 0.1600
Epoch 00019 | Step 00020 | Loss 4459.7549 | Acc 0.8126 | Root Acc 0.5156 | Time(s) 0.1598
Epoch 00019 | Step 00025 | Loss 4357.2935 | Acc 0.8135 | Root Acc 0.4961 | Time(s) 0.1596
Epoch 00019 | Step 00030 | Loss 4382.1328 | Acc 0.8193 | Root Acc 0.4961 | Time(s) 0.1593
Epoch 00019 training time 7.0636s
Epoch 00019 | Dev Acc 0.8139 | Root Acc 0.4723
0.04089534687986153
0.04089534687986153
Epoch 00020 | Step 00005 | Loss 4621.7788 | Acc 0.8121 | Root Acc 0.5508 | Time(s) 0.1593
Epoch 00020 | Step 00010 | Loss 4439.5488 | Acc 0.8166 | Root Acc 0.5117 | Time(s) 0.1593
Epoch 00020 | Step 00015 | Loss 4391.4717 | Acc 0.8120 | Root Acc 0.5430 | Time(s) 0.1593
Epoch 00020 | Step 00020 | Loss 4558.4761 | Acc 0.8156 | Root Acc 0.5586 | Time(s) 0.1594
Epoch 00020 | Step 00025 | Loss 4441.6011 | Acc 0.8065 | Root Acc 0.5977 | Time(s) 0.1592
Epoch 00020 | Step 00030 | Loss 4231.6099 | Acc 0.8100 | Root Acc 0.5195 | Time(s) 0.1593
Epoch 00020 training time 8.3417s
Epoch 00020 | Dev Acc 0.8143 | Root Acc 0.4668
0.040486393411062915
0.040486393411062915
Epoch 00021 | Step 00005 | Loss 4228.3027 | Acc 0.8208 | Root Acc 0.5469 | Time(s) 0.1595
Epoch 00021 | Step 00010 | Loss 4437.4014 | Acc 0.8099 | Root Acc 0.5117 | Time(s) 0.1594
Epoch 00021 | Step 00015 | Loss 4464.4297 | Acc 0.8190 | Root Acc 0.5273 | Time(s) 0.1595
Epoch 00021 | Step 00020 | Loss 4361.5220 | Acc 0.8083 | Root Acc 0.5117 | Time(s) 0.1598
Epoch 00021 | Step 00025 | Loss 4393.3721 | Acc 0.8164 | Root Acc 0.4961 | Time(s) 0.1598
Epoch 00021 | Step 00030 | Loss 4480.3940 | Acc 0.8085 | Root Acc 0.4727 | Time(s) 0.1601
Epoch 00021 training time 8.7110s
Epoch 00021 | Dev Acc 0.8138 | Root Acc 0.4714
------------------------------------------------------------------------------------
Epoch 00011 | Test Acc 0.8063 | Root Acc 0.4855 |
It seems that the speed has improved a lot ! |
Is this ready to be reviewed? |
yes, it's ready to be reviewed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
> [**Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks**](http://arxiv.org/abs/1503.00075) | ||
> *Kai Sheng Tai, Richard Socher, and Christopher Manning*. | ||
|
||
The provided implementation can achieve a test accuracy of 51.72 which is comparable with the result reported in the original paper: 51.0(±0.5). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does MXNet Tree-LSTM produce the same result as PyTorch? That's interesting
return batcher_dev | ||
|
||
def prepare_glove(): | ||
if not (os.path.exists('glove.840B.300d.txt') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think PyTorch Tree LSTM should prepare glove inside training script too.
{'learning_rate': args.lr}) | ||
|
||
dur = [] | ||
L = gluon.loss.SoftmaxCrossEntropyLoss(axis=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In DyNet implementation, they use reduction=sum
instead of mean
, I'm also not sure which one is better, but in practice using sum
produce higher results.
@szha what you did improves its speed? |
Hybridization has more noticable effect on throughput when batch size is bigger. |
Description
continue #234