You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The released code uses the temporal Transformer, but temporal attention treats each frame equally. It seems that no use of tricks like TimeEmbedding in different frames. Does this mean that the network cannot distinguish the temporal relationships of different frames.
The text was updated successfully, but these errors were encountered:
The released code uses the temporal Transformer, but temporal attention treats each frame equally. It seems that no use of tricks like TimeEmbedding in different frames. Does this mean that the network cannot distinguish the temporal relationships of different frames.
Hi, thank you for your suggestion. We did not include position coding in our experiment to train on 16 / 32 frames and can be easily extended to other temporal lengths. But adding temporal coding may be better for fixed temporal length.
The released code uses the temporal Transformer, but temporal attention treats each frame equally. It seems that no use of tricks like TimeEmbedding in different frames. Does this mean that the network cannot distinguish the temporal relationships of different frames.
The text was updated successfully, but these errors were encountered: