Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tsm model different from original #1572

Open
knowlessthanenough opened this issue Sep 4, 2024 · 3 comments
Open

tsm model different from original #1572

knowlessthanenough opened this issue Sep 4, 2024 · 3 comments
Labels
wontfix This will not be worked on

Comments

@knowlessthanenough
Copy link

I am recently running tsm and I find it layer seem to be different .
in the tsm_r50.py create_engine function after the fc1 the num_outputs = OUTPUT_SIZE which is 400 the class number
until here it is normal but than it do reshape reshape it to [num_segments, output_size] I don't understand why and even how a shape [400] can be resize to [8,400].
after that it add_reduce axes=1 keep_dims = false so it turn into [8]?
than it do softmax on axis 1? it is already 1d how come there is axis 1. and also reduce to 8 (segment ) than how to know what is the class of video?

I know it is a old repo but if anyone know the concept I will be very thank you.

@wang-xinyu
Copy link
Owner

@irvingzhang0512 pls help

@knowlessthanenough
Copy link
Author

i turn it in to explicit input and i find that what it is doing is same as orginal, the code use different method but infact what it is doing is : after pooling2 it shape is [4, 8, 2048, 1, 1] than i reshape it to [4*8,2048] than i do matrix multiple with fc1 weight [class_num, 2048] finally add with [1, 1, class_num] (fc1 bias). you will get [batch_size, segment, class_num] than do reduce in axis2 (cause i have batch dimension) at first i was confuse why reduce in class_num channel but when i print it i find it reduce in segment channel output shape [batch_size , class_num]. i think what it is doing is combine multi seg information (still trying to understand orginal paper and why channel 2 not 1 is segment channel) than do softmax to output prob.

the different i think is because it is working on batch and the one i look is online demo so there is no 8 segment. it do it one by one. (as for reduce layer still checking on tensorrt api)

Copy link

stale bot commented Nov 10, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Nov 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants