You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am recently running tsm and I find it layer seem to be different .
in the tsm_r50.py create_engine function after the fc1 the num_outputs = OUTPUT_SIZE which is 400 the class number
until here it is normal but than it do reshape reshape it to [num_segments, output_size] I don't understand why and even how a shape [400] can be resize to [8,400].
after that it add_reduce axes=1 keep_dims = false so it turn into [8]?
than it do softmax on axis 1? it is already 1d how come there is axis 1. and also reduce to 8 (segment ) than how to know what is the class of video?
I know it is a old repo but if anyone know the concept I will be very thank you.
The text was updated successfully, but these errors were encountered:
i turn it in to explicit input and i find that what it is doing is same as orginal, the code use different method but infact what it is doing is : after pooling2 it shape is [4, 8, 2048, 1, 1] than i reshape it to [4*8,2048] than i do matrix multiple with fc1 weight [class_num, 2048] finally add with [1, 1, class_num] (fc1 bias). you will get [batch_size, segment, class_num] than do reduce in axis2 (cause i have batch dimension) at first i was confuse why reduce in class_num channel but when i print it i find it reduce in segment channel output shape [batch_size , class_num]. i think what it is doing is combine multi seg information (still trying to understand orginal paper and why channel 2 not 1 is segment channel) than do softmax to output prob.
the different i think is because it is working on batch and the one i look is online demo so there is no 8 segment. it do it one by one. (as for reduce layer still checking on tensorrt api)
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I am recently running tsm and I find it layer seem to be different .
in the tsm_r50.py create_engine function after the fc1 the num_outputs = OUTPUT_SIZE which is 400 the class number
until here it is normal but than it do reshape reshape it to [num_segments, output_size] I don't understand why and even how a shape [400] can be resize to [8,400].
after that it add_reduce axes=1 keep_dims = false so it turn into [8]?
than it do softmax on axis 1? it is already 1d how come there is axis 1. and also reduce to 8 (segment ) than how to know what is the class of video?
I know it is a old repo but if anyone know the concept I will be very thank you.
The text was updated successfully, but these errors were encountered: