-
Notifications
You must be signed in to change notification settings - Fork 548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feet prediction quality #24
Comments
Some of the results you show above may be improved by cropping using a larger bounding box SPIN is trained with multiple losses, the relevant ones I think are:
according to nkolot/SPIN#39 (comment) ground truth keypoints include Ankle, but do not include keypoint on the feet. OpenPose includes 2 key points on the feet, but openpose_train_weight is set to zero by default. I think it is a shared gap of SPIN and VIBE, therefore I don't think it is related to the GRU (temporal) part of VIBE. |
Some papers suggest foot velocity loss |
@ikvision Thank you a lot for clarification! |
Do you have an idea on when you would release the feet enhancement? I think it would help a lot to improve further the overall quality of the poses estimations. Thanks for the great work. |
That is true. We are trying to find a workaround for this. Either better feet keypoints or a constraint. We don't have a precise estimate, right now. |
Dear @mkocabas Thank you for the great work!
I have watched and analysed a number of indoor videos which illustrate the approach, and it looks like while the reconstruction results overall are impressive,
there is still a number of leg positioning drawbacks, especially noticeable in feet pose detection.
In some frames feet are not detected correctly, on most of incorrectly predicted frames leg toes are raised up. IMHO, it is likely that all net contraction trained with poor feet labeling. Is it correct assumption?
Now I am in the process of understanding what is the root cause of this issue, and what can be done
in order to alleviate feet prediction error.
I see that you predictor has convolutional backbone inherited from the SPIN solution.
https://github.com/nkolot/SPIN
But I haven't figured out on what data and with what labelling it is trained with?
I mean, does it have only one feet joint or several? Have you retrained SPIN on your datasets?
And in case of retraining CNN backbone, is it necessary to retrain the temporal part of VIBE too?
Or perhaps I can leave it untouched for a while?
Thanks a lot in advance, I would greatly appreciate your response.
The text was updated successfully, but these errors were encountered: