Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question when I use dsnt in my net #17

Open
QcQcM opened this issue Nov 4, 2020 · 8 comments
Open

Question when I use dsnt in my net #17

QcQcM opened this issue Nov 4, 2020 · 8 comments
Labels

Comments

@QcQcM
Copy link

QcQcM commented Nov 4, 2020

I have trained a network that obtains key points of the face by supervising the generation of heatmaps.The network uses the max operation to obtain 68 key point coordinates of the face from the key point heat map with 68 channels output by FCN. At present I want to combine this network with another network to train together, but the max operation used before is not differentiable, so I want to replace the max operation with dsnt.
So I use batch_location_dsnt = dsntnn.dsnt(heatmap) (the heatmap is obtained by FCN, it's a 68 * 1 * 16 * 16 tensor)
but the batch_location_dsnt I obtained is

`tensor([[[ -0.5989, -0.2222]],

    [[ -0.6683,  -0.0225]],

    [[ -0.7003,   0.1874]],

    [[ -0.7120,   0.5027]],

    [[ -0.6451,   0.7451]],

    [[ -0.5105,   1.0081]],

    [[ -0.4522,   1.1898]],

    [[ -0.2934,   1.2817]],

    [[ -0.0759,   0.9567]],

    [[  0.1304,   1.0607]],

    [[  0.3462,   1.4314]],

    [[  0.7308,   1.3509]],

    [[  0.8871,   1.0625]],

    [[  1.1645,   0.7980]],

    [[  1.4735,   0.5973]],

    [[  1.3658,   0.1797]],

    [[  1.2114,  -0.1012]],

    [[ -0.7434,  -0.7085]],

    [[ -0.6286,  -0.7392]],

    [[ -0.4630,  -0.7343]],

    [[ -0.2988,  -0.6485]],

    [[ -0.1515,  -0.5185]],

    [[  0.0185,  -0.5908]],

    [[  0.3039,  -0.6446]],

    [[  0.5553,  -0.6704]],

    [[  0.8032,  -0.6359]],

    [[  0.9848,  -0.4610]],

    [[ -0.1231,  -0.3595]],

    [[ -0.2189,  -0.2581]],

    [[ -0.2404,  -0.0784]],

    [[ -0.3306,   0.1073]],

    [[ -0.4281,   0.2564]],

    [[ -0.3071,   0.3424]],

    [[ -0.2748,   0.3945]],

    [[ -0.1277,   0.3686]],

    [[  0.0404,   0.3399]],

    [[ -0.5630,  -0.4150]],

    [[ -0.4809,  -0.4761]],

    [[ -0.3541,  -0.4953]],

    [[ -0.2261,  -0.3877]],

    [[ -0.4000,  -0.3473]],

    [[ -0.5188,  -0.3881]],

    [[  0.2428,  -0.3442]],

    [[  0.4070,  -0.3346]],

    [[  0.5273,  -0.3868]],

    [[  0.7190,  -0.2441]],

    [[  0.5536,  -0.2888]],

    [[  0.4207,  -0.2777]],

    [[ -0.3997,   0.7421]],

    [[ -0.3004,   0.5801]],

    [[ -0.3018,   0.5292]],

    [[ -0.1713,   0.4833]],

    [[ -0.0893,   0.4787]],

    [[  0.0906,   0.6432]],

    [[  0.3095,   0.7009]],

    [[  0.1567,   0.8734]],

    [[ -0.0456,   1.1209]],

    [[ -0.1621,   1.0680]],

    [[ -0.2678,   1.0100]],

    [[ -0.3905,   0.8635]],

    [[ -0.3840,   0.7459]],

    [[ -0.2615,   0.6243]],

    [[ -0.1569,   0.5345]],

    [[ -0.1064,   0.6030]],

    [[  0.2071,   0.6364]],

    [[ -0.0748,   0.8947]],

    [[ -0.1838,   0.7509]],

    [[ -0.2617,   0.8739]]], device='cuda:0', grad_fn=<CatBackward>)`

Obviously,[-0.5989, -0.2222] doesn't look like coordinates,Why is dsnt not outputting the maximum x and y coordinates like the max operation? How can I get the correct coordinates of the key points?

@anibali
Copy link
Owner

anibali commented Nov 4, 2020

Please read the basic usage guide.

Importantly, the target coordinates are normalized so that they are in the range (-1, 1). The DSNT layer always outputs coordinates in this range.

You can use the image size to convert from normalized coordinates to pixel coordinates.

Now, I can also see that you have some coordinates which are slightly outside of the (-1, 1) range. This implies to me that you have not normalized the heatmaps (e.g. using dsntnn.flat_softmax).

@QcQcM
Copy link
Author

QcQcM commented Nov 6, 2020

Thank you for your prompt reply.
In fact, I normalized the heatmap with heatmap = dsntnn.flat_softmax(heatmap) based on the example. Maybe because my torch is 1.2.0? I found that you mentioned in the answer to other people before.
Another question I want to ask maybe stupid, how can convert the normalized coordinates to the pixel coordinates?
When I was reading the examples in the paper, I didn't quite understand how the x=0.4, y=0 finally got back to the original coordinates.
2020-11-06 09-09-23 的屏幕截图
Is the final coordinate the intersection of the column with the value 0.4 in the X matrix and the row with the value 0 in the Y?
So we know the value of X is 0.4, the value of n is 5, and beacuse Xij = (2j -(n+1))/n (As shown below) ,we can know the value of j is 4,the same reason, the value of i is 3 ,so the pixel coordinate is (4,3) ?third row and fourth column
2020-11-06 09-09-35 的屏幕截图
Should I use the same method to restore the results obtained by dsnt to pixel coordinates?
Thank you.

@anibali
Copy link
Owner

anibali commented Nov 6, 2020

In fact, I normalized the heatmap with heatmap = dsntnn.flat_softmax(heatmap) based on the example. Maybe because my torch is 1.2.0? I found that you mentioned in the answer to other people before.

If you use dsnt directly after flat_softmax, it should not be possible for values to appear outside of the (-1, 1) range.

Another question I want to ask maybe stupid, how can convert the normalized coordinates to the pixel coordinates?

There's a function that will do the conversion for you:

dsntnn/dsntnn/__init__.py

Lines 322 to 334 in 779631f

def normalized_to_pixel_coordinates(coords, size):
"""Convert from normalized coordinates to pixel coordinates.
Args:
coords: Coordinate tensor, where elements in the last dimension are ordered as (x, y, ...).
size: Number of pixels in each spatial dimension, ordered as (..., height, width).
Returns:
`coords` in pixel coordinates.
"""
if torch.is_tensor(coords):
size = coords.new_tensor(size).flip(-1)
return 0.5 * ((coords + 1) * size - 1)

Your understanding of how the conversion works seems to be correct.

@QcQcM
Copy link
Author

QcQcM commented Nov 6, 2020

I use pip install dsntnn==0.4.0a0 to change the version of dsnt

the code used dsnt is
heatmap = dsntnn.flat_softmax(heatmap) batch_location_dsnt = dsntnn.dsnt(heatmap)

the value of batch_location_dsnt I get is
`tensor([[[-7.8823e-04, 4.4607e-04]],

    [[-7.1920e-04,  1.8634e-03]],

    [[-6.0907e-04,  3.9498e-03]],

    [[-4.3593e-04,  5.8416e-03]],

    [[ 8.3447e-05,  7.0462e-03]],

    [[ 4.7119e-04,  5.1756e-03]],

    [[ 9.2238e-05,  4.8673e-04]],

    [[ 4.5002e-06,  1.5318e-04]],

    [[ 1.6151e-04,  1.1787e-04]],

    [[ 2.4366e-04,  1.8312e-04]],

    [[ 1.7077e-04,  1.4871e-04]],

    [[ 1.0362e-03,  1.1041e-03]],

    [[ 1.0263e-03,  1.1403e-03]],

    [[ 1.0668e-04,  7.3150e-05]],

    [[ 6.6787e-05,  1.5837e-04]],

    [[-6.5744e-05,  6.1721e-05]],

    [[ 2.0477e-04,  7.8917e-05]],

    [[-4.1145e-04,  1.0529e-04]],

    [[-1.0380e-04, -7.3744e-04]],

    [[ 6.6893e-04, -1.2747e-03]],

    [[ 1.6504e-03, -8.9851e-04]],

    [[ 2.5342e-03, -2.5265e-04]],

    [[ 3.0837e-03, -8.1700e-04]],

    [[ 4.1877e-03, -1.2371e-03]],

    [[ 5.6369e-03, -1.4836e-03]],

    [[ 6.1638e-03, -9.8917e-04]],

    [[ 5.0859e-03, -2.0368e-04]],

    [[ 2.6908e-03,  2.7435e-04]],

    [[ 2.9002e-03,  1.1736e-03]],

    [[ 2.6792e-03,  2.1132e-03]],

    [[ 2.5727e-03,  2.8178e-03]],

    [[ 1.8448e-03,  3.5268e-03]],

    [[ 2.7345e-03,  4.2092e-03]],

    [[ 2.6624e-03,  3.8066e-03]],

    [[ 3.3309e-03,  4.0315e-03]],

    [[ 4.3377e-03,  4.2360e-03]],

    [[ 7.3338e-04,  6.7948e-04]],

    [[ 8.0203e-04, -4.6089e-05]],

    [[ 1.5023e-03,  3.3472e-04]],

    [[ 1.8472e-03,  1.5168e-04]],

    [[ 1.5913e-03,  3.4170e-04]],

    [[ 9.1051e-04,  3.3060e-04]],

    [[ 4.7607e-03,  3.5889e-04]],

    [[ 4.5108e-03,  2.9832e-05]],

    [[ 4.9918e-03, -1.3429e-04]],

    [[ 6.7976e-03,  4.4054e-04]],

    [[ 5.4981e-03,  5.2540e-04]],

    [[ 5.6122e-03,  4.2932e-04]],

    [[ 1.7819e-03,  7.3131e-03]],

    [[ 1.6770e-03,  5.3437e-03]],

    [[ 2.6431e-03,  5.7792e-03]],

    [[ 2.8788e-03,  5.6338e-03]],

    [[ 3.3270e-03,  5.2575e-03]],

    [[ 3.8019e-03,  5.3061e-03]],

    [[ 4.7200e-03,  6.3319e-03]],

    [[ 3.4578e-03,  6.0037e-03]],

    [[ 3.2287e-03,  6.9463e-03]],

    [[ 2.3241e-03,  6.2275e-03]],

    [[ 2.1963e-03,  6.6319e-03]],

    [[ 1.7331e-03,  6.3944e-03]],

    [[ 2.2274e-03,  7.5258e-03]],

    [[ 2.3830e-03,  5.9583e-03]],

    [[ 2.6063e-03,  5.4184e-03]],

    [[ 3.0387e-03,  5.4277e-03]],

    [[ 4.6358e-03,  6.6240e-03]],

    [[ 3.1824e-03,  6.5282e-03]],

    [[ 2.9334e-03,  6.8267e-03]],

    [[ 2.4876e-03,  6.8808e-03]],

    [[-4.4794e-02, -4.6768e-02]]], device='cuda:0', grad_fn=<FlipBackward>)`

does the output look right now ? Thank you,looking forward to your reply.
Today is also a day to work hard!!!best wishes!

@anibali
Copy link
Owner

anibali commented Nov 6, 2020

does the output look right now ?

The output is valid, but I can't say whether they are the right answers for your problem 😉

@QcQcM
Copy link
Author

QcQcM commented Nov 6, 2020

the first two output of dsnt:

[[ 2.3011e-03,  4.8983e-03]],

        [[ 2.5807e-03,  5.0080e-03]],

the first two coordnite output of dsnt(return of function normalized_to_pixel_coordinates):

tensor([[[7.4616, 7.5037]],

        [[7.4597, 7.5175]],

the result of finding the maximum value:
index of w:

2., 2.,
index of h:
8., 10.
which means the point coordnites is (2,8) (2,10)

**the first two of the 68 heatmaps
2020-11-06 14-53-16 的屏幕截图

2020-11-06 14-53-23 的屏幕截图

As u can see,the result of finding the maximum value is not same as the dsnt
😭 😭 😭
Is it because I am now directly using dsnt to find the value on the heatmap previously trained without dsnt?
Do I need to retrain with dsnt?

@anibali
Copy link
Owner

anibali commented Nov 6, 2020

Is it because I am now directly using dsnt to find the value on the heatmap previously trained without dsnt?
Do I need to retrain with dsnt?

I didn't realise you were trying to avoid retraining. I think that the problem you're having is that if you don't retrain, flat_softmax will cause the heatmap values to "flatten" which biases dsnt towards the centre of the image.

Your options are:

  1. Retrain.

  2. Instead of using flat_softmax, try normalising the heatmaps by subtracting the minimum value and dividing by the sum:

heatmap -= heatmap.flatten(-2).min(-1)[0][..., None, None]
heatmap /= heatmap.sum([-1, -2], keepdim=True)

You may as well try 2), and then if that doesn't work, try 1).

@QcQcM
Copy link
Author

QcQcM commented Nov 6, 2020

I have to say, you are so kind!!!
I have tried 2), it seems like better than before ,but still far away from right result,and I realize the result when heatmap is not normalized is better than normalized.
I will try to retrain the net, and pls looking forward to my good new!:stuck_out_tongue::stuck_out_tongue::stuck_out_tongue:
nice to meet u!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants