-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
problem about loss value #18
Comments
This usually means that all the embeddings have collapsed on a single point. One solution that might work is to lower your learning rate so that this collapse doesn't happen. |
thx, but I lower my learning rate down to 1e-6, also the problem exists, maybe my learning rate need lower? My dataset is cifar10, and the net is alexNet |
I realize the problem now. |
You could juste duplicate the loss so that it has the right shape? loss = ... # scalar
loss = tf.ones([batch_size, 1]) * loss |
No, I have tested it. |
@Cong222 Hi, how did you solve the problem that triplet_loss is scalar tensor which is inconsistent with the loss in keras? |
Hello, I think the problem mainly comes from the fact that in Keras, any custom loss should be designed this way: "The function should takes the following two arguments: This is in practice impossible for any embedding learning tasks, but maybe there could be a workaround for it... |
@Cong222 did you find the way we can use the triplet loss in keras? Even I have the same issue with the loss value. |
I can't find the way. |
@Cong222 I meet the same problem but set lower learining rate the loss converged. |
Thank you @ChristieLin . Changing the learning rate worked for me. |
I also meet same question, loss value is approximate of margin,i found that the distance is close to 0,I don't know how it was caused.Does the output of the network need to be L2 normalized, what is the role of L2 normalization? |
#18 (comment) |
@Cong222 @ChristieLin Can you elaborate how you used this loss function with Keras with incompatible y_true and y_pred shapes? |
Hello, I have a similar problem, I use transfer learning on vggface with keras, combined with triplet loss. val_loss does not change every time to 0.500. Because the training data is too much, I read and store data into the ".h5" file, each time I train, I will read each batch from that file. Then I create a Data Generate that returns batch_x and batch_y. I use model.fit_generator to train the model, however the error occurs when val_loss doesn't change every time down to 0.500. My learning_rate is 0.001. |
I am facing similar problem with my model. Training loss is stuck at the margin with very low learning rate as well. Is there any solution yet? |
As @vijayanand-Git pointed it out, the loss function introduced in this repository is not to be applied as-is in a Keras environment. A small enhancment is needed, that in the answer above is adding the line ( In Keras, the default shape for To elaborate a bit more on the expected shapes of For those who are still looking for a working example in Keras, I created a notebook that shows how omoindrot 's triplet loss function can be used with Keras, check it out here: https://github.com/ma1112/keras-triplet-loss |
Adding |
but there are not labels on triplet loss, there is only the embeddings and the margin. |
When using triplet loss, labels help the algorithm determine which pairs are positive and which pairs are negative, by inspecting whether the labels for two training examples are the same or not. Two training examples with the same label are considered a positive pair and will have their embeddings close together in the embedding space. So the only important concept around labels is that they should be the same for every example from a given class and they should be different for examples from different classes. Keeping that in mind you can use any numeric value as a label. Particularly, if your dataset has |
@ma1112 thanks for the explanation, but if I understand you correctly, your samples are combinations of pairs? not triplets? |
@JJKK1313 Sorry for the confusing answer, let me elaborate further. If you wish to use the triplet loss implementation found in this repo, your samples should be individual samples just as if you trained a network without using triplet loss. I.e. in case of working with the MNIST dataset, in which there are 60k grayscale images of hand written digits, each with a size of 28x28, you can use that dataset as-is to train a network with the triplet loss algorithm. So your input tensor should have a size of 60kx28x28x1. (Note that you should keep labels as integers from 0 to 9 when working with triplet loss, whereas if you were to use softmax activation + crossentropy loss, you'd one-hot encode the labels.) That is because the triplet loss implementation found in this repo implements online triplet mining, and picks the best triplets from a batch of images during the time the model is being trained. As triplets are created on-the-fly, the algorithm needs to know whether for a given anchor another sample is negative or positive. Hence you need to have labels for online triplet mining. And you are quite right, if you were to use a model with offline triplet mining, i.e. if you fed the network with triplets of samples during training, then you would not need to pass labels to the network. However in that case you could not use the triplet loss function you find in this repo and your model would be probably worse than one with online triplet mining. |
Ohhhhhh nnooowww I got it! Thank you very much for the explanation @ma1112!! |
Epoch 1/60
97/97 [==============================] - 24s - loss: 1.0072 - mAP: 0.1649 - val_loss: 0.9624 - val_mAP: 0.1296
Epoch 2/60
97/97 [==============================] - 22s - loss: 1.0060 - mAP: 0.1959 - val_loss: 0.9647 - val_mAP: 0.0784
Epoch 3/60
97/97 [==============================] - 21s - loss: 1.0051 - mAP: 0.2268 - val_loss: 0.9851 - val_mAP: 0.1536
Epoch 4/60
97/97 [==============================] - 21s - loss: 1.0051 - mAP: 0.1650 - val_loss: 0.9519 - val_mAP: 0.1808
Epoch 5/60
97/97 [==============================] - 21s - loss: 1.0034 - mAP: 0.2474 - val_loss: 0.9696 - val_mAP: 0.3072
Epoch 6/60
97/97 [==============================] - 21s - loss: 1.0025 - mAP: 0.2577 - val_loss: 0.9895 - val_mAP: 0.3584
Epoch 7/60
97/97 [==============================] - 21s - loss: 1.0044 - mAP: 0.2990 - val_loss: 0.9717 - val_mAP: 0.5392
Epoch 8/60
97/97 [==============================] - 21s - loss: 1.0007 - mAP: 0.2784 - val_loss: 0.9902 - val_mAP: 0.4096
Hi, I found something wrong for the loss value.
The loss value is almost not changing at the time model training.
And the loss value is changed when I change margin value, loss value is approximate of margin
The text was updated successfully, but these errors were encountered: