The domain classifier loss is not decreasing. #12

zhongpeixiang · 2023-03-10T07:05:45Z

As shown in the image above, the domain classifier loss is almost constant throughout the training process. I use a ViT as the feature extractor, a linear layer as label classifier, and a two-layer MLP as the domain classifier.

What are the possible causes? and what are the typical loss curves for domain classifier?

Thanks

cs-mshah · 2023-04-03T10:50:05Z

Even I am facing this issue. The DANN loss converges quite fast. Here are the plots of this repository without any changes to the code: wandb-DANN. In my other case I tried using a resnet18 and the domain classifier similar to the classifier part of resnet. There also the loss almost instantly stabilised to around 0.7. Is there any good repository where we can clearly understand how DANN works and a CLEAR practical working example of stable training?

taotaowang97479 · 2024-05-10T04:01:30Z

Even I am facing this issue. The DANN loss converges quite fast. Here are the plots of this repository without any changes to the code: wandb-DANN. In my other case I tried using a resnet18 and the domain classifier similar to the classifier part of resnet. There also the loss almost instantly stabilised to around 0.7. Is there any good repository where we can clearly understand how DANN works and a CLEAR practical working example of stable training?

I had the same problem. I used the code and data set given by the author, and there was no problem in the training. With the progress of training, the loss of the domain classifier stabilized at 0.65-0.67, and the training loss was shown in the figure.

However, with my own network, the same training method, the domain classifier loss is stable at 0.69 from the beginning, which seems to indicate that the domain classifier is not learning, it will randomly classify the source domain or target domain samples with 50% probability.
Can anyone figure out how to fix it?

cs-mshah · 2024-05-10T12:02:26Z

I had shifted to a more robust codebase: https://github.com/thuml/Transfer-Learning-Library

taotaowang97479 · 2024-05-15T08:04:47Z

I had shifted to a more robust codebase: https://github.com/thuml/Transfer-Learning-Library

So is it the code itself? I went to this codabase and looked at the same DANN code and I didn't think there was a big difference in the way it was written

JialingRichard · 2024-12-08T21:20:32Z

Even I am facing this issue. The DANN loss converges quite fast. Here are the plots of this repository without any changes to the code: wandb-DANN. In my other case I tried using a resnet18 and the domain classifier similar to the classifier part of resnet. There also the loss almost instantly stabilised to around 0.7. Is there any good repository where we can clearly understand how DANN works and a CLEAR practical working example of stable training?

I had the same problem. I used the code and data set given by the author, and there was no problem in the training. With the progress of training, the loss of the domain classifier stabilized at 0.65-0.67, and the training loss was shown in the figure. However, with my own network, the same training method, the domain classifier loss is stable at 0.69 from the beginning, which seems to indicate that the domain classifier is not learning, it will randomly classify the source domain or target domain samples with 50% probability. Can anyone figure out how to fix it?

Does this means the domain classifier even didn't learn anything? Did you solve this problem， like other resp's code can be better?

taotaowang97479 · 2024-12-13T02:08:32Z

Even I am facing this issue. The DANN loss converges quite fast. Here are the plots of this repository without any changes to the code: wandb-DANN. In my other case I tried using a resnet18 and the domain classifier similar to the classifier part of resnet. There also the loss almost instantly stabilised to around 0.7. Is there any good repository where we can clearly understand how DANN works and a CLEAR practical working example of stable training?

I had the same problem. I used the code and data set given by the author, and there was no problem in the training. With the progress of training, the loss of the domain classifier stabilized at 0.65-0.67, and the training loss was shown in the figure. However, with my own network, the same training method, the domain classifier loss is stable at 0.69 from the beginning, which seems to indicate that the domain classifier is not learning, it will randomly classify the source domain or target domain samples with 50% probability. Can anyone figure out how to fix it?

Does this means the domain classifier even didn't learn anything? Did you solve this problem， like other resp's code can be better?

Yes, the domain classifier has not learned anything. I have already abandoned my project, as it is not an issue with the code itself. Based on my experience, hyperparameter tuning does not have much effect (I have tried random search for hyperparameters). I believe the key to successful training is whether the dataset aligns with the network architecture: if the data is too challenging, the network is likely unable to learn domain shift; if the network architecture is overly complex, adversarial training is easily unbalanced. Both of these factors can lead to the domain classifier performing random classification.

JialingRichard · 2024-12-13T18:00:45Z

Even I am facing this issue. The DANN loss converges quite fast. Here are the plots of this repository without any changes to the code: wandb-DANN. In my other case I tried using a resnet18 and the domain classifier similar to the classifier part of resnet. There also the loss almost instantly stabilised to around 0.7. Is there any good repository where we can clearly understand how DANN works and a CLEAR practical working example of stable training?

I had the same problem. I used the code and data set given by the author, and there was no problem in the training. With the progress of training, the loss of the domain classifier stabilized at 0.65-0.67, and the training loss was shown in the figure. However, with my own network, the same training method, the domain classifier loss is stable at 0.69 from the beginning, which seems to indicate that the domain classifier is not learning, it will randomly classify the source domain or target domain samples with 50% probability. Can anyone figure out how to fix it?

Does this means the domain classifier even didn't learn anything? Did you solve this problem， like other resp's code can be better?

Yes, the domain classifier has not learned anything. I have already abandoned my project, as it is not an issue with the code itself. Based on my experience, hyperparameter tuning does not have much effect (I have tried random search for hyperparameters). I believe the key to successful training is whether the dataset aligns with the network architecture: if the data is too challenging, the network is likely unable to learn domain shift; if the network architecture is overly complex, adversarial training is easily unbalanced. Both of these factors can lead to the domain classifier performing random classification.

Hi, I just did a training use this code this week. I found maybe we can set alpha(the weight of GRL layer) is very small at first, to make sure the classifier is strong enough, then we start scale the alpha to make sure the feature extractor layer start 'cheating'. As a result, the accuracy of source and domain classifier accuracy can be very high at first then slow down after that finally they both can be very close to 0.5 accuracy, just like this picture below (red and purple line for 2 domain classifier accuracy). So that We can have a better performance, although this changes of alpha are completed by a math function in this resp code. Hope this can be useful for you to train DANN maybe.

taotaowang97479 · 2024-12-16T03:14:24Z

Even I am facing this issue. The DANN loss converges quite fast. Here are the plots of this repository without any changes to the code: wandb-DANN. In my other case I tried using a resnet18 and the domain classifier similar to the classifier part of resnet. There also the loss almost instantly stabilised to around 0.7. Is there any good repository where we can clearly understand how DANN works and a CLEAR practical working example of stable training?

I had the same problem. I used the code and data set given by the author, and there was no problem in the training. With the progress of training, the loss of the domain classifier stabilized at 0.65-0.67, and the training loss was shown in the figure. However, with my own network, the same training method, the domain classifier loss is stable at 0.69 from the beginning, which seems to indicate that the domain classifier is not learning, it will randomly classify the source domain or target domain samples with 50% probability. Can anyone figure out how to fix it?

Does this means the domain classifier even didn't learn anything? Did you solve this problem， like other resp's code can be better?

Yes, the domain classifier has not learned anything. I have already abandoned my project, as it is not an issue with the code itself. Based on my experience, hyperparameter tuning does not have much effect (I have tried random search for hyperparameters). I believe the key to successful training is whether the dataset aligns with the network architecture: if the data is too challenging, the network is likely unable to learn domain shift; if the network architecture is overly complex, adversarial training is easily unbalanced. Both of these factors can lead to the domain classifier performing random classification.

Hi, I just did a training use this code this week. I found maybe we can set alpha(the weight of GRL layer) is very small at first, to make sure the classifier is strong enough, then we start scale the alpha to make sure the feature extractor layer start 'cheating'. As a result, the accuracy of source and domain classifier accuracy can be very high at first then slow down after that finally they both can be very close to 0.5 accuracy, just like this picture below (red and purple line for 2 domain classifier accuracy). So that We can have a better performance, although this changes of alpha are completed by a math function in this resp code. Hope this can be useful for you to train DANN maybe.

Hi, your ACC and the loss in the image I shared are perfectly matched in this DANN project. In some of the applications of DANN in my field, some alphas are fixed throughout, while others, like in this DANN project, gradually increase as the number of training iterations increases. When I previously used random search for parameters, I always used a fixed alpha value. Now, I'm trying to dynamically adjust alpha. Thanks for your reminder.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The domain classifier loss is not decreasing. #12

The domain classifier loss is not decreasing. #12

zhongpeixiang commented Mar 10, 2023

cs-mshah commented Apr 3, 2023 •

edited

Loading

taotaowang97479 commented May 10, 2024

cs-mshah commented May 10, 2024

taotaowang97479 commented May 15, 2024

JialingRichard commented Dec 8, 2024

taotaowang97479 commented Dec 13, 2024

JialingRichard commented Dec 13, 2024

taotaowang97479 commented Dec 16, 2024

The domain classifier loss is not decreasing. #12

The domain classifier loss is not decreasing. #12

Comments

zhongpeixiang commented Mar 10, 2023

cs-mshah commented Apr 3, 2023 • edited Loading

taotaowang97479 commented May 10, 2024

cs-mshah commented May 10, 2024

taotaowang97479 commented May 15, 2024

JialingRichard commented Dec 8, 2024

taotaowang97479 commented Dec 13, 2024

JialingRichard commented Dec 13, 2024

taotaowang97479 commented Dec 16, 2024

cs-mshah commented Apr 3, 2023 •

edited

Loading