-
Notifications
You must be signed in to change notification settings - Fork 45.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[deeplab] Training deeplab model with ADE20K dataset #3730
Comments
|
@aquariusjay Thanks for the hints. Now I have started the training, using the provided VOC model checkpoint, setting There are still two things confusing me:
|
Oh, will it be OK to prepare a pull request for the ADE20K dataset? |
Regarding your previous questions:
We currently do not have any plan to prepare that. Cheers, |
I'm currently having similar issues attempting to train with a custom dataset and was hoping you could offer some insight.
The link you included "here" appears to need a Google SSO to login. I am assuming that was a link to the train_util.py script. Here are the changes I have currently made to implement your architecture on my custom dataset:
However, when I run this my code appears to successfully train, but then running into an issues with the the confusion matrix during evaluation (I include the traceback below for reference). Any tips/suggestions on how to fix this? Thanks for your help! Error Traceback:
|
3. train_utils.py
• I modify the code here so that the exclude_list only includes the
`_LOGITS_SCOPE_NAME', as you stated above.
exclude_list = ['_LOGITS_SCOPE_NAME']
if not initialize_last_layer:
exclude_list.extend(last_layers)
this should be
exclude_list = [_LOGITS_SCOPE_NAME]
That is, _LOGITS_SCOPE_NAME is a variable defined else where (search for it)
|
I am trying to train the deeplab model with the ADE20k datasets. |
@brett-whitford When I use my data .I have the same error with you . Can you share your solution? |
@wonderit Of course. Please wait for a while until I have access to my GPU server. |
@wonderit Here is the patch for converting training data and training deeplabv3 on ADE20K. https://gist.github.com/walkerlala/82d978e68407e65158e8825cd470d7e1 (it can also be found at http://fastdrivers.org/misc/patch-for-ade20k.patch ) You can apply this patch on top of commit 1d38a22 or 5281c9a without conflict. Note:
I am also going to submit a PR to get these into the repo. However, I don't have enough GPU to get a good pretrained model (only get two Nvidia 1080...) If you can obtain a decent pretrained model, please share! |
Also, anyone interested in add ADE20K to deeplabv3 can take a look at this PR I just created: #3853 |
@walkerlala When use val.py, did you have the error 'predictions' out of bound?just same with the @brett-whitford ' question. |
@walkerlala Can you share your eval script? |
@walkerlala @aquariusjay I am not sure whether I understand it correctly: If so, following @aquariusjay 's suggestion, in "train_utils.py": exclude_list = [_LOGITS_SCOPE_NAME]
if not initialize_last_layer:
exclude_list.extend(last_layers) if set Shouldn't it be the following? |
Hi, I'm training on my own dataset as well (only two classes). When I set exclude_list = ['logits']
if not initialize_last_layer:
exclude_list.extend(last_layers) Then when I run vis.py, it gives me all black images (not binary). When I only set
when Anyone knows why this happens? Thanks! |
@lydialixia exclude_list = ['global_step'] But I am still confused about whether one should set |
When you want to fine-tune DeepLab on other datasets, there are a few cases:
|
Hi @walkerlala: did you manage to finetune the ADE20K dataset? |
@georgosgeorgos No I can't eventually fine tune the model on ADE20K dataset. I don't have enough GPU. Every time I try to fine tune the batch normalization parameters the model blow up throwing out out-of-memory error. So I freeze the batch normalization layers when training. Finally I only got a model with only "modest" performance: Here is the original image (too large to display here): http://www.fastdrivers.org/misc/stuffseg-origin.jpg Here is the segmentation result: However I can get a satisfying result with PSPNet: According to the slides from the 2017 Coco + Places Workshop, deeplabv3 should also be able to do that, but I haven't got any luck to fine-tune that. Hopefully Google can provide a fine-tuned pre-trained model in the future @aquariusjay . |
@brett-whitford - Hi Brett, I am having the exact same problem as you. How did you end up solving it? |
@shipeng-uestc - Hi shipeng, did you manage to solve the issue? I am currently using |
when I run NotFoundError (see above for traceback): Key aspp1_depthwise/BatchNorm/beta not found in checkpoint |
@hhwxxx Hello, in your answer to lydialixia, do you mean in train_util.py, exclude_list should be like this: but I still can't start training, the information is: I have also tried exclude_list = ['_LOGITS_SCOPE_NAME'], this doesn't work. |
Hello. Maybe you can try this: As to the And I have no idea about |
Just set |
@BeSlower , yes, the solution is work for me but there is another problem that the result is all black and no other label , but during the training process , the loss is decrease. Can anyone help me ? |
@qmy612 Did you get the problem solved? I am having the exacting problem as you |
@xiangjinwu Yes, the answer of hhwxxx is work. |
@apolo74 Thanks I got the output now |
Happy to hear that! |
hi,i tried the training on my own data(classe=2=1+background) |
hi ,i have the same problem as you, the predicted mask is a black image |
Assuming that you are re-training on your own data that, for example, has 2 classes... in my toy case I mentioned I created a dataset with circles and squares. Then I have 2 classes BUT the parameter called "--num-classes" should be 4 because: 2 (own classes) + 1 (background) + 1 (ignore_label) Hi , i tried the trainig on my custom dataset |
Hey guys! Have you ever evaluate the provided ade20k pretrained model on val set? I have test them, but both mobilenetv2_ade20k_train and xception65_ade20k_train are lower than the reported performance for about 3%-4%. |
thx to your descriptive comment i was able to train successfully deeplab on my custom dataset(14000 images) |
Hey guys, I would really appreciate some help on this matter. Thanks in advance |
@ma8tsch did you manage to freeze some layers eventually ? If yes, can you pls provide some details ? |
Thank you for your help :) setting
allowed me to no longer have all black masks. I am not getting color spotted, but not acceptable, masks after only 100 steps. To phrase what you said more clearly (for me at least), you are saying that images should be labeled with only values from 1...N where N is the number of classes, and 0 is reserved for background, and possibly even N+1 because of the ignore label (I am not utilizing this). In other words, if you have 2 classes (circle and triangle), you will have 4 labels/indexes in your image.
How can I confirm that this is the case for my dataset? I'll report back tomorrow after 10,000 steps to confirm. |
How did y'all color index your images? It seems that my images ARE color indexed as @apolo74 specified. Here is what my model got after 10000 steps: This is what a color indexed image looks like in my dataset (not from same picture as above): Any possible help? |
Hi i am trying to run deeplab in my own dataset but i get an error when i am running the train.py it is related to the number of clases because i have 5 but apparently the program is expecting 21 like the number of classes in the VOC dataset, |
@aquariusjay |
Hi, My loss does not change. It has become stagnant. I have tried everything mentioned related to deeplabv3+ on every blog. changes in train_util.py are :
Variables that will not be restored.exclude_list = ['global_step','logits'] my train.py nohup python deeplab/train.py Please help 👍 |
@aquariusjay Hi, May I know how we can quantify our dataset to find out these values. |
@PallawiSinghal did u solve it?I also want to change the loss_weight |
@jinyuan30 did u solve it?I also want to change the loss_weight |
@aquariusjay Hi there~, about the problem of classes imbalance, new version train_utils.py of deeplab seems to change the code, so maybe I can't add variables like label1_weight = to fix classes imbalance problem. |
Hello, it seems that I meet the same problem, have you solved it yet? |
@LightingX Hi,friend! Have you figured out how to adjust the loss weight in new version of train_utils.py? |
Did you solve it? I have the same problem now :/. |
@Alive1024 @claudiourbina Hey guys, in the latest implemented version, it seems we can adjust the weight by params. When training, add
|
@essalahsouad Hi! Did you solved problem with black images ? Still actual for me |
System information
Describe the problem
This is a feature request. I am trying to train the deeplab model with the ADE20K dataset (see this presentation). I have finished the data format conversion and "successfully" train the model on a small subset of ADE20K. Below is the modification to file
research/deeplab/datasets/segmentation_dataset.py
which is used to extract segmentation data.The problem is, in the ADE20K dataset there are 150 classes, which is different from that in the VOC or cityspace dataset. That brings problem w.r.t the checkpoint file. Currently there are only pretrained model on the VOC and cityspace dataset. So we have two choices here:
Are there any alternatives to these?
If anyone have any workable solution for the ADE20K dataset it would be really appreciated.
The text was updated successfully, but these errors were encountered: