Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing problem on cityscapes #150

Closed
FishYuLi opened this issue Nov 15, 2017 · 18 comments
Closed

Reproducing problem on cityscapes #150

FishYuLi opened this issue Nov 15, 2017 · 18 comments

Comments

@FishYuLi
Copy link

FishYuLi commented Nov 15, 2017

Hello! I have read your paper very carefully and tried to reproduce your experiments on cityscapes dataset (photo to label). I changed the input size from 256 to 128 and use resnet_6blocks for G as stated in the appendix, and trained a model, and evaluated the performance of photo to label generator. You got (0.58, 0.22, 0.16) results for pixel acc, cls acc, and iou. But my best results are (0.51, 0.16, 0.10), which have quiet a large margin. I wondered if there are any other details that I should change. What is your configs for cityscapes? And can the size of batch_size produce an influence on the final results? Also, how did you evaluate the segmentation results of 128x128 generated images? Did you resize the original label image to 128x128 and do the evaluation? Thanks a lot. This work is amazing.

@junyanz
Copy link
Owner

junyanz commented Nov 15, 2017

Thanks for your nice words. We use batchSize=1 and we got worse results with a larger batchSize. I remembered a few details (e.g. saving the output as png images, loadSize=143, fineSize=128). We use the evaluation code in pix2pix. See more discussion here.

@FishYuLi
Copy link
Author

@junyanz Ok, I see. Thanks a lot! I use batchSize=20 on 4 gpus to accelerate training. I will train the model again and try your evaluation code in pix2pix.

@FishYuLi
Copy link
Author

FishYuLi commented Nov 16, 2017

@junyanz Sorry it's me again. I tried your evaluation code. It looks normal when I take the fake image as input, but it's so weird that when the input is the true image, the segmentation results become so bad. My questions are,

  1. what iou does this of this caffe model get on cityscapes?
  2. is there any special configs that we need to change when running normal images?
    results
    The numbers are (pixel acc., cls acc., iou).
    (Do I need to new an issue in pix2pix?)

@tinghuiz
Copy link
Collaborator

Please see the notes from the original pix2pix torch repo (copied below):
The pre-trained model does not work well on Cityscapes in the original resolution (1024x2048) as it was trained on 256x256 images that are resized to 1024x2048. The purpose of the resizing was to 1) keep the label maps in the original high resolution untouched and 2) avoid the need of changing the standard FCN training code for Cityscapes. To get the ground-truth numbers in the paper, you need to resize the original Cityscapes images to 256x256 before running the evaluation code.

@junyanz
Copy link
Owner

junyanz commented Dec 29, 2017

Please see this discussion.

@lx7555
Copy link

lx7555 commented Mar 24, 2019

I want to evaluate cityscapes datasets,because I try to reproduce the fcn score results from pip2pip by using https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix code.
I try many ways.I can not get any results when I run the scripts. All parameters are zero.
I still want to know how to configure the cityspaces dataset
There are three folders with gtFine, originals image, and predictions.
Are the gtFine and predictions color or grayscale? And what is the size of these three types of pictures?
I use python2,caffe.I Is that possible that it was because i was using python2.7 but not python3 ?
@FishYuLi

@junyanz
Copy link
Owner

junyanz commented Mar 25, 2019

@tinghuiz

@ZhangCZhen
Copy link

Hello, I evaluated the results after using the cityscape dataset, and the values obtained are far from those mentioned in the paper. So I would like to ask, is the code given by 'evaluate.py' only used to evaluate the results of label2image? Can I evaluate image2label? After I use RGB image to generate semantic segmentation results, how do I evaluate the segmentation results? Thanks a lot!

@FishYuLi
Copy link
Author

FishYuLi commented Jan 6, 2020

Hello, I evaluated the results after using the cityscape dataset, and the values obtained are far from those mentioned in the paper. So I would like to ask, is the code given by 'evaluate.py' only used to evaluate the results of label2image? Can I evaluate image2label? After I use RGB image to generate semantic segmentation results, how do I evaluate the segmentation results? Thanks a lot!

@ZhangCZhen

  1. Yes, the given 'evaluate.py' is for label2image. It sends the generated image to a FCN model to generate a mask, and evaluates the mask with ground truth mask.
  2. Of couse you can use it to evaluate image2label. Just directly load your generated label, transfer the RGB label to H x W x C, and you can re-use the evaluation code in 'evaluate.py'.

@ZhangCZhen
Copy link

@FishYuLi Thanks for your reply.

  1. When you use 'evaluate.py' to evaluate the label2RGB image results, did you get values similar to those mentioned in the paper? I used the RGB images in the dataset for testing, and the accuracy is still poor.
  2. You mean I use the 'evaluate.py' to evaluate image2label results, I don't need the FCN model?

@FishYuLi
Copy link
Author

FishYuLi commented Jan 8, 2020

@ZhangCZhen

  1. Yes, I got almost the same FCN score.
  2. Yes, FCN model is used to transfer the generated real image to label again so that we can compare it with GT masks. Please refer to the paper of Pix2pix carefully to see how exactly the evaluation metric is designed.

@ZhangCZhen
Copy link

@FishYuLi
Thank you for your reply, you are really enthusiastic!!

@ZhangCZhen
Copy link

@FishYuLi
Sorry it's me again. When you evaluate the results of image2label, you need to convert the semantic segmentation results into Ids. Right? How do you convert the semantically segmented RGB map to Ids map? I mean, the pixel values of the color label image you got do not exactly match the pixel values in 'labels.py'. How do you evaluate the semantic segmentation results? Thanks a lot.
I will be very grateful if you could send me your evaluation code. My email: 825762985@qq.com

@FishYuLi
Copy link
Author

@ZhangCZhen
In my case, I first got a standard color map for Cityscapes dataset. (I mean you should get what is the exact RGB color for a certain category in ground truth label map.) Then, for the generated label, calculate the L2 distance of each pixel with the RGB color of all categories, and asign that pixel to the category with the smallest L2 distance. Then you get the WxHxC prediction.

@ZhangCZhen
Copy link

@FishYuLi
With your help, I have reproduced the evaluation results of label2image, and I am very grateful for this. Thank you very much! But I don't understand your above method of evaluating image2label. I have a few questions for this to ask you.

  1. Did you use 'evaluate.py' when evaluating image2label? I think you used it, did you?
  2. If you used 'evaluate.py' to evaluate the generated results of image2label, is the ground truth '_gtFine_labelIds.png'? The calculation of accuracy values during the evaluation process uses the 'trainIds' of the ground truth. (Use 'assign_trainIds' function in 'cityscape.py' to align). But I don't know how to convert the generated color label map to trainIds value.
  3. "Of couse you can use it to evaluate image2label. Just directly load your generated label, transfer the RGB label to H x W x C, and you can re-use the evaluation code in 'evaluate.py'." --This is your reply. I don't know how to transfer the RGB label to HWC, and why you transfer it?
    Looking forward to your reply. Thanks a lot!

@CR-Gjx
Copy link

CR-Gjx commented Apr 15, 2020

@junyanz Ok, I see. Thanks a lot! I use batchSize=20 on 4 gpus to accelerate training. I will train the model again and try your evaluation code in pix2pix.

Hi, did you re-train the model and get the similar results with them in the paper? Could you provide some details (e.g parameters) for me? Now I cannot reproduce the results using default parameters. Thanks! @FishYuLi

@FishYuLi
Copy link
Author

FishYuLi commented Apr 15, 2020

@junyanz Ok, I see. Thanks a lot! I use batchSize=20 on 4 gpus to accelerate training. I will train the model again and try your evaluation code in pix2pix.

Hi, did you re-train the model and get the similar results with them in the paper? Could you provide some details (e.g parameters) for me? Now I cannot reproduce the results using default parameters. Thanks! @FishYuLi

@CR-Gjx Yes, I got similar results as that in the paper just with the default parameters. I think you may try to set your batchSize=1 on single gpu. I have also tried to enlarge the batchSize but find that the model is very sensitive to batch size. Results become worse with large batch size. Also note that the input size should be 128x128 but not 256.

@CR-Gjx
Copy link

CR-Gjx commented Apr 21, 2020

Thanks for your help, I have reproduced the results

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants