Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to reproduce the results in the paper when training from the scratch #11

Open
becxer opened this issue Jan 7, 2022 · 5 comments

Comments

@becxer
Copy link

becxer commented Jan 7, 2022

Hello, we have a problem with reproducing the results in the paper.

With the official code and the default parameters for training, we are not able to reach the desirable scores except IC03 and IC13.

Method Train Opt Epoch IC03 IC13 IC15 IIIT5k SVT SVTP CUTE
PREN(Paper) - - 94.90 94.70 79.20 92.10 92.00 83.90 81.30
PREN(w/ Official code) default 3 95.23 94.52 76.97 84.33 87.33 79.23 71.18

We used all data in ST and MJ in LMDB format. We haven't changed any code except to import images and labels.
By any chance, did you use preprocessing that does not exist in the current code when creating the image file?

And also it's very strange that the score on CUTE dataset is 10% lower than the reported one.
Can you guide us in detail on how to reproduce it?

@RuijieJ
Copy link
Owner

RuijieJ commented Jan 9, 2022

Hi, this is strange, I have run the model with different random seed and can get similar results.
The training data I use is not the LMDB format, I just download the original version, and for SynthText I clip text word images from the original image mannually.
I will try to figure this out by also using the LMDB data recently.

@becxer
Copy link
Author

becxer commented Jan 10, 2022

Thanks for the response!

I have an additional question about this.
(1) When configuring batch, did you do random sampled without considering the ratio from the list of images from MJ data and ST data?
(2) Can you guide me to the downloadable link to the train set that was exactly used before?

@RuijieJ
Copy link
Owner

RuijieJ commented Jan 10, 2022

(1) Yes, we simply sampled from the whole data, without considering the ratio from each dataset
(2) for MJSynth, please reffer to the offical site, for SynthText, please also reffer to this link

@milely
Copy link

milely commented Mar 23, 2022

Hello, we have a problem with reproducing the results in the paper.

With the official code and the default parameters for training, we are not able to reach the desirable scores except IC03 and IC13.

Method Train Opt Epoch IC03 IC13 IC15 IIIT5k SVT SVTP CUTE
PREN(Paper) - - 94.90 94.70 79.20 92.10 92.00 83.90 81.30
PREN(w/ Official code) default 3 95.23 94.52 76.97 84.33 87.33 79.23 71.18
We used all data in ST and MJ in LMDB format. We haven't changed any code except to import images and labels. By any chance, did you use preprocessing that does not exist in the current code when creating the image file?

And also it's very strange that the score on CUTE dataset is 10% lower than the reported one. Can you guide us in detail on how to reproduce it?
Hello, have you found the problem? I also used the MJ and ST datasets in LMBD format for training, and got similar results to yours, but didn’t achieve the performance in the paper.

@qyfff
Copy link

qyfff commented Nov 9, 2022

怎么训练两个数据集

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants