Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect example command in README , MNIST generate data script is broken and improving documentation in key parts of the code #7

Open
rmcavoy opened this issue Nov 5, 2019 · 3 comments

Comments

@rmcavoy
Copy link

rmcavoy commented Nov 5, 2019

The readme says to run dnse the command
python main_sda.py --method dsnet --cfg cfg/digits-a.json --bb lenetplus --bs 256 --src MT --tgt MM --nc 10 --size 32 --log-itv 100 --dropout --hybridize
This needs a minor correction as "dsnet" should be replaced with "dsne" since dsnet is not currently an option for main_sda.py.

A larger problem is the the mnist gen_dataset.py script transforms the dataset into the incorrect format in several ways and consequentially errors out immediately when one tries to load the dataset with the example commands (I used dsne with train_sda.py)

  • It creates images that are batch, height, width .reshape(-1, rows, cols) (on line 16 of gen_dataset.py) instead of batch, height, width, channel (.reshape(-1, rows, cols,1)) which causes the resize transforms to fail
  • it stores a pkl of a dictionary with keys TR and TE that each contain their particular data in lists [img, lbl]. While this is the correct shape for deeper in the code (e.g. training_sda.py), load_pkl in utils/io.py requires that the data be unpacked with tr_x, tr_y, te_x, te_y = pkl.load(f)
  • It also by default uses self.__class__.__name__ instead of self.__class__.__name__.lower() when the cfg file expects a lower cast pickle file name and the class is all-caps by default.

As a more minor correction, it would also be nice if the gen_dataset.py file was located in the datasets folder as that is where the datasets should be located. It would also be good to provide a fully documented README example where you go through every step of the calculation (including the data moving and conversion) so users don't have to dig through the code and example inputs files to find out where they need to put the datasets and with what directory names.

Also for future maintainabity and expandability, it would be good if the commenting or variable naming in the important pieces of the code was expanded and clarified.

For example, at first glance for someone new to mxnet, the loop inside dnse's train_batch (ln 1018 of training_sda.py) appears to be using the zip function to loop over a list of samples in the minibatch but is actually looping over the list of ctx that the batch is split over. That may be obvious to someone very familiar with mxnet but will not be necessarily be obvious or readable at all for new users coming from e.g. pytorch where the gpu context split is handled seamlessly in the background. This confusion is also not helped by the naming scheme in e.g. dSNELoss (in custom layers) which refers to sizes as "bs_src" and "bs_tgt" or "N, K" and "M, K" which are only obvious, if A the user has not assumed the loop in train_batch is over the batch index and B the user has a good idea what variable refers to which already.

@kailigo
Copy link

kailigo commented Dec 27, 2019

@rmcavoy . That is exactly what I have met. I spent several days figuring these out. I doubt if the authors have tested the code before uploading them.

Btw, @rmcavoy, did you manage to have the code run for the other datasets?

@rmcavoy
Copy link
Author

rmcavoy commented Dec 30, 2019

No, I didn't bother to try to get it working for other datasets as I moved onto other things. WRT to a lack of testing, I would assume that they have a server where all the data is layed out and formatted in the correct manner so there was never any need for them to get these pieces running for them to use it.

@joshuacwnewton
Copy link

Thank you kindly for taking the time to point these issues out, @kailigo and @rmcavoy! You've potentially saved me a lot of time and troubleshooting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants