- The handwritten dataset used is IAM
- The wrapper for code was taken from youtube-8m contest.
- The lstm2d.py and lstm1d.py was taken from Tensorflow contrib.
- The input from training and testing are stored in tfrecods files.
- Python 2.7
- Tensorflow 1.1
- Note: I used that docker image in order to run the project, change the tensorflow vertion to 1.1.0
- open make tfRecods.ipynb notebook
- change the path '../xml/' and '../forms/' to the path to the xml and forms folders from IAM dataset
- run all the cells on the notebook
- in the 'test-batch1' folder will be created tfrecords files
- note: sorted(glob.glob(pathXML+"*.xml"))[:200] will process just 200 images, change 200 to more
- note: this notebook will create tf.records with images of shape: (350,25)
python train.py --slices 55 --width 12 --stride 1 --Bwidth 350 --vocabulary_size 29 \
--height 25 --train_data_pattern test-batch1/handwritten-test-{}.tfrecords --train_dir models-feds \
--test_data_pattern test-batch1/handwritten-test-{}.tfrecords --max_steps 20 --batch_size 20 --beam_size 1 \
--input_chanels 1 --start_new_model --rnn_cell LSTM --model MDLSTMCTCModel --num_epochs 6000
python train.py --slices 55 --width 12 --stride 1 --Bwidth 350 --vocabulary_size 29 \
--height 25 --train_data_pattern test-batch1/handwritten-test-{}.tfrecords --train_dir models-feds \
--test_data_pattern test-batch1/handwritten-test-{}.tfrecords --max_steps 20 --batch_size 20 --beam_size 1 \
--input_chanels 1 --start_new_model --rnn_cell LSTM --model LSTMCTCModel --num_epochs 6000
python inference.py --slices 55 --width 12 --Bwidth 350 --stride 1 \
--input_chanels 1 --height 25 --input_data_pattern test-batch1/handwritten-test-1.tfrecords \
--train_dir models-feds --batch_size 20 --beam_size 1
- --slices: number of slices
- --width: width of the window
- --stride: step for
- --Bwidth: image width
- --train_data_pattern ../tf-data/handwritten-test-{}.tfrecords
- --train_dir separable_lstm
- --test_data_pattern ../tf-data/handwritten-test-{}.tfrecords
- --max_steps 6000
- --batch_size 20
- --beam_size 3
- --input_chanels 1
- --model: the class from handwritten models.py
- --base_learning_rate 0.001
- --num_readers 2
- --export_model_steps 500
- --display_step 10
- --display_step_lme 100
- --start_new_model
- --hidden: lstm number of neurons
- --layers: number of layers of lstm cell
tensorboard --logdir=separable_lstm --port=8080
- In order to see some prediction example checkout the jupyter notebook