-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCR: Add IAM corpus with unk decoding support #6
Conversation
Ashish, could you please rebase against the ocr branch? |
|
||
num_targets=$(tree-info $tree_dir/tree | grep num-pdfs | awk '{print $2}') | ||
learning_rate_factor=$(echo "print 0.5/$xent_regularize" | python) | ||
common1="required-time-offsets= height-offsets=-2,-1,0,1,2 num-filters-out=36" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can remove required-time-offsets=
altogether
6a93702
to
f8eb4fd
Compare
Thanks, rebased it against ocr branch. updated headers in the new recipes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some notes about the headers
@@ -29,8 +29,8 @@ alignment_subsampling_factor=1 | |||
chunk_width=340,300,200,100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the results for this recipe if it's not already updated
@@ -33,8 +33,8 @@ alignment_subsampling_factor=1 | |||
chunk_width=340,300,200,100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also for this recipe.
Change the description to "chainali_1a is as 1a except it uses chain alignments (using 1a system) instead of gmm alignments" and then append the output (and the command itself) of compare_wer.sh
for 1a and chainali_1a (after 1 blank line)
@@ -0,0 +1,235 @@ | |||
#!/bin/bash | |||
|
|||
# chainali_1b uses chain model for lattice instead of gmm-hmm model. It has more cnn layers as compared to 1a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change this to "chainali_1b is as chainali_1a except it has 3 more cnn layers."
Then append the compare_wer.sh output (with the command) after adding a blank line
@@ -0,0 +1,226 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove this and 1d for now. The improvements are not significant.
sorry. updated headers for run_cnn_1a.sh, run_cnn_chainali_1a.sh, run_cnn_chainali_1b.sh. removed run_cnn_chainali_1c.sh , run_cnn_chainali_1d.sh. |
Thanks. Merging... |
* OCR: Add IAM corpus with unk decoding support (#3) * Add a new English OCR database 'UW3' * Some minor fixes re IAM corpus * Fix an issue in IAM chain recipes + add a new recipe (#6) * Some fixes based on the pull request review * Various fixes + cleaning on IAM * Fix LM estimation and add extended dictionary + other minor fixes * Add README for IAM * Add output filter for scoring * Fix a bug RE switch to pyhton3 * Add updated results + minor fixes * Remove unk decoding -- gives almost no gain * Add UW3 OCR database * Fix cmd.sh in IAM + fix usages of train/decode_cmd in chain recipes * Various minor fixes on UW3 * Rename iam/s5 to iam/v1 * Add README file for UW3 * Various cosmetic fixes on UW3 scripts * Minor fixes in IAM
* OCR: Add IAM corpus with unk decoding support (#3) * Add a new English OCR database 'UW3' * Some minor fixes re IAM corpus * Fix an issue in IAM chain recipes + add a new recipe (#6) * Some fixes based on the pull request review * Various fixes + cleaning on IAM * Fix LM estimation and add extended dictionary + other minor fixes * Add README for IAM * Add output filter for scoring * Fix a bug RE switch to pyhton3 * Add updated results + minor fixes * Remove unk decoding -- gives almost no gain * Add UW3 OCR database * Fix cmd.sh in IAM + fix usages of train/decode_cmd in chain recipes * Various minor fixes on UW3 * Rename iam/s5 to iam/v1 * Add README file for UW3 * Various cosmetic fixes on UW3 scripts * Minor fixes in IAM
No description provided.