-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement CTC with keras? #383
Comments
Do you have a reference for what you are trying to implement? As well as your attempt so far. |
Hi @fchollet , the original paper of CTC could be found here by Alex Graves. Basically, CTC is a special loss function to handle alignment. For example, in speech recognition, suppose the input sequence has a length of t (then the output of RNN also has a length of t), usually the target sequence would have a length of w smaller than t. CTC saves the need for pre-segmentation of the inputs and post-segmentation of the net outputs. I was trying to add a new cost function in objectives.py, based on this ctc.py file. The model could be compiled, however, there are several errors when I do model.fit(). I guess the reason lies in these lines, which implies that the two arguments to cost function should share same shape. Correct me if I misunderstand anything |
@amaas implemented the CTC loss strictly faithful to the original paper in a very straightforward way. |
It should be relatively straightforward to port our CTC implementation into the Keras framework. Note that our fast version is cython (which doesn't seem to be used elsewhere in Keras). Without cython the loops to compute alignments required to evaluate the CTC loss were painfully slow. |
@amaas : do you have a theano version implementation? Or can your fast version work with theano? |
@Jedi00 No, we wrote our RNNs from scratch without Theano. If you want to replace the NN architecture though you could take just our CTC loss and make it a Theano function. It only needs to interact with the final layer so it should be mostly unchanged in a Theano implementation. |
Hi @blackyang, did you implement Lasagne's CTC into Keras? If you did, could you tell me how to do? |
Hi @jinserk , I was stuck at the same place, so I used Lasagne which I think is more extensible. By the way I recommend amaas's implementation instead of Lasagne's CTC, since the later one is somehow problematic |
I tried it too, unfortunately, failed... |
The following paper trained a convolutional bidirectional LSTM network to recognize natural scene texts without text line segmentation. The open source code implemented CTC in C++ for the Torch7 framework in Lua. The C++ code can be modified to use in Python. [1] B. Shi, X. Bai, C. Yao. An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition. CoRR abs/1507.05717, 2015. |
Baidu just released their open source CPU and GPU implementation of CTC here: It is released as a C-library and bindings for Torch. The C library should be easy to integrate into many different projects. |
@ekelsen thanks for the pointer! |
Here is a implementation of Theano bindings for Baidu's warp-ctc: https://github.com/sherjilozair/ctc Is there any plan for Keras to do bind this? |
And from TensorFlow... https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/ctc |
You guys have any luck with this implementation? |
I maintain a repository of CTC with various implementations, including cython, numba/python and theano versions, check here: https://github.com/daweileng/Precise-CTC. You can use CTC_precise or CTC_for_train class, they're both fine for RNN training. The CTC objective is different from the current objective functions in Keras, and requires different masking mechanism. I also maintain a repository of Keras MOD with CTC incorporated, check here : https://github.com/daweileng/keras_MOD. Currently, only train_on_batch() is modified to be compatible with CTC. This is enough for me, so there's no definite planning to modify other parts of Keras. |
Oh this is perfect and exactly what I am interested in. Thank you! |
Just to let you know, there is this discussion with version that wrap baidu There is 2 current wrapper version at: https://github.com/mcf06/theano_ctc and https://github.com/sherjilozair/ctc On Fri, May 13, 2016 at 1:41 AM, shantanudev notifications@github.com
|
@daweileng Do you have any instructions/examples as to how to use your Keras MOD? |
Under the repository https://github.com/daweileng/Precise-CTC, there is a folder named as 'Test', you can find a demo script 'mnist_ctc_v4.py' there. |
@daweileng In mnist_ctc_v4.py you import from NN_auxiliary and from mytheano_utils. Can you share these too? Maybe in Keras MOD? |
For who's interested: I updated my CTC-integrated Keras fork to base version 1.0.4, check here: https://github.com/daweileng/keras_MOD/tree/MOD_1.0.4. Till now, the following train/test functions work well with CTC cost:
|
@daweileng Sadly you did not fork the Keras repository. Instead you just copied the files over and added everything (including your patches) in one commit. Can you do that properly (e.g., press the fork button on github, clone, add your changes, commit separately, push) so your patches become actually visible? That'd be awesome. |
@githubnemo As explained in the README, the reason I didn't make a pull request is that to avoid a mass modification of Keras' masking mechanism, currently I override sample_weights and masks variables of Keras. In theory this should not cause problem for other networks but I'm not 100% sure about this. Besides, the modification of fit() function is not done yet. I'd like to collect enough feedback before an official pull request to Keras master branch. If you just want to know what are changed, you can compare contents of the two repositories. Progress: Now FCN can work with LSTM + CTC! |
See also #3436 |
@harikrishnavydana The ocr example runs fine for me on both Theano and Tensorflow. If it is not working for you, please review the issue guidelines (update keras) and if the issue persists, open a new issue. |
Thank you, i was using the older version of keras @patyork |
Hi, thanks @patyork . Just to understand. You are putting text in images. And using some of these to train and others to validate? but you are using full words to train and not characters. Pycairo is a complicated library to install :) |
I am trying to use keras ctc in Bidirectional LSTM i.e. https://github.com/lvapeab/ABiViRNet Since loss function in keras uses 2 arguements, ctc_batch_cost uses 4. Can somebody tell me how to process it? |
Apparently, there is a ctc_loss implementation in Keras. There's an open issue on Keras' |
Hello... We already have some sample of CTC at keras repository? |
You mean this one? If so, yeah I know there is already a sample. It's just when I search for "Keras CTC" in google, this issue comes up and I just thought it would be nice to let people know that such an implementation already exists. |
Great |
Is it ok to use Is there any plan to integrate WarpCTC to Keras? |
Could you please tell what input_length and label_length specify? As per the documentation it seems label_length contains the lengths of ground truth strings (in case of OCR). But I'm not sure what input_length means. |
I think input_length refers to your sequence length and label_length refers to the ground truth label length. |
Hi there,
Has anyone implemented a (Connectionist-Temporal-Classification)CTC loss with keras?
I attempt to add such a cost function in objectives.py file, based on rakeshvar's code. The model could be compiled, however, there are several errors when I do model.fit(). I am new to theano so it's really tough for me to debug...
It shouldn't be hard in theory, so I guess I made some "naive" mistakes...
The text was updated successfully, but these errors were encountered: