implement CTC with keras? #383

blackyang · 2015-07-12T21:16:30Z

Hi there,

Has anyone implemented a (Connectionist-Temporal-Classification)CTC loss with keras?

I attempt to add such a cost function in objectives.py file, based on rakeshvar's code. The model could be compiled, however, there are several errors when I do model.fit(). I am new to theano so it's really tough for me to debug...

It shouldn't be hard in theory, so I guess I made some "naive" mistakes...

fchollet · 2015-07-14T07:19:01Z

Do you have a reference for what you are trying to implement? As well as your attempt so far.

blackyang · 2015-07-14T16:50:54Z

Hi @fchollet , the original paper of CTC could be found here by Alex Graves.

Basically, CTC is a special loss function to handle alignment. For example, in speech recognition, suppose the input sequence has a length of t (then the output of RNN also has a length of t), usually the target sequence would have a length of w smaller than t. CTC saves the need for pre-segmentation of the inputs and post-segmentation of the net outputs.

I was trying to add a new cost function in objectives.py, based on this ctc.py file. The model could be compiled, however, there are several errors when I do model.fit(). I guess the reason lies in these lines, which implies that the two arguments to cost function should share same shape. Correct me if I misunderstand anything

futurely · 2015-07-23T02:51:29Z

@amaas implemented the CTC loss strictly faithful to the original paper in a very straightforward way.

blackyang · 2015-07-23T02:55:05Z

@futurely thanks! Currently I am using this with lasagne :-)

amaas · 2015-07-23T17:28:41Z

It should be relatively straightforward to port our CTC implementation into the Keras framework. Note that our fast version is cython (which doesn't seem to be used elsewhere in Keras). Without cython the loops to compute alignments required to evaluate the CTC loss were painfully slow.

ghost · 2015-08-19T06:19:25Z

@amaas : do you have a theano version implementation? Or can your fast version work with theano?

amaas · 2015-08-25T19:28:15Z

@Jedi00 No, we wrote our RNNs from scratch without Theano. If you want to replace the NN architecture though you could take just our CTC loss and make it a Theano function. It only needs to interact with the final layer so it should be mostly unchanged in a Theano implementation.

jinserk · 2015-09-22T08:09:09Z

Hi @blackyang, did you implement Lasagne's CTC into Keras? If you did, could you tell me how to do?
Keras' loss objects are all functions defined in objectives.py, and this seems being called from compile() function in models.py. It is wrapped with weighted_objective() function, which call the loss function object with only two params y_true and y_pred. However, Lasagne's CTC is a class object, and the apply() function seems to require 4 params. I'm stuck here.
Thank you.

blackyang · 2015-09-22T16:02:19Z

Hi @jinserk , I was stuck at the same place, so I used Lasagne which I think is more extensible. By the way I recommend amaas's implementation instead of Lasagne's CTC, since the later one is somehow problematic

Michlong · 2015-12-23T09:32:43Z

I tried it too, unfortunately, failed...

futurely · 2015-12-23T09:52:41Z

The following paper trained a convolutional bidirectional LSTM network to recognize natural scene texts without text line segmentation. The open source code implemented CTC in C++ for the Torch7 framework in Lua. The C++ code can be modified to use in Python.

[1] B. Shi, X. Bai, C. Yao. An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition. CoRR abs/1507.05717, 2015.

ekelsen · 2016-01-14T18:47:48Z

Baidu just released their open source CPU and GPU implementation of CTC here:
https://github.com/baidu-research/warp-ctc

It is released as a C-library and bindings for Torch. The C library should be easy to integrate into many different projects.

blackyang · 2016-01-14T19:02:29Z

@ekelsen thanks for the pointer!

ZhangAustin · 2016-01-27T22:53:52Z

Here is a implementation of Theano bindings for Baidu's warp-ctc: https://github.com/sherjilozair/ctc

Is there any plan for Keras to do bind this?

futurely · 2016-03-13T03:15:42Z

https://github.com/baidu-research/warp-ctc

mschonwe · 2016-03-13T04:17:14Z

And from TensorFlow... https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/ctc

shantanudev · 2016-05-12T16:04:44Z

You guys have any luck with this implementation?

ghost · 2016-05-13T01:26:33Z

I maintain a repository of CTC with various implementations, including cython, numba/python and theano versions, check here: https://github.com/daweileng/Precise-CTC. You can use CTC_precise or CTC_for_train class, they're both fine for RNN training.

The CTC objective is different from the current objective functions in Keras, and requires different masking mechanism. I also maintain a repository of Keras MOD with CTC incorporated, check here : https://github.com/daweileng/keras_MOD. Currently, only train_on_batch() is modified to be compatible with CTC. This is enough for me, so there's no definite planning to modify other parts of Keras.

shantanudev · 2016-05-13T05:41:38Z

Oh this is perfect and exactly what I am interested in. Thank you!

nouiz · 2016-05-13T16:55:16Z

Just to let you know, there is this discussion with version that wrap baidu
version that could be faster:

Theano/Theano#3871 (comment)

There is 2 current wrapper version at:

https://github.com/mcf06/theano_ctc

and

https://github.com/sherjilozair/ctc

On Fri, May 13, 2016 at 1:41 AM, shantanudev notifications@github.com
wrote:

Oh this is perfect and exactly what I am interested in. Thank you!

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#383 (comment)

lingz · 2016-06-01T11:49:57Z

@daweileng Do you have any instructions/examples as to how to use your Keras MOD?

ghost · 2016-06-02T01:14:53Z

Under the repository https://github.com/daweileng/Precise-CTC, there is a folder named as 'Test', you can find a demo script 'mnist_ctc_v4.py' there.

vkatsouros · 2016-06-22T14:43:11Z

@daweileng In mnist_ctc_v4.py you import from NN_auxiliary and from mytheano_utils. Can you share these too? Maybe in Keras MOD?

ghost · 2016-07-06T01:50:14Z

For who's interested: I updated my CTC-integrated Keras fork to base version 1.0.4, check here: https://github.com/daweileng/keras_MOD/tree/MOD_1.0.4. Till now, the following train/test functions work well with CTC cost:

train_on_batch()
test_on_batch()
predict_on_batch()

githubnemo · 2016-07-07T01:50:08Z

@daweileng Sadly you did not fork the Keras repository. Instead you just copied the files over and added everything (including your patches) in one commit. Can you do that properly (e.g., press the fork button on github, clone, add your changes, commit separately, push) so your patches become actually visible? That'd be awesome.

ghost · 2016-07-07T02:43:16Z

@githubnemo As explained in the README, the reason I didn't make a pull request is that to avoid a mass modification of Keras' masking mechanism, currently I override sample_weights and masks variables of Keras. In theory this should not cause problem for other networks but I'm not 100% sure about this. Besides, the modification of fit() function is not done yet. I'd like to collect enough feedback before an official pull request to Keras master branch.

If you just want to know what are changed, you can compare contents of the two repositories.

Progress: Now FCN can work with LSTM + CTC!

pasky · 2016-08-12T20:20:26Z

See also #3436

patyork · 2017-01-13T19:04:26Z

@harikrishnavydana The ocr example runs fine for me on both Theano and Tensorflow.

If it is not working for you, please review the issue guidelines (update keras) and if the issue persists, open a new issue.

HariKrishna-Vydana · 2017-01-14T06:28:33Z

Thank you, i was using the older version of keras @patyork

besanson · 2017-01-26T17:18:55Z

Hi, thanks @patyork . Just to understand. You are putting text in images. And using some of these to train and others to validate? but you are using full words to train and not characters. Pycairo is a complicated library to install :)

anuj-rathore · 2017-08-17T08:30:58Z

I am trying to use keras ctc in Bidirectional LSTM i.e. https://github.com/lvapeab/ABiViRNet
Network is as:
https://pastebin.com/9QXbJSwE

Since loss function in keras uses 2 arguements, ctc_batch_cost uses 4. Can somebody tell me how to process it?

selcouthlyBlue · 2018-01-16T01:34:48Z

Apparently, there is a ctc_loss implementation in Keras. There's an open issue on Keras' ctc_batch_cost in the tensorflow_backend.

hypernote · 2018-01-16T01:59:11Z

Hello... We already have some sample of CTC at keras repository?

selcouthlyBlue · 2018-01-16T02:03:54Z

You mean this one? If so, yeah I know there is already a sample. It's just when I search for "Keras CTC" in google, this issue comes up and I just thought it would be nice to let people know that such an implementation already exists.

hypernote · 2018-01-16T02:11:06Z

Great

rasto2211 · 2018-01-16T22:24:19Z

Is it ok to use ctc_batch_cost as keras loss function and pass it to model.compile? All the losses that are implemented in:
https://github.com/keras-team/keras/blob/master/keras/losses.py
take only one sample. Is it efficient?

Is there any plan to integrate WarpCTC to Keras?

saisumanth007 · 2018-01-28T12:28:31Z

Could you please tell what input_length and label_length specify? As per the documentation it seems label_length contains the lengths of ground truth strings (in case of OCR). But I'm not sure what input_length means.

aayushee · 2018-02-05T17:42:39Z

I think input_length refers to your sequence length and label_length refers to the ground truth label length.

)

futurely mentioned this issue Dec 25, 2015

mxnet and ocr apache/mxnet#1023

Closed

blackyang closed this as completed Feb 23, 2017

fchollet pushed a commit that referenced this issue Sep 22, 2023

Support dict inputs in TFDataLayer, plus some lint fixes (#383)

d955292

hubingallin pushed a commit to hubingallin/keras that referenced this issue Sep 22, 2023

Support dict inputs in TFDataLayer, plus some lint fixes (keras-team#383

f1025e6

)

implement CTC with keras? #383

implement CTC with keras? #383

Comments

blackyang commented Jul 12, 2015

fchollet commented Jul 14, 2015

blackyang commented Jul 14, 2015

futurely commented Jul 23, 2015

blackyang commented Jul 23, 2015

amaas commented Jul 23, 2015

ghost commented Aug 19, 2015

amaas commented Aug 25, 2015

jinserk commented Sep 22, 2015

blackyang commented Sep 22, 2015

Michlong commented Dec 23, 2015

futurely commented Dec 23, 2015

ekelsen commented Jan 14, 2016

blackyang commented Jan 14, 2016

ZhangAustin commented Jan 27, 2016

futurely commented Mar 13, 2016

mschonwe commented Mar 13, 2016

shantanudev commented May 12, 2016

ghost commented May 13, 2016 • edited by ghost Loading

shantanudev commented May 13, 2016

nouiz commented May 13, 2016

lingz commented Jun 1, 2016 • edited Loading

ghost commented Jun 2, 2016

vkatsouros commented Jun 22, 2016 • edited Loading

ghost commented Jul 6, 2016

githubnemo commented Jul 7, 2016

ghost commented Jul 7, 2016 • edited by ghost Loading

pasky commented Aug 12, 2016

patyork commented Jan 13, 2017

HariKrishna-Vydana commented Jan 14, 2017

besanson commented Jan 26, 2017

anuj-rathore commented Aug 17, 2017 • edited Loading

selcouthlyBlue commented Jan 16, 2018 • edited Loading

hypernote commented Jan 16, 2018

selcouthlyBlue commented Jan 16, 2018

hypernote commented Jan 16, 2018

rasto2211 commented Jan 16, 2018

saisumanth007 commented Jan 28, 2018

aayushee commented Feb 5, 2018

ghost commented May 13, 2016 •

edited by ghost

Loading

lingz commented Jun 1, 2016 •

edited

Loading

vkatsouros commented Jun 22, 2016 •

edited

Loading

ghost commented Jul 7, 2016 •

edited by ghost

Loading

anuj-rathore commented Aug 17, 2017 •

edited

Loading

selcouthlyBlue commented Jan 16, 2018 •

edited

Loading