OCR-D processor is leaky #66

bertsky · 2024-03-01T14:18:39Z

When processing a document of 1.5k pages of medium size (1-2 MP each), I am observing a slow but steady increase in RSS from 4 GB up to 14 GB after 1.2k pages at which point the process gets crashed by the OS (Killed).

I do not see any Python bindings accessible to the input file loop which could accumulate such data without ever being GCed.

I am on CUDA 11.8

Has anybody seen this before?

The text was updated successfully, but these errors were encountered:

mikegerber · 2024-04-25T18:43:02Z

I've seen these kinds of memory leaks happen with TF 1, but AFAICR not with TF 2. (See https://github.com/qurator-spk/sbb_column_classifier - I think just upgrading fixed it, but maybe the "TF best practices" were necessary too.)

bertsky · 2024-04-29T12:29:39Z

What I describe happens on TF 2.13.1, which should be fully supported.

This issue is a show-stopper for me, as with OCR-D, it's not even possible to keep the results already produced (since they are only persisted in the METS at the end of the loop).

@mikegerber what do you mean by TF Best Practices – some particular document perhaps?

mikegerber · 2024-05-07T11:02:57Z

@mikegerber what do you mean by TF Best Practices – some particular document perhaps?

The things I did in sbb_column_classifier to make it process ~ 20 million pages:

1a. Updating to TF2
1b. IIRC using TF graph execution, TF functions (JIT?)
2. Dealing with flow problems due to the interweaved CPU processing (Would probably look into using some kind of bounded queues now, but solved it using semaphores at the time.)

I'm not sure if I did 1b to fix any memory leaks, may have just been for better performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OCR-D processor is leaky #66

OCR-D processor is leaky #66

bertsky commented Mar 1, 2024

mikegerber commented Apr 25, 2024

bertsky commented Apr 29, 2024

mikegerber commented May 7, 2024

OCR-D processor is leaky #66

OCR-D processor is leaky #66

Comments

bertsky commented Mar 1, 2024

mikegerber commented Apr 25, 2024

bertsky commented Apr 29, 2024

mikegerber commented May 7, 2024