Python prototype — 4m 51s
C version — 1.2s
Naive Python implementation with a preliminary calculation array — 18s
Scanline fill with a preliminary calculation array — 16s
Multiprocessing — 4s
Breadth-first filling with memoization. Up to two times faster.
Solution based on the native Python bitwise operators: & ~ |
.
Much faster.
End of support of Python 3.4.