group some modules into isolated venvs… #118

bertsky · 2020-06-26T22:28:13Z

for modules with certain sets of dependencies,
delegate to another (isolated) virtual env and
place a shell script in the top-level venv
update some executable names
remove fix-pip
update documentation

- for modules with certain sets of dependencies, delegate to another (isolated) virtual env and place a shell script in the top-level venv - update some executable names - remove `fix-pip` - update documentation

bertsky · 2020-06-26T22:31:35Z

Fix #35
Fix #68
Fix #69

This is merely a starting point. One can easily imagine making better/more groups. Or Docker delegation (CLI or REST). Also, maybe there is some way to have less writing effort for each module. (~~But apparently you cannot have control structues or dependencies in canned recipes, only the recipe itself.~~ Probably via eval function...)
I kept the top-level venv for most modules for now. It should work inside the Docker containers, too.

Makefile

bertsky · 2020-06-29T20:15:27Z

I have successfully tested this inside a Docker container. I first had to re-instate our custom make docker rule though (see 4ba6628). Must have become dysfunctional when we introduced the automatic mini/medi/maxi builds via CI.

kba

Looks great, will test it now.

Makefile

stweil · 2020-07-07T15:20:17Z

A non-recursive solution could be based on master...stweil:tf1. It uses symbolic links to run processors in the TF1 venv.

bertsky · 2020-07-07T15:50:45Z

A non-recursive solution could be based on master...stweil:tf1. It uses symbolic links to run processors in the TF1 venv.

It's not just about separating TF1. There's many more potential and actual incompatibilities (TF<2.2 vs >=2.2, Keras 2.3.* vs 2.4.*, Torch <1.0 vs <1.5 vs >=1.5, opencv-python vs -headless, Pillow ...). Some of these have merely run under the radar until now. Plus think about going into Docker thin containers (wrapping CLIs or even REST API) as the next step.

So merely copying all variables and rules (as ub your proposal) we would quickly run into a mess (esp. since you have to drag along any change to them in all others as well). Plus symlinks are not supported by all currently relevant FS IIRC.

stweil · 2020-07-07T15:54:53Z

Plus symlinks are not supported by all currently relevant FS IIRC.

They are supported, at least on any machine and operating system which I tried so far. We don't have plans to support Windows without WSL, do we?

bertsky · 2020-07-07T16:01:14Z

Plus symlinks are not supported by all currently relevant FS IIRC.

They are supported, at least on any machine and operating system which I tried so far. We don't have plans to support Windows without WSL, do we?

Oh, I think I confused HPFS (which doesn't support them but is irrelevant) and HFS+. And yes, I guess at least within WSL even Windows should be fine with symlinks nowadays.

stweil · 2020-07-07T16:03:26Z

There's many more potential and actual incompatibilities.

Indeed, there are. Luckily we only have to handle actual incompatibilities.

Are there still problems with opencv-python? I thought all of them are using opencv-python-headless now?

Makefile

bertsky · 2020-07-07T16:15:04Z

There's many more potential and actual incompatibilities.

Indeed, there are. Luckily we only have to handle actual incompatibilities.

The main point was: there are too many for your kind of solution

The other point was: we don't really know what incompatibilities we already have. They rarely surface as long as no one here uses complex workflows in Docker images and we don't have regression tests for that.

Besides, potential can always become actual when a module changes in the future.

kba · 2020-07-07T16:43:28Z

Code is still compiling but so far it looks good :)

We don't have plans to support Windows without WSL, do we?

IIUC there is a potential project partner for phase 3 that intends to roll out OCR-D on a pure Windows (Server) environment. We'll continue to target Ubuntu 18.04 as the main platform and when in doubt, we'll stick to POSIX. However, symlinks seem like a less maintainable approach and unless there's an advantage to it over the more generalized multi-venv solution here, I'd avoid them.

stweil · 2020-07-07T18:44:22Z

IIUC there is a potential project partner for phase 3 that intends to roll out OCR-D on a pure Windows (Server) environment.

I noticed that, too, but think that's not feasible, so IMO even trying it is a waste of time. With Docker or a virtual Linux machine there exist solutions which are known to work. And with newer Windows WSL is also a good solution.

stweil · 2020-07-07T18:56:12Z

However, symlinks seem like a less maintainable approach and unless there's an advantage to it over the more generalized multi-venv solution here, I'd avoid them.

Both @bertsky's and my solution use multi-venv. Running a venv/sub/bin/processor (not in PATH) from venv/bin/processor (in PATH) can be implemented in different ways. One way are individual wrapper scripts venv/sub/bin/processor (solution here, wrapper scripts must be generated during make and use disk space), another way are symbolic links venv/sub/bin/processor which link to a single wrapper script which is placed in venv/sub/bin (my solution, only links must be created, less disk space required). Both variants are not difficult to maintain.

stweil · 2020-07-07T19:34:05Z

There's many more potential and actual incompatibilities.

One known existing problem is that some prebuild Tensorflow modules don't exist for every Python version. This is not a problem as long as we stick to Ubuntu 18.04, but for newer Debian / Ubuntu it could help to support different Python versions for the different venvs.

kba · 2020-07-07T20:02:06Z

However, symlinks seem like a less maintainable approach and unless there's an advantage to it over the more generalized multi-venv solution here, I'd avoid them.

Both @bertsky's and my solution use multi-venv. Running a venv/sub/bin/processor (not in PATH) from venv/bin/processor (in PATH) can be implemented in different ways. One way are individual wrapper scripts venv/sub/bin/processor (solution here, wrapper scripts must be generated during make and use disk space), another way are symbolic links venv/sub/bin/processor which link to a single wrapper script which is placed in venv/sub/bin (my solution, only links must be created, less disk space required). Both variants are not difficult to maintain.

Yes, both solutions are multi-env but I like that @bertsky's approach is more generic, not just for the TF1/TF2 issue but possible other dependency incompatibilities. The venv delegation mechanism is the same by and large. Wrapper scripts over symlinks have the advantage that they are explicitly in the script whereas symlinks contain that information implicitly. If the OS or the filesystem - think a removable FAT or NTFS drive - don't support them, this needlessly breaks the deployment IMHO.

(Server) environment.

I noticed that, too, but think that's not feasible, so IMO even trying it is a waste of time. With Docker or a virtual Linux machine there exist solutions which are known to work. And with newer Windows WSL is also a good solution.

I agree with you that Linux, Docker or even WSL are the best option. And we will continue to target Ubuntu 18.04. But there are Windows Server environments out there where Linux in a VM or Docker are not viable for whatever reason. We're not actively supporting that at the moment but if somebody made the effort, I would not want to prevent it with platform-dependent design choices.

One known existing problem is that some prebuild Tensorflow modules don't exist for every Python version. This is not a problem as long as we stick to Ubuntu 18.04, but for newer Debian / Ubuntu it could help to support different Python versions for the different venvs.

Fully agree, it is unfortunate that we cannot support Python 3.8 at the moment. I have been thinking about alleviating this by setting up a PPA or other Debian repo and generating .deb packages for the tools in ocrd_all, as well as common models. This would give us the freedom to create compiled packages for tesseract, olena, possibly python, as well as python packages like tensorflow, torch and simplify the docker builds.

stweil · 2020-07-07T20:55:25Z

Yes, both solutions are multi-env but I like that @bertsky's approach is more generic.

Both approaches are identically generic. I only have implemented TF1 as an example, but of course any number of other venvs can be added in the same way, and it would also be possible to use hard files instead of symbolic links when needed, so that's not the essential difference.

The difference is how the Makefile looks like and whether it requires recursive make or not.

stweil · 2020-07-08T05:56:22Z

As we all agree that there will be more venvs needed than two, I think it is important to have also a common understanding of the contents of those different venvs. It should be clearly defined what goes into the primary venv and what is the difference for the sub-venvs. The PR currently uses venv-headless-tf1 and venv-headless-tf2 which are currently not documented explicitly, and I simply don't know what they stand for. tf1 and tf2 are clear, but why is headless an important attribute of both venvs?

I'd try to put most modules into the primary venv. In my understanding that includes modules without peculiarities and modules which follow best / recommended practice (using current software versions => TF2, using OpenCV headless, ...). So I would not make a primary venv without any TF software and put all TF software in different sub-venvs, and similar with OpenCV headless or not (where we currently don't have a problem as far as I know).

bertsky · 2020-07-08T10:37:43Z

less disk space required

I think we all agree this is a non-argument (for a few bytes of shell script).

We also (more or less) agreed that FS should be able to cope with symlinks.

Both variants are not difficult to maintain.

I argued precisely against that above. @stweil's solution needs to mirror many variables and recipes for each additional sub-venv:

pip_install → *_pip_install
VIRTUAL_ENV → *_VIRTUAL_ENV
BIN → *_BIN
ACTIVATE_VENV → ACTIVATE_*_VENV
OCRD_EXECUTABLES → *_EXECUTABLES

IMO this not only bloats the Makefile, it also makes it less readable and more error-prone.

Both approaches are identically generic. I only have implemented TF1 as an example, but of course any number of other venvs can be added in the same way

They are not IMO: the non-recursive approach cannot yet deal with non-Python (or not purely Python) install targets. It's not trivial to wrap recipes like make -C $< install, which some modules will need, in the same way that pip_install has been redefined. In contrast, the recursive approach is fully transparent w.r.t. all venv-dependent choices (variables, rules and recipes alike).

The PR currently uses venv-headless-tf1 and venv-headless-tf2 which are currently not documented explicitly, and I simply don't know what they stand for. tf1 and tf2 are clear, but why is headless an important attribute of both venvs?

I have explained that above already. The current choices are merely illustrations to get the new mechanism going and fix the most pressing inconsistencies. Headless stands for opencv-python-headless (as opposed to opencv-python, which some modules might still need explicitly or implicitly). We will have the need for combinations of certain requirements, like TF1 with headless, or TF1 with X11 (think pyplot), or TF2 with older Torch etc.

So I would not make a primary venv without any TF software and put all TF software in different sub-venvs, and similar with OpenCV headless or not (where we currently don't have a problem as far as I know).

Like I said above, I believe it's better to group all GPU-enabled modules into sub-venvs for now. Not only because they usually have mutually exclusive sets of dependencies, but because we still have no mechanism to make GPU work in Docker yet, so grouping them might facilitate future solutions.

kba · 2020-07-16T17:01:51Z

I've tested and unfortunately, I cannot get it working. After make all in a fresh checkout, there is no venv/share/venv-headless-tf2 directory, therefore the wrapper script for e.g. ocrd-anybaseocr-binarize, which looks like

#!/bin/bash
. /home/kba/build/github.com/OCR-D/monorepo/ocrd_all/venv/share/venv-headless-tf2/bin/activate && /home/kba/build/github.com/OCR-D/monorepo/ocrd_all/venv/share/venv-headless-tf2/bin/ocrd-anybaseocr-binarize "$@"

fails to run. Any ideas what to look for to fix this?

- ensure the target-specific `VIRTUAL_ENV` for modules with clashing dependencies gets exported to the sub-make - call sub-make with `--always-make` to ensure the nested recipe gets run as well when the outer recipe was needed - call sub-make with `--assume-old` on module directory to avoid updating the submodule twice

bertsky · 2020-07-17T12:31:36Z

ImportError: Keras requires TensorFlow 2.2 or higher. Install TensorFlow via pip install tensorflow

Probably need to differentiate between 2.1 (for ocrd_pc_segmentation, maybe others) and 2.2 (ocrd_anybaseocr?). Using venv-headless-tf22?

kba · 2020-07-17T12:39:26Z

ImportError: Keras requires TensorFlow 2.2 or higher. Install TensorFlow via pip install tensorflow

Probably need to differentiate between 2.1 (for ocrd_pc_segmentation, maybe others) and 2.2 (ocrd_anybaseocr?). Using venv-headless-tf22?

Fine by me, if that's what it takes. I can only tell for sure once the new run finishes.

kba · 2020-07-17T12:43:56Z

Sorry for the noise but here's the newest hickup:

. /home/kba/build/github.com/OCR-D/monorepo/ocrd_all/venv/bin/activate && make -C core install PIP_INSTALL="pip3 install --force-reinstall "
make[1]: Entering directory '/home/kba/build/github.com/OCR-D/monorepo/ocrd_all/core'
make[1]: *** No rule to make target 'install'.  Stop.
make[1]: Leaving directory '/home/kba/build/github.com/OCR-D/monorepo/ocrd_all/core'
Makefile:135: recipe for target '/home/kba/build/github.com/OCR-D/monorepo/ocrd_all/venv/bin/ocrd' failed
make: *** [/home/kba/build/github.com/OCR-D/monorepo/ocrd_all/venv/bin/ocrd] Error 2

Apparently not all modules have been checked out:

du -hs *|sort -h|head
4.0K	clstm
4.0K	cor-asv-fst
4.0K	core
4.0K	Dockerfile
4.0K	LICENSE
4.0K	ocrd_kraken
4.0K	ocrd_ocropy
4.0K	opencv-python

Running make modules does not fix the problem.

bertsky · 2020-07-17T13:00:41Z

Running make modules does not fix the problem.

That's strange. The core module is not even part of the changes in this PR. Do you perhaps have a Makefile.local or OCRD_MODULES in your environment? And what does git submodule status say?

bertsky · 2020-07-17T13:22:13Z

I need a bigger hard drive or the ability to build ocrd_all on an external FAT/NTFS HDD. Trying again from scratch, fingers crossed that remaining space is enough...

I recommend against vfat. For fresh start, I always do:

git submodule deinit --all -f
git submodule status | while read stat dir ver; do rmdir $dir; done
rm -fr venv

kba · 2020-07-17T13:35:34Z

Running make modules does not fix the problem.

That's strange. The core module is not even part of the changes in this PR. Do you perhaps have a Makefile.local or OCRD_MODULES in your environment? And what does git submodule status say?

This was a clean checkout, so no Makefile.local or variable overrides. Cannot say what git submodule status says because I manually ran git submodule update --init --recursive. But since the build failed again with No space left on device I'll have to run it yet again anyway and will report back if the issue persists.

I recommend against vfat.

That won't work anyway (I tried and failed before).

git submodule deinit --all -f
git submodule status | while read stat dir ver; do rmdir $dir; done
rm -fr venv

Thx. I'll try with a fresh checkout but this is useful documentation for debugging. Perhaps add to the contributor guide?

bertsky · 2020-07-17T13:44:46Z

because I manually ran git submodule update --init --recursive.

That would take substantially more space than just make modules (not all modules are needed, and not all modules are needed recursively).

but this is useful documentation for debugging. Perhaps add to the contributor guide?

I could introduce it as clean-modules. Should clean also depend on it?

bertsky · 2020-07-17T13:47:01Z

But since the build failed again with No space left on device

BTW your problems could also be explained by that alone. As soon as git submodule becomes inconsistent with the actual subdirectories, it often cannot recover by its own means.

bertsky · 2020-07-17T13:48:06Z

tensorflow 2.2.0 has requirement scipy==1.4.1; python_version >= "3", but you have scipy 1.5.1.

Damn!

bertsky · 2020-07-17T13:53:49Z

TF's handling of GH issues is a joke:
tensorflow/tensorflow#40884

Oh, this issue hasn't been solved in 7 days! Makes us look bad. Better close the issue automatically. Users seriously wounded will come back to our knife anyway.

kba · 2020-07-17T14:04:07Z

But since the build failed again with No space left on device

BTW your problems could also be explained by that alone. As soon as git submodule becomes inconsistent with the actual subdirectories, it often cannot recover by its own means.

The space ran out after this, there was at least enough space left for git submodule update --init --recursive.

Cannot reproduce now, though, fortunately:

Submodule 'core' (https://github.com/OCR-D/core.git) registered for path 'core'
Cloning into '/home/kba/build/github.com/OCR-D/monorepo/ocrd_all/core'...
Submodule path 'core': checked out 'f708420d806ab6363917e3a53519d85505327568'

Maybe a network problem.

but this is useful documentation for debugging. Perhaps add to the contributor guide?

I could introduce it as clean-modules. Should clean also depend on it?

clean-modules: 👍 But keep it explicit, I wouldn't want to accidentally lose all local changes to the modules because of make clean.

bertsky · 2020-07-17T14:14:50Z

TF's handling of GH issues is a joke:
tensorflow/tensorflow#40884

Looks like this has since been removed upstream – 4 days ago. How long do they take to update their releases on PyPI?

Who knows what other failures are lurking behind this one. Perhaps pip check was too ambitious for a system like pip which is incapable of conflict management?

bertsky · 2020-07-17T14:25:02Z

clean-modules: +1 But keep it explicit, I wouldn't want to accidentally lose all local changes to the modules because of make clean.

And what if we made that deinit --all (without --force)?

kba · 2020-07-17T14:57:59Z

:( No space issues this time but again, something went wrong with anybaseocr and typegroups classifier. In fact, only these tools have been installed:

$ ls -l share/venv*/bin/ocrd-*
-rwxrwxr-x 1 kba kba 329 Jul 17 16:35 share/venv-headless-tf1/bin/ocrd-calamari-recognize
-rwxrwxr-x 1 kba kba 344 Jul 17 16:14 share/venv-headless-tf1/bin/ocrd-cor-asv-ann-evaluate
-rwxrwxr-x 1 kba kba 342 Jul 17 16:14 share/venv-headless-tf1/bin/ocrd-cor-asv-ann-process
-rwxrwxr-x 1 kba kba 296 Jul 17 16:13 share/venv-headless-tf1/bin/ocrd-dummy
-rwxrwxr-x 1 kba kba 324 Jul 17 16:14 share/venv-headless-tf1/bin/ocrd-keraslm-rate
-rwxrwxr-x 1 kba kba 347 Jul 17 16:46 share/venv-headless-tf1/bin/ocrd-sbb-textline-detector
-rwxrwxr-x 1 kba kba 324 Jul 17 16:33 share/venv-headless-tf1/bin/ocrd-segment-evaluate
-rwxrwxr-x 1 kba kba 334 Jul 17 16:33 share/venv-headless-tf1/bin/ocrd-segment-extract-lines
-rwxrwxr-x 1 kba kba 334 Jul 17 16:33 share/venv-headless-tf1/bin/ocrd-segment-extract-pages
-rwxrwxr-x 1 kba kba 338 Jul 17 16:33 share/venv-headless-tf1/bin/ocrd-segment-extract-regions
-rwxrwxr-x 1 kba kba 326 Jul 17 16:33 share/venv-headless-tf1/bin/ocrd-segment-from-coco
-rwxrwxr-x 1 kba kba 328 Jul 17 16:33 share/venv-headless-tf1/bin/ocrd-segment-from-masks
-rwxrwxr-x 1 kba kba 320 Jul 17 16:33 share/venv-headless-tf1/bin/ocrd-segment-repair
-rwxrwxr-x 1 kba kba 340 Jul 17 16:33 share/venv-headless-tf1/bin/ocrd-segment-replace-original
-rwxrwxr-x 1 kba kba 297 Jul 17 16:35 share/venv-headless-tf21/bin/ocrd-dummy
-rwxrwxr-x 1 kba kba 331 Jul 17 16:36 share/venv-headless-tf21/bin/ocrd-pc-segmentation
-rwxrwxr-x 1 kba kba 300 Jul 17 16:44 share/venv-headless-torch14/bin/ocrd-dummy
-rwxrwxr-x 1 kba kba 315 Jul 17 16:46 share/venv-headless-torch14/bin/ocrd-typegroups-classifier

Wrapper scripts for ocrd-anybaseocr-* are not executable. Of course I ran this make all outside tmux and don't have the output to debug, because it's Friday afternoon and things cannot just work for once.

Off to another fresh start, again.

bertsky · 2020-07-17T15:07:41Z

something went wrong with anybaseocr and typegroups classifier
[...]
Wrapper scripts for ocrd-anybaseocr-* are not executable

Are you sure you didn't just fall into the above pip check trap, as does CI now?

(due to failing TF dependencies, possibly others – this was too ambitious or too early) This reverts commit a6e7e4e.

kba · 2020-07-17T15:15:04Z

something went wrong with anybaseocr and typegroups classifier
[...]
Wrapper scripts for ocrd-anybaseocr-* are not executable

Are you sure you didn't just fall into the above pip check trap, as does CI now?

Yes, there was no error that broke make all AFAICS. This time I'm doing make all|tee make-all.log so at least I can inspect the results later...

kba · 2020-07-17T17:19:12Z

~~No luck~~ SUCCESS

$ ls -l venv/**/bin/ocrd-*
-rw-rw-r-- 1 kba kba   226 Jul 17 17:42 venv/bin/ocrd-anybaseocr-binarize
-rw-rw-r-- 1 kba kba   236 Jul 17 17:42 venv/bin/ocrd-anybaseocr-block-segmentation
-rw-rw-r-- 1 kba kba   222 Jul 17 17:42 venv/bin/ocrd-anybaseocr-crop
-rw-rw-r-- 1 kba kba   224 Jul 17 17:42 venv/bin/ocrd-anybaseocr-deskew
-rw-rw-r-- 1 kba kba   224 Jul 17 17:42 venv/bin/ocrd-anybaseocr-dewarp
-rw-rw-r-- 1 kba kba   233 Jul 17 17:42 venv/bin/ocrd-anybaseocr-layout-analysis
-rw-rw-r-- 1 kba kba   226 Jul 17 17:42 venv/bin/ocrd-anybaseocr-textline
-rw-rw-r-- 1 kba kba   223 Jul 17 17:42 venv/bin/ocrd-anybaseocr-tiseg
-rwxrwxr-x 1 kba kba   223 Jul 17 17:40 venv/bin/ocrd-calamari-recognize
-rwxrwxr-x 1 kba kba   288 Jul 17 17:40 venv/bin/ocrd-cis-align
-rwxrwxr-x 1 kba kba   272 Jul 17 17:40 venv/bin/ocrd-cis-data
-rwxrwxr-x 1 kba kba   309 Jul 17 17:40 venv/bin/ocrd-cis-ocropy-binarize
-rwxrwxr-x 1 kba kba   301 Jul 17 17:40 venv/bin/ocrd-cis-ocropy-clip
-rwxrwxr-x 1 kba kba   307 Jul 17 17:40 venv/bin/ocrd-cis-ocropy-denoise
-rwxrwxr-x 1 kba kba   305 Jul 17 17:40 venv/bin/ocrd-cis-ocropy-deskew
-rwxrwxr-x 1 kba kba   305 Jul 17 17:40 venv/bin/ocrd-cis-ocropy-dewarp
-rwxrwxr-x 1 kba kba   299 Jul 17 17:40 venv/bin/ocrd-cis-ocropy-rec
-rwxrwxr-x 1 kba kba   311 Jul 17 17:40 venv/bin/ocrd-cis-ocropy-recognize
-rwxrwxr-x 1 kba kba   311 Jul 17 17:40 venv/bin/ocrd-cis-ocropy-resegment
-rwxrwxr-x 1 kba kba   307 Jul 17 17:40 venv/bin/ocrd-cis-ocropy-segment
-rwxrwxr-x 1 kba kba   303 Jul 17 17:40 venv/bin/ocrd-cis-ocropy-train
-rwxrwxr-x 1 kba kba   306 Jul 17 17:40 venv/bin/ocrd-cis-postcorrect
-rwxrwxr-x 1 kba kba   225 Jul 17 17:17 venv/bin/ocrd-cor-asv-ann-evaluate
-rwxrwxr-x 1 kba kba   224 Jul 17 17:17 venv/bin/ocrd-cor-asv-ann-process
-rwxrwxr-x 1 kba kba   305 Jul 17 17:39 venv/bin/ocrd-dinglehopper
-rwxrwxr-x 1 kba kba   272 Jul 17 17:17 venv/bin/ocrd-dummy
-rwxrwxr-x 1 kba kba  2861 Jul 17 17:21 venv/bin/ocrd-fileformat-transform
-rwxrwxr-x 1 kba kba  2995 Jul 17 17:19 venv/bin/ocrd-im6convert
-rwxrwxr-x 1 kba kba   217 Jul 17 17:19 venv/bin/ocrd-keraslm-rate
-rwxrwxr-x 1 kba kba 15952 Jul 17 17:39 venv/bin/ocrd-olena-binarize
-rwxrwxr-x 1 kba kba  5300 Jul 17 17:40 venv/bin/ocrd-pagetopdf
-rwxrwxr-x 1 kba kba   222 Jul 17 17:41 venv/bin/ocrd-pc-segmentation
-rwxrwxr-x 1 kba kba   297 Jul 17 17:20 venv/bin/ocrd-preprocess-image
-rwxrwxr-x 1 kba kba   221 Jul 17 17:39 venv/bin/ocrd-segment-evaluate
-rwxrwxr-x 1 kba kba   226 Jul 17 17:39 venv/bin/ocrd-segment-extract-lines
-rwxrwxr-x 1 kba kba   226 Jul 17 17:39 venv/bin/ocrd-segment-extract-pages
-rwxrwxr-x 1 kba kba   228 Jul 17 17:39 venv/bin/ocrd-segment-extract-regions
-rwxrwxr-x 1 kba kba   222 Jul 17 17:39 venv/bin/ocrd-segment-from-coco
-rwxrwxr-x 1 kba kba   223 Jul 17 17:39 venv/bin/ocrd-segment-from-masks
-rwxrwxr-x 1 kba kba   219 Jul 17 17:39 venv/bin/ocrd-segment-repair
-rwxrwxr-x 1 kba kba   229 Jul 17 17:39 venv/bin/ocrd-segment-replace-original
-rwxrwxr-x 1 kba kba   297 Jul 17 17:20 venv/bin/ocrd-skimage-binarize
-rwxrwxr-x 1 kba kba   295 Jul 17 17:20 venv/bin/ocrd-skimage-denoise
-rwxrwxr-x 1 kba kba   303 Jul 17 17:20 venv/bin/ocrd-skimage-denoise-raw
-rwxrwxr-x 1 kba kba   299 Jul 17 17:20 venv/bin/ocrd-skimage-normalize
-rwxrwxr-x 1 kba kba   306 Jul 17 17:40 venv/bin/ocrd-tesserocr-binarize
-rwxrwxr-x 1 kba kba   298 Jul 17 17:40 venv/bin/ocrd-tesserocr-crop
-rwxrwxr-x 1 kba kba   302 Jul 17 17:40 venv/bin/ocrd-tesserocr-deskew
-rwxrwxr-x 1 kba kba   308 Jul 17 17:40 venv/bin/ocrd-tesserocr-recognize
-rwxrwxr-x 1 kba kba   314 Jul 17 17:40 venv/bin/ocrd-tesserocr-segment-line
-rwxrwxr-x 1 kba kba   318 Jul 17 17:40 venv/bin/ocrd-tesserocr-segment-region
-rwxrwxr-x 1 kba kba   316 Jul 17 17:40 venv/bin/ocrd-tesserocr-segment-table
-rwxrwxr-x 1 kba kba   314 Jul 17 17:40 venv/bin/ocrd-tesserocr-segment-word
-rwxrwxr-x 1 kba kba   329 Jul 17 17:40 venv/share/venv-headless-tf1/bin/ocrd-calamari-recognize
-rwxrwxr-x 1 kba kba   344 Jul 17 17:19 venv/share/venv-headless-tf1/bin/ocrd-cor-asv-ann-evaluate
-rwxrwxr-x 1 kba kba   342 Jul 17 17:19 venv/share/venv-headless-tf1/bin/ocrd-cor-asv-ann-process
-rwxrwxr-x 1 kba kba   296 Jul 17 17:18 venv/share/venv-headless-tf1/bin/ocrd-dummy
-rwxrwxr-x 1 kba kba   324 Jul 17 17:19 venv/share/venv-headless-tf1/bin/ocrd-keraslm-rate
-rwxrwxr-x 1 kba kba   324 Jul 17 17:39 venv/share/venv-headless-tf1/bin/ocrd-segment-evaluate
-rwxrwxr-x 1 kba kba   334 Jul 17 17:39 venv/share/venv-headless-tf1/bin/ocrd-segment-extract-lines
-rwxrwxr-x 1 kba kba   334 Jul 17 17:39 venv/share/venv-headless-tf1/bin/ocrd-segment-extract-pages
-rwxrwxr-x 1 kba kba   338 Jul 17 17:39 venv/share/venv-headless-tf1/bin/ocrd-segment-extract-regions
-rwxrwxr-x 1 kba kba   326 Jul 17 17:39 venv/share/venv-headless-tf1/bin/ocrd-segment-from-coco
-rwxrwxr-x 1 kba kba   328 Jul 17 17:39 venv/share/venv-headless-tf1/bin/ocrd-segment-from-masks
-rwxrwxr-x 1 kba kba   320 Jul 17 17:39 venv/share/venv-headless-tf1/bin/ocrd-segment-repair
-rwxrwxr-x 1 kba kba   340 Jul 17 17:39 venv/share/venv-headless-tf1/bin/ocrd-segment-replace-original
-rwxrwxr-x 1 kba kba   297 Jul 17 17:41 venv/share/venv-headless-tf21/bin/ocrd-dummy
-rwxrwxr-x 1 kba kba   331 Jul 17 17:42 venv/share/venv-headless-tf21/bin/ocrd-pc-segmentation
-rwxrwxr-x 1 kba kba   338 Jul 17 17:47 venv/share/venv-headless-tf22/bin/ocrd-anybaseocr-binarize
-rwxrwxr-x 1 kba kba   358 Jul 17 17:47 venv/share/venv-headless-tf22/bin/ocrd-anybaseocr-block-segmentation
-rwxrwxr-x 1 kba kba   338 Jul 17 17:47 venv/share/venv-headless-tf22/bin/ocrd-anybaseocr-crop
-rwxrwxr-x 1 kba kba   334 Jul 17 17:47 venv/share/venv-headless-tf22/bin/ocrd-anybaseocr-deskew
-rwxrwxr-x 1 kba kba   334 Jul 17 17:47 venv/share/venv-headless-tf22/bin/ocrd-anybaseocr-dewarp
-rwxrwxr-x 1 kba kba   352 Jul 17 17:47 venv/share/venv-headless-tf22/bin/ocrd-anybaseocr-layout-analysis
-rwxrwxr-x 1 kba kba   338 Jul 17 17:47 venv/share/venv-headless-tf22/bin/ocrd-anybaseocr-textline
-rwxrwxr-x 1 kba kba   332 Jul 17 17:47 venv/share/venv-headless-tf22/bin/ocrd-anybaseocr-tiseg
-rwxrwxr-x 1 kba kba   297 Jul 17 17:46 venv/share/venv-headless-tf22/bin/ocrd-dummy

ocrd-anybaseocr wrapper scripts still not executable but after fixing that by hand, processor --version worked fine for all the processors I tested. 😌

make-all.log

bertsky · 2020-07-17T18:08:54Z

ocrd-anybaseocr wrapper scripts still not executable

thanks for the log. This is proof to me that you did fall into the pip check trap. That failed (with above scipy issue), and then the recipe stopped short, right before the chmod +x call.

make all should have stopped right there. Maybe you used make -j and did not notice the error further up?

Anyway, since I have reverted the pip check step, because it's unclear when a fix for TF is released via PyPI, this should be gone.

kba · 2020-07-24T14:17:05Z

make all should have stopped right there. Maybe you used make -j and did not notice the error further up?

No, it was plain make all. But I have since tried again and so far have no more issues. @cneud also verified this. Merging 🎉

group some modules into isolated venvs…

f7f16fb

- for modules with certain sets of dependencies, delegate to another (isolated) virtual env and place a shell script in the top-level venv - update some executable names - remove `fix-pip` - update documentation

bertsky requested review from kba and stweil June 26, 2020 22:28

stweil requested changes Jun 27, 2020

View reviewed changes