Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCR example #6560

Merged
merged 39 commits into from
Jun 14, 2024
Merged

OCR example #6560

merged 39 commits into from
Jun 14, 2024

Conversation

andreasnaoum
Copy link
Contributor

@andreasnaoum andreasnaoum commented Jun 13, 2024

What

New example for Document Analysis and Text Detection (OCR).

This example demonstrates the ability to visualize and verify the document layout analysis and text detection using the PaddleOCR. PP-Structure used for this task, which is an intelligent document analysis system developed by the PaddleOCR team, which aims to help developers better complete tasks related to document understanding such as layout analysis and table recognition. In the layout analysis task, the image first goes through the layout analysis model to divide the image into different areas such as text, table, figure and more, and then analyze these areas separately. The classification of layouts and the text detection (including confidence levels) are visualized in the Rerun viewer. Finally, the recovery text document section presents the restored document with sorted order. By clicking on the restored text, the text area will be highlighted.

ocr

Checklist

  • I have read and agree to Contributor Guide and the Code of Conduct
  • I've included a screenshot or gif (if applicable)
  • I have tested the web demo (if applicable):
  • The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG
  • If applicable, add a new check to the release checklist!

To run all checks from main, comment on the PR with @rerun-bot full-check.

Copy link

github-actions bot commented Jun 13, 2024

Deployed docs

Commit Link
581895c https://landing-htcyznait-rerun.vercel.app/docs

@andreasnaoum andreasnaoum added examples Issues relating to the Rerun examples include in changelog labels Jun 13, 2024
@Wumpf
Copy link
Member

Wumpf commented Jun 13, 2024

please add a bit of a description and screenshots to the PR so that if someone comes here from the changelog they know what this is about :)

@Wumpf Wumpf requested review from Wumpf June 13, 2024 12:53
@Wumpf
Copy link
Member

Wumpf commented Jun 13, 2024

having issues running the example on mac. pixi run -e examples ocr gives me:

  × error updating pypi prefix
  ├─▶ Failed to download distributions
  ├─▶ Failed to fetch wheel: faiss-cpu==1.7.1.post2
  ├─▶ Failed to build: `faiss-cpu==1.7.1.post2`
  ╰─▶ Build backend failed to build wheel through `build_wheel()` with exit status: 1
      --- stdout:
      running bdist_wheel
      running build
      running build_py
      running build_ext
      building 'faiss._swigfaiss' extension
      swigging faiss/faiss/python/swigfaiss.i to faiss/faiss/python/swigfaiss_wrap.cpp
      swig -python -c++ -Doverride= -I/usr/local/include -Ifaiss -o faiss/faiss/python/swigfaiss_wrap.cpp faiss/faiss/python/swigfaiss.i
      --- stderr:
      error: command 'swig' failed: No such file or directory
      ---

is the example generally not mac compatible?

@Wumpf
Copy link
Member

Wumpf commented Jun 14, 2024

I still can't run it on Mac:

pixi run -e examples-ocr ocr
  × error updating pypi prefix
  ├─▶ Failed to download distributions
  ├─▶ Failed to fetch wheel: scipy==1.13.1
  ├─▶ Failed to build: `scipy==1.13.1`
  ╰─▶ Build backend failed to build wheel through `build_wheel()` with exit status: 1
      --- stdout:
      + meson setup /Users/andreas/Library/Caches/rattler/cache/uv-cache/built-wheels-v3/pypi/scipy/1.13.1/UE5L6cOSKv1YRVrPWGSGv/scipy-1.13.1.tar.gz /Users/andreas/Library/Caches/rattler/cache/uv-cache/built-wheels-v3/pypi/
      scipy/1.13.1/UE5L6cOSKv1YRVrPWGSGv/scipy-1.13.1.tar.gz/.mesonpy-rd6c6q76 -Dbuildtype=release -Db_ndebug=if-release -Db_vscrt=md --native-file=/Users/andreas/Library/Caches/rattler/cache/uv-cache/built-wheels-v3/pypi/scipy/1.13.1/
      UE5L6cOSKv1YRVrPWGSGv/scipy-1.13.1.tar.gz/.mesonpy-rd6c6q76/meson-python-native-file.ini
      The Meson build system
      Version: 1.4.1
      Source dir: /Users/andreas/Library/Caches/rattler/cache/uv-cache/built-wheels-v3/pypi/scipy/1.13.1/UE5L6cOSKv1YRVrPWGSGv/scipy-1.13.1.tar.gz
      Build dir: /Users/andreas/Library/Caches/rattler/cache/uv-cache/built-wheels-v3/pypi/scipy/1.13.1/UE5L6cOSKv1YRVrPWGSGv/scipy-1.13.1.tar.gz/.mesonpy-rd6c6q76
      Build type: native build
      Project name: scipy
      Project version: 1.13.1
      C compiler for the host machine: cc (clang 15.0.0 "Apple clang version 15.0.0 (clang-1500.3.9.4)")
      C linker for the host machine: cc ld64 1053.12
      C++ compiler for the host machine: c++ (clang 15.0.0 "Apple clang version 15.0.0 (clang-1500.3.9.4)")
      C++ linker for the host machine: c++ ld64 1053.12
      Cython compiler for the host machine: cython (cython 3.0.10)
      Host machine cpu family: aarch64
      Host machine cpu: aarch64
      Program python found: YES (/Users/andreas/Library/Caches/rattler/cache/uv-cache/.tmpXGEEN3/.venv/bin/python)
      Found pkg-config: YES (/opt/homebrew/bin/pkg-config) 0.29.2
      Run-time dependency python found: YES 3.11
      Program cython found: YES (/Users/andreas/Library/Caches/rattler/cache/uv-cache/.tmpXGEEN3/.venv/bin/cython)
      Compiler for C supports arguments -Wno-unused-but-set-variable: YES
      Compiler for C supports arguments -Wno-unused-function: YES
      Compiler for C supports arguments -Wno-conversion: YES
      Compiler for C supports arguments -Wno-misleading-indentation: YES
      Library m found: YES
      Fortran compiler for the host machine: gfortran (gcc 13.2.0 "GNU Fortran (Homebrew GCC 13.2.0) 13.2.0")
      Fortran linker for the host machine: gfortran ld64 1053.12
      Compiler for Fortran supports arguments -Wno-conversion: YES
      Compiler for C supports link arguments -Wl,-ld_classic: YES
      Checking if "-Wl,--version-script" : links: NO
      Program pythran found: YES 0.15.0 0.15.0 (/Users/andreas/Library/Caches/rattler/cache/uv-cache/.tmpXGEEN3/.venv/bin/pythran)
      Found CMake: /opt/homebrew/bin/cmake (3.28.3)
      WARNING: CMake Toolchain: Failed to determine CMake compilers state
      Run-time dependency xsimd found: NO (tried pkgconfig, framework and cmake)
      Run-time dependency threads found: YES
      Library npymath found: YES
      Library npyrandom found: YES
      pybind11-config found: YES (/Users/andreas/Library/Caches/rattler/cache/uv-cache/.tmpXGEEN3/.venv/bin/pybind11-config) 2.12.0
      Run-time dependency pybind11 found: YES 2.12.0
      Run-time dependency scipy-openblas found: NO (tried pkgconfig)
      Run-time dependency openblas found: NO (tried pkgconfig, framework and cmake)
      Run-time dependency openblas found: NO (tried pkgconfig, framework and cmake)

      ../scipy/meson.build:163:9: ERROR: Dependency "OpenBLAS" not found, tried pkgconfig, framework and cmake

      A full log can be found at /Users/andreas/Library/Caches/rattler/cache/uv-cache/built-wheels-v3/pypi/scipy/1.13.1/UE5L6cOSKv1YRVrPWGSGv/scipy-1.13.1.tar.gz/.mesonpy-rd6c6q76/meson-logs/meson-log.txt
      --- stderr:

      ---

@Wumpf
Copy link
Member

Wumpf commented Jun 14, 2024

after install in a fresh virtual env I get:

python -m ocr --help
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/andreas/dev/test-py-env/rerun/examples/python/ocr/ocr.py", line 19, in <module>
    from paddleocr import PPStructure
  File "/Users/andreas/dev/test-py-env/.venv/lib/python3.12/site-packages/paddleocr/__init__.py", line 14, in <module>
    from .paddleocr import *
  File "/Users/andreas/dev/test-py-env/.venv/lib/python3.12/site-packages/paddleocr/paddleocr.py", line 21, in <module>
    import paddle
  File "/Users/andreas/dev/test-py-env/.venv/lib/python3.12/site-packages/paddle/__init__.py", line 28, in <module>
    from .base import core  # noqa: F401
    ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/andreas/dev/test-py-env/.venv/lib/python3.12/site-packages/paddle/base/__init__.py", line 77, in <module>
    from . import dataset
  File "/Users/andreas/dev/test-py-env/.venv/lib/python3.12/site-packages/paddle/base/dataset.py", line 20, in <module>
    from ..utils import deprecated
  File "/Users/andreas/dev/test-py-env/.venv/lib/python3.12/site-packages/paddle/utils/__init__.py", line 16, in <module>
    from . import (  # noqa: F401
  File "/Users/andreas/dev/test-py-env/.venv/lib/python3.12/site-packages/paddle/utils/cpp_extension/__init__.py", line 15, in <module>
    from .cpp_extension import (
  File "/Users/andreas/dev/test-py-env/.venv/lib/python3.12/site-packages/paddle/utils/cpp_extension/cpp_extension.py", line 21, in <module>
    import setuptools
ModuleNotFoundError: No module named 'setuptools'

installing setuptools makes it work. So that should be enough to get this workflow fixed.
pixi setup issues seem to be more complex

Copy link
Member

@Wumpf Wumpf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reckon this example is one of those that only runs on a single core for an eternity if you don't have an nvidia gpu? Takes ages to do anything on my Mac. Needs a note in the description, probably even in the --help text.
As of writing it has been running for 10min without any response (wtf, it's ocr, we used to do this in the 70s -.-), would it be possible to show some progress bar on the model execution? At least there should be a message when it starts, right now I'm not sure if the process is actually just stuck

examples/python/ocr/README.md Outdated Show resolved Hide resolved
examples/python/ocr/README.md Outdated Show resolved Hide resolved
examples/python/ocr/README.md Show resolved Hide resolved


def main() -> None:
parser = argparse.ArgumentParser(description="OCR Example - Layout Analysis and Text Detections")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a note that this will automatically download the model 🤔
which btw. in my case took ages since it downloaded it with less than 100kib/s 😞

examples/python/ocr/ocr.py Outdated Show resolved Hide resolved
examples/python/ocr/README.md Outdated Show resolved Hide resolved
examples/python/ocr/README.md Outdated Show resolved Hide resolved
examples/python/ocr/ocr.py Outdated Show resolved Hide resolved
examples/python/ocr/ocr.py Outdated Show resolved Hide resolved
examples/python/ocr/ocr.py Outdated Show resolved Hide resolved
@Wumpf
Copy link
Member

Wumpf commented Jun 14, 2024

it hasn't done anything after 20min runtime on my M1max macbook. We probably should just earmark it as not supported on Mac..?

@andreasnaoum
Copy link
Contributor Author

andreasnaoum commented Jun 14, 2024

I'm also using it on an M1 macbook and the maximum wait time was 30 seconds. I will try to build it on an new environment to check the time @Wumpf

andreasnaoum and others added 2 commits June 14, 2024 11:47
Run guidelines

Co-authored-by: Andreas Reich <andreas@rerun.io>
Co-authored-by: Andreas Reich <andreas@rerun.io>
@andreasnaoum
Copy link
Contributor Author

Do you have any suggestions for the name?

The description is this:
This example visualizes layout analysis and text detection of documents.

I wanted to keep a simple name; that's why it was set it as OCR, but we can change it.

@Wumpf
Copy link
Member

Wumpf commented Jun 14, 2024

paddle_ocr/PaddleOCR would make more sense than just OCR I reckon since this is what the example about, right?

@Wumpf
Copy link
Member

Wumpf commented Jun 14, 2024

pixi environment fix confirmed

@Wumpf
Copy link
Member

Wumpf commented Jun 14, 2024

demo now also works from pixi

@andreasnaoum
Copy link
Contributor Author

Name changed to PaddleOCR @Wumpf

Copy link
Member

@Wumpf Wumpf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left one more question / thing that needs fixing, but otherwise I believe we're good to go!

examples/python/ocr/README.md Outdated Show resolved Hide resolved
@andreasnaoum
Copy link
Contributor Author

Readme looks fine to me @Wumpf

@Wumpf Wumpf merged commit 0179f79 into main Jun 14, 2024
34 checks passed
@Wumpf Wumpf deleted the andreasnaoum-ocr-example branch June 14, 2024 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples Issues relating to the Rerun examples include in changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants