Getting pixel coordinates as output #593
Unanswered
MrSpaghettiCode
asked this question in
Q&A
Replies: 1 comment 1 reply
-
The bounds are available on the iterators. See console demo here:
https://github.com/charlesw/tesseract-samples
So in your case:
* OCR page
* Iterator through results identifying words of interest
* Get regions from iterator (TryGetBoundingBox)
Hope that helps 🙂
…On Sat, 22 Jan 2022, 01:14 MrSpaghettiCode, ***@***.***> wrote:
Hello,
i am using your wrapper to ocr some documents and i was wondering if it is
possible to get pixel coordinates of found words.
I am trying to read certain data by drawing rectangles around it and
feeding them to tess afterwards, but unfortunately the datacoordinates vary
from document to document.
so, what i want to do:
1. let tesseract search for buzzword.
2. get coordinates of found word.
3. draw a rectangle around a certain area
4. feed the rectangle to tesseract
Normaly i would just scan the whole thing and extract the data, but since
my scans are super unrelyable and there is much unneeded data, i have to do
this rectangle stuff.
thx in advance
—
Reply to this email directly, view it on GitHub
<#593>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAB7HSE542HVWET3JTK7IWLUXFS4LANCNFSM5MPUDNYQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
i am using your wrapper to ocr some documents and i was wondering if it is possible to get pixel coordinates of found words.
I am trying to read certain data by drawing rectangles around it and feeding them to tess afterwards, but unfortunately the datacoordinates vary from document to document.
so, what i want to do:
Normaly i would just scan the whole thing and extract the data, but since my scans are super unrelyable and there is much unneeded data, i have to do this rectangle stuff.
thx in advance
Beta Was this translation helpful? Give feedback.
All reactions