-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmenting and saving each class as image to allow OCR #9
Comments
@elnazsn1988 You can try the below code. It extracts the bounding boxes and then crop the image based on the bounding box and save. classes = ['text', 'title', 'list', 'table', 'figure']
default_predictor = detectron2.engine.defaults.DefaultPredictor(cfg)
img = detectron2.data.detection_utils.read_image(path_to_image, format="BGR")
predictions = default_predictor(img)
instances = predictions["instances"].to('cpu')
pred_classes = instances.pred_classes
labels = [classes[i] for i in pred_classes]
boxes = instances.pred_boxes
if isinstance(boxes, detectron2.structures.boxes.Boxes):
boxes = boxes.tensor.numpy()
else:
boxes = np.asarray(boxes)
from PIL import Image
img = Image.fromarray(img)
for label, bbox in zip(labels, boxes):
if label == "table":
cropped_img = img.crop(bbox)
croppped_img.save(f"{label}_{bbox}.png") |
@hpanwar08 thanks for the above, it throws errors at img being extracted as a numpy object : ```
|
@elnazsn1988 I have updated the code, it should work now |
@hpanwar08 where to add the code ?which file? |
@akshay94950 You can add this code in a new python file and run it. |
@hpanwar08 NameError: name 'croppped_img' is not defined |
can you show the entire code. |
import argparse from predictor import VisualizationDemo constantsWINDOW_NAME = "COCO detections" pred_classes = instances.pred_classes for label, bbox in zip(labels, boxes): |
it was because of spelling mistake in that above code .thanks @hpanwar08 but it shows this error: |
thanks bro its working.....@hpanwar08 |
HI @hpanwar08 - codes working great thanks, do you happen to know how I can annotate a new image and retrain the existing weights for the new image? Ive been trying to donwload the full dataset from: |
I did not understand your question. If you want to annotate new images then you can have a look at https://github.com/wkentaro/labelme |
@hpanwar08 is there a way to add a new annotated image and train your pretrained model on it without retraining the whole thing? As in use your weights from the trained model and retrain on a new image set. |
Yes, you can annotated your custom dataset and save it in COCO format and train using train_net_dla.py |
can we obtain predictions output image without figures class for using it in ocr?or how to use it for text extraction ? |
One follow-up question: How to extract the confident score from each of the boxes? |
You could directly crop the images with bounding box |
|
Thank you! |
@hpanwar08 how to get the images saved using above code in order,which can be later used for an ocr using tesseract? |
These images will be saved in the same directory in which you run the code. |
I mean the images are not in order. I want to use it for OCR. How to sort
in order and get of the original input?
…On Thu, 9 Apr, 2020, 11:43 AM Himanshu, ***@***.***> wrote:
These images will be saved in the same directory in which you run the code.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AO6YAAJILCOVYP5Q4GVTGQLRLVRQ3ANCNFSM4KSFJACA>
.
|
what do you want to sort on? |
The images are saved in the order of predictions but if I use the images in
that order I can't extract meaningful text from the input image. I want all
classes of predictions except figures and I want to use it for a text to
speech module.
…On Thu, 9 Apr, 2020, 12:24 PM Himanshu, ***@***.***> wrote:
what do you want to sort on?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AO6YAAP2A3CQIRUTFSSID43RLVWLFANCNFSM4KSFJACA>
.
|
exclude the figure label when iterating the labels |
I had done that and got the images but the image is not in order as of
input. I want to use the images for text extraction. Since it's not in
order the text I extracting is not meaningful or jumbled.
…On Thu, 9 Apr, 2020, 1:07 PM Himanshu, ***@***.***> wrote:
exclude the figure label when iterating the labels
if label != "figure": ...
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AO6YAANKFY2RG6XXNKO7UZTRLV3KVANCNFSM4KSFJACA>
.
|
Now I got what you are saying. |
Yeah... That's matter. Can you help something regarding this?thanks for the
help you shown till now..
…On Thu, 9 Apr, 2020, 3:06 PM Himanshu, ***@***.***> wrote:
Now I got what you are saying.
You may need sort segments based on the bounding box location. But it
would depend on the page layout. Some images may have two column layout,
some may have one, some may have mixed layout. You need write your logic
based on these constraints.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AO6YAALLNJBP7RQNFUL36ETRLWJJDANCNFSM4KSFJACA>
.
|
One of the solution could be, write a classifier to classify the type of image e.g. single column, 2 column etc. Then write logic based on the type of the image e.g sorting based on x or y. |
can you please help me writing the logic of a single column image |
try this classes = ['text', 'title', 'list', 'table', 'figure']
default_predictor = detectron2.engine.defaults.DefaultPredictor(cfg)
img = detectron2.data.detection_utils.read_image(path_to_image, format="BGR")
predictions = default_predictor(img)
instances = predictions["instances"].to('cpu')
pred_classes = instances.pred_classes
labels = [classes[i] for i in pred_classes]
boxes = instances.pred_boxes
if isinstance(boxes, detectron2.structures.boxes.Boxes):
boxes = boxes.tensor.numpy()
else:
boxes = np.asarray(boxes)
from PIL import Image
img = Image.fromarray(img)
# bbox = [xmin. ymin, xmax, ymax],
# sort by bboxes and lebels by ymin
sorted_by_bbox_ymin = sorted(zip(boxes, labels), key=lambda x: x[0][1])
boxes, labels = list(zip(*sorted_by_bbox_ymin))
boxes = list(boxes)
labels = list(labels)
for idx, (label, bbox) in enumerate(zip(labels, boxes)):
if label == "text":
cropped_img = img.crop(bbox)
croppped_img.save(f"{idx}_{label}_{bbox}.png") |
Fixed the code, should work now. |
Yooo...thanks :) |
Thanks.... @hpanwar08 For this work... It's working nice to some of images...'did you use the complete publaynet dataset for training?if not i like to train the remaining data in your model for that can you specify the portion of dataset you doesnt use ? |
Hi, I am wondering if you have an example python script which I could run to test the pretrained detectron? That would be a hughe help |
There is a command written in the README.md which you can try. |
I was more looking for something like this which I could also manipulate afterwards. I am working in jupyter lab. Do you also have a simple example with such a structure but also including the implementation of cfg and all the required libraries?
|
You could use the above code, it should get the predictions for you. You need to install detectron2 first and copy the cfg from this repo to your installation. |
I used almost half of the publaynet data, it is mentioned in the README file. |
I installed detectron2 already but if I just run the code above like that it obviously says that the library is not defined and cfg is not defined. I guess there is missing something like: import argparse cfg = get_cfg() |
🚀 Feature
Hi, is there an internal feature which lets each classed be saved as a seperate segment, or image? I am trying to identify tables, seperate and then run through a tabular data analyzer and ocr - so far am able to get the image predictions with your code, but not the actual annotations/segmented fields for further analysis/ocr.
Motivation
Pitch
Note that you can implement many features by extending detectron2.
See projects for some examples.
We would only consider adding new features if they are relevant to many users.
The text was updated successfully, but these errors were encountered: