add additional bad examples

johnathanchiu · Oct 7, 2024 · c157dd5 · c157dd5
1 parent ed620b8
commit c157dd5
Show file tree

Hide file tree

Showing 2 changed files with 7 additions and 3 deletions.
diff --git a/README.md b/README.md
@@ -31,10 +31,8 @@ img.show()
 
 ## Examples
 
-<p>
 <img src="https://github.com/johnathanchiu/recursive-segmentation/blob/main/examples/outputs/apple_output.jpg" alt="Image 1" width="400"/> 
 <img src="https://github.com/johnathanchiu/recursive-segmentation/blob/main/examples/outputs/dell_output.jpg" alt="Image 2" width="400"/>
-</p>
 
 See `main.py` or `ex.ipynb` for examples on how to draw the images.
 
@@ -50,12 +48,18 @@ pip install -r requirements.txt
 
 This algorithm works particularly well with documents that have a lot of diagrams and that are well spaced. It performs poorly on documents that are purely text-based (but there is usually no need to segment documents that are completely text-based just throw it into RAG directly). It could be interesting to detect situations like this and skip the segmentation step entirely for these sorts of pages.
 
-At the moment, I am looking to build out an ML model to determine when to split chunks in the page. The main principle would be to train a seq2seq model that outputs a binary sequence. The sequence input is the slices of the image and the output is a binary sequence where a 1 represents a split in the image and 0 otherwise.
+At the moment, I am looking to build out an ML model to determine when to split chunks in the page. The main principle would be to train a seq2seq model that outputs a binary sequence. The sequence input is the slices of the image and the output is a binary sequence where a 1 represents a split in the image and 0 otherwise. Basic training code setup can be found on my other [branch](https://github.com/johnathanchiu/recursive-segmentation/tree/jchiu/model-training-code/model).
 
 ### Limitations
 
 Like any bounding box segmentation algorithm, the main limitation is the shape of the segmentation. Edge cases arise when the input image is not necessarily framed in a grid-shape. Take an example where an image contains "L" shaped objects. This makes it impossible to segment out the "L" shaped object defined by a bounding box. If anyone has any ideas on how to improve this, please feel free to suggest!
 
+For largely text-based PDFs, the results can look like this.
+
+<img src="https://github.com/johnathanchiu/recursive-segmentation/blob/main/examples/outputs/somato_output.jpg" alt="Image 3"/>
+
+I'm still looking for a solution so feel free to suggest any if you have ideas.
+
 ## Contributing
 
 Feel free to contribute to this repository through Pull Requests and Issues. Reach out to me if you have any ideas surrounding this that you want to discuss!
diff --git a/examples/outputs/somato_output.jpg b/examples/outputs/somato_output.jpg