-
Notifications
You must be signed in to change notification settings - Fork 911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hierarchical Topic Parsing #77
Comments
At the moment, the system is detecting only one level of section headers. |
New capabilities will come, but for now, the system only recognises titles and the first level of section headers. |
Can I know when this cool feature of header hierarchy is released? |
Following because very interested |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
First of all, amazing work and thanks for making this open source!
Issue
I'm interested in obtaining hierachical topics after converting PDF -> MD. My naive attempt resulted in a document well extracted, but with a somewhat flat structure.
I used the out of the box pipeline for this extraction
DocumentConverter()
.Is there a way I could obtain hierarchical topics (
#
,##
,###
). What components of the pipeline can I change?Thanks!
The text was updated successfully, but these errors were encountered: