0.5.6
0.5.6
- Fix problem with PDF partition (duplicated test)
Enhancements
contains_english_word()
, used heavily in text processing, is 10x faster.
Features
- Add
--metadata-include
and--metadata-exclude
parameters tounstructured-ingest
- Add
clean_non_ascii_chars
to remove non-ascii characters from unicode string
Fixes
- Fixes duplicated elements issue with
partition_pdf(..., strategy="fast")