Skip to content

0.5.6

Compare
Choose a tag to compare
@cragwolfe cragwolfe released this 21 Mar 20:42
· 1393 commits to main since this release
3c95b97

0.5.6

  • Fix problem with PDF partition (duplicated test)

Enhancements

  • contains_english_word(), used heavily in text processing, is 10x faster.

Features

  • Add --metadata-include and --metadata-exclude parameters to unstructured-ingest
  • Add clean_non_ascii_chars to remove non-ascii characters from unicode string

Fixes

  • Fixes duplicated elements issue with partition_pdf(..., strategy="fast")