Skip to content

noah-art3mis/bookbinder

Repository files navigation

Bookbinder & Antiquarian

Text cleanup.

Bookbinder.py is regex; Antiquarian.py is AI cleanup. openai_batch_api/batch.ipynb is AI cleanup in batches and preparing finetune files.

How to

Because every project has different requirements, this needs to be manual every time.

  1. Edit config files
  2. Edit python scripts
  3. Run scripts

TODO

  • fix cost estimation to estimate just the file that was generated
  • fix batch notebook to work when the batch job uses several files.
  • refactor
  • Add segmentation.
  • Add deferral to save on compute

About

Tools for making bad ebooks readable.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published