Skip to content

Commit

Permalink
NEW FEATURE - Added Kindle-EPUB-Fixer
Browse files Browse the repository at this point in the history
- Originally developed by [innocenat](https://github.com/innocenat/kindle-epub-fix), this tool corrects the following potential issues for every EPUB processed by CWA:
    - Fixes UTF-8 encoding problem by adding UTF-8 declaration if no encoding is specified
    - Fixes hyperlink problem (result in Amazon rejecting the EPUB) when NCX table of content link to `<body>` with ID hash.
    - Detect invalid and/or missing language tag in metadata, and prompt user to select new language.
    - Remove stray `<img>` tags with no source field.
- This ensures maximum comparability for each EPUB file with the Amazon Send-to-Kindle service and for those who don't use Amazon devices, has the side benefit of cleaning up your lower quality files
- This feature is on by default and is able to be toggled on and off by the user in the CWA Settings panel

Minor Changes:
- All CWA python scripts now conform to the snake_case naming convention
- Minor refactoring of ingest_processor script
  • Loading branch information
crocodilestick committed Dec 11, 2024
1 parent 2e110dd commit 1a63a51
Show file tree
Hide file tree
Showing 15 changed files with 206 additions and 57 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ TEST
DEV
*/REFERENCE
cwa.db
*.epub

# Dev files
changelogs/
Expand Down
7 changes: 4 additions & 3 deletions root/app/calibre-web/cps/cwa_functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ def cwa_switch_theme():
@login_required_if_no_ano
def cwa_library_refresh():
flash(_("Library Refresh: Initialising Book Ingest System, please wait..."), category="cwa_refresh")
result = subprocess.run(['python3', '/app/calibre-web-automated/scripts/ingest-processor.py', '/cwa-book-ingest'])
result = subprocess.run(['python3', '/app/calibre-web-automated/scripts/ingest_processor.py', '/cwa-book-ingest'])
return_code = result.returncode

# if return_code == 100:
Expand Down Expand Up @@ -88,7 +88,8 @@ def set_cwa_settings():
"auto_zip_backups",
"cwa_update_notifications",
"auto_convert",
"auto_metadata_enforcement"]
"auto_metadata_enforcement",
"kindle_epub_fixer"]
string_settings = ["auto_convert_target_format"]
for format in ignorable_formats:
string_settings.append(f"ignore_ingest_{format}")
Expand Down Expand Up @@ -233,7 +234,7 @@ def cwa_flash_status():


def flask_logger():
subprocess.Popen(['python3', '/app/calibre-web-automated/scripts/convert-library.py'])
subprocess.Popen(['python3', '/app/calibre-web-automated/scripts/convert_library.py'])
if os.path.isfile("/config/convert-library.log") == False:
with open('/config/convert-library.log', 'w') as create_new_log:
pass
Expand Down
10 changes: 10 additions & 0 deletions root/app/calibre-web/cps/templates/cwa_settings.html
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,16 @@ <h4 style="padding-bottom: 14px;">Web UI Settings</h4>
<label for="cwa_update_notifications" style="padding-left: 10px;">Enable CWA Update Notifications</label><br>
<p class="cwa-settings-tooltip">When active, you will no longer receive notifications in the Web UI when a new version of CWA is released</p>

{% if cwa_settings['kindle_epub_fixer'] %}
<input type="checkbox" id="kindle_epub_fixer" name="kindle_epub_fixer" value="True" checked style="accent-color: var(--color-secondary);" data-toggle="tooltip" data-placement="right" title="Does NOT require restart for changes to take effect">
{% else %}
<input type="checkbox" id="kindle_epub_fixer" name="kindle_epub_fixer" value="True" style="accent-color: var(--color-secondary);" data-toggle="tooltip" data-placement="right" title="Does NOT require restart for changes to take effect">
{% endif %}
<label for="kindle_epub_fixer" style="padding-left: 10px;">Enable CWA Kindle EPUB Fixer</label><br>
<p class="cwa-settings-tooltip">When active, the encoding among other attributes of all EPUB files processed by CWA will be checked and fixed to ensure maximum compatibility with Amazon's Send-to-Kindle Service (TLDR: if you've ever had EPUB files that Amazon just constantly rejects for seemingly no reason, this should prevent that from happening again)</p>
<p class="cwa-settings-tooltip"><i>This tool was adapted from the kindle-epub-fix.netlify.app tool made by innocenat</i></p>


<h4 style="padding-bottom: 14px;">Automatic Backup Settings</h4>

{% if cwa_settings['auto_backup_imports'] %}
Expand Down
2 changes: 1 addition & 1 deletion root/etc/s6-overlay/s6-rc.d/cwa-auto-library/run
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/bin/bash

python3 /app/calibre-web-automated/scripts/auto-library.py
python3 /app/calibre-web-automated/scripts/auto_library.py

if [[ $? == 1 ]]
then
Expand Down
2 changes: 1 addition & 1 deletion root/etc/s6-overlay/s6-rc.d/cwa-auto-zipper/run
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ do
echo "[cwa-auto-zipper] Next run in $SECS seconds."
sleep $SECS & # We sleep in the background to make the script interruptible via SIGTERM when running in docker
wait $!
python3 /app/calibre-web-automated/scripts/auto-zip.py
python3 /app/calibre-web-automated/scripts/auto_zip.py
if [[ $? == 1 ]]
then
echo "[cwa-auto-zipper] Error occurred during script initialisation (see errors above)."
Expand Down
2 changes: 1 addition & 1 deletion root/etc/s6-overlay/s6-rc.d/cwa-ingest-service/run
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,6 @@ echo "[cwa-ingest-service]: Watching folder: $WATCH_FOLDER"
s6-setuidgid abc inotifywait -m -r --format="%e %w%f" -e close_write -e moved_to "$WATCH_FOLDER" |
while read -r events filepath ; do
echo "[cwa-ingest-service]: New files detected - $filepath - Starting Ingest Processor..."
python3 /app/calibre-web-automated/scripts/ingest-processor.py "$filepath"
python3 /app/calibre-web-automated/scripts/ingest_processor.py "$filepath"
done

2 changes: 1 addition & 1 deletion root/etc/s6-overlay/s6-rc.d/cwa-init-remove-locks/run
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/bin/bash

declare -a lockFiles=("ingest-processor.lock" "convert-library.lock" "cover_enforcer.lock")
declare -a lockFiles=("ingest_processor.lock" "convert_library.lock" "cover_enforcer.lock")

echo "[cwa-init-remove-locks] Checking for leftover lock files from previous instance..."

Expand Down
File renamed without changes.
File renamed without changes.
22 changes: 15 additions & 7 deletions scripts/convert-library.py → scripts/convert_library.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
import atexit

from cwa_db import CWA_DB
from kindle_epub_fixer import EPUBFixer


logger = logging.getLogger(__name__)
Expand All @@ -25,7 +26,7 @@ def print_and_log(string) -> None:
# already running, then the script is closed, the user is notified and the program
# exits with code 2
try:
lock = open(tempfile.gettempdir() + '/convert-library.lock', 'x')
lock = open(tempfile.gettempdir() + '/convert_library.lock', 'x')
lock.close()
except FileExistsError:
print_and_log("[convert-library]: CANCELLING... convert-library was initiated but is already running")
Expand All @@ -34,7 +35,7 @@ def print_and_log(string) -> None:

# Defining function to delete the lock on script exit
def removeLock():
os.remove(tempfile.gettempdir() + '/convert-library.lock')
os.remove(tempfile.gettempdir() + '/convert_library.lock')

# Will automatically run when the script exits
atexit.register(removeLock)
Expand All @@ -60,9 +61,10 @@ def __init__(self) -> None: #args
self.cwa_settings = self.db.cwa_settings
self.target_format = self.cwa_settings['auto_convert_target_format']
self.convert_ignored_formats = self.cwa_settings['auto_convert_ignored_formats']
self.kindle_epub_fixer = self.cwa_settings['kindle_epub_fixer']

self.supported_book_formats = ['azw', 'azw3', 'azw4', 'cbz', 'cbr', 'cb7', 'cbc', 'chm', 'djvu', 'docx', 'epub', 'fb2', 'fbz', 'html', 'htmlz', 'lit', 'lrf', 'mobi', 'odt', 'pdf', 'prc', 'pdb', 'pml', 'rb', 'rtf', 'snb', 'tcr', 'txt', 'txtz']
self.hierarchy_of_success = ['epub', 'lit', 'mobi', 'azw', 'azw3', 'fb2', 'fbz', 'azw4', 'prc', 'odt', 'lrf', 'pdb', 'cbz', 'pml', 'rb', 'cbr', 'cb7', 'cbc', 'chm', 'djvu', 'snb', 'tcr', 'pdf', 'docx', 'rtf', 'html', 'htmlz', 'txtz', 'txt']
self.supported_book_formats = {'azw', 'azw3', 'azw4', 'cbz', 'cbr', 'cb7', 'cbc', 'chm', 'djvu', 'docx', 'epub', 'fb2', 'fbz', 'html', 'htmlz', 'lit', 'lrf', 'mobi', 'odt', 'pdf', 'prc', 'pdb', 'pml', 'rb', 'rtf', 'snb', 'tcr', 'txt', 'txtz'}
self.hierarchy_of_success = {'epub', 'lit', 'mobi', 'azw', 'azw3', 'fb2', 'fbz', 'azw4', 'prc', 'odt', 'lrf', 'pdb', 'cbz', 'pml', 'rb', 'cbr', 'cb7', 'cbc', 'chm', 'djvu', 'snb', 'tcr', 'pdf', 'docx', 'rtf', 'html', 'htmlz', 'txtz', 'txt'}

self.current_book = 1
self.ingest_folder, self.library_dir, self.tmp_conversion_dir = self.get_dirs('/app/calibre-web-automated/dirs.json')
Expand Down Expand Up @@ -128,8 +130,8 @@ def convert_library(self):
continue

if self.target_format == "kepub":
successful, target_filepath = self.convert_to_kepub(filename, file_extension)
if not successful:
convert_successful, target_filepath = self.convert_to_kepub(filename, file_extension)
if not convert_successful:
print_and_log(f"[convert-library]: Conversion of {os.path.basename(file)} was unsuccessful. See the following error:\n{e}")
self.current_book += 1
continue
Expand All @@ -152,7 +154,13 @@ def convert_library(self):
self.current_book += 1
continue

try: # Import converted book to library. As of V3.0.0, "add_format" is used instead of add
if self.target_format == "epub" and self.kindle_epub_fixer:
try:
EPUBFixer(target_filepath).process()
except Exception as e:
print_and_log(f"[convert-library] An error occurred while processing {os.path.basename(target_filepath)} with the kindle-epub-fixer. See the following error:\n{e}")

try: # Import converted book to library. As of V3.0.0, "add_format" is used instead of "add"
subprocess.run(["calibredb", "add_format", book_id, target_filepath, f"--library-path={self.library_dir}"], check=True)

if self.cwa_settings['auto_backup_imports']:
Expand Down
3 changes: 2 additions & 1 deletion scripts/cwa_db.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,8 @@ def __init__(self, verbose=False):
"auto_convert_target_format": "epub",
"auto_convert_ignored_formats":"",
"auto_ingest_ignored_formats":"",
"auto_metadata_enforcement":1}
"auto_metadata_enforcement":1,
"kindle_epub_fixer":1}

self.tables, self.schema = self.make_tables()
self.ensure_settings_schema_match()
Expand Down
3 changes: 2 additions & 1 deletion scripts/cwa_schema.sql
Original file line number Diff line number Diff line change
Expand Up @@ -31,5 +31,6 @@ CREATE TABLE IF NOT EXISTS cwa_settings(
auto_convert_target_format TEXT DEFAULT "epub" NOT NULL,
auto_convert_ignored_formats TEXT DEFAULT "" NOT NULL,
auto_ingest_ignored_formats TEXT DEFAULT "" NOT NULL,
auto_metadata_enforcement SMALLINT DEFAULT 1 NOT NULL
auto_metadata_enforcement SMALLINT DEFAULT 1 NOT NULL,
kindle_epub_fixer SMALLINT DEFAULT 1 NOT NULL
);
Loading

0 comments on commit 1a63a51

Please sign in to comment.