Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving with set_content produces empty HTML #300

Open
netw0rkf10w opened this issue Jan 22, 2024 · 3 comments
Open

Saving with set_content produces empty HTML #300

netw0rkf10w opened this issue Jan 22, 2024 · 3 comments

Comments

@netw0rkf10w
Copy link

Hello,

First of all thank you so much for your great work!

I have been trying your library to make changes to an existing ePub, but for some reason, the saved file contains empty HTML:

import os
import argparse
from bs4 import BeautifulSoup
import ebooklib
from ebooklib import epub

def modify_epub(file_name, output):
    book = epub.read_epub(file_name)

    for item in book.get_items():
        if item.get_type() == ebooklib.ITEM_DOCUMENT:
            # soup = BeautifulSoup(item.get_content(), 'html.parser')
            soup = BeautifulSoup(item.get_content(), 'lxml')
            item.set_content(str(soup))

    output = os.path.expanduser(output)
    if os.path.exists(output):
        print(f'Removing existing file before saving: {output}')
        os.remove(output)
    epub.write_epub(output, book)

It seems that the issue lies at the line item.set_content(str(soup)). Could you please tell me what's wrong?

Thank you so much in advance for your help!

@c1924959470
Copy link

c1924959470 commented Jan 22, 2024 via email

@atempcode1
Copy link

my experience is that it is related to the BYTE and STRING thing in Python. content is expected to be BYTE. change

item.set_content(str(soup))

to

html_item.set_content(str(soup).encode(encoding='UTF-8'))

works for me.

@c1924959470
Copy link

c1924959470 commented Oct 22, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants