Python | Machine Learning | Coding

# 📚 Python Tutorial: Convert EPUB to PDF (Preserving Images)
#Python #EPUB #PDF #EbookConversion #Automation

This comprehensive guide will show you how to convert EPUB files (including those with images) to high-quality PDFs using Python.

---

## 🔹 Required Tools & Libraries
We'll use these Python packages:
- ebooklib - For EPUB parsing
- pdfkit (wrapper for wkhtmltopdf) - For PDF generation
- Pillow - For image handling (optional)

pip install ebooklib pdfkit pillow

Also install system dependencies:

# On Ubuntu/Debian
sudo apt-get install wkhtmltopdf

# On MacOS
brew install wkhtmltopdf

# On Windows (download from wkhtmltopdf.org)

---

## 🔹 Step 1: Extract EPUB Contents
First, we'll unpack the EPUB file to access its HTML and images.

from ebooklib import epub
from bs4 import BeautifulSoup
import os

def extract_epub(epub_path, output_dir):
    book = epub.read_epub(epub_path)
    
    # Create output directory
    os.makedirs(output_dir, exist_ok=True)
    
    # Extract all items (chapters, images, styles)
    for item in book.get_items():
        if item.get_type() == epub.ITEM_IMAGE:
            # Save images
            with open(os.path.join(output_dir, item.get_name()), 'wb') as f:
                f.write(item.get_content())
        elif item.get_type() == epub.ITEM_DOCUMENT:
            # Save HTML chapters
            with open(os.path.join(output_dir, item.get_name()), 'wb') as f:
                f.write(item.get_content())
    
    return [item.get_name() for item in book.get_items() if item.get_type() == epub.ITEM_DOCUMENT]

---

## 🔹 Step 2: Convert HTML to PDF
Now we'll convert the extracted HTML files to PDF while preserving images.

import pdfkit
from PIL import Image  # For image validation (optional)

def html_to_pdf(html_files, output_pdf, base_dir):
    options = {
        'encoding': "UTF-8",
        'quiet': '',
        'enable-local-file-access': '',  # Critical for local images
        'no-outline': None,
        'margin-top': '15mm',
        'margin-right': '15mm',
        'margin-bottom': '15mm',
        'margin-left': '15mm',
    }
    
    # Validate images (optional)
    for html_file in html_files:
        soup = BeautifulSoup(open(os.path.join(base_dir, html_file)), 'html.parser')
        for img in soup.find_all('img'):
            img_path = os.path.join(base_dir, img['src'])
            try:
                Image.open(img_path)  # Validate image
            except Exception as e:
                print(f"Image error in {html_file}: {e}")
                img.decompose()  # Remove broken images
    
    # Convert to PDF
    pdfkit.from_file(
        [os.path.join(base_dir, f) for f in html_files],
        output_pdf,
        options=options
    )

---

## 🔹 Step 3: Complete Conversion Function
Combine everything into a single workflow.

def epub_to_pdf(epub_path, output_pdf, temp_dir="temp_epub"):
    try:
        print(f"Converting {epub_path} to PDF...")
        
        # Step 1: Extract EPUB
        print("Extracting EPUB contents...")
        html_files = extract_epub(epub_path, temp_dir)
        
        # Step 2: Convert to PDF
        print("Generating PDF...")
        html_to_pdf(html_files, output_pdf, temp_dir)
        
        print(f"Success! PDF saved to {output_pdf}")
        return True
    
    except Exception as e:
        print(f"Conversion failed: {str(e)}")
        return False
    finally:
        # Clean up temporary files
        if os.path.exists(temp_dir):
            import shutil
            shutil.rmtree(temp_dir)

---

## 🔹 Advanced Options
### 1. Custom Styling
Add CSS to improve PDF appearance:

def html_to_pdf(html_files, output_pdf, base_dir):
    options = {
        # ... previous options ...
        'user-style-sheet': 'styles.css',  # Custom CSS
    }
    
    # Create CSS file if needed
    css = """
    body { font-family: "Times New Roman", serif; font-size: 12pt; }
    img { max-width: 100%; height: auto; }
    """
    with open(os.path.join(base_dir, 'styles.css'), 'w') as f:
        f.write(css)
    
    pdfkit.from_file(/* ... */)

❤11🔥2🎉1

5.26K views10:48

Python | Machine Learning | Coding | R

#PDF #EPUB #TelegramBot #Python #SQLite #Project

Lesson: Building a PDF <> EPUB Telegram Converter Bot

This lesson walks you through creating a fully functional Telegram bot from scratch. The bot will accept PDF or EPUB files, convert them to the other format, and log each transaction in an SQLite database.

---

Part 1: Prerequisites & Setup

First, we need to install the necessary Python library for the Telegram Bot API. We will also rely on Calibre's command-line tools for conversion.

Important: You must install Calibre on the system where the bot will run and ensure its ebook-convert tool is in your system's PATH.

pip install python-telegram-bot==20.3

#Setup #Prerequisites

---

Part 2: Database Initialization

We'll use SQLite to log every successful conversion. Create a file named database_setup.py and run it once to create the database file and the table.

# database_setup.py
import sqlite3

def setup_database():
    conn = sqlite3.connect('conversions.db')
    cursor = conn.cursor()
    
    # Create table to store conversion logs
    cursor.execute('''
        CREATE TABLE IF NOT EXISTS conversions (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            user_id INTEGER NOT NULL,
            original_filename TEXT NOT NULL,
            converted_filename TEXT NOT NULL,
            conversion_type TEXT NOT NULL,
            timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
        )
    ''')
    
    conn.commit()
    conn.close()
    print("Database setup complete. 'conversions.db' is ready.")

if __name__ == '__main__':
    setup_database()

#Database #SQLite #Initialization

---

Part 3: The Main Bot Script - Imports & Basic Commands

Now, let's create our main bot file, converter_bot.py. We'll start with imports and the initial /start and /help commands.

# converter_bot.py
import logging
import os
import sqlite3
import subprocess
from telegram import Update
from telegram.ext import Application, CommandHandler, MessageHandler, filters, ContextTypes

# Enable logging
logging.basicConfig(format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', level=logging.INFO)

# --- Bot Token ---
TELEGRAM_TOKEN = "YOUR_TELEGRAM_BOT_TOKEN"

# --- Command Handlers ---
async def start(update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
    user = update.effective_user
    await update.message.reply_html(
        rf"Hi {user.mention_html()}! Send me a PDF or EPUB file to convert.",
    )

async def help_command(update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
    await update.message.reply_text("Simply send a .pdf file to get an .epub, or send an .epub file to get a .pdf. Note: Conversion quality depends on the source file's structure.")

#TelegramBot #Python #Boilerplate

---

Part 4: The Core Conversion Logic

This function will be the heart of our bot. It uses the ebook-convert command-line tool (from Calibre) to perform the conversion. It's crucial that Calibre is installed correctly for this to work.

503 views09:19

About

Blog

Apps

Platform