Data Science Jupyter Notebooks
11K subscribers
269 photos
31 videos
9 files
726 links
Explore the world of Data Science through Jupyter Notebooks—insights, tutorials, and tools to boost your data journey. Code, analyze, and visualize smarter with every post.
Download Telegram
Topic: Python Script to Convert a Shared ChatGPT Link to PDF – Step-by-Step Guide

---

### Objective

In this lesson, we’ll build a Python script that:

• Takes a ChatGPT share link (e.g., https://chat.openai.com/share/abc123)
• Downloads the HTML content of the chat
• Converts it to a PDF file using pdfkit and wkhtmltopdf

This is useful for archiving, sharing, or printing ChatGPT conversations in a clean format.

---

### 1. Prerequisites

Before starting, you need the following libraries and tools:

#### • Install pdfkit and requests

pip install pdfkit requests


#### • Install wkhtmltopdf

Download from:
[https://wkhtmltopdf.org/downloads.html](https://wkhtmltopdf.org/downloads.html)

Make sure to add the path of the installed binary to your system PATH.

---

### 2. Python Script: Convert Shared ChatGPT URL to PDF

import pdfkit
import requests
import os

# Define output filename
output_file = "chatgpt_conversation.pdf"

# ChatGPT shared URL (user input)
chat_url = input("Enter the ChatGPT share URL: ").strip()

# Verify the URL format
if not chat_url.startswith("https://chat.openai.com/share/"):
print("Invalid URL. Must start with https://chat.openai.com/share/")
exit()

try:
# Download HTML content
response = requests.get(chat_url)
if response.status_code != 200:
raise Exception(f"Failed to load the chat: {response.status_code}")

html_content = response.text

# Save HTML to temporary file
with open("temp_chat.html", "w", encoding="utf-8") as f:
f.write(html_content)

# Convert HTML to PDF
pdfkit.from_file("temp_chat.html", output_file)

print(f"\n PDF saved as: {output_file}")

# Optional: remove temp file
os.remove("temp_chat.html")

except Exception as e:
print(f" Error: {e}")


---

### 3. Notes

• This approach works only if the shared page is publicly accessible (which ChatGPT share links are).
• The PDF output will contain the web page version, including theme and layout.
• You can customize the PDF output using pdfkit options (like page size, margins, etc.).

---

### 4. Optional Enhancements

• Add GUI with Tkinter
• Accept multiple URLs
• Add PDF metadata (title, author, etc.)
• Add support for offline rendering using BeautifulSoup to clean content

---

### Exercise

• Try converting multiple ChatGPT share links to PDF
• Customize the styling with your own CSS
• Add a timestamp or watermark to the PDF

---

#Python #ChatGPT #PDF #WebScraping #Automation #pdfkit #tkinter

https://t.iss.one/CodeProgrammer
Please open Telegram to view this post
VIEW IN TELEGRAM
8
🔥 Trending Repository: Stirling-PDF

📝 Description: #1 Locally hosted web application that allows you to perform various operations on PDF files

🔗 Repository URL: https://github.com/Stirling-Tools/Stirling-PDF

🌐 Website: https://stirlingpdf.com

📖 Readme: https://github.com/Stirling-Tools/Stirling-PDF#readme

📊 Statistics:
🌟 Stars: 65.1K stars
👀 Watchers: 202
🍴 Forks: 5.6K forks

💻 Programming Languages: Java - HTML - JavaScript - CSS - Rich Text Format - Shell

🏷️ Related Topics:
#java #docker #pdf #pdf_converter #pdf_manipulation #pdfmerger #pdf_merger #pdf_tools #pdf_editor #pdf_web_apps #pdf_ocr


==================================
🧠 By: https://t.iss.one/DataScienceM
1
🔥 Trending Repository: markitdown

📝 Description: Python tool for converting files and office documents to Markdown.

🔗 Repository URL: https://github.com/microsoft/markitdown

📖 Readme: https://github.com/microsoft/markitdown#readme

📊 Statistics:
🌟 Stars: 74K stars
👀 Watchers: 255
🍴 Forks: 4.1K forks

💻 Programming Languages: Python - Dockerfile

🏷️ Related Topics:
#markdown #pdf #openai #microsoft_office #autogen #langchain #autogen_extension


==================================
🧠 By: https://t.iss.one/DataScienceM
🔥 Trending Repository: Dolphin

📝 Description: The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.

🔗 Repository URL: https://github.com/bytedance/Dolphin

📖 Readme: https://github.com/bytedance/Dolphin#readme

📊 Statistics:
🌟 Stars: 6.3K stars
👀 Watchers: 53
🍴 Forks: 516 forks

💻 Programming Languages: Python - Shell

🏷️ Related Topics:
#python #pdf #parser #ocr #pdf_converter #document_analysis #pdf_parser #layout_analysis #vlm_ocr


==================================
🧠 By: https://t.iss.one/DataScienceM
🔥 Trending Repository: siyuan

📝 Description: A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.

🔗 Repository URL: https://github.com/siyuan-note/siyuan

🌐 Website: https://b3log.org/siyuan

📖 Readme: https://github.com/siyuan-note/siyuan#readme

📊 Statistics:
🌟 Stars: 37.6K stars
👀 Watchers: 159
🍴 Forks: 2.3K forks

💻 Programming Languages: TypeScript - Go - JavaScript - SCSS - HTML - CSS

🏷️ Related Topics:
#electron #markdown #pdf #ocr #s3 #webdav #self_hosted #openai #note_taking #evernote #anki #knowledge_base #obsidian #notion #notes_app #local_first #chatgpt #ollama #deepseek


==================================
🧠 By: https://t.iss.one/DataScienceM
🔥 Trending Repository: pdfplumber

📝 Description: Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.

🔗 Repository URL: https://github.com/jsvine/pdfplumber

📖 Readme: https://github.com/jsvine/pdfplumber#readme

📊 Statistics:
🌟 Stars: 8.5K stars
👀 Watchers: 99
🍴 Forks: 781 forks

💻 Programming Languages: Python - Makefile

🏷️ Related Topics:
#pdf #pdf_parsing #table_extraction


==================================
🧠 By: https://t.iss.one/DataScienceM
🔥 Trending Repository: PDFMathTranslate

📝 Description: PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero

🔗 Repository URL: https://github.com/Byaidu/PDFMathTranslate

🌐 Website: https://pdf2zh.com

📖 Readme: https://github.com/Byaidu/PDFMathTranslate#readme

📊 Statistics:
🌟 Stars: 28.2K stars
👀 Watchers: 104
🍴 Forks: 2.5K forks

💻 Programming Languages: Python

🏷️ Related Topics:
#python #pdf #latex #translation #math #mcp #japanese #english #openai #translate #document #chinese #edit #modify #russian #korean #zotero #obsidian #pdf2zh


==================================
🧠 By: https://t.iss.one/DataScienceM
🔥 Trending Repository: MinerU

📝 Description: Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.

🔗 Repository URL: https://github.com/opendatalab/MinerU

🌐 Website: https://opendatalab.github.io/MinerU/

📖 Readme: https://github.com/opendatalab/MinerU#readme

📊 Statistics:
🌟 Stars: 45.7K stars
👀 Watchers: 183
🍴 Forks: 3.8K forks

💻 Programming Languages: Python - Dockerfile

🏷️ Related Topics:
#python #pdf #parser #ocr #pdf_converter #extract_data #document_analysis #pdf_parser #layout_analysis #ai4science #pdf_extractor_rag #pdf_extractor_llm #pdf_extractor_pretrain


==================================
🧠 By: https://t.iss.one/DataScienceM
1
🔥 Trending Repository: PDFPatcher

📝 Description: PDF补丁丁——PDF工具箱,可以编辑书签、剪裁旋转页面、解除限制、提取或合并文档,探查文档结构,提取图片、转成图片等等

🔗 Repository URL: https://github.com/wmjordan/PDFPatcher

🌐 Website: https://pdfpatcher.cnblogs.com/

📖 Readme: https://github.com/wmjordan/PDFPatcher#readme

📊 Statistics:
🌟 Stars: 10.9K stars
👀 Watchers: 101
🍴 Forks: 1.4K forks

💻 Programming Languages: C# - C - C++ - HTML

🏷️ Related Topics:
#pdf #pdf_converter #pdf_generation #pdf_document_processor


==================================
🧠 By: https://t.iss.one/DataScienceM