Learn Python Coding
39.1K subscribers
639 photos
31 videos
24 files
398 links
Learn Python through simple, practical examples and real coding ideas. Clear explanations, useful snippets, and hands-on learning for anyone starting or improving their programming skills.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Convert PDF to structured JSON — in a couple of lines and without hassle! 📄

Today, we'll create a mini-service that takes a PDF document, extracts the text from it, and asks GPT to neatly organize the content into sections: title, author, date, and a list of sections. 🚀

First, let's connect the necessary libraries and API key:

import os
from PyPDF2 import PdfReader
from openai import OpenAI

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

Now, let's extract the text from the PDF. We'll loop through all the pages and combine them into a single string:

reader = PdfReader("document.pdf")
text = "
".join(page.extract_text() for page in reader.pages)

Next, we'll send the obtained text to GPT. We'll ask the model to return a structured JSON with the necessary fields:

response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": (
"You are a PDF parser. Return a JSON with the fields: title, author, date, sections. "
"Each section is an object with name and summary."
)},
{"role": "user", "content": text}
]
)

Output the result:

structured = response.choices[0].message.content.strip()
print(structured)

🔥 Suitable for contracts, reports, methodologies, and any PDFs — we immediately get a JSON ready for use.

#PDF #JSON #Python #GPT #Automation #DataScience

Join Best TG Channels https://t.iss.one/addlist/0f6vfFbEMdAwODBk

⭐️ Join Our WhatsApp Channel https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A

🚀 Level up your AI & Data Science skills with HelloEncyclo — a growing all-in-one platform featuring hands-on courses in LLMs, Deep Learning, MLOps, Data Engineering, and more.
13 courses live + 40+ coming soon
🎯 One access, lifetime updates
🔑 Use code: PRESALE-BOOK-WAVE-2GFG
👉 https://helloencyclo.com/?ref=HUSSEINSHEIKHO
1