Code Stars
2.02K subscribers
11.1K photos
11.4K links
Code Stars alerts you to GitHub repos gaining stars rapidly. Stay ahead of the curve and discover trending projects before they go viral! #AI #GitHub #OpenSource #Tech #MachineLearning #Python #Programming #Java #Javascript #React #Docker #Devops
Download Telegram
paperless-ngx/paperless-ngx
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
Language:Python
Total stars: 24407
Stars trend:
31 Jan 2025
6am ▎ +2
7am ▎ +2
8am ▎ +2
9am ▎ +2
10am +0
11am ▉ +7
12pm █▏ +9
1pm █▉ +15
2pm █▋ +13
3pm █▉ +15
4pm █▌ +12
5pm █▋ +13

#python
#angular, #archiving, #django, #dms, #documentmanagement, #documentmanagementsystem, #machinelearning, #ocr, #opticalcharacterrecognition, #pdf
ocrmypdf/OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Language:Python
Total stars: 14952
Stars trend:
2 Feb 2025
6am ▏ +1
7am +0
8am ▏ +1
9am ▏ +1
10am ▍ +3
11am ▎ +2
12pm ▍ +3
1pm █▎ +10
2pm ███▎ +26
3pm █▌ +12
4pm █▋ +13
5pm █▉ +15

#python
#imageprocessing, #ocr, #pdf, #python, #tesseract
Goldziher/kreuzberg
A text extraction library supporting PDFs, images, office documents and more
Language:Python
Total stars: 304
Stars trend:
15 Feb 2025
12am █ +8
1am ▋ +5
2am █ +8
3am ▊ +6
4am ▉ +7
5am ▉ +7
6am ▊ +6
7am ▎ +2
8am █ +8
9am █ +8
10am █▋ +13

#python
#asyncio, #docx, #ocr, #pdf, #textextraction
CatchTheTornado/text-extract-api
Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
Language:Python
Total stars: 2248
Stars trend:
15 Feb 2025
6am ▉ +7
7am █▎ +10
8am ▌ +4
9am ▉ +7
10am ▉ +7
11am ▍ +3
12pm ▊ +6
1pm ▋ +5
2pm █ +8
3pm █▎ +10
4pm █ +8
5pm ▍ +3

#python
#anonymization, #api, #extract, #json, #llm, #ocr, #ocrpython, #pdf, #pii
opendatalab/MinerU
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Language:Python
Total stars: 27107
Stars trend:
3 Mar 2025
3am █▋ +13
4am ▋ +5
5am ▉ +7
6am █▍ +11
7am █▏ +9
8am ▊ +6
9am ▉ +7
10am █ +8
11am ▊ +6
12pm ▊ +6
1pm █ +8
2pm ▉ +7

#python
#ai4science, #documentanalysis, #extractdata, #layoutanalysis, #ocr, #parser, #pdf, #pdfconverter, #pdfextractorllm, #pdfextractorpretrain, #pdfextractorrag, #pdfparser, #python
2
oomol-lab/pdf-craft
PDF craft can convert PDF files into various other formats. This project will focus on processing PDF files of scanned books. The project has just started.
Language:Python
Total stars: 1537
Stars trend:
10 Apr 2025
4pm ▏ +1
5pm +0
6pm +0
7pm +0
8pm +0
9pm +0
10pm ▏ +1
11pm ▏ +1
11 Apr 2025
12am ████▊ +38
1am ██████████▊ +86
2am ████████ +64

#python
#ai, #document, #ocr, #pdf
umlx5h/LLPlayer
The media player for language learning, with dual subtitles, AI-generated subtitles, real-time translation, and more!
Language:C#
Total stars: 838
Stars trend:
12 Apr 2025
3am ▎ +2
4am ▎ +2
5am ▍ +3
6am ▏ +1
7am ▌ +4
8am ▏ +1
9am ▏ +1
10am ▍ +3
11am █▎ +10
12pm ████▏ +33
1pm ▍ +3
2pm █▋ +13

#csharp
#asr, #csharp, #fasterwhisper, #flyleaf, #languagelearning, #llm, #mediaplayer, #ocr, #ollama, #player, #video, #videoplayer, #whisper, #wpf, #ytdlp
kotaro-kinoshita/yomitoku
Yomitoku is an AI-powered document image analysis package designed specifically for the Japanese language.
Language:Python
Total stars: 697
Stars trend:
20 Apr 2025
10am ▊ +6
11am █▎ +10
12pm █▍ +11
1pm █▉ +15
2pm █▊ +14
3pm ▌ +4
4pm █ +8
5pm ▌ +4
6pm +0
7pm +0
8pm ▍ +3
9pm ▌ +4

#python
#deeplearning, #layoutanalysis, #ocr, #python, #pytorch
hiroi-sora/Umi-OCR
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
Language:Python
Total stars: 33184
Stars trend:
7 May 2025
4am ▎ +2
5am ▌ +4
6am ▎ +2
7am ▍ +3
8am ▉ +7
9am █▎ +10
10am ▊ +6
11am █▎ +10
12pm ▊ +6
1pm ▉ +7
2pm █▏ +9
3pm █▏ +9

#python
#ocr, #ocrpython, #paddleocr, #qml, #qt, #screenshot, #umiocr
clawsoftware/clawPDF
Open Source Virtual (Network) Printer for Windows that allows you to create PDFs, OCR text, and print images, with advanced features usually available only in enterprise solutions.
Language:C#
Total stars: 1043
Stars trend:
19 May 2025
12pm ▍ +3
1pm █████▌ +44
2pm ███████▎ +58
3pm ██████▌ +52
4pm ██▋ +21

#csharp
#imageprocessing, #merge, #networkprinter, #ocr, #pdf, #pdfmerger, #pdfprinter, #print, #printer, #terminalserver, #windows