Most used PDF operations performed with with various free apps
Highlight text in yellow with Document Viewer (Evince)
able to change color by right-clicking | Annotation Properties
choose
------------------------------------------------
PDF pages per side (more options) and to make booklets out of a linear PDF
https://kjo.herbesfolles.org/bookletimposer/
Combine many images from a directory into a PDF.
OCR (Optical Character Recognition) a PDF document while retaining the image and putting the OCR'ed text hidden behind it
1) install tesseract 5.x which is 15% faster than 4.x
https://ocrmypdf.readthedocs.io/en/latest/jbig2.html
OCR a PDF
Batch ocrmypdf limiting it to 2 pdfs at a time
To extract, delete, rotate, split, combine PDF pages use PDF Slicer (Windows, Linux, keyboard to rearrange) or PDF Arranger drag PDF pages to rearrange
Combine PDFs
Clean PDF metadata
https://exifcleaner.com AppImage, DEB, rpm, Windows, Mac
drag your PDFs into ExifCleaner window and their metadata is wiped
------------------------------------------------
Crop a PDF (not just crop and hide margins) (Linux, Mac, Windows)
Master PDF Editor (DEB, rpm)$70 but can use the non-expiring free version
https://code-industry.net/free-pdf-editor/
Crop a PDF in Master PDF Editor (free version)
Crop a page or pages manually by selecting the area to keep. Click
Highlight text in yellow with Document Viewer (Evince)
select text | right click or Ctrl-H able to change color by right-clicking | Annotation Properties
flatpak install flathub org.gnome.EvinceIn Evince to print many pages in one page
flatpak run org.gnome.Evince
choose
Print | Print to File
Page Setup | Pages per side: 1, 2, 4, 6, 9 or 16------------------------------------------------
PDF pages per side (more options) and to make booklets out of a linear PDF
https://kjo.herbesfolles.org/bookletimposer/
sudo apt install bookletimposer------------------------------------------------
Combine many images from a directory into a PDF.
sudo apt install img2pdf(or python)
pip3 install img2pdfIf you have images and a few are much bigger than the others you might get extremely small pages in your document. Use Pix or Image View (Xviewer) to quickly browse through the images and check out the image dimensions. So if most are say 2000 x 1500 or so and just a few are 3000 x 2500 or higher set the max pixel height and width and all PDF pages will be relatively uniform in size and with no white margins.
img2pdf *.jpg -o output.pdf
img2pdf --imgsize 2000x2000 *.jpg -o output.pdf------------------------------------------------
OCR (Optical Character Recognition) a PDF document while retaining the image and putting the OCR'ed text hidden behind it
1) install tesseract 5.x which is 15% faster than 4.x
sudo add-apt-repository ppa:alex-p/tesseract-ocr-devel2) install ocrmypdf
sudo apt update
sudo apt install tesseract-ocr
pip install ocrmypdf3) install JBIG2 for image compression
https://ocrmypdf.readthedocs.io/en/latest/jbig2.html
OCR a PDF
ocrmypdf input.pdf output.pdfOCR a PDF and add metadata
ocrmypdf --title "title" --author "author" input.pdf output.pdfOCR a PDF and optimize file size by compressing images
ocrmypdf -O 3 input.pdf output.pdfOnly optimizing a PDF and skipping OCR
ocrmypdf -s -O 3 --skip-big .1 input.pdf output.pdf
-s is same as —skip-text (skips text if already OCR'd)-O (that's a letter O not a 0 zero) --optimize and 3 does aggressive lossy optimizations (including lossy JBIG2)- - skip-big tells it to skip any page over 0.1 Megapixels (which would be every page) - - output-type pdf to disable PDF/A generation and maintain annotationsBatch ocrmypdf limiting it to 2 pdfs at a time
sudo apt install parallel(in dir of PDFs)
mkdir output
parallel --tag -j 2 ocrmypdf -s -O 3 --skip-big .1 '{}' 'output/{}' ::: *.pdf
------------------------------------------------To extract, delete, rotate, split, combine PDF pages use PDF Slicer (Windows, Linux, keyboard to rearrange) or PDF Arranger drag PDF pages to rearrange
flatpak install flathub com.github.junrrein.PDFSlicer------------------------------------------------
flatpak run com.github.junrrein.PDFSlicer
flatpak install flathub com.github.jeromerobert.pdfarranger
flatpak run com.github.jeromerobert.pdfarranger
Combine PDFs
pdftk one.pdf two.pdf three.pdf output combined.pdf-v = natural sort of (version) numbers within text
pdftk *.pdf cat output combined.pdf
ls -v *.pdf > namelist------------------------------------------------
pdftk 'cat namelist' output combined.pdf
Clean PDF metadata
https://exifcleaner.com AppImage, DEB, rpm, Windows, Mac
drag your PDFs into ExifCleaner window and their metadata is wiped
------------------------------------------------
Crop a PDF (not just crop and hide margins) (Linux, Mac, Windows)
Master PDF Editor (DEB, rpm)$70 but can use the non-expiring free version
https://code-industry.net/free-pdf-editor/
Crop a PDF in Master PDF Editor (free version)
Crop a page or pages manually by selecting the area to keep. Click
Document | Crop Pages
------------------------------------------------Edit text in a PDF with LibreOffice Draw. Has problems with some complicated documents though.
------------------------------------------------
NormCap is a app that lets you capture and OCR any part of your screen you select that is an image and extract text copied to your clipboard which can then be pasted into a text editor.
In
View a PDF in Dark Mode / Night Mode Grayscale NOT inverted images like most PDF viewers
Master PDF Editor (free version)
To change to Dark Mode grayscale click
/ search for text
n next search result
shift+n previous search result
d toggles dual page display
o open another PDF
r rotate page
s fit to screen
- zoom out
+ zoom in
Tab toggle index
CTRL-P print
CTRL-R toggle recolor (dark mode)
F5 Presentation mode
F11 fullscreen
------------------------------------------------
NormCap is a app that lets you capture and OCR any part of your screen you select that is an image and extract text copied to your clipboard which can then be pasted into a text editor.
In
Applications | Settings | Keyboard under Application Shorcuts add Ctrl+Print and for the command navigate to the NormCap-unstable-x86_64.AppImage
------------------------------------------------View a PDF in Dark Mode / Night Mode Grayscale NOT inverted images like most PDF viewers
Master PDF Editor (free version)
Settings | Display and check Replace Document Colors
change Page Background: black color #2c2c2c and Text: white or light grayTo change to Dark Mode grayscale click
View | Replace Document Colors
Also Zathura has Dark Mode grayscalesudo apt install zathura
Add comic book support cbz cbr files dark mode grayscale on the images toosudo apt install zathura-cb
Zathura has no thumbnails nor shows document propertiesnano ~/.config/zathura/zathurarc
Paste the following into the zathurarc text configuration file and save it. You don't need to create this configuration file as it already has dark mode CTRL-R. So only do this if you don't want pure black and pure white colors.set recolor true
set recolor-darkcolor "#dcdccc"
set recolor-lightcolor "#1f1f1f"
Zathura Keyboard shortcut keys since there isn't any menu/ search for text
n next search result
shift+n previous search result
d toggles dual page display
o open another PDF
r rotate page
s fit to screen
- zoom out
+ zoom in
Tab toggle index
CTRL-P print
CTRL-R toggle recolor (dark mode)
F5 Presentation mode
F11 fullscreen
GitHub
GitHub - dynobo/normcap: OCR powered screen-capture tool to capture information instead of images
OCR powered screen-capture tool to capture information instead of images - dynobo/normcap
SparkelDrinkIdeas_ocr.pdf
2.7 MB
34 Sparkel Drink Ideas all non-alcoholic and no tea ones. Just put this together for personal reference.
WorkingWithPDFs.pdf
114.7 KB
Working with PDFs. Made a PDF of the above two posts for quick reference. Pretty much sums up all the stuff I do on a regular basis with PDFs.
My profiles for videomass I use.
Profile name:
Description:
Download all the presets here https://t.iss.one/geektips/321
Profile name:
hevc 265 libopus 16k crf 28
Description: crf 28 or 31 for VHS quality
output format: mp4
-c:v hevc -crf 28 -c:a libopus -b:a 16k -vf scale="-2:720"Profile name:
burn-in subtitles
Description: x265, rescale to 720p, srt subs hard-coded
output format: mp4
-c:v hevc -crf 28 -c:a libopus -b:a 16k -vf scale="-2:720",subtitles="/home/mint/Videos/English.srt"
Profile name: replace audio and rescale to 720p
Description: x265 replace audio track
output format: mp4
-i "/home/mint/Music/music.opus" -c:v hevc -crf 28 -vf scale="-2:720" -c:a copy -map 0:v:0 -map 1:a:0 -map_metadata 0 -shortestProfile name:
Extract audio from video converts to Opus
Description: in Opus 16kbps (talk) 96k (music)
output format: opus
-vn -sn -map 0:1 -c:a:0 libopus -b:a 16k -vbr off -map_metadata 0Profile name:
encode video to audio only opus
Description: 16kbps for speech, talk audio only from video
output format: opus
-vn -c:a libopus -b:a 16k
-vbr off
Profile name: Add audio stream to video (copy)
Description: muxing audio with video
output format: mp4
-i "/home/mint/music.opus" -c:v copy -c:a copy -map_metadata 0 -shortest
Profile name: audio filter Dynamic Audio Normalization
Description: collection of mp3s prepare for opus chaptered audiobook
output format: opus
-vn -c:a libopus -b:a 32k -vbr off -ar 48000 -af dynaudnorm,aformat=channel_layouts=stereoProfile name:
noise reduction Description:
highpass=500 lowpass =1000
output format: opus
-c:a libopus -vbr off -b:a 32k -ar 48000 -af highpass=200,lowpass=3000,afftdn,aformat=channel_layouts=stereo,volume=12dB,dynaudnormProfile name:
remove silence
Description: start_periods=0 to only remove silence from beginning
output format: opus
-c:a libopus -vbr off -b:a 32k -ar 48000 -af highpass=500,lowpass=1000,afftdn,aformat=channel_layouts=stereo,volume=12dB,"silenceremove=start_periods=1:stop_periods=-1:start_threshold=-50dB:stop_threshold=-50dB:start_silence=1:start_duration=2:stop_duration=5:detection=peak",dynaudnormffmpeg 5.x no longer requires the option
-strict -2 when using opus in mp4. If still on ffmpeg 4.x you need to use it.Download all the presets here https://t.iss.one/geektips/321
Telegram
GeekTips
Videomass here are the presets I use. You can download here https://t.iss.one/geektips/331 it's just a text file All Opus, Subs, x265.prst and choose Import preset.
Obviously you must change the path in a few of them. Or you can copy and paste them in from…
Obviously you must change the path in a few of them. Or you can copy and paste them in from…
downsub.com works great for getting subtitles
Telegram doesn't have subtitles and most users aren't gonna download a video and the SRT subs and watch locally. So until telegram does we must burn-in subtitles. See previous post how to do so with videomass.
1) paste the link to the video
2) click on SRT to download the English subtitles
3) optionally translate from one language to another (I've done this with German to English)
4) download the language you want it translated to
Telegram doesn't have subtitles and most users aren't gonna download a video and the SRT subs and watch locally. So until telegram does we must burn-in subtitles. See previous post how to do so with videomass.
1) paste the link to the video
2) click on SRT to download the English subtitles
3) optionally translate from one language to another (I've done this with German to English)
4) download the language you want it translated to
Make a chaptered audiobook with the chapter times already specified.
On youtube it lists the times but has a single segment for chapters 1-10 and so on.
0:00:00 Opening
0:00:49 Chapter 1: The Beginning
0:09:07 Chapter 2: Regality
0:29:27 Chapter 3: Polar Symbolism; The Lord of Peace and Justice
0:40:28 Chapter 4: The Law, The State, the Empire
0:58:40 Chapter 5: The Mystery of the Rite
1:11:39 Chapter 6: On the Primordial Nature of the Patriciate
1:28:07 Chapter 7: Spiritual Virility
1:38:52 Chapter 8: The Two Paths in the Afterlife
1:53:47 Chapter 9: Life and Death of Civilizations
2:08:05 Chapter 10: Initiation and Consecration
First in videomass download all the segments for the audiobook using yt-dlp.
On youtube it lists the times but has a single segment for chapters 1-10 and so on.
0:00:00 Opening
0:00:49 Chapter 1: The Beginning
0:09:07 Chapter 2: Regality
0:29:27 Chapter 3: Polar Symbolism; The Lord of Peace and Justice
0:40:28 Chapter 4: The Law, The State, the Empire
0:58:40 Chapter 5: The Mystery of the Rite
1:11:39 Chapter 6: On the Primordial Nature of the Patriciate
1:28:07 Chapter 7: Spiritual Virility
1:38:52 Chapter 8: The Two Paths in the Afterlife
1:53:47 Chapter 9: Life and Death of Civilizations
2:08:05 Chapter 10: Initiation and Consecration
First in videomass download all the segments for the audiobook using yt-dlp.
in LosslessCut (Win, Mac, Linux appimage and GPL free) load up the mp4 for chapters 1 to 10 and press G (to go to specific time) and input the time to create chapter breaks. Press B to set a break / split point there.
LosslessCut export options
1) choose separate files which you'll rename later in your file manager to match the chapter names
2) Smart cut (experimental) Yes as this will cut exactly at the break/split point.
Now just set metadata to file name in MusicBrainz Picard and import into freac and make your chaptered opus audiobook.
1) choose separate files which you'll rename later in your file manager to match the chapter names
2) Smart cut (experimental) Yes as this will cut exactly at the break/split point.
Now just set metadata to file name in MusicBrainz Picard and import into freac and make your chaptered opus audiobook.
Huge audiobook collections from various sources video / audio have various sample rates which is irrelevant but mixing various sample rates 44.1kHz, 48kHz creates a problem of chapter times being not accurate.
Opus compression does not depend on the input sample rate; timestamps are measured in 48 kHz units even if the full bandwidth is not used.
1) Freac when you attempt to
Preset filter to use in videomass and output to opus
Don't forget with hundreds of chapters you can't access them all on VLC on your computer so need to use VLC on your phone.
Opus compression does not depend on the input sample rate; timestamps are measured in 48 kHz units even if the full bandwidth is not used.
1) Freac when you attempt to
Encode to a single file opus chaptered audiobook and you see verious sample rates stop. First encode audio files with videomass before loading them into freac. Preset filter to use in videomass and output to opus
-vn -c:a libopus -b:a 32k -ar 48000 -af dynaudnorm,aformat=channel_layouts=mono
-ar 48000 (is mandatory) and mono (is highly recommended)dynaudnorm = Dynamic Audio Normalization so corrects low volume (optional)Don't forget with hundreds of chapters you can't access them all on VLC on your computer so need to use VLC on your phone.
Opus is superior for low (orange) narrowband 8kHz, (yellow) wideband 16kHz, (green) super-wideband 24kHz and fullband 48kHz for ultra low bitrate and latency which is important for real-time communication. Zoom, Discord, Playstation Network, Jitsi, WhatsApp all use Opus for their voice communication.
To reduce the file size of a video while retaining video resolution and quality. This is all I do with videomass (free GPL Windows, Linux or Mac) which is just a frontend GUI for ffmpeg which is also free.
1) Click
2) choose the following preset (need to set up initially)
4) click Convert
5) click OK
containers there is mp4 and mkv. Telegram likes mp4 to be able to stream for iphones, etc.
Audio codecs aac (most common) and opus which I use since it's superior at lower bitrates.
1) Click
Presets Manager then drag video into queue2) choose the following preset (need to set up initially)
-c:v hevc -crf 28 -c:a libopus -b:a 16k -vf scale="-2:720"3) remove
-vf scale="-2:720" if you don't wanna upscale nor downscale this video to 720p4) click Convert
5) click OK
containers there is mp4 and mkv. Telegram likes mp4 to be able to stream for iphones, etc.
Audio codecs aac (most common) and opus which I use since it's superior at lower bitrates.
-crf 28 I use for most video clips and use -crf 31 for VHS quality stuff. If you're doing a Hollywood movie use -crf 23 (lower is higher quality). Also you wouldn't re-encode audio so just -c:a copy instead of -c:a libopus and don't specify a bitrate -b:a 16k
ffmpeg 5.x no longer requires -strict -2 for opus with mp4Videomass here are the presets I use. You can download here https://t.iss.one/geektips/331 it's just a text file
Obviously you must change the path in a few of them. Or you can copy and paste them in from this post.
https://t.iss.one/geektips/311
All Opus, Subs, x265.prst and choose Import preset. Obviously you must change the path in a few of them. Or you can copy and paste them in from this post.
https://t.iss.one/geektips/311
Easily remove line breaks in LibreOffice which can occur if you copy text from a PDF.
1) Shows line breaks
2) pressing
3) shows line breaks removed and the text flows to the end of the page
How to setup the macro in LibreOffice
https://t.iss.one/geektips/270
1) Shows line breaks
2) pressing
Ctrl-Shift-L to activate the macro to remove line breaks3) shows line breaks removed and the text flows to the end of the page
How to setup the macro in LibreOffice
https://t.iss.one/geektips/270
How to make an opus chaptered audiobook (quick overview)
Downloaded an m4a (m4b) audiobook 1.2GB but it didn't have chapters so had to download 8.6GB video text audiobook that had chapters and extracted the zip file. Could try to automatically to detect chapters based on silence and manually rename but decided against it.
Downloaded an m4a (m4b) audiobook 1.2GB but it didn't have chapters so had to download 8.6GB video text audiobook that had chapters and extracted the zip file. Could try to automatically to detect chapters based on silence and manually rename but decided against it.
Rename files in your filemanager then load in MusicBrainz Picard set Title (chapter name) metadata to be the same as the filenames by
Ctrl-Shift-T
For those on Mac use MP3Tag which canRename files from tags Rename files based on the tag information and import tags from filenames.Add a cover if there isn't one already there in the metadata. Also set Artist to the Author and Album to the Title of audiobook. Press Encode. Notice on telegram it shows Kevin B. MacDonald instead of Kevin MacDonald. You can use tageditor to fix that if you wish.
Telegram uses MiB (Mebibyte rather than Megabyte) which is International System of Units (SI)
100 MB = 95MiB (~5% less)
1GB = .93 GiB
Linux Mint terminal
Telegram uses MiB (Mebibyte rather than Megabyte) which is International System of Units (SI)
100 MB = 95MiB (~5% less)
1GB = .93 GiB
Linux Mint terminal
ls -lh shows 182M which is 182MiBls -l --si shows 192M which is 192MBThose on Linux Mint (or other distros it's on flathub) Warpinator has an iOS (requires iOS 13+) in beta app. Must start up warpinator first and then open app.
Beta Link for Warpinator iOS app
https://testflight.apple.com/join/7ndmZa31
Beta Link for Warpinator iOS app
https://testflight.apple.com/join/7ndmZa31