GeekTips
109 subscribers
586 photos
3 videos
77 files
231 links
Linux Mint, video encoding, ffmpeg, geek tips, regex, pdf manipulation, substitcher, mpv config
Download Telegram
Ocenaudio (free GPL Win, Linux, Mac) can also Normalize audio.

1) Import all chaptered audio files by dragging them into Ocenaudio and it takes awhile to analyzing them (about 3x longer than audacity). Each chapter double click on and choose Effects | Normalize or press the Normalize button.

2) Each subsequent chapter just press Ctrl-Y to repeat the Normalization after double clicking on the track to select it.

3) File | Save All and it'll output the opus files to a variable bitrate around 50kbps overwriting the original ones.

4) For that rare chapter if it doesn't normalize the track like the other tracks judging by the waveform graph then use the Gain Tool and increase the dB appropriately.

For advanced users: you can use MKVToolnix to extract the chapters without re-encoding them although you'll have to manually rename the chapter names. https://t.iss.one/geektips/79
Automatically create audiobook chapters from a single mp3 audiobook that has no chapters.

1) Import single mp3 audiobook into Audacity 3.x or higher

2) Ctrl-A to select all then Analyze | Label Sounds (this process took me 5 minutes on my laptop) Automatically detect chapter breaks for silence of at least 3 seconds as seen in the screenshot. You could try 2.5 seconds or even 2 seconds but don't go below that.
3) In Audacity choose File | Export | Export Multiple and Format: Opus Bitrate: 128kbps (same as original mp3)

4) VBR Mode: Off (Variable Bitrate mode)

5) Split files based on: Labels by default they are Chapter 001, Chapter 002, Chapter 003, etc. if you set it like the screenshot in the previous post.

6) Name files: Using Label/Track Name and click Export and it'll re-encode the one mp3 file into multiple opus audio files, one for each chapter detected.

7) Rename them to chapter names as the book and before importing them into freac to make an opus chaptered audiobook, drag into MusicBrainz Picard and Tag from file names.
Many times the auto detection doesn't work so you can manually find chapters. Once you locate the starting point of the next chapter click it then choose Select | Clip Boundaries | Previous Clip Boundary to Cursor. Then Edit | Labels | Add Label at Selection (Ctrl+B) then name your chapter. This is a tedious process and lots of work but might be worth it if the audiobook is a great one.

To manual locate chapters you zoom in and look for big breaks in the waveform graph.

An even quicker way if you have just a few chapters is to use LosslessCut (Linux, Mac, Win) free GPL. It doesn't have to analyze the mp3 or opus file. Create in and out points and separate the files to export.
Chapters created automatically and here is the result in VLC on Linux.
Most used PDF operations performed with with various free apps

Highlight text in yellow with Document Viewer (Evince)
select text | right click or Ctrl-H
able to change color by right-clicking | Annotation Properties

flatpak install flathub org.gnome.Evince
flatpak run org.gnome.Evince

In Evince to print many pages in one page
choose Print | Print to File
Page Setup | Pages per side:
1, 2, 4, 6, 9 or 16
------------------------------------------------

PDF pages per side (more options) and to make booklets out of a linear PDF
https://kjo.herbesfolles.org/bookletimposer/
sudo apt install bookletimposer
------------------------------------------------

Combine many images from a directory into a PDF.
sudo apt install img2pdf
(or python)
pip3 install img2pdf

img2pdf *.jpg -o output.pdf

If you have images and a few are much bigger than the others you might get extremely small pages in your document. Use Pix or Image View (Xviewer) to quickly browse through the images and check out the image dimensions. So if most are say 2000 x 1500 or so and just a few are 3000 x 2500 or higher set the max pixel height and width and all PDF pages will be relatively uniform in size and with no white margins.

img2pdf --imgsize 2000x2000 *.jpg -o output.pdf
------------------------------------------------

OCR (Optical Character Recognition) a PDF document while retaining the image and putting the OCR'ed text hidden behind it

1) install tesseract 5.x which is 15% faster than 4.x
sudo add-apt-repository ppa:alex-p/tesseract-ocr-devel
sudo apt update
sudo apt install tesseract-ocr

2) install ocrmypdf
pip install ocrmypdf

3) install JBIG2 for image compression
https://ocrmypdf.readthedocs.io/en/latest/jbig2.html

OCR a PDF
ocrmypdf input.pdf output.pdf

OCR a PDF and add metadata
ocrmypdf --title "title" --author "author" input.pdf output.pdf

OCR a PDF and optimize file size by compressing images
ocrmypdf -O 3 input.pdf output.pdf

Only optimizing a PDF and skipping OCR
ocrmypdf -s -O 3 --skip-big .1 input.pdf output.pdf

-s is same as —skip-text (skips text if already OCR'd)
-O (that's a letter O not a 0 zero) --optimize and 3 does aggressive lossy optimizations (including lossy JBIG2)
- - skip-big tells it to skip any page over 0.1 Megapixels (which would be every page)
- - output-type pdf to disable PDF/A generation and maintain annotations

Batch ocrmypdf limiting it to 2 pdfs at a time
sudo apt install parallel
mkdir output
(in dir of PDFs)
parallel --tag -j 2 ocrmypdf -s -O 3 --skip-big .1 '{}' 'output/{}' ::: *.pdf
------------------------------------------------

To extract, delete, rotate, split, combine PDF pages use PDF Slicer (Windows, Linux, keyboard to rearrange) or PDF Arranger drag PDF pages to rearrange

flatpak install flathub com.github.junrrein.PDFSlicer
flatpak run com.github.junrrein.PDFSlicer

flatpak install flathub com.github.jeromerobert.pdfarranger
flatpak run com.github.jeromerobert.pdfarranger
------------------------------------------------

Combine PDFs
pdftk one.pdf two.pdf three.pdf output combined.pdf
pdftk *.pdf cat output combined.pdf
-v = natural sort of (version) numbers within text
ls -v *.pdf > namelist
pdftk 'cat namelist' output combined.pdf
------------------------------------------------

Clean PDF metadata
https://exifcleaner.com AppImage, DEB, rpm, Windows, Mac
drag your PDFs into ExifCleaner window and their metadata is wiped
------------------------------------------------

Crop a PDF (not just crop and hide margins) (Linux, Mac, Windows)
Master PDF Editor (DEB, rpm)$70 but can use the non-expiring free version
https://code-industry.net/free-pdf-editor/

Crop a PDF in Master PDF Editor (free version)
Crop a page or pages manually by selecting the area to keep. Click Document | Crop Pages

------------------------------------------------
Edit text in a PDF with LibreOffice Draw. Has problems with some complicated documents though.
------------------------------------------------

NormCap is a app that lets you capture and OCR any part of your screen you select that is an image and extract text copied to your clipboard which can then be pasted into a text editor.

In Applications | Settings | Keyboard under Application Shorcuts add Ctrl+Print and for the command navigate to the NormCap-unstable-x86_64.AppImage
------------------------------------------------

View a PDF in Dark Mode / Night Mode Grayscale NOT inverted images like most PDF viewers

Master PDF Editor (free version)
Settings | Display and check Replace Document Colors
change Page Background: black color #2c2c2c and Text: white or light gray
To change to Dark Mode grayscale click View | Replace Document Colors

Also Zathura has Dark Mode grayscale
sudo apt install zathura
Add comic book support cbz cbr files dark mode grayscale on the images too
sudo apt install zathura-cb

Zathura has no thumbnails nor shows document properties

nano ~/.config/zathura/zathurarc
Paste the following into the zathurarc text configuration file and save it. You don't need to create this configuration file as it already has dark mode CTRL-R. So only do this if you don't want pure black and pure white colors.

set recolor true
set recolor-darkcolor "#dcdccc"
set recolor-lightcolor "#1f1f1f"

Zathura Keyboard shortcut keys since there isn't any menu

/ search for text
n next search result
shift+n previous search result

d toggles dual page display
o open another PDF
r rotate page
s fit to screen
- zoom out
+ zoom in
Tab toggle index
CTRL-P print
CTRL-R toggle recolor (dark mode)
F5 Presentation mode
F11 fullscreen
Optimized PDF from 3.1MB to 176KB

ocrmypdf -s -O 3 --skip-big .1 Harvey\ Weinstein\ accuser\ Mogul\ lacks\ male\ genitalia.pdf Harvey\ Weinstein\ accuser\ Mogul\ lacks\ male\ genitalia_opt.pdf 

metadata cleaned with ExifCleaner. Screenshot shows metadata before being wiped
SparkelDrinkIdeas_ocr.pdf
2.7 MB
34 Sparkel Drink Ideas all non-alcoholic and no tea ones. Just put this together for personal reference.
WorkingWithPDFs.pdf
114.7 KB
Working with PDFs. Made a PDF of the above two posts for quick reference. Pretty much sums up all the stuff I do on a regular basis with PDFs.
My profiles for videomass I use.

Profile name: hevc 265 libopus 16k crf 28
Description: crf 28 or 31 for VHS quality
output format: mp4
-c:v hevc -crf 28 -c:a libopus -b:a 16k -vf scale="-2:720"

Profile name: burn-in subtitles
Description: x265, rescale to 720p, srt subs hard-coded
output format: mp4
-c:v hevc -crf 28 -c:a libopus -b:a 16k -vf scale="-2:720",subtitles="/home/mint/Videos/English.srt"

Profile name: replace audio and rescale to 720p
Description: x265 replace audio track
output format: mp4
-i "/home/mint/Music/music.opus" -c:v hevc -crf 28 -vf scale="-2:720" -c:a copy -map 0:v:0 -map 1:a:0 -map_metadata 0 -shortest

Profile name: Extract audio from video converts to Opus
Description: in Opus 16kbps (talk) 96k (music)
output format: opus
-vn -sn -map 0:1 -c:a:0 libopus -b:a 16k -vbr off -map_metadata 0

Profile name: encode video to audio only opus
Description: 16kbps for speech, talk audio only from video
output format: opus
-vn -c:a libopus -b:a 16k
-vbr off

Profile name: Add audio stream to video (copy)
Description: muxing audio with video
output format: mp4
-i "/home/mint/music.opus" -c:v copy -c:a copy -map_metadata 0 -shortest

Profile name: audio filter Dynamic Audio Normalization
Description: collection of mp3s prepare for opus chaptered audiobook
output format: opus
-vn -c:a libopus -b:a 32k -vbr off -ar 48000 -af dynaudnorm,aformat=channel_layouts=stereo

Profile name: noise reduction
Description: highpass=500 lowpass =1000
output format: opus
-c:a libopus -vbr off -b:a 32k -ar 48000 -af highpass=200,lowpass=3000,afftdn,aformat=channel_layouts=stereo,volume=12dB,dynaudnorm

Profile name: remove silence
Description: start_periods=0 to only remove silence from beginning
output format: opus
-c:a libopus -vbr off -b:a 32k -ar 48000 -af highpass=500,lowpass=1000,afftdn,aformat=channel_layouts=stereo,volume=12dB,"silenceremove=start_periods=1:stop_periods=-1:start_threshold=-50dB:stop_threshold=-50dB:start_silence=1:start_duration=2:stop_duration=5:detection=peak",dynaudnorm

ffmpeg 5.x no longer requires the option -strict -2 when using opus in mp4. If still on ffmpeg 4.x you need to use it.

Download all the presets here https://t.iss.one/geektips/321
downsub.com works great for getting subtitles

Telegram doesn't have subtitles and most users aren't gonna download a video and the SRT subs and watch locally. So until telegram does we must burn-in subtitles. See previous post how to do so with videomass.

1) paste the link to the video

2) click on SRT to download the English subtitles

3) optionally translate from one language to another (I've done this with German to English)

4) download the language you want it translated to
Make a chaptered audiobook with the chapter times already specified.

On youtube it lists the times but has a single segment for chapters 1-10 and so on.

0:00:00 Opening
0:00:49 Chapter 1: The Beginning
0:09:07 Chapter 2: Regality
0:29:27 Chapter 3: Polar Symbolism; The Lord of Peace and Justice
0:40:28 Chapter 4: The Law, The State, the Empire
0:58:40 Chapter 5: The Mystery of the Rite
1:11:39 Chapter 6: On the Primordial Nature of the Patriciate
1:28:07 Chapter 7: Spiritual Virility
1:38:52 Chapter 8: The Two Paths in the Afterlife
1:53:47 Chapter 9: Life and Death of Civilizations
2:08:05 Chapter 10: Initiation and Consecration

First in videomass download all the segments for the audiobook using yt-dlp.
in LosslessCut (Win, Mac, Linux appimage and GPL free) load up the mp4 for chapters 1 to 10 and press G (to go to specific time) and input the time to create chapter breaks. Press B to set a break / split point there.
LosslessCut export options

1) choose separate files which you'll rename later in your file manager to match the chapter names

2) Smart cut (experimental) Yes as this will cut exactly at the break/split point.

Now just set metadata to file name in MusicBrainz Picard and import into freac and make your chaptered opus audiobook.
Huge audiobook collections from various sources video / audio have various sample rates which is irrelevant but mixing various sample rates 44.1kHz, 48kHz creates a problem of chapter times being not accurate.

Opus compression does not depend on the input sample rate; timestamps are measured in 48 kHz units even if the full bandwidth is not used.

1) Freac when you attempt to Encode to a single file opus chaptered audiobook and you see verious sample rates stop. First encode audio files with videomass before loading them into freac.

Preset filter to use in videomass and output to opus
-vn -c:a libopus -b:a 32k -ar 48000 -af dynaudnorm,aformat=channel_layouts=mono

-ar 48000 (is mandatory) and mono (is highly recommended)

dynaudnorm = Dynamic Audio Normalization so corrects low volume (optional)

Don't forget with hundreds of chapters you can't access them all on VLC on your computer so need to use VLC on your phone.
Shows why Opus has superior audio quality at 16kbps. All audiobooks at 16kbps 48kHz or 44kHz.
Opus is superior for low (orange) narrowband 8kHz, (yellow) wideband 16kHz, (green) super-wideband 24kHz and fullband 48kHz for ultra low bitrate and latency which is important for real-time communication. Zoom, Discord, Playstation Network, Jitsi, WhatsApp all use Opus for their voice communication.
To reduce the file size of a video while retaining video resolution and quality. This is all I do with videomass (free GPL Windows, Linux or Mac) which is just a frontend GUI for ffmpeg which is also free.

1) Click Presets Manager then drag video into queue

2) choose the following preset (need to set up initially)
-c:v hevc -crf 28 -c:a libopus -b:a 16k -vf scale="-2:720"

3) remove -vf scale="-2:720" if you don't wanna upscale nor downscale this video to 720p

4) click Convert

5) click OK

containers there is mp4 and mkv. Telegram likes mp4 to be able to stream for iphones, etc.

Audio codecs aac (most common) and opus which I use since it's superior at lower bitrates.

-crf 28 I use for most video clips and use -crf 31 for VHS quality stuff. If you're doing a Hollywood movie use -crf 23 (lower is higher quality). Also you wouldn't re-encode audio so just -c:a copy instead of -c:a libopus and don't specify a bitrate -b:a 16k

ffmpeg 5.x no longer requires -strict -2 for opus with mp4
Videomass here are the presets I use. You can download here https://t.iss.one/geektips/331 it's just a text file All Opus, Subs, x265.prst and choose Import preset.
Obviously you must change the path in a few of them. Or you can copy and paste them in from this post.
https://t.iss.one/geektips/311
Easily remove line breaks in LibreOffice which can occur if you copy text from a PDF.

1) Shows line breaks

2) pressing Ctrl-Shift-L to activate the macro to remove line breaks

3) shows line breaks removed and the text flows to the end of the page

How to setup the macro in LibreOffice
https://t.iss.one/geektips/270