GeekTips
109 subscribers
586 photos
3 videos
77 files
231 links
Linux Mint, video encoding, ffmpeg, geek tips, regex, pdf manipulation, substitcher, mpv config
Download Telegram
Generate a PDF TOC (Table of Contents) and bookmarks from filenames. I tried jpdftweak, jpdfbookmarks but PDFSAM (Basic Version free, Linux and Windoze) does the job perfectly.
It creates an clickable table of contents at the beginning and also bookmarks. Also it compressed the file from 916MB (874MiB) (pdftk *.pdf cat output combined.pdf) to 810MB (773MiB) with PDFSAM.
Showing Thumbnail of the top portion of the page for the clickable link in the Table of Contents.
Back to adding some Subtitles with Gaupol (flatpak).
Edit a few videos together and add soundtracks in Shotcut (Linux, Mac, Windoze GPL)
Using LosslessCut (Linux, Mac, Windoze GPL) to make quick editing cuts of mp3s. Trim off the first 22 seconds and last 20 seconds of each file before encoding into a opus chaptered audiobook with freac.
These are the options that I use for mp3s. But for videos I use SmartCut or keyframe cuts.
Queue up the files you wish to batch download in Videomass (Linux, Mac, Windoze GPL free) which uses yl-dlp to downlaod from youtube, bitchute, odysee, etc.
Download all the videos in Videomass
Queue up all the videos you wish to re-encode to reduce file size keeping 720p.
-c:v hevc -crf 28 -c:a libopus -b:a 16k -vf scale="-2:720"

The preset I use. - 2 keeps aspect ratio even if you upscale or downscale video. hevc = x265. -crf 28 I use for most videos. -crf 23 for a great documentary or movie and -crf 31 for VHS quality.
Legogender
Liaspec
Liaspec
Librafeminine
Librafeminine
Libragender
Libragender
Libralesbian
Libralesbian
Libramasculine
Libramasculine
Libramaverique
Libramaverique
Librandrogyne
Librandrogyne
Libranonbinary
Libranonbinary
Lilafluid
Lilafluid
Lingender
Lingender
Linkgender
Littlefluid
Littlefluid
Lolgender
Ludogender
Lunagender
Lunagender
Lunarset
Lunarset

Remove duplicate lines without changing order
nl -w1 gender.txt | sort -k2 | uniq -f1 | sort -n | cut -f2- > output.txt

or this one works too keeping original order
awk '!seen[$0]++' gender.txt > output.txt

Legogender
Liaspec
Librafeminine
Libragender
Libralesbian
Libramasculine
Libramaverique
Librandrogyne
Libranonbinary
Lilafluid
Lingender
Linkgender
Littlefluid
Lolgender
Ludogender
Lunagender
Lunarset
regex searches for and replaces digits up to 13 times after a dash -
-(\d){13}
regex searches for a replaces any digits mixed with periods / dots 00.34.77 which are timestamps created by LosslessCut
you need to use a . instead of using quantifier or whatever it's called. I used 23 or 24 ..... periods.
-(\d)........................

this also works
-(\d+).(\d+).(\d+).(\d+).(\d+).(\d+).(\d+).(\d+)
(\D)-(\d)........................
$1
multiple dashes in filename so (\D) matches non-digit (like abcd, etc.) then a dash and replace with first string $1 which is the letter. Otherwise the last letter gets chopped off at end of filename.

If there is a numeral at the end like Part 1 change the first (\D) to lowercase to indicate digit like so
(\d)-(\d)........................
\1
This book I'm making into an audiobook but the original OCR on the document is pretty much impossible to correct.

pdftotext -layout book.pdf output.txt
so had to force ocr it
ocrmypdf - -force-ocr book.pdf book_ocr.pdf
and now it's a tad better. Formatting isn't all that important for text to speech.
Removing hyphens from hyphenated words at the end of a line. Notice for the text to speech to work correctly need to change defi- nitely to definitely and don't change non-partisanship as it's correct as it is.
-$\n\s+
-
is dash
$ says at the end of a line
\n line break
\s is whitespace (blank spaces)
\s+ any amount of whitespace
It didn't get im- portance nor cir- cumstances as there wasn't any whitespace after the dash -. So search and replace all again using -$\n
now importance and circumstances are correct and non-partisanship isn't changed. Now just have to spell check it before feeding it to ttstool (text to speech)