GeekTips
109 subscribers
586 photos
3 videos
77 files
231 links
Linux Mint, video encoding, ffmpeg, geek tips, regex, pdf manipulation, substitcher, mpv config
Download Telegram
Decibel chart so anything 50dB or below seems good for marking it as silence as 60dB is for normal conversation.
Add this part to remove silence from the entire audio (beginning, middle, end) in videomass. Useful for doing hundreds at a time.

-c:a libopus -vbr off -b:a 32k -ar 48000 -af highpass=500,lowpass=1000,afftdn,aformat=channel_layouts=stereo,volume=12dB,"silenceremove=start_periods=1:stop_periods=-1:start_threshold=-50dB:stop_threshold=-50dB:start_silence=1:start_duration=2:stop_duration=5:detection=peak",dynaudnorm

If you only wish to remove silence from beginning of audio put instead start_periods=0

What's the equivalent to detect minimum amount of silence duration of say 5 seconds like you can do in Audacity? stop_duration=5 is.
Silenced removed with videomass...total is 3m 3s
When an audiobook doesn't have chapters and you're too lazy to automatically detect for silent breaks (like for 3 second gaps) with Audacity then use LosslessCut Edit | Segments | Create fixed duration segments. Setting 30 min chapters.
followup to removing silence with ffmpeg
I guess minimum duration of silence seems to work with stop_duration=30 for example. It always removes silence at the beginning though.

Black waveform you can see the 12s is still there and portions 1m 8s (1st part is 14s, 2nd part is 8s as 3rd part was longer than 30s and removed). Also removed the 1m 20s and seemed to remove the end.

Red waveform shows stop_duration=15 seconds and it seems it worked well.

Last thing is I tried start_duration=2 and it did remove one click but missed the other click. So based on this it'll be pretty safe to use these default settings.

stop_duration=30:start_duration=2
Image Upscale 2x, 4x, 8x, etc. Some online services for this. Gimp | Image | Scale and choose Linear | Cubic | NoHalo doesn't matter even if you use a Sharpen filter afterwards. Not even close.

https://github.com/nihui/waifu2x-ncnn-vulkan

./waifu2x-ncnn-vulkan -i ~/Pictures/rainbow.jpg -o ~/Pictures/rainbow4x.jpg -s 4 -n 1

Usage: waifu2x-ncnn-vulkan -i infile -o outfile [options]...

-n noise-level
denoise level (-1/0/1/2/3, default=0)
-s scale upscale ratio (1/2/4/8/16/32, default=2)
-t tile-size >=32 ..if GPU error reduce tile to like 64 or 128, default=auto)

to batch process see here
rainbow.jpg
132.6 KB
rainbow.jpg original image 680x549
rainbow4x.jpg
3.2 MB
rainbow4x.jpg Image Upscaled 4x to 2720x2196

I tried a few online AI image upscaling services and this gives similar results.
rainbowGimp.jpg
1.3 MB
rainbowGimp.jpg Used GIMP to upscale 4x | NoHalo and here's the result...just terrible.
rainbow_4ximgscaler.com.jpg
2.1 MB
imgscaler dot com ($$$) for comparison 4x 2720x2196
Shows chapters on VLC on your computer Linux, Mac or Windows. In Preferences under Interface settings set Continue playback to always so it'll resume from where you stopped it.

If you wish to increase or decrease the playback speed (0.25x to 4.00x) be sure to check under Preferences | Audio that Enable Time-Stretching audio is checked as they adjusts the pitch to improve output at faster or slower speeds. VLC on Linux doesn't show chapter durations or starting times unfortunately.
VLC (free) on iOS can’t stop playback at the end of a chapter but does have a sleep timer. Supports variable playback speed (0.25x to 8.00x). On iOS it’ll show the duration of each chapter and VLC on Android shows starting time of each chapter. VLC Settings on iOS make sure Continue audio playback is set to Always so it’ll resume from the point you last listened to. For variable playback speed make sure Time-stretching audio is checked.
Making a PDF with an index / bookmark of images along with their corresponding filenames (not technically bookmarks but even I'm confused which is the correct term....so confusing. Bookmarks are usually user created for certain points while reading a PDF document so an index would be more accurate).

Then compressing each image before feeding it to PDFSAM to create the indexed PDF with images and img2pdf *.jpg -o output.pdf gives one PDF with all images but doesn't show the filenames of the jpgs. Thus the solution. Install ocrmypdf and parallel if not already installed. If you wish to avoid ocrmypdf image compression convert / compress prior with whatever image editor as img2pdf makes pdf with lossless compression.

open terminal in directory of jpgs or other images

mkdir output ; parallel -j2 img2pdf {} -o 'output/{.}.pdf' ::: *.jpg && parallel --tag -j2 ocrmypdf -s -O 2 -s --skip-big .1 '{}' './{}' ::: output/*.pdf


-j2 = two jobs simultaneously max
-O 2 = compression level 3 is best if image quality is acceptable which usually it is. Only if doing some OCR on scanned text does it blur the text at times...then use 2 instead (it's an letter O not numeral zero 0)
-s skip text OCR
--skip-big .1 skips all OCR processing with pages with only 0.1 megapixels which is each page
{.} = outputs filename without extension..if you just use {}here then it outputs filename.jpg.pdf instead of filename.pdf which you want. I did that at first and had to batch rename them with thunar or nemo. Command line to batch rename files from filename.jpg.pdf to filename.pdf
rename 's/.jpg//' *.pdf
But this step is avoided luckily.

In PDFSAM Basic drag all pdf files into the merge section and under Bookmarks handling: Create one entry for each merged document. Done.

If it's a huge amount of images and PDFSAM crashes then open java app PDFSAM with 4.4GB of RAM like so java -jar -Xmx4400m /opt/pdfsam-basic/pdfsam-basic-4.3.1.jar

quickly compare which compression you want
ocrmypdf -O 2 -s —skip-big .1 someimage.jpg someimage_opt2.jpg
ocrmypdf -O 3 -s —skip-big .1 someimage.jpg someimage_opt3.jpg
original image in PDF 240KB, -O 2 205KB, -O 3 115KB and to me the file size savings and still acceptable image quality is worth it so I'm going with -O 3 (optimization level 3)

In these 175 celebrity images it went from 39MB down to 18MB so a tad more than 2X file size savings.
Then clean with exifcleaner (free app) I use the appimage on linux to clean out all PDF metadata. In this case it deleted Creator, Producer and ModifyDate.
Wanted images pretty much uniform in size. Didn't want a pdf with small, medium and super large images. So kept all of them to be max width and height 1000 pixels max using
img2pdf --imgsize 1000x1000 *.jpg -o output.pdf

But before I did that there were some small say 400 pixel photos in Ben Garrison, Sheeple, Ads collection. So I batched upscaled many of them 2x with waifu2x-ncnn-vulkan. See next post. Even if an image were already say 1500x1500 and you 2X upscale it to 3000x3000 it's doable as you set max to 1000x1000 with img2pdf.
copy waifu2x-ncnn-vulkan into usr/bin path with elevated privileges
sudo nemo
or sudo thunar
which let's use copy the 3 directories and waifu2x-ncnn-vulkan exectuable to /usr/bin

command line (-r = recurvsive)
sudo cp -r waifu2x-ncnn-vulkan models-* /usr/bin
when you wish to delete (be careful)
sudo cd /usr/bin
sudo rm -r waifu2x-ncnn-vulkan models-*

Upscale only one image
waifu2x-ncnn-vulkan -i input.jpg -o output.jpg -s 4 -n 2
To batch upscale many images (jpg/png/webp) and specify the input and output directory plus the image format
waifu2x-ncnn-vulkan -i ~/Pictures/input -o ~/Pictures/output -s 2 -n 0 -f jpg -t 64

-s
scale 1/2/4/8/16/32 (default 2)
-n noise-level -1/0/1/2/3 (default 0)
Don't need an expensive AMD or Nvidia graphics card either. Just takes longer.
-f format type for batch processing dirs
- t tile-size >=32 default=auto...I got a GPU error so reduced tile size to 64 with -t 64 and it worked fine but albeit even slower.
Shows the white space to be cropped / trimmed / removed automatically with the k2pdfopt

k2pdfopt input.pdf -ui- -x -mode tm -om 0.01,0.01,0.01,0.01 -c

outputs a file name input_k2opt.pdf
-ui-
disables interactive GUI on Linux
-x exits when finished
-c color output as default is black and White
-mode tm (trim margins / auto crop)
-om 0.01,0.01,0.01,0.01 output margins adds just a little bit of margins to left, top, right, bottom of pages

OCR works but I'll stick with ocrmypdf since when it OCR's text and images it converts the text to images which is terrible. Images have overlaying OCR text . In ocrmypdf you can --force-ocr it and it will keep your text as text and overlay text on images too. Huge major difference.

k2pdfopt input.pdf -ui- -p 1-4 -x -mode tm -om 0.01,0.01,0.01,0.01 -ocr t -ocrhmax 1.5 -ocrdpi 400 -ocrvis s -ocrd p -c

-p 1-4 (page ranges 1-2) 1- is page 1 to end, e for even and o for odd pages
-ocrd p is to send Tesseract a page at a time rather a line at a time. This was necessary
k2pdfopt interactive mode. Use this first time if you decide to OCR as it'll download the language training model automatically.
ScanTailor a PDF from 19.2MB (on left side with yellow background) to 3x smaller at 6.5MB (on right side) and put uniform margins for each page and add a chapter index.

1) extract images with pdfimages

2) change black background with White text to a White background and black texts with convert

3) ScanTailor margins, deskew, despeckle

4) combine .tif images and pipe to ocrmypdf for OCR and image optimization

5) create bookmarks / index with booky.sh
pdfimages -list -f 1 -l 5 Conquest-of-a-Continent.pdf

--list only lists images in PDF without extractting them
-f first page to process
-l last page to process

Showing just page 1 to 5 you can see each page has 3 multi-layered images. An image with an alpha channel, a yellow background and an smask (soft mask) jbig2.

in the directory with the Conquest of a Continent
mkdir images
pdfimages -jp2 -p Conquest-of-a-Continent.pdf images/img

Extracts all images from pdf and prepends page number -p to each image
img-001-000.jp2
img-001-001.jp2
img-001-002.pbm
img-002-003.jp2
img-002-004.jp2
img-002-005.pbm
img-003-006.jp2
img-003-007.jp2
img-003-008.pbm

Cover looks bad so screenshot it and save it as -001cover.jpg to the images/ directory. If you save image from PDF it'll have an alpha channel.

Only want the pbm files so delete the .jp2 files
cd images/
rm *.jp2
The pbm images are inverted so we need to negative (invert) them with convert to get a White background with black text. Also we'll convert it to png so ScanTailor can read them.

for f in *.pbm; do convert "$f" -negate png"$f".png; done

check to make sure the png images look good then delete the pbm images
rm *.pbm

edit: Oops found out ScanTailor can convert that yellow background to White so no need for pdfimages and inverting, etc. If there are masks, soft-masks, stencils (multi-layered) PDF and want one image per page instead of 3 then use pdftoppm instead of pdfimages

pdftoppm -png -r 300 input.pdf dump/img