Aspiring Data Science
384 subscribers
465 photos
12 videos
12 files
2.16K links
Заметки экономиста о программировании, прогнозировании и принятии решений, научном методе познания.
Контакт: @fingoldo

I call myself a data scientist because I know just enough math, economics & programming to be dangerous.
Download Telegram
#postgres #pgbench #zfs

Потестил pg_bench PG15 на SSD в ext4 и zfs (с шифрованием/сжатием и без) в Ubuntu 22.04.

ZFS ENCRYPTED & COMPRESSED (LZ4):

sudo -i -u postgres pgbench -c 10 -j 2 -t 10000 example

latency average = 4.891 ms
initial connection time = 34.946 ms
tps = 2044.768698 (without initial connection time)

sudo -i -u postgres pgbench -c 10 -j 6 -t 10000 example

latency average = 4.552 ms
initial connection time = 17.378 ms
tps = 2196.925697 (without initial connection time)

ZFS ENCRYPTED & COMPRESSED (LZ4) TUNED WITH TIMESCALEDB:

sudo -i -u postgres pgbench -c 10 -j 2 -t 10000 example

latency average = 4.031 ms
initial connection time = 31.661 ms
tps = 2480.899924 (without initial connection time)

latency average = 3.850 ms
initial connection time = 31.273 ms
tps = 2597.574037 (without initial connection time)

latency average = 3.889 ms
initial connection time = 32.503 ms
tps = 2571.192661 (without initial connection time)


sudo -i -u postgres pgbench -c 10 -j 6 -t 10000 example

latency average = 3.784 ms
initial connection time = 27.329 ms
tps = 2642.536852 (without initial connection time)

latency average = 3.741 ms
initial connection time = 17.922 ms
tps = 2673.011469 (without initial connection time)

latency average = 3.679 ms
initial connection time = 16.235 ms
tps = 2718.262035 (without initial connection time)


ZFS NON-ENCRYPTED & COMPRESSED (LZ4):

sudo -i -u postgres pgbench -c 10 -j 2 -t 10000 example

latency average = 4.660 ms
initial connection time = 38.052 ms
tps = 2146.130682 (without initial connection time)

latency average = 4.507 ms
initial connection time = 39.378 ms
tps = 2218.757361 (without initial connection time)

latency average = 4.266 ms
initial connection time = 36.697 ms
tps = 2344.329379 (without initial connection time)


sudo -i -u postgres pgbench -c 10 -j 6 -t 10000 example

latency average = 3.936 ms
initial connection time = 21.317 ms
tps = 2540.753366 (without initial connection time)

latency average = 3.915 ms
initial connection time = 17.523 ms
tps = 2554.329177 (without initial connection time)

latency average = 4.076 ms
initial connection time = 21.361 ms
tps = 2453.572519 (without initial connection time)


ZFS NON-ENCRYPTED & COMPRESSED (LZ4) TUNED WITH TIMESCALEDB:

sudo -i -u postgres pgbench -c 10 -j 2 -t 10000 example

latency average = 3.648 ms
initial connection time = 34.840 ms
tps = 2741.206504 (without initial connection time)

latency average = 3.473 ms
initial connection time = 39.593 ms
tps = 2879.122987 (without initial connection time)


latency average = 3.592 ms
initial connection time = 34.987 ms
tps = 2783.957080 (without initial connection time)

sudo -i -u postgres pgbench -c 10 -j 6 -t 10000 example

latency average = 3.346 ms
initial connection time = 21.312 ms
tps = 2988.591798 (without initial connection time)

latency average = 3.178 ms
initial connection time = 18.359 ms
tps = 3147.071197 (without initial connection time)

latency average = 3.337 ms
initial connection time = 19.320 ms
tps = 2996.486500 (without initial connection time)

EXT4 NON-ENCRYPTED NON-COMPRESSED TUNED WITH TIMESCALEDB

sudo -i -u postgres pgbench -c 10 -j 2 -t 10000 example

latency average = 2.596 ms
initial connection time = 86.466 ms
tps = 3852.682810 (without initial connection time)

latency average = 2.751 ms
initial connection time = 89.400 ms
tps = 3635.586811 (without initial connection time)

latency average = 2.719 ms
initial connection time = 92.922 ms
tps = 3677.184800 (without initial connection time)

sudo -i -u postgres pgbench -c 10 -j 6 -t 10000 example

latency average = 2.643 ms
initial connection time = 50.687 ms
tps = 3784.041569 (without initial connection time)

latency average = 2.518 ms
initial connection time = 42.306 ms
tps = 3971.158272 (without initial connection time)

latency average = 2.560 ms
initial connection time = 40.711 ms
tps = 3906.430062 (without initial connection time)
#postgres

"pg_repack is a PostgreSQL extension which lets you remove bloat from tables and indexes, and optionally restore the physical order of clustered indexes. Unlike CLUSTER and VACUUM FULL it works online, without holding an exclusive lock on the processed tables during processing. pg_repack is efficient to boot, with performance comparable to using CLUSTER directly."

https://github.com/reorg/pg_repack/
#db #tuning #postgres

Любопытная ситуация. За много лет DBA-шное сообщество, кажется, ничего так и не придумало для оптимизации параметров СУБД. Уж на примере Постгре точно. Разрабам ядра PG положить с прибором на это, они слишком старомодны и консервативны, до сих пор считают, что рабочую память, да и все остальные параметры, сисадмин должен ручками прописывать. Есть pg_bench, но неясно, как его толком применить к настройке. Ведь параметры СУБД взаимодействуют с настройками ОС и ФС, причем зачастую нелинейно. А есть же ещё параметры железа. А еще версии ОС, ФС, СУБД. А еще разные запросы и разное распределение данных в таблицах, поэтому универсальные рекомендации дать трудно. Надо или брутфорсить сотни тысяч комбинаций pgbench-ем (на сервере близком к боевому), или оверпровижинить, или забивать на оптимальность. Почему я один в этом вижу проблему? PostgresPro, возможно, как-то над этим работают, но с их ценником в миллион за ядро я их даже не рассматриваю.
#postgres #db

А тем временим готовится к релизу Постгре 16.

Из интересного:

Allow parallelization of FULL and internal right OUTER hash joins (Melanie Plageman, Thomas Munro)
Allow optimization of always-increasing window functions ntile(), cume_dist() and percent_rank() (David Rowley)

Add system view pg_stat_io view to track IO statistics (Melanie Plageman)
Record statistics on the last sequential and index scans on tables (Dave Page) This information appears in pg_stat_all_tables and pg_stat_all_indexes.

Create a predefined role and grantable privilege with permission to perform maintenance operations (Nathan Bossart) The predefined role is is called pg_maintain.
Add predefined role pg_create_subscription with permission to create subscriptions (Robert Haas)

Add support for regular expression matching on database and role entries in pg_hba.conf (Bertrand Drouvot)

Allow NUMERIC to process hexadecimal, octal, and binary integers of any size (Dean Rasheed) Previously only unquoted eight-byte integers were supported with these non-decimal bases.
Allow underscores in integer and numeric constants (Peter Eisentraut, Dean Rasheed) This can improve readability for long strings of digits.
Accept the spelling "+infinity" in datetime input (Vik Fearing)

Allow parallel application of logical replication (Hou Zhijie, Wang Wei, Amit Kapila)

Allow the STORAGE type to be specified by CREATE TABLE (Teodor Sigaev, Aleksander Alekseev) Previously only ALTER TABLE could control this

Allow subqueries in the FROM clause to omit aliases (Dean Rasheed)

Change date_trunc(unit, timestamptz, time_zone) to be an immutable function (Przemyslaw Sztoch) This allows the creation of expression indexes using this function.
Add functions array_sample() and array_shuffle() (Martin Kalcher)
Add aggregate function ANY_VALUE() which returns any value from a set (Vik Fearing)
Add function random_normal() to supply normally-distributed random numbers (Paul Ramsey)
Add error function erf() and its complement erfc() (Dean Rasheed)

Add libpq option sslcertmode to control transmission of the client certificate (Jacob Champion) The option values are "disable", "allow", and "require".

Add LZ4 and Zstandard compression to pg_dump (Georgios Kokolatos, Justin Pryzby)
Allow pg_dump and pg_basebackup to use "long" mode for compression (Justin Pryzby)

Add support for SSE2 (Streaming SIMD Extensions 2) vector operations on x86-64 architectures (John Naylor) вот это странно, этой поддержки не было, чтоль?
Add support for Advanced SIMD (Single Instruction Multiple Data) (NEON) instructions on ARM architectures (Nathan Bossart)

Require Windows 10 or newer versions (Michael Paquier, Juan José Santamaría Flecha) Previously Windows Vista and Windows XP were supported. Вот это вряд ли можно считать улучшением.


https://www.postgresql.org/docs/16/release-16.html
#databases #postgres

А тем временем вышла 17-я постгре! Много улучшений. Я не понимаю, как у них поднимается рука релизить заведомо слабые решения, а потом течение лет и десятилетий их постепенно допиливать, но хоть так.

https://www.postgresql.org/about/news/postgresql-17-beta-1-released-2865/