Greenplum secrets🎩
699 subscribers
38 photos
8 videos
10 files
67 links
The channel about best practice coding for Greenplum / Канал ΠΎ Ρ‚ΠΎΠΌ ΠΊΠ°ΠΊ ΠΏΠΈΡΠ°Ρ‚ΡŒ ΠΎΠΏΡ‚ΠΈΠΌΠ°Π»ΡŒΠ½Ρ‹ΠΉ ΠΊΠΎΠ΄ Π² Greenplum. by @smartyru
Download Telegram
The channel about best practice coding for Greenplum
Канал ΠΏΠΎ сути Π±ΡƒΠ΄Π΅Ρ‚ Π² Ρ„ΠΎΡ€ΠΌΠ°Ρ‚Π΅ "ΠŸΡ€ΠΎΠ³Ρ€Π°ΠΌΠΌΠΈΡ€ΠΎΠ²Π°Π½ΠΈΠ΅ Greenplum для Ρ‡Π°ΠΉΠ½ΠΈΠΊΠΎΠ²", Π³Π΄Π΅ Π±ΡƒΠ΄Ρƒ Π΄Π΅Π»ΠΈΡ‚ΡŒΡΡ
Ρ€Π΅Ρ†Π΅ΠΏΡ‚Π°ΠΌΠΈ ΠΊΠ°ΠΊ Π½Π΅ стоит ΠΏΠΈΡΠ°Ρ‚ΡŒ ΠΊΠΎΠ΄ Π² Π΄Π°Π½Π½ΠΎΠΉ Π‘Π£Π‘Π” класса MPP ( массивно-ΠΏΠ°Ρ€Π°Π»Π»Π΅Π»ΡŒΠ½Π°Ρ ΠΎΠ±Ρ€Π°Π±ΠΎΡ‚ΠΊΠ°), ΠΈΡ‚Π°ΠΊ ΠΏΠΎΠ΅Ρ…Π°Π»ΠΈ!

The channel is a kind of "Greenplum for Dummies" where author will share pieces of advice
how to create optimal code in MPP database, let's go!
πŸ‘4
Π‘Π΅ΠΊΡ€Π΅Ρ‚ 1 (ΠŸΡ€ΠΎΠ±Π»Π΅ΠΌΠ° пустой Ρ‚Π°Π±Π»ΠΈΡ†Ρ‹)
Π”Π°ΠΆΠ΅ select ΠΈΠ· пустой Ρ‚Π°Π±Π»ΠΈΡ†Ρ‹ с Ρ„ΠΈΠ»ΡŒΡ‚Ρ€ΠΎΠΌ ΠΏΠΎ тяТСлой Ρ‚Π°Π±Π»ΠΈΡ†Π΅ ΠΌΠΎΠΆΠ΅Ρ‚ Π·Π°Π²ΠΈΡΠ½ΡƒΡ‚ΡŒ, Π²Ρ‹Π·Π²Π°Π² спилл (ΡƒΡ‚Π΅Ρ‡ΠΊΠ° памяти)

Secret 1 (Empty table surprise)
Even empty table query filtered by heavy table may run slow and create spill

--bad code:
SELECT t.a, t.b, t.c
FROM foo t
where t.key not in (select pk from big_tbl)

--good code:
with a as
(select pk from big_tbl intersect select key from foo)
SELECT t.a, t.b, t.c
from foo t
where t.key not in (select pk from a)
Π‘Π΅ΠΊΡ€Π΅Ρ‚ 2 (УскоряСм GROUP BY)
ΠžΠ±Ρ‹Ρ‡Π½Ρ‹ΠΉ GROUP BY ΠΏΡ€ΠΈ Π½Π°Π»ΠΈΡ‡ΠΈΠΈ большого числа ΠΌΠ΅Ρ‚Ρ€ΠΈΠΊ ΠΌΠΎΠΆΠ½ΠΎ ΡƒΡΠΊΠΎΡ€ΠΈΡ‚ΡŒ, ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΡ CTE, Ρ‚Π°ΠΊΠΆΠ΅ снизив Ρ€Π°Π·ΠΌΠ΅Ρ€ спилла

Secret 2(Speed up GROUP BY)
Simple GROUP BY operator with a lot of aggregates can be boosted using CTE (cutting spill usage as well)

--sub-optimal code:
select dim1, dim2, dim3, sum(m1), sum(m2), ..., sum(m50)
from foo
group by 1, 2, 3

--good code
with a as
(select dim1, dim2, dim3, sum(m1), sum(m2), ..., sum(m25)
group by 1, 2, 3),
b as
(select dim1, dim2, dim3, sum(m26) m26_agg, sum(m27) m27_agg, ..., sum(m50) m50_agg
group by 1, 2, 3)
select a.*, b.m26_agg, b.m27_agg, ..., b.m50_agg
from a, b
where a.dim(1, 2, 3) = b.dim(1, 2, 3) -- краткая Ρ„ΠΎΡ€ΠΌΠ° записи для join-Π° ΠΏΠΎ всСм измСрСниям / short form of a.dim1 = b.dim1 and ... and a.dim3 = b.dim3
πŸ”₯1
Π‘Π΅ΠΊΡ€Π΅Ρ‚ 3 (ВСздСсущая статистика)
Частой ΠΏΡ€ΠΈΡ‡ΠΈΠ½ΠΎΠΉ зависания запроса ΠΈ/ΠΈΠ»ΠΈ Π³Π΅Π½Π΅Ρ€Π°Ρ†ΠΈΠΈ спилла являСтся отсутствиС статистики ΠΏΠΎ Ρ‚Π°Π±Π»ΠΈΡ†Π°ΠΌ запроса.
Если ваш ETL основан Π½Π° PL/pgSQL функциях, Ρ‚ΠΎ Ρ€Π΅ΠΆΠΈΠΌΠΎΠΌ сбора статистики ( Ссли Π½Π΅ Π·Π°Π΄Π°Π½ Π½Π° ΡƒΡ€ΠΎΠ²Π½Π΅ GUC - Global User Configuration )
Ρ€ΡƒΠ»ΠΈΡ‚ ΠΏΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€ gp_autostats_mode_in_functions, ΠΊΠΎΡ‚ΠΎΡ€Ρ‹ΠΉ Π΄ΠΎΠ»ΠΆΠ΅Π½ Π±Ρ‹Ρ‚ΡŒ установлСн Π² самой Ρ„ΡƒΠ½ΠΊΡ†ΠΈΠΈ ( Π½Π΅ Π΄ΠΎ Π΅Π΅ Π²Ρ‹Π·ΠΎΠ²Π° )

Secret 3 (Omnipresent statistics)
A common reason for a query to hang and/or generate a spill is the lack of statistics on the query tables.
If your ETL is built on PL/pgSQL functions, then the statistics gather mode ( if not set by GUC - Global User Configuration )
is managed by gp_autostats_mode_in_functions parameter which must be set in the function itself ( otherwise it doesn't matter )

CREATE OR REPLACE FUNCTION public.fn_foo()
RETURNS boolean
LANGUAGE plpgsql
AS $function$
begin
set gp_autostats_mode_in_functions = on_no_stats; -- бСзуслловный сбор статистики ΠΏΠΎ Ρ‚Π°Π±Π»ΠΈΡ†Π°ΠΌ, ΠΌΠΎΠ΄ΠΈΡ„ΠΈΡ†ΠΈΡ€ΡƒΠ΅ΠΌΡ‹ΠΌ Π² Ρ„ΡƒΠ½ΠΊΡ†ΠΈΠΈ / always gather stat for all tables modified by function
-- Π”Ρ€ΡƒΠ³ΠΈΠ΅ значСния / other values:
-- none - Π½Π΅ ΡΠΎΠ±ΠΈΡ€Π°Ρ‚ΡŒ статистику / don't gather stat,
-- on_change - ΡΠΎΠ±ΠΈΡ€Π°Ρ‚ΡŒ статистику Ссли число ΠΈΠ·ΠΌΠ΅Π½Π΅Π½Π½Ρ‹Ρ… записСй прСвысило Π·Π°Π΄Π°Π½Π½Ρ‹ΠΉ ΠΏΠΎΡ€ΠΎΠ³ / gather stat if the number of modified records exceeds specified threshold
...
πŸ‘3πŸ”₯1
Π‘Π΅ΠΊΡ€Π΅Ρ‚ 4 (ΠžΡΡ‚ΠΎΡ€ΠΎΠΆΠ½ΠΎ - РСкурсия!)

Π‘Π»Π΅Π΄ΡƒΠ΅Ρ‚ ΠΈΠ·Π±Π΅Π³Π°Ρ‚ΡŒ рСкурсивных запросов , Ρ‚.ΠΊ. join Ρ‚Π°Π±Π»ΠΈΡ†Ρ‹ с Π½Π΅ΠΉ ΠΆΠ΅ нСсовмСстим с ΠΊΠΎΠ½Ρ†Π΅ΠΏΡ†ΠΈΠ΅ΠΉ shared-nothing.
Π˜Π½Ρ‹ΠΌΠΈ словами, такая опСрация ΠΏΡ€ΠΈΠ²ΠΎΠ΄ΠΈΡ‚ ΠΊ Ρ‚ΠΈΡ€Π°ΠΆΠΈΡ€ΠΎΠ²Π°Π½ΠΈΡŽ Ρ‚Π°Π±Π»ΠΈΡ†Ρ‹ Π½Π° ΠΊΠ°ΠΆΠ΄Ρ‹ΠΉ ΡƒΠ·Π΅Π» класСра, Ρ‚.ΠΊ.
Π½Π΅ ΠΌΠΎΠΆΠ΅Ρ‚ Π²Ρ‹ΠΏΠΎΠ»Π½ΡΡ‚ΡŒΡΡ локально (Ρ‚ΠΎΠ»ΡŒΠΊΠΎ Π½Π° своСм) Π²Π²ΠΈΠ΄Ρƒ распрСдСлСния Ρ‚Π°Π±Π»ΠΈΡ†Ρ‹ Ρ‚ΠΎΠ»ΡŒΠΊΠΎ ΠΏΠΎ ΠΎΠ΄Π½ΠΎΠΌΡƒ ΠΈΠ· ΠΊΠ»ΡŽΡ‡Π΅ΠΉ ΠΏΡ€Π΅Π΄ΠΈΠΊΠ°Ρ‚Π° запроса.
Однако, Ссли join-ΠΈΡ‚ΡŒ Ρ‚Π°Π±Π»ΠΈΡ†Ρƒ Π½Π΅ саму Π½Π° сСбя ( Ρ€Π°ΠΏΡ€Π΅Π΄Π΅Π»Π΅Π½Π½ΡƒΡŽ ΠΏΠΎ AGREEMENT_PK Π² ΠΏΡ€ΠΈΠΌΠ΅Ρ€Π΅ Π½ΠΈΠΆΠ΅), Π° с Π΅Π΅ ΠΊΠΎΠΏΠΈΠ΅ΠΉ, распрСдСлСнной ΠΏΠΎ PREVIOUS_AGREEMENT_PK( 2Π΅ ΠΏΠ»Π΅Ρ‡ΠΎ ΠΏΡ€Π΅Π΄ΠΈΠΊΠ°Ρ‚Π°),
Ρ€Π°Π·ΠΌΠ΅Ρ€ спилла ΠΌΠΎΠΆΠ½ΠΎ Π·Π°ΠΌΠ΅Ρ‚Π½ΠΎ ΡΠΎΠΊΡ€Π°Ρ‚ΠΈΡ‚ΡŒ Π½Π° порядок.

Secret 4 (Beware - Recursion!)

Recursive queries should be avoided, because joining a table with itself is incompatible with the shared-nothing concept.
In other words, such an operation leads to replicating the table to each cluster node, because it cannot be performed locally (only on its own) due to the distribution of the table only by one of the keys of the query predicate.
However, If you join the table not to itself (distributed by AGREEMENT_PK in the example below), but with its copy distributed by PREVIOUS_AGREEMENT_PK (the parent key in the predicate),
the size of the spill can be significantly reduced by an order of magnitude.
-- bad code
with recursive
agreement_chain as
(

select
AGREEMENT_PK,
PREVIOUS_AGREEMENT_PK,
counter_rec,
from foo
where is_first_agreement = 'y'
union all
select
t1.AGREEMENT_PK,
t2.AGREEMENT_PK as PREVIOUS_AGREEMENT_PK,
t2.counter_rec + 1,
from foo t1
inner join agreement_chain t2 on t2.AGREEMENT_PK = t1.PREVIOUS_AGREEMENT_PK
and t2.counter_rec < 10 -- ΠΎΠ³Ρ€Π°Π½ΠΈΡ‡Π΅Π½ΠΈΠ΅ ΠΏΠΎ рСкурсии Ρ‡Ρ‚ΠΎΠ±Ρ‹ Π½Π΅ ΡƒΠΉΡ‚ΠΈ Π² бСсконСчный Ρ†ΠΈΠΊΠ» / limitation on recursion to avoid an infinite loop

)

select
AGREEMENT_PK,
PREVIOUS_AGREEMENT_PK
from agreement_chain

-- good code
create table foo_mirror as select * from foo distribued by (PREVIOUS_AGREEMENT_PK);

with recursive
agreement_chain as
(

select
AGREEMENT_PK,
PREVIOUS_AGREEMENT_PK,
counter_rec,
from foo
where is_first_agreement = 'y'
union all
select
t1.AGREEMENT_PK,
t2.AGREEMENT_PK as PREVIOUS_AGREEMENT_PK,
t2.counter_rec + 1,
from foo_mirror t1
inner join agreement_chain t2 on t2.AGREEMENT_PK = t1.PREVIOUS_AGREEMENT_PK
and t2.counter_rec < 10

)

select
AGREEMENT_PK,
PREVIOUS_AGREEMENT_PK
from agreement_chain
Channel photo updated
Π‘Π΅ΠΊΡ€Π΅Ρ‚ 5 (Join Π±Π΅Π· ΠΊΠ»ΡŽΡ‡Π°)

Π•Ρ‰Π΅ ΠΎΠ΄ΠΈΠ½ ΠΏΡ€ΠΈΠΌΠ΅Ρ€ ΠΏΠΎΡ‚Π΅Π½Ρ†ΠΈΠ°Π»ΡŒΠ½ΠΎΠΉ Π±ΠΎΠΌΠ±Ρ‹ - это запросы Π±Π΅Π· '=' Π² ΠΏΡ€Π΅Π΄ΠΈΠΊΠ°Ρ‚Π΅ соСдинСния Ρ‚Π°Π±Π»ΠΈΡ†,
Π² частности Ρ‚Π°ΠΊΠΎΠΉ, Π³Π΄Π΅ ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠ΅Ρ‚ΡΡ опСрация between Π² join-Π΅ Ρ‚Π°Π±Π»ΠΈΡ† Π±Π΅Π· ΠΊΠ»ΡŽΡ‡Π°, которая Π² GP выполняСтся ΠΊΡ€Π°ΠΉΠ½Π΅ Π½Π΅ΠΎΠΏΡ‚ΠΈΠΌΠ°Π»ΡŒΠ½ΠΎ ( Ρ‡Π΅Ρ€Π΅Π· Nested Loop, Π° Π½Π΅ Ρ‡Π΅Ρ€Π΅Π· быстрый Hash Join )

НапримСр, Ссли Π² Ρ‚Π°Π±Π»ΠΈΡ†Π΅ ΠΏΡ€ΠΎΠ΄Π°ΠΆ foo отсутствуСт ΠΊΠ»ΡŽΡ‡ измСрСния dim_periods, прямоС Ρ€Π΅ΡˆΠ΅Π½ΠΈΠ΅ Π² Π»ΠΎΠ± Π²Ρ‹Ρ‚Π°Ρ‰ΠΈΡ‚ΡŒ ΠΏΠ΅Ρ€ΠΈΠΎΠ΄ для ΠΊΠ°ΠΆΠ΄ΠΎΠΉ строки foo - это Π»ΠΎΠ²ΡƒΡˆΠΊΠ°.

Однако, Ссли Π² foo ΡƒΡ€ΠΎΠ²Π΅Π½ΡŒ Π΄Π΅Ρ‚Π°Π»ΠΈΠ·Π°Ρ†ΠΈΠΈ Π΄Π°Π½Π½Ρ‹Ρ… - дСнь,
Ρ‚ΠΎ Π² ΠΎΠ±Ρ‰Π΅ΠΌ случаС число строк Π±ΡƒΠ΄Π΅Ρ‚ Π½Π° порядки большС числа Π΄Π½Π΅ΠΉ, Π² ΠΊΠΎΡ‚ΠΎΡ€Ρ‹Π΅ Π±Ρ‹Π»ΠΈ ΠΏΡ€ΠΎΠ΄Π°ΠΆΠΈ ΠΈ
Π² Ρ‚Π°ΠΊΠΎΠΌ случаС Π»ΠΎΠ²ΡƒΡˆΠΊΠΈ ΠΌΠΎΠΆΠ½ΠΎ ΠΈΠ·Π±Π΅ΠΆΠ°Ρ‚ΡŒ, привСдя запрос ΠΊ Ρ„ΠΎΡ€ΠΌΠ΅ equi-join:

Secret 5(Non-Equi Join)
Another example of a potential bomb is queries without '=' in the table join predicate.
Particularly, one that uses the between operation in a join of tables without a key, which in GP is performed extremely suboptimally (through Nested Loop, and not through fast Hash Join)

For example, if the sales table foo is missing the dim_periods dimension key, a straight forward solution to pull out period for each row of foo is a trap.

However, if foo has a data granularity of day,
then in the general case the number of rows will be orders of magnitude greater than the number of days in which there were sales and
in this case, the pitfall can be avoided by transforming the query in equi-join form:

--bad code (ΠΌΠ΅Ρ‚ΠΎΠ΄ соСдинСния/ join method - Nested Loop )
select foo.*, d.date_from, d.date_to
from foo
join dim_periods d on
foo.dt between d.date_from d.and date_to


--good code ( вмСсто NL ΠΎΠΏΡ‚ΠΈΠΌΠΈΠ·Π°Ρ‚ΠΎΡ€ ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠ΅Ρ‚ Π² ΠΈΡ‚ΠΎΠ³Π΅ Hash Join / Instead of NL optimizer finally uses Hash Join )
with a as
(select distinct dt from foo),
x as ( a.dt, s.* from dim_periods s, a
where a.dt between s.date_from s.and date_to ) -- ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠ΅ΠΌ NL лишь для ΠΌΠ°Π»ΠΎΠΉ Π΄ΠΎΠ»ΠΈ ΠΎΡ‚ foo / use NL path only for tiny % of foo

select foo.*, x.date_from, x.date_to
from foo join x on foo.dt = x.dt
πŸ‘3
Π‘Π΅ΠΊΡ€Π΅Ρ‚ 6 (ΠŸΡ€ΠΎΠ±Π»Π΅ΠΌΠ° пустой Ρ‚Π°Π±Π»ΠΈΡ†Ρ‹, Ρ‡.2)

Π‘Π΅ΠΊΡ€Π΅Ρ‚ 1 ΠΏΠΎΠ²Π΅Π΄Π°Π» Π½Π°ΠΌ, Ρ‡Ρ‚ΠΎ select ΠΈΠ· пустой Ρ‚Π°Π±Π»ΠΈΡ†Ρ‹ ΠΌΠΎΠΆΠ΅Ρ‚ ΠΏΠΎΠ΄Π²ΠΈΡΠ°Ρ‚ΡŒ, создавая спилл.
ΠžΠΊΠ°Π·Ρ‹Π²Π°Π΅Ρ‚ΡΡ, запрос ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΡŽΡ‰ΠΈΠΉ ΠΏΡƒΡΡ‚ΡƒΡŽ Ρ‚Π°Π±Π»ΠΈΡ†Ρƒ Π² качСствС Ρ„ΠΈΠ»ΡŒΡ‚Ρ€Π°, ΠΏΠΎ ΠΊΠΎΡ‚ΠΎΡ€ΠΎΠΉ Π½Π΅Ρ‚ статистики ΠΌΠΎΠΆΠ΅Ρ‚ ΡΠΎΠ·Π΄Π°Ρ‚ΡŒ Ρ‚Π΅ ΠΆΠ΅ ΠΏΡ€ΠΎΠ±Π»Π΅ΠΌΡ‹.

Π’ ΠΏΡ€ΠΈΠΌΠ΅Ρ€Π΅ Π½ΠΈΠΆΠ΅ ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠ΅Ρ‚ΡΡ Ρ€Π΅Π°Π»ΡŒΠ½Ρ‹ΠΉ кСйс Π² ΠΊΡ€ΡƒΠΏΠ½ΠΎΠΌ DWH Π½Π° сотни TB Π³Π΄Π΅ ΠΈΠ΄Π΅Ρ‚ select ΠΈΠ· интСрфСйсной Ρ„ΡƒΠ½ΠΊΡ†ΠΈΠΈ ΠΏΠΎ Π΄ΠΈΠ°ΠΏΠ°Π·ΠΎΠ½Ρƒ Π΄Π°Ρ‚ с Ρ„ΠΈΠ»ΡŒΡ‚Ρ€Π°Ρ†ΠΈΠ΅ΠΉ ΠΏΠΎ пустой Ρ‚Π°Π±Π»ΠΈΡ†Π΅.

Π‘Π±ΠΎΡ€ статистики ΠΏΠΎ пустой Ρ‚Π°Π±Π»ΠΈΡ†Π΅ foo Ρ€Π΅ΡˆΠ°Π΅Ρ‚ ΠΏΡ€ΠΎΠ±Π»Π΅ΠΌΡƒ.
К слову, Π² ΠΈΠ΄Π΅Π°Π»Π΅, Π½Π΅ стоит Π·Π»ΠΎΡƒΠΏΠΎΡ‚Ρ€Π΅Π±Π»ΡΡ‚ΡŒ использованиСм Ρ„ΡƒΠ½ΠΊΡ†ΠΈΠΉ вмСсто Ρ‚Π°Π±Π»ΠΈΡ† Π² запросах, Ρ‚.ΠΊ. Π² этом случаС ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠ΅Ρ‚ΡΡ Legacy optimizer ΠΎΡ‚ Postgres,
Π° Π½Π΅ Π½Π°Ρ‚ΠΈΠ²Π½Ρ‹ΠΉ GPORCA

Secret 6 (Empty table problem, part 2)

Secret 1 told us that select from an empty table can hang, creating a spill.
It turns out that a query using an empty table as a filter for which there are no statistics can also fall into the same problem.

The example below uses a real case in a large DWH for hundreds of TB, where there is a select from an interface function by a range of versions with filtering by an empty table.

Collecting statistics on an empty table foo solves the problem.
By the way, ideally, you should avoid the use of functions instead of tables in queries, because in this case the Legacy optimizer from Postgres is used, and not the native GPORCA.

select deal_rk
from fn_foo( p_dt_from :='2024-09-01', p_dt_to :='2024-09-02')
where 1 = 1
and src_cd = 'IMOEX'
and coalesce(add_info_04, '&&&') not like '%NON_RESIDENT%'
and id in (select risk_contract_id from foo) -- ΠΏΠ»ΠΎΡ…ΠΎΠΉ Ρ„ΠΈΠ»ΡŒΡ‚Ρ€ Ссли Π΄Π°ΠΆΠ΅ Ρ‚Π°Π±Π»ΠΈΡ†Π° пустая, Π½ΠΎ Π±Π΅Π· статистики / bad filter if even the table is empty but without statistics
πŸ‘3
Π‘Π΅ΠΊΡ€Π΅Ρ‚ 7(Π‘ΠΎΠ»ΠΈΠ²Π°Ρ€ Π½Π΅ Π²Ρ‹Π΄Π΅Ρ€ΠΆΠΈΡ‚ Π΄Π²ΠΎΠΈΡ… ΠΈΠ»ΠΈ ΠΏΡ€ΠΎΠ±Π»Π΅ΠΌΠ° пСрСкоса)
На днях словили Π½Π° ΠΏΡ€ΠΎΠ΄Π΅ Π½Π΅ΠΎΡ‡Π΅Π²ΠΈΠ΄Π½ΡƒΡŽ ΠΎΡˆΠΈΠ±ΠΊΡƒ ΠΏΡ€ΠΈ left join 2Ρ… Ρ‚Π°Π±Π»ΠΈΡ† ΡΠΎΠ²Π΅Ρ€ΡˆΠ΅Π½Π½ΠΎ ΠΎΠ±Ρ‹Ρ‡Π½Ρ‹Ρ… для нашСго Π₯Π” Ρ€Π°Π·ΠΌΠ΅Ρ€ΠΎΠ²
ΠŸΡ€ΠΈ этом Π°Π½Π°Π»ΠΎΠ³ΠΈΡ‡Π½Ρ‹ΠΉ запрос с inner join ΠΎΡ‚Ρ€Π°Π±Π°Ρ‚Ρ‹Π²Π°Π΅Ρ‚ Π±Π΅Π· ΠΏΡ€ΠΎΠ±Π»Π΅ΠΌ :

Secret 7 (Bolivar can't bring two or the problem of skew)
Recently we caught a non-obvious error in production when left joining 2 tables of completely normal sizes for our data warehouse.
At the same time, a similar query with inner join works without problems:

-- bad query
select a.*, b.address
from small_tbl a -- 7 ΠΌΠ»Π½ строк ( 7 mln rows )
LEFT JOIN big_tbl b -- 1 ΠΌΠ»Ρ€Π΄ строк ( 1 bln rows )
on a.addr_fk = b.addr_pk
ERROR: Canceling query because of high VMEM usage.

-- friendly query
select a.*, b.address
from small_tbl a JOIN big_tbl b
on a.addr_fk = b.addr_pk

ВскрытиС ΠΏΠΎΠΊΠ°Π·Π°Π»ΠΎ, Ρ‡Ρ‚ΠΎ Π² Ρ‚Π°Π±Π»ΠΈΡ†Π΅ small_tbl ΠΏΠΎΠ»ΠΎΠ²ΠΈΠ½Π° ΠΊΠ»ΡŽΡ‡Π΅ΠΉ оказались NULL + addr_fk являСтся ΠΊΠ»ΡŽΡ‡ΠΎΠΌ распрСдСлСния Π² Ρ‚Π°Π±Π»ΠΈΡ†Π΅.
ΠŸΡ€ΠΎΠ±Π»Π΅ΠΌΡƒ Ρ€Π΅ΡˆΠΈΠ»ΠΈ ΠΌΠ°Π»ΠΎΠΉ ΠΊΡ€ΠΎΠ²ΡŒΡŽ, Π·Π°ΠΌΠ΅Π½ΠΈΠ² left join объСдинСниСм Π΄Π²ΡƒΡ… мноТСств:

The analysis showed that in the small_tbl table half of the keys were NULL + addr_fk is the distribution key in the table. The problem was solved with little effort by replacing left join with a union of two sets:
select a.*, b.address
from small_tbl a
left join big_tbl b
on a.addr_fk = b.addr_pk
where a.addr_fk is not null
union all
select a.*, null
from small_tbl a
where a.addr_fk is null
Π‘Π΅ΠΊΡ€Π΅Ρ‚ 8(Бколько вСсит Ρ‚Π°Π±Π»ΠΈΡ†Π° .. ΠΈΠ»ΠΈ ΠΎ Π΄ΠΎΠ±Ρ€ΠΎΠΌ cross-join Π·Π°ΠΌΠΎΠ»Π²ΠΈΡ‚Π΅ слово!)

Как-Ρ‚ΠΎ наш DBA ΠΏΠΎΠ΄Π³ΠΎΡ‚ΠΎΠ²ΠΈΠ» ΠΎΡ‚Ρ‡Π΅Ρ‚ ΠΎ Ρ€Π°Π·ΠΌΠ΅Ρ€Π΅ Ρ‚Π°Π±Π»ΠΈΡ† ΠΏΠΎ Ρ‚Ρ€Π΅Π±ΡƒΠ΅ΠΌΠΎΠΌΡƒ списку ΠΈ ΠΊΡ€Π°ΠΉΠ½Π΅ удивился.
Оказалось, Ρ‡Ρ‚ΠΎ функция pg_relation_size для вычислСния Ρ€Π°Π·ΠΌΠ΅Ρ€Π° Ρ‚Π°Π±Π»ΠΈΡ†Ρ‹ Π΄Π°Π΅Ρ‚ Ρ€Π°Π·Π½Ρ‹ΠΉ Ρ€Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚ Π² зависимости ΠΎΡ‚ контСкста Π΅Π΅ использования.

ΠŸΡ€ΠΎΠ²Π΅Π΄Π΅ΠΌ простой экспСримСнт
Secret 8 (How much does a table weigh... or say a word about a good cross-join!)
Once our DBA prepared a report on the size of tables for the required list and was extremely surprised.
It turned out that the pg_relation_size function for calculating the size of a table gives different results depending on the context of its use.
Let's conduct a simple experiment

create table public.foo   WITH (appendonly=true,orientation=column,compresstype=zstd,compresslevel=1)
as select generate_series(1,1000000) n;

ΠžΠΏΡ€Π΅Π΄Π΅Π»ΠΈΠΌ Ρ€Π°Π·ΠΌΠ΅Ρ€ : Let's determine its size:
SQL> select pg_relation_size('public.foo')

pg_relation_size
------------------------------
3 209 024 Π±Π°ΠΉΡ‚

Π­Ρ‚ΠΎ ΠΊΠΎΡ€Ρ€Π΅ΠΊΡ‚Π½Ρ‹ΠΉ Ρ€Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚.
Π‘ΠΎΠ·Π΄Π°Π΄ΠΈΠΌ список ΠΈΠ· 1 элСмСнта с Π½Π°Π·Π²Π°Π½ΠΈΠ΅ΠΌ Ρ‚Π°Π±Π»ΠΈΡ†Ρ‹ Π²Ρ‹ΡˆΠ΅
This is the correct result.
Let's create a list of 1 element with the table name above

create table public.foo_2
as select 'public.foo'::text as tbl_nm;

ΠžΠΏΡ€Π΅Π΄Π΅Π»ΠΈΠΌ Ρ€Π°Π·ΠΌΠ΅Ρ€ Ρ‚Π°Π±Π»ΠΈΡ† списка: Let's determine its size:
SQL> select pg_relation_size(tbl_nm) from public.foo_2

pg_relation_size
------------------------------
4 640 Π±Π°ΠΉΡ‚

Как ΠΆΠ΅ Ρ‚Π°ΠΊ ?
Π”Π΅Π»ΠΎ Π² Ρ‚ΠΎΠΌ, Ρ‡Ρ‚ΠΎ запрос ΠΏΠΎΠΊΠ°Π·Ρ‹Π²Π°Π΅Ρ‚ Ρ€Π°Π·ΠΌΠ΅Ρ€ Π½Π° Ρ‚ΠΎΠΌ сСгмСнтС, Π³Π΄Π΅ Π»Π΅ΠΆΠΈΡ‚ строчка с Π΅Π΅ Π½Π°Π·Π²Π°Π½ΠΈΠ΅ΠΌ Π² public.foo_2.
РСшСниСм Π² Π»ΠΎΠ± Π±ΡƒΠ΄Π΅Ρ‚ join cписка c pg_class
The point is that the query shows the size of the public.foo part on the segment where the line with its name in public.foo_2 is located.
The straightforward solution would be to join the list with pg_class

select pg_relation_size(t.tbl_nm) from
pg_class c, pg_namespace s, public.foo_2 t
where 1=1
and s.oid = c.relnamespace
and s.nspname || '.' || relname = t.tbl_nm

Однако ΠΎΠ½ΠΎ Π½Π΅ самоС ΠΎΠΏΡ‚ΠΈΠΌΠ°Π»ΡŒΠ½ΠΎΠ΅, Ρ‚.ΠΊ. сильно Π½Π°Π³Ρ€ΡƒΠΆΠ°Π΅Ρ‚ ΠΊΠ°Ρ‚Π°Π»ΠΎΠ³.
БущСствуСт Π±ΠΎΠ»Π΅Π΅ элСгантный Π²Π°Ρ€ΠΈΠ°Π½Ρ‚, ΠΊΠΎΡ‚ΠΎΡ€Ρ‹ΠΉ стоит смСло Π±Ρ€Π°Ρ‚ΡŒ Π½Π° Π²ΠΎΠΎΡ€ΡƒΠΆΠ΅Π½ΠΈΠ΅.

However, it is not the most optimal, as it heavily loads the catalog.
There is a more elegant option that is A MUST!

SQL> select sum(pg_relation_size(y.tbl_nm || decode(q.content, q.content, '')))
from public.foo_2 y
cross join gp_dist_random('gp_id') q

sum
------------------------------
3 209 024 Π±Π°ΠΉΡ‚
πŸ‘2