Greenplum secrets🎩

Channel created

09:00

The channel about best practice coding for Greenplum

558 views09:03

Канал по сути будет в формате "Программирование Greenplum для чайников", где буду делиться
рецептами как не стоит писать код в данной СУБД класса MPP ( массивно-параллельная обработка), итак поехали!

The channel is a kind of "Greenplum for Dummies" where author will share pieces of advice
how to create optimal code in MPP database, let's go!

👍4

557 viewsedited 10:29

Greenplum secrets🎩

Секрет 1 (Проблема пустой таблицы)
Даже select из пустой таблицы с фильтром по тяжелой таблице может зависнуть, вызвав спилл (утечка памяти)

Secret 1 (Empty table surprise)
Even empty table query filtered by heavy table may run slow and create spill

--bad code:
SELECT t.a, t.b, t.c
FROM foo t
where t.key not in (select pk from big_tbl)

--good code:
with a as
             (select pk from big_tbl intersect select key from foo)
SELECT t.a, t.b, t.c
from foo t
where t.key not in (select pk from a)

560 viewsedited 10:30

Greenplum secrets🎩

Секрет 2 (Ускоряем GROUP BY)
Обычный GROUP BY при наличии большого числа метрик можно ускорить, используя CTE, также снизив размер спилла

Secret 2(Speed up GROUP BY)
Simple GROUP BY operator with a lot of aggregates can be boosted using CTE (cutting spill usage as well)

--sub-optimal code:
select dim1, dim2, dim3, sum(m1), sum(m2), ..., sum(m50)
from foo
group by 1, 2, 3

--good code
with a as
         (select dim1, dim2, dim3, sum(m1), sum(m2), ..., sum(m25)
          group by 1, 2, 3),
     b as
         (select dim1, dim2, dim3, sum(m26) m26_agg, sum(m27) m27_agg, ..., sum(m50) m50_agg
          group by 1, 2, 3)
select a.*, b.m26_agg, b.m27_agg, ..., b.m50_agg
from a, b
where a.dim(1, 2, 3) = b.dim(1, 2, 3) -- краткая форма записи для join-а по всем измерениям /  short form of a.dim1 = b.dim1 and ... and a.dim3 = b.dim3

🔥1

568 viewsedited 14:04

Greenplum secrets🎩

Секрет 3 (Вездесущая статистика)
Частой причиной зависания запроса и/или генерации спилла является отсутствие статистики по таблицам запроса.
Если ваш ETL основан на PL/pgSQL функциях, то режимом сбора статистики ( если не задан на уровне GUC - Global User Configuration )
рулит параметр gp_autostats_mode_in_functions, который должен быть установлен в самой функции ( не до ее вызова )

Secret 3 (Omnipresent statistics)
A common reason for a query to hang and/or generate a spill is the lack of statistics on the query tables.
If your ETL is built on PL/pgSQL functions, then the statistics gather mode ( if not set by GUC - Global User Configuration )
is managed by gp_autostats_mode_in_functions parameter which must be set in the function itself ( otherwise it doesn't matter )

CREATE OR REPLACE FUNCTION public.fn_foo()
RETURNS boolean
LANGUAGE plpgsql
AS $function$
    begin
      set gp_autostats_mode_in_functions = on_no_stats; -- безуслловный сбор статистики по таблицам, модифицируемым в функции / always gather stat for all tables modified by function
  -- Другие значения / other values: 
  --   none - не собирать статистику / don't gather stat, 
  --   on_change - собирать статистику если число измененных записей превысило заданный порог  / gather stat if  the number of modified records exceeds specified threshold
  ...

👍3🔥1

1.06K viewsedited 11:45

Greenplum secrets🎩

Секрет 4 (Осторожно - Рекурсия!)

Следует избегать рекурсивных запросов , т.к. join таблицы с ней же несовместим с концепцией shared-nothing.
Иными словами, такая операция приводит к тиражированию таблицы на каждый узел класера, т.к.
не может выполняться локально (только на своем) ввиду распределения таблицы только по одному из ключей предиката запроса.
Однако, если join-ить таблицу не саму на себя ( рапределенную по AGREEMENT_PK в примере ниже), а с ее копией, распределенной по PREVIOUS_AGREEMENT_PK( 2е плечо предиката),
размер спилла можно заметно сократить на порядок.

Secret 4 (Beware - Recursion!)

Recursive queries should be avoided, because joining a table with itself is incompatible with the shared-nothing concept.
In other words, such an operation leads to replicating the table to each cluster node, because it cannot be performed locally (only on its own) due to the distribution of the table only by one of the keys of the query predicate.
However, If you join the table not to itself (distributed by AGREEMENT_PK in the example below), but with its copy distributed by PREVIOUS_AGREEMENT_PK (the parent key in the predicate),
the size of the spill can be significantly reduced by an order of magnitude.

579 viewsedited 11:15

Greenplum secrets🎩

-- bad code
with recursive
                                agreement_chain as
                                (

                                select
                                    AGREEMENT_PK,
                                    PREVIOUS_AGREEMENT_PK,
                                    counter_rec,
                                from foo 
                                where is_first_agreement = 'y'
                                union all
                                select
                                    t1.AGREEMENT_PK, 
                                    t2.AGREEMENT_PK as PREVIOUS_AGREEMENT_PK,
                                    t2.counter_rec + 1,
                                from foo t1
                                inner join  agreement_chain t2  on t2.AGREEMENT_PK = t1.PREVIOUS_AGREEMENT_PK
                                                                and t2.counter_rec < 10 -- ограничение по рекурсии чтобы не уйти в бесконечный цикл / limitation on recursion to avoid an infinite loop
                                                              
                                )

                                select
                                    AGREEMENT_PK,
                                    PREVIOUS_AGREEMENT_PK
                                from agreement_chain 
                
-- good code
create table foo_mirror as select * from foo distribued by (PREVIOUS_AGREEMENT_PK);              
                
with recursive
                                agreement_chain as
                                (

                                select
                                    AGREEMENT_PK,
                                    PREVIOUS_AGREEMENT_PK,
                                    counter_rec,
                                from foo 
                                where is_first_agreement = 'y'
                                union all
                                select
                                    t1.AGREEMENT_PK, 
                                    t2.AGREEMENT_PK as PREVIOUS_AGREEMENT_PK,
                                    t2.counter_rec + 1,
                                from foo_mirror t1
                                inner join  agreement_chain t2  on t2.AGREEMENT_PK = t1.PREVIOUS_AGREEMENT_PK
                                                                and t2.counter_rec < 10
                                                              
                                )

                                select
                                    AGREEMENT_PK,
                                    PREVIOUS_AGREEMENT_PK
                                from agreement_chain

594 viewsedited 11:16

Greenplum secrets🎩

Channel photo updated

15:13

Greenplum secrets🎩

Секрет 5 (Join без ключа)

Еще один пример потенциальной бомбы - это запросы без '=' в предикате соединения таблиц,
в частности такой, где используется операция between в join-е таблиц без ключа, которая в GP выполняется крайне неоптимально ( через Nested Loop, а не через быстрый Hash Join )

Например, если в таблице продаж foo отсутствует ключ измерения dim_periods, прямое решение в лоб вытащить период для каждой строки foo - это ловушка.

Однако, если в foo уровень детализации данных - день,
то в общем случае число строк будет на порядки больше числа дней, в которые были продажи и
в таком случае ловушки можно избежать, приведя запрос к форме equi-join:

Secret 5(Non-Equi Join)
Another example of a potential bomb is queries without '=' in the table join predicate.
Particularly, one that uses the between operation in a join of tables without a key, which in GP is performed extremely suboptimally (through Nested Loop, and not through fast Hash Join)

For example, if the sales table foo is missing the dim_periods dimension key, a straight forward solution to pull out period for each row of foo is a trap.

However, if foo has a data granularity of day,
then in the general case the number of rows will be orders of magnitude greater than the number of days in which there were sales and
in this case, the pitfall can be avoided by transforming the query in equi-join form:

--bad code (метод соединения/ join method - Nested Loop )
select foo.*, d.date_from, d.date_to
from foo
         join dim_periods d on
    foo.dt between d.date_from d.and date_to


--good code ( вместо NL оптимизатор использует в итоге  Hash Join / Instead of NL optimizer finally uses Hash Join )
with a as 
(select distinct dt from foo),
x as ( a.dt, s.* from dim_periods s, a 
where a.dt between s.date_from s.and date_to ) -- используем NL лишь для малой доли от foo / use NL path only for tiny % of foo

select foo.*, x.date_from, x.date_to
from foo join x on foo.dt = x.dt

👍3

695 viewsedited 15:00

Greenplum secrets🎩

Секрет 6 (Проблема пустой таблицы, ч.2)

Секрет 1 поведал нам, что select из пустой таблицы может подвисать, создавая спилл.
Оказывается, запрос использующий пустую таблицу в качестве фильтра, по которой нет статистики может создать те же проблемы.

В примере ниже используется реальный кейс в крупном DWH на сотни TB где идет select из интерфейсной функции по диапазону дат с фильтрацией по пустой таблице.

Сбор статистики по пустой таблице foo решает проблему.
К слову, в идеале, не стоит злоупотреблять использованием функций вместо таблиц в запросах, т.к. в этом случае используется Legacy optimizer от Postgres,
а не нативный GPORCA

Secret 6 (Empty table problem, part 2)

Secret 1 told us that select from an empty table can hang, creating a spill.
It turns out that a query using an empty table as a filter for which there are no statistics can also fall into the same problem.

The example below uses a real case in a large DWH for hundreds of TB, where there is a select from an interface function by a range of versions with filtering by an empty table.

Collecting statistics on an empty table foo solves the problem.
By the way, ideally, you should avoid the use of functions instead of tables in queries, because in this case the Legacy optimizer from Postgres is used, and not the native GPORCA.

select deal_rk
      from fn_foo( p_dt_from :='2024-09-01', p_dt_to :='2024-09-02')
      where 1 = 1
        and src_cd = 'IMOEX'
        and coalesce(add_info_04, '&&&') not like '%NON_RESIDENT%'
        and id in (select risk_contract_id from foo) -- плохой фильтр если даже таблица пустая, но без статистики / bad filter if even the table is empty but without statistics

👍3

731 viewsedited 23:51

Greenplum secrets🎩

Секрет 7(Боливар не выдержит двоих или проблема перекоса)
На днях словили на проде неочевидную ошибку при left join 2х таблиц совершенно обычных для нашего ХД размеров
При этом аналогичный запрос с inner join отрабатывает без проблем :

Secret 7 (Bolivar can't bring two or the problem of skew)
Recently we caught a non-obvious error in production when left joining 2 tables of completely normal sizes for our data warehouse.
At the same time, a similar query with inner join works without problems:

-- bad query
select a.*, b.address
from small_tbl a -- 7 млн строк ( 7 mln rows )
LEFT JOIN big_tbl b -- 1 млрд строк ( 1 bln rows )
on a.addr_fk = b.addr_pk
ERROR: Canceling query because of high VMEM usage.

-- friendly query
select a.*, b.address
from small_tbl a JOIN big_tbl b
on a.addr_fk = b.addr_pk

Вскрытие показало, что в таблице small_tbl половина ключей оказались NULL + addr_fk является ключом распределения в таблице.
Проблему решили малой кровью, заменив left join объединением двух множеств:

The analysis showed that in the small_tbl table half of the keys were NULL + addr_fk is the distribution key in the table. The problem was solved with little effort by replacing left join with a union of two sets:

select a.*, b.address
from small_tbl a
left join big_tbl b
on a.addr_fk = b.addr_pk
where a.addr_fk is not null
union all
select a.*, null 
from small_tbl a
where a.addr_fk is null

768 viewsedited 21:40

Greenplum secrets🎩

Секрет 8(Сколько весит таблица .. или о добром cross-join замолвите слово!)

Как-то наш DBA подготовил отчет о размере таблиц по требуемому списку и крайне удивился.
Оказалось, что функция pg_relation_size для вычисления размера таблицы дает разный результат в зависимости от контекста ее использования.

Проведем простой эксперимент
Secret 8 (How much does a table weigh... or say a word about a good cross-join!)
Once our DBA prepared a report on the size of tables for the required list and was extremely surprised.
It turned out that the pg_relation_size function for calculating the size of a table gives different results depending on the context of its use.
Let's conduct a simple experiment

create table public.foo   WITH (appendonly=true,orientation=column,compresstype=zstd,compresslevel=1)
as select generate_series(1,1000000) n;

Определим размер : Let's determine its size:

SQL> select pg_relation_size('public.foo')

pg_relation_size
------------------------------
3 209 024 байт

Это корректный результат.
Создадим список из 1 элемента с названием таблицы выше
This is the correct result.
Let's create a list of 1 element with the table name above

create table public.foo_2
as select 'public.foo'::text as tbl_nm;

Определим размер таблиц списка: Let's determine its size:

SQL> select pg_relation_size(tbl_nm) from public.foo_2

pg_relation_size
------------------------------
4 640 байт

Как же так ?
Дело в том, что запрос показывает размер на том сегменте, где лежит строчка с ее названием в public.foo_2.
Решением в лоб будет join cписка c pg_class
The point is that the query shows the size of the public.foo part on the segment where the line with its name in public.foo_2 is located.
The straightforward solution would be to join the list with pg_class

select pg_relation_size(t.tbl_nm) from
pg_class c, pg_namespace s, public.foo_2 t
where 1=1
and s.oid = c.relnamespace
and s.nspname || '.' || relname = t.tbl_nm

Однако оно не самое оптимальное, т.к. сильно нагружает каталог.
Существует более элегантный вариант, который стоит смело брать на вооружение.

However, it is not the most optimal, as it heavily loads the catalog.
There is a more elegant option that is A MUST!

SQL> select sum(pg_relation_size(y.tbl_nm || decode(q.content, q.content, '')))
from public.foo_2 y
cross join gp_dist_random('gp_id') q

sum
------------------------------
3 209 024 байт

👍2

1.79K viewsedited 13:53

About

Blog

Apps

Platform