马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。
您需要 登录 才可以下载或查看,没有账号?立即注册
x
摘要:本文就针对因USING子句的书写方式可能导致MERGE INTO语句的执行不下推的场景,对USING子句的SQL语句进行改写一遍,整个SQL语句可以下推。 本文分享自华为云社区《GaussDB(DWS)运维 -- values子句做MERGE数据源导致SQL执行不下推的改写方案》,作者: 譡里个檔。
现网做实时接入的时候,有的时候会使用MERGE INTO语句实现类似UPSERT的功能。这种场景下MERGE INTO语句的USING部分的数据位VALUES子句,为了后续的SQL语句中描述方便,需要对VALUES子句的输出命名别名。USING子句的书写方式可能导致MERGE INTO语句的执行不下推,本文就针对因此导致的不下推的场景,对USING子句的SQL语句进行改写一遍,整个SQL语句可以下推。
预置条件
- CREATE TABLE t1(name text, id INT) DISTRIBUTE BY HASH(id);
复制代码 原始语句
- MERGE INTO t1 USING (
- SELECT *
- FROM (VALUES ('json', 1), ('sam', 2)) AS val(name, id)
- ) tmp ON (t1.id = tmp.id)
- WHEN MATCHED THEN
- UPDATE SET t1.name = tmp.name
- WHEN NOT MATCHED THEN
- INSERT (name, id) VALUES(tmp.name, tmp.id);
复制代码 SQL语句不下推,导致执行低效
- postgres=# EXPLAIN VERBOSE MERGE INTO t1 USING (
- postgres(# SELECT *
- postgres(# FROM (VALUES ('json', 1), ('sam', 2)) AS val(name, id)
- postgres(# ) tmp ON (t1.id = tmp.id)
- postgres-# WHEN MATCHED THEN
- postgres-# UPDATE SET t1.name = tmp.name
- postgres-# WHEN NOT MATCHED THEN
- postgres-# INSERT (name, id) VALUES(tmp.name, tmp.id);
- QUERY PLAN
- -------------------------------------------------------------------------------------------------------------------------------------------------------------------
- id | operation | E-rows | E-distinct | E-width | E-costs
- ----+-------------------------------------------------------+--------+------------+---------+---------
- 1 | -> Merge on public.t1 | 2 | | 54 | 0.08
- 2 | -> Nested Loop Left Join (3, 4) | 2 | | 54 | 0.08
- 3 | -> Values Scan on "*VALUES*" | 2 | | 36 | 0.03
- 4 | -> Data Node Scan on t1 "_REMOTE_TABLE_QUERY_" | 2 | | 18 | 0.00
- SQL Diagnostic Information
- ------------------------------------------------------------
- SQL is not plan-shipping
- reason: Type of Record in non-real table can not be shipped
- Predicate Information (identified by plan id)
- -------------------------------------------------
- 1 --Merge on public.t1
- Node expr: : $10
- 2 --Nested Loop Left Join (3, 4)
- Join Filter: (t1.id = "*VALUES*".column2)
- Targetlist Information (identified by plan id)
- -----------------------------------------------------------------------------------------------------------------------------------------------------------------
- 1 --Merge on public.t1
- Node/s: All datanodes
- Remote query: UPDATE ONLY public.t1 SET name = $7, id = $8 WHERE t1.ctid = $5 AND t1.xc_node_id = $6
- Node/s: All datanodes
- Remote query: INSERT INTO public.t1 (name, id) VALUES ($9, $10)
- 2 --Nested Loop Left Join (3, 4)
- Output: "*VALUES*".column1, "*VALUES*".column2, t1.name, t1.id, t1.ctid, t1.xc_node_id, "*VALUES*".column1, t1.id, "*VALUES*".column1, "*VALUES*".column2
- 3 --Values Scan on "*VALUES*"
- Output: "*VALUES*".column1, "*VALUES*".column2
- 4 --Data Node Scan on t1 "_REMOTE_TABLE_QUERY_"
- Output: t1.name, t1.id, t1.ctid, t1.xc_node_id
- Node/s: All datanodes
- Remote query: SELECT name, id, ctid, xc_node_id FROM ONLY public.t1 WHERE true
- ====== Query Summary =====
- --------------------------
- Parser runtime: 0.079 ms
- Planner runtime: 1.392 ms
- Unique SQL Id: 1657855173
- (40 rows)
复制代码 改写方案
- MERGE INTO t1 USING (
- WITH val(name, id) AS(
- VALUES ('json', 1), ('sam', 2)
- )
- SELECT * FROM val
- ) tmp ON (t1.id = tmp.id)
- WHEN MATCHED THEN
- UPDATE SET t1.name = tmp.name
- WHEN NOT MATCHED THEN
- INSERT (name, id) VALUES(tmp.name, tmp.id);
复制代码 改写后下推
- postgres=# EXPLAIN VERBOSE MERGE INTO t1 USING (
- postgres(# WITH val(name, id) AS(
- postgres(# VALUES ('json', 1), ('sam', 2)
- postgres(# )
- postgres(# SELECT * FROM val
- postgres(# ) tmp ON (t1.id = tmp.id)
- postgres-# WHEN MATCHED THEN
- postgres-# UPDATE SET t1.name = tmp.name
- postgres-# WHEN NOT MATCHED THEN
- postgres-# INSERT (name, id) VALUES(tmp.name, tmp.id);
- QUERY PLAN
- ------------------------------------------------------------------------------------------------------------------------------------------------------
- id | operation | E-rows | E-distinct | E-memory | E-width | E-costs
- ----+----------------------------------------------+--------+------------+----------+---------+---------
- 1 | -> Streaming (type: GATHER) | 1 | | | 54 | 1.56
- 2 | -> Merge on public.t1 | 2 | | | 54 | 1.15
- 3 | -> Streaming(type: REDISTRIBUTE) | 2 | | 2MB | 54 | 1.15
- 4 | -> Nested Loop Left Join (5, 7) | 2 | | 1MB | 54 | 1.11
- 5 | -> Subquery Scan on tmp | 2 | | 1MB | 36 | 0.08
- 6 | -> Values Scan on "*VALUES*" | 24 | | 1MB | 36 | 0.03
- 7 | -> Seq Scan on public.t1 | 2 | | 1MB | 18 | 1.01
- Predicate Information (identified by plan id)
- ---------------------------------------------
- 4 --Nested Loop Left Join (5, 7)
- Join Filter: (t1.id = tmp.id)
- 5 --Subquery Scan on tmp
- Filter: (Hash By tmp.id)
- Targetlist Information (identified by plan id)
- ----------------------------------------------------------------------------------------------------------------------------------------------------
- 1 --Streaming (type: GATHER)
- Node/s: All datanodes
- 3 --Streaming(type: REDISTRIBUTE)
- Output: tmp.name, tmp.id, t1.name, t1.id, t1.ctid, t1.xc_node_id, tmp.name, tmp.id, (CASE WHEN (t1.ctid IS NULL) THEN tmp.id ELSE t1.id END)
- Distribute Key: (CASE WHEN (t1.ctid IS NULL) THEN tmp.id ELSE t1.id END)
- Spawn on: All datanodes
- Consumer Nodes: All datanodes
- 4 --Nested Loop Left Join (5, 7)
- Output: tmp.name, tmp.id, t1.name, t1.id, t1.ctid, t1.xc_node_id, tmp.name, tmp.id, CASE WHEN (t1.ctid IS NULL) THEN tmp.id ELSE t1.id END
- 5 --Subquery Scan on tmp
- Output: tmp.name, tmp.id
- 6 --Values Scan on "*VALUES*"
- Output: "*VALUES*".column1, "*VALUES*".column2
- 7 --Seq Scan on public.t1
- Output: t1.name, t1.id, t1.ctid, t1.xc_node_id
- Distribute Key: t1.id
- ====== Query Summary =====
- -------------------------------
- System available mem: 3112960KB
- Query Max mem: 3112960KB
- Query estimated mem: 6336KB
- Parser runtime: 0.107 ms
- Planner runtime: 1.185 ms
- Unique SQL Id: 780461632
- (44 rows)
复制代码
点击关注,第一时间了解华为云新鲜技术~
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作! |