[20250103]使用递归实现distinct功能.txt
--//生产系统遇到实际上许多条类似语句,顺便拿此中几个出来,至心不知道开发如何学计算机的。
1.问题提出:
SYS@127.0.0.1:9106/xtdb/xtdb2> @ sql_id c29undaquszs6
-- SQL_ID = c29undaquszs6 come from shared pool
select distinct ritem from routine2;
SYS@127.0.0.1:9106/xtdb/xtdb2> @ sql_id fhz2xwcnx2uyh
-- SQL_ID = fhz2xwcnx2uyh come from shared pool
select distinct rtype from routine2;
SYS@127.0.0.1:9106/xtdb/xtdb2> @ seg2 nis5.routine2 ''
SEG_MB OWNER SEGMENT_NAME SEGMENT_TYPE SEG_TABLESPACE_NAME BLOCKS HDRFIL HDRBLK
------ ----- ------------ ------------ ------------------- ------ ------ ------
1088 NIS5 ROUTINE2 TABLE NIS 139264 21 843528
SYS@127.0.0.1:9106/xtdb/xtdb2> @ desczz nis5.routine2 ritem,rtype
eXtended describe of nis5.routine2
DISPLAY TABLE_NAME OF COLUMN_NAME INFORMATION.
INPUT OWNER.TABLE_NAME
SAMPLE : @ desczz TABLE_NAME column_name1,column_name2
IF NOT INPUT ,USE "" .
Owner Table_Name SAMPLE_SIZE LAST_ANALYZED Col# Column Name Null? Type NUM_DISTINCT Density NUM_NULLS HISTOGRAM NUM_BUCKETS Low_value High_value
----- ---------- ----------- ------------------- ---- ----------- ----- ------------ ------------ -------------- ---------- --------- ----------- -------------------- ----------------------
NIS5 ROUTINE2 4861544 2025-01-01 22:01:25 7 RTYPE VARCHAR2(40) 36 .00000010285 0 FREQUENCY 36 13种呼吸道病原体检测 真菌三项病原体核酸检测
4861544 2025-01-01 22:01:25 8 RITEM VARCHAR2(40) 32 .00000010285 0 FREQUENCY 32 13种呼吸道病原体检测 中性粒细胞
--//表ROUTINE2 1088M,而RTYPE,RITEM的差别值分布为36,32。要优化上面的sql语句分布建立对应索引简直就是资源浪费。
--//而且类似的sql语句另有一大堆,不想贴出来了,这样的项目简直就是豆腐渣中的豆腐渣工程。
--//上该公司网站,我至心不知道这样的产品竟然到处在买,简直就是垃圾,站在优化的角度又是一个豆腐渣工程。
--//要优化它,只能建立索引,还好NUM_NULLS=0,要给字段加not null约束,然后建立相应索引。
--//执行计划可以猜测出来走快速全索引扫描(需要另外排序),或者走全索引扫描(不需要排序).
--//我在想实际上每个值仅仅返回1行,是否可以通过递归实现类似功能,这样可以减少逻辑读,规避快速全索引扫描或者全索引扫描带
--//来的逻辑读,这样表索引应该也不小,即使选择压缩模式。
--//通过例子验证我的想法。
2.情况:
SCOTT@book> @ ver1
PORT_STRING VERSION BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx 11.2.0.4.0 Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
create table t as select * from all_objects;
create index i_t_owner on t(owner) COMPRESS 1;
--//alter table t modify owner not null;这步不需要。
--//分析略。
3.测试:
SCOTT@book> @ sl all
alter session set statistics_level = all;
Session altered.
SCOTT@book> select distinct owner from t;
OWNER
------------------------------
OWBSYS_AUDIT
MDSYS
CTXSYS
FLOWS_FILES
HR
OLAPSYS
OUTLN
OWBSYS
PUBLIC
APEX_030200
EXFSYS
ORACLE_OCM
SCOTT
SYSTEM
DBSNMP
OE
ORDPLUGINS
ORDSYS
PM
SH
SYSMAN
APPQOSSYS
BI
IX
ORDDATA
XDB
SI_INFORMTN_SCHEMA
SYS
WMSYS
29 rows selected.
SCOTT@book> @ dpc '' '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID g3ywa5u5raj7d, child number 0
-------------------------------------
select distinct owner from t
Plan hash value: 4043955095
--------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 40 (100)| | 29 |00:00:00.04 | 138 | | | |
| 1 | HASH UNIQUE | | 1 | 29 | 174 | 40 (8)| 00:00:01 | 29 |00:00:00.04 | 138 | 5686K| 1858K| 1246K (0)|
| 2 | INDEX FAST FULL SCAN| I_T_OWNER | 1 | 84801 | 496K| 37 (0)| 00:00:01 | 84801 |00:00:00.02 | 138 | | | |
--------------------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$1
2 - SEL$1 / T@SEL$1
--//oracle选择快速全扫描,然后hash unique排序,逻辑读138.
4.使用递归方式:
WITH t_rec (owner) AS (
SELECT MIN (owner) FROM t
UNION ALL
SELECT (SELECT MIN (owner) FROM t WHERE owner > c.owner) FROM t_rec c WHERE c.owner IS NOT NULL)
SELECT * FROM t_rec;
OWNER
------------------------------
APEX_030200
APPQOSSYS
BI
CTXSYS
DBSNMP
EXFSYS
FLOWS_FILES
HR
IX
MDSYS
OE
OLAPSYS
ORACLE_OCM
ORDDATA
ORDPLUGINS
ORDSYS
OUTLN
OWBSYS
OWBSYS_AUDIT
PM
PUBLIC
SCOTT
SH
SI_INFORMTN_SCHEMA
SYS
SYSMAN
SYSTEM
WMSYS
XDB
30 rows selected.
--//留意这样查询返回一个空行.
SCOTT@book> @ dpc '' '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID 46c0waacwrkh6, child number 0
-------------------------------------
WITH t_rec (owner) AS ( SELECT MIN (owner) FROM t UNION ALL SELECT
(SELECT MIN (owner) FROM t WHERE owner > c.owner) FROM t_rec c WHERE
c.owner IS NOT NULL) SELECT * FROM t_rec
Plan hash value: 3372523748
-------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 4 (100)| | 30 |00:00:00.01 | 35 |
| 1 | VIEW | | 1 | 2 | 34 | 4 (0)| 00:00:01 | 30 |00:00:00.01 | 35 |
| 2 | UNION ALL (RECURSIVE WITH) BREADTH FIRST| | 1 | | | | | 30 |00:00:00.01 | 35 |
| 3 | SORT AGGREGATE | | 1 | 1 | 6 | | | 1 |00:00:00.01 | 2 |
| 4 | INDEX FULL SCAN (MIN/MAX) | I_T_OWNER | 1 | 1 | 6 | 2 (0)| 00:00:01 | 1 |00:00:00.01 | 2 |
| 5 | SORT AGGREGATE | | 29 | 1 | 6 | | | 29 |00:00:00.01 | 33 |
| 6 | FIRST ROW | | 29 | 1 | 6 | 2 (0)| 00:00:01 | 28 |00:00:00.01 | 33 |
|* 7 | INDEX RANGE SCAN (MIN/MAX) | I_T_OWNER | 29 | 1 | 6 | 2 (0)| 00:00:01 | 28 |00:00:00.01 | 33 |
| 8 | RECURSIVE WITH PUMP | | 30 | | | | | 29 |00:00:00.01 | 0 |
-------------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SET$1 / T_REC@SEL$4
2 - SET$1
3 - SEL$1
4 - SEL$1 / T@SEL$1
5 - SEL$3
7 - SEL$3 / T@SEL$3
8 - SEL$2
Predicate Information (identified by operation id):
---------------------------------------------------
7 - access("OWNER">:B1)
--//充实使用取最大最小值,减少了逻辑读.我的表不大,逻辑读也就是35.如果返回记录许多的话效果也许没有这么好..
--//排除null应该修改如下:
WITH t_rec (ownerx) AS (
SELECT MIN (owner) ownerx FROM t
UNION ALL
SELECT (SELECT MIN (owner) FROM t WHERE owner > c.ownerx) ownerx FROM t_rec c WHERE c.ownerx IS NOT NULL)
SELECT * FROM t_rec where ownerx is not null;
--//如果存在空值呢?
SCOTT@book> alter table t modify owner null;
Table altered.
SCOTT@book> update t set owner=null where object_id=2;
1 row updated.
SCOTT@book> commit ;
Commit complete.
SCOTT@book> select distinct owner from t;
...
--//返回30行.
SCOTT@book> @ dpc '' '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID g3ywa5u5raj7d, child number 0
-------------------------------------
select distinct owner from t
Plan hash value: 1793979440
------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 340 (100)| | 30 |00:00:00.04 | 1215 | | | |
| 1 | HASH UNIQUE | | 1 | 29 | 174 | 340 (1)| 00:00:05 | 30 |00:00:00.04 | 1215 | 5686K| 1858K| 1246K (0)|
| 2 | TABLE ACCESS FULL| T | 1 | 84801 | 496K| 338 (1)| 00:00:05 | 84801 |00:00:00.02 | 1215 | | | |
------------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$1
2 - SEL$1 / T@SEL$1
--//owner索引不再有效,因为存在null值.
WITH t_rec (ownerx) AS (
SELECT MIN (owner) ownerx FROM t
UNION ALL
SELECT (SELECT MIN (owner) FROM t WHERE owner > c.ownerx) ownerx FROM t_rec c WHERE c.ownerx IS NOT NULL)
SELECT * FROM t_rec where ownerx is not null;
--//执行计划同上.返回29条.
WITH t_rec (ownerx) AS (
SELECT MIN (owner) ownerx FROM t
UNION ALL
SELECT (SELECT MIN (owner) FROM t WHERE owner > c.ownerx) ownerx FROM t_rec c WHERE c.ownerx IS NOT NULL)
SELECT * FROM t_rec where ownerx is not null
union all
select owner from t where owner is null and rownum=1;
--//如果全表扫描很快满足owner is null and rownum=1的情况,逻辑读不会很高,不行只能建立函数索引,解决这个问题.
Plan hash value: 2493656999
--------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 342 (100)| | 30 |00:00:00.01 | 38 |
| 1 | UNION-ALL | | 1 | | | | | 30 |00:00:00.01 | 38 |
|* 2 | VIEW | | 1 | 2 | 34 | 4 (0)| 00:00:01 | 29 |00:00:00.01 | 35 |
| 3 | UNION ALL (RECURSIVE WITH) BREADTH FIRST| | 1 | | | | | 30 |00:00:00.01 | 35 |
| 4 | SORT AGGREGATE | | 1 | 1 | 6 | | | 1 |00:00:00.01 | 2 |
| 5 | INDEX FULL SCAN (MIN/MAX) | I_T_OWNER | 1 | 1 | 6 | 2 (0)| 00:00:01 | 1 |00:00:00.01 | 2 |
| 6 | SORT AGGREGATE | | 29 | 1 | 6 | | | 29 |00:00:00.01 | 33 |
| 7 | FIRST ROW | | 29 | 1 | 6 | 2 (0)| 00:00:01 | 28 |00:00:00.01 | 33 |
|* 8 | INDEX RANGE SCAN (MIN/MAX) | I_T_OWNER | 29 | 1 | 6 | 2 (0)| 00:00:01 | 28 |00:00:00.01 | 33 |
| 9 | RECURSIVE WITH PUMP | | 30 | | | | | 29 |00:00:00.01 | 0 |
|* 10 | COUNT STOPKEY | | 1 | | | | | 1 |00:00:00.01 | 3 |
|* 11 | TABLE ACCESS FULL | T | 1 | 1 | 6 | 338 (1)| 00:00:05 | 1 |00:00:00.01 | 3 |
--------------------------------------------------------------------------------------------------------------------------------------------------
--//最差扫描全部.
Plan hash value: 2493656999
--------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 342 (100)| | 30 |00:00:00.01 | 1240 |
| 1 | UNION-ALL | | 1 | | | | | 30 |00:00:00.01 | 1240 |
|* 2 | VIEW | | 1 | 2 | 34 | 4 (0)| 00:00:01 | 29 |00:00:00.01 | 35 |
| 3 | UNION ALL (RECURSIVE WITH) BREADTH FIRST| | 1 | | | | | 30 |00:00:00.01 | 35 |
| 4 | SORT AGGREGATE | | 1 | 1 | 6 | | | 1 |00:00:00.01 | 2 |
| 5 | INDEX FULL SCAN (MIN/MAX) | I_T_OWNER | 1 | 1 | 6 | 2 (0)| 00:00:01 | 1 |00:00:00.01 | 2 |
| 6 | SORT AGGREGATE | | 29 | 1 | 6 | | | 29 |00:00:00.01 | 33 |
| 7 | FIRST ROW | | 29 | 1 | 6 | 2 (0)| 00:00:01 | 28 |00:00:00.01 | 33 |
|* 8 | INDEX RANGE SCAN (MIN/MAX) | I_T_OWNER | 29 | 1 | 6 | 2 (0)| 00:00:01 | 28 |00:00:00.01 | 33 |
| 9 | RECURSIVE WITH PUMP | | 30 | | | | | 29 |00:00:00.01 | 0 |
|* 10 | COUNT STOPKEY | | 1 | | | | | 1 |00:00:00.01 | 1205 |
|* 11 | TABLE ACCESS FULL | T | 1 | 1 | 6 | 338 (1)| 00:00:05 | 1 |00:00:00.01 | 1205 |
--------------------------------------------------------------------------------------------------------------------------------------------------
--//删除原来索引,建立函数索引再测试
SCOTT@book> create index if_t_owner on t(owner,0) COMPRESS 2;
Index created.
SCOTT@book> select distinct owner from t;
...
--//返回30行.
SCOTT@book> @ dpc '' '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID g3ywa5u5raj7d, child number 0
-------------------------------------
select distinct owner from t
Plan hash value: 3764002236
------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 46 (100)| | 30 |00:00:00.04 | 161 | 154 | | | |
| 1 | HASH UNIQUE | | 1 | 29 | 174 | 46 (7)| 00:00:01 | 30 |00:00:00.04 | 161 | 154 | 5686K| 1858K| 1243K (0)|
| 2 | INDEX FAST FULL SCAN| IF_T_OWNER | 1 | 84801 | 496K| 43 (0)| 00:00:01 | 84801 |00:00:00.02 | 161 | 154 | | | |
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$1
2 - SEL$1 / T@SEL$1
WITH t_rec (ownerx) AS (
SELECT MIN (owner) ownerx FROM t
UNION ALL
SELECT (SELECT MIN (owner) FROM t WHERE owner > c.ownerx) ownerx FROM t_rec c WHERE c.ownerx IS NOT NULL)
SELECT * FROM t_rec where ownerx is not null
union all
select owner from t where owner is null and rownum=1;
SCOTT@book> @ dpc '' '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID cu7sbwm03bu92, child number 0
-------------------------------------
WITH t_rec (ownerx) AS ( SELECT MIN (owner) ownerx FROM t UNION ALL
SELECT (SELECT MIN (owner) FROM t WHERE owner > c.ownerx) ownerx FROM
t_rec c WHERE c.ownerx IS NOT NULL) SELECT * FROM t_rec where ownerx is
not null union all select owner from t where owner is null and rownum=1
Plan hash value: 995977602
---------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 5 (100)| | 30 |00:00:00.01 | 37 |
| 1 | UNION-ALL | | 1 | | | | | 30 |00:00:00.01 | 37 |
|* 2 | VIEW | | 1 | 2 | 34 | 4 (0)| 00:00:01 | 29 |00:00:00.01 | 35 |
| 3 | UNION ALL (RECURSIVE WITH) BREADTH FIRST| | 1 | | | | | 30 |00:00:00.01 | 35 |
| 4 | SORT AGGREGATE | | 1 | 1 | 6 | | | 1 |00:00:00.01 | 2 |
| 5 | INDEX FULL SCAN (MIN/MAX) | IF_T_OWNER | 1 | 1 | 6 | 2 (0)| 00:00:01 | 1 |00:00:00.01 | 2 |
| 6 | SORT AGGREGATE | | 29 | 1 | 6 | | | 29 |00:00:00.01 | 33 |
| 7 | FIRST ROW | | 29 | 1 | 6 | 2 (0)| 00:00:01 | 28 |00:00:00.01 | 33 |
|* 8 | INDEX RANGE SCAN (MIN/MAX) | IF_T_OWNER | 29 | 1 | 6 | 2 (0)| 00:00:01 | 28 |00:00:00.01 | 33 |
| 9 | RECURSIVE WITH PUMP | | 30 | | | | | 29 |00:00:00.01 | 0 |
|* 10 | COUNT STOPKEY | | 1 | | | | | 1 |00:00:00.01 | 2 |
|* 11 | INDEX RANGE SCAN | IF_T_OWNER | 1 | 1 | 6 | 1 (0)| 00:00:01 | 1 |00:00:00.01 | 2 |
---------------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SET$2
2 - SET$1 / T_REC@SEL$1
3 - SET$1
4 - SEL$2
5 - SEL$2 / T@SEL$2
6 - SEL$4
8 - SEL$4 / T@SEL$4
9 - SEL$3
10 - SEL$5
11 - SEL$5 / T@SEL$5
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("OWNERX" IS NOT NULL)
8 - access("OWNER">:B1 AND "OWNER" IS NOT NULL)
10 - filter(ROWNUM=1)
11 - access("OWNER" IS NULL)
3.思索:
--//另有怎么方法可以解决这个问题,物化视图可以吗?感觉这样无法优化,写成group by也许可以。
select owner from t group by owner;
--//有机会测试看看。
--//另外以上代码存在问题,也许数据模型不应该允许生产系统执行这类sql语句。应该有类似应用的数据字典生存这些信息。
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。 |