[20250103]使用递归实现distinct功能.txt

打印 上一主题 下一主题

主题 844|帖子 844|积分 2532

[20250103]使用递归实现distinct功能.txt

--//生产系统遇到实际上许多条类似语句,顺便拿此中几个出来,至心不知道开发如何学计算机的。

1.问题提出:
SYS@127.0.0.1:9106/xtdb/xtdb2> @ sql_id c29undaquszs6
-- SQL_ID = c29undaquszs6 come from shared pool
select distinct ritem from routine2;

SYS@127.0.0.1:9106/xtdb/xtdb2> @ sql_id fhz2xwcnx2uyh
-- SQL_ID = fhz2xwcnx2uyh come from shared pool
select distinct rtype from routine2;

SYS@127.0.0.1:9106/xtdb/xtdb2> @ seg2 nis5.routine2 ''
SEG_MB OWNER SEGMENT_NAME SEGMENT_TYPE SEG_TABLESPACE_NAME BLOCKS HDRFIL HDRBLK
------ ----- ------------ ------------ ------------------- ------ ------ ------
  1088 NIS5  ROUTINE2     TABLE        NIS                 139264     21 843528

SYS@127.0.0.1:9106/xtdb/xtdb2> @ desczz nis5.routine2 ritem,rtype
eXtended describe of nis5.routine2

DISPLAY TABLE_NAME OF COLUMN_NAME INFORMATION.
INPUT   OWNER.TABLE_NAME 
SAMPLE  : @ desczz TABLE_NAME column_name1,column_name2
IF NOT INPUT  ,USE "" .
Owner Table_Name SAMPLE_SIZE LAST_ANALYZED       Col# Column Name Null? Type         NUM_DISTINCT        Density  NUM_NULLS HISTOGRAM NUM_BUCKETS Low_value            High_value
----- ---------- ----------- ------------------- ---- ----------- ----- ------------ ------------ -------------- ---------- --------- ----------- -------------------- ----------------------
NIS5  ROUTINE2       4861544 2025-01-01 22:01:25    7 RTYPE             VARCHAR2(40)           36   .00000010285          0 FREQUENCY          36 13种呼吸道病原体检测 真菌三项病原体核酸检测
                     4861544 2025-01-01 22:01:25    8 RITEM             VARCHAR2(40)           32   .00000010285          0 FREQUENCY          32 13种呼吸道病原体检测 中性粒细胞

--//表ROUTINE2 1088M,而RTYPE,RITEM的差别值分布为36,32。要优化上面的sql语句分布建立对应索引简直就是资源浪费。
--//而且类似的sql语句另有一大堆,不想贴出来了,这样的项目简直就是豆腐渣中的豆腐渣工程。
--//上该公司网站,我至心不知道这样的产品竟然到处在买,简直就是垃圾,站在优化的角度又是一个豆腐渣工程。

--//要优化它,只能建立索引,还好NUM_NULLS=0,要给字段加not null约束,然后建立相应索引。
--//执行计划可以猜测出来走快速全索引扫描(需要另外排序),或者走全索引扫描(不需要排序).
--//我在想实际上每个值仅仅返回1行,是否可以通过递归实现类似功能,这样可以减少逻辑读,规避快速全索引扫描或者全索引扫描带
--//来的逻辑读,这样表索引应该也不小,即使选择压缩模式。
--//通过例子验证我的想法。

2.情况:
SCOTT@book> @ ver1
PORT_STRING                    VERSION        BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx            11.2.0.4.0     Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

create table t as select * from all_objects;
create index i_t_owner on t(owner) COMPRESS 1;
--//alter table t modify owner not null;这步不需要。
--//分析略。

3.测试:

SCOTT@book> @ sl all
alter session set statistics_level = all;
Session altered.

SCOTT@book> select distinct owner from t;
OWNER
------------------------------
OWBSYS_AUDIT
MDSYS
CTXSYS
FLOWS_FILES
HR
OLAPSYS
OUTLN
OWBSYS
PUBLIC
APEX_030200
EXFSYS
ORACLE_OCM
SCOTT
SYSTEM
DBSNMP
OE
ORDPLUGINS
ORDSYS
PM
SH
SYSMAN
APPQOSSYS
BI
IX
ORDDATA
XDB
SI_INFORMTN_SCHEMA
SYS
WMSYS
29 rows selected.

SCOTT@book> @ dpc '' '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  g3ywa5u5raj7d, child number 0
-------------------------------------
select distinct owner from t
Plan hash value: 4043955095
--------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name      | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |           |      1 |        |       |    40 (100)|          |     29 |00:00:00.04 |     138 |       |       |          |
|   1 |  HASH UNIQUE          |           |      1 |     29 |   174 |    40   (8)| 00:00:01 |     29 |00:00:00.04 |     138 |  5686K|  1858K| 1246K (0)|
|   2 |   INDEX FAST FULL SCAN| I_T_OWNER |      1 |  84801 |   496K|    37   (0)| 00:00:01 |  84801 |00:00:00.02 |     138 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$1
   2 - SEL$1 / T@SEL$1
--//oracle选择快速全扫描,然后hash unique排序,逻辑读138.

4.使用递归方式:
WITH t_rec (owner) AS (
SELECT MIN (owner) FROM t
UNION ALL
SELECT (SELECT MIN (owner) FROM t WHERE owner > c.owner) FROM t_rec c WHERE c.owner IS NOT NULL)
SELECT * FROM t_rec;

OWNER
------------------------------
APEX_030200
APPQOSSYS
BI
CTXSYS
DBSNMP
EXFSYS
FLOWS_FILES
HR
IX
MDSYS
OE
OLAPSYS
ORACLE_OCM
ORDDATA
ORDPLUGINS
ORDSYS
OUTLN
OWBSYS
OWBSYS_AUDIT
PM
PUBLIC
SCOTT
SH
SI_INFORMTN_SCHEMA
SYS
SYSMAN
SYSTEM
WMSYS
XDB


30 rows selected.
--//留意这样查询返回一个空行.

SCOTT@book> @ dpc '' '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  46c0waacwrkh6, child number 0
-------------------------------------
WITH t_rec (owner) AS ( SELECT MIN (owner) FROM t UNION ALL SELECT
(SELECT MIN (owner) FROM t WHERE owner > c.owner) FROM t_rec c WHERE
c.owner IS NOT NULL) SELECT * FROM t_rec
Plan hash value: 3372523748
-------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name      | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |           |      1 |        |       |     4 (100)|          |     30 |00:00:00.01 |      35 |
|   1 |  VIEW                                     |           |      1 |      2 |    34 |     4   (0)| 00:00:01 |     30 |00:00:00.01 |      35 |
|   2 |   UNION ALL (RECURSIVE WITH) BREADTH FIRST|           |      1 |        |       |            |          |     30 |00:00:00.01 |      35 |
|   3 |    SORT AGGREGATE                         |           |      1 |      1 |     6 |            |          |      1 |00:00:00.01 |       2 |
|   4 |     INDEX FULL SCAN (MIN/MAX)             | I_T_OWNER |      1 |      1 |     6 |     2   (0)| 00:00:01 |      1 |00:00:00.01 |       2 |
|   5 |    SORT AGGREGATE                         |           |     29 |      1 |     6 |            |          |     29 |00:00:00.01 |      33 |
|   6 |     FIRST ROW                             |           |     29 |      1 |     6 |     2   (0)| 00:00:01 |     28 |00:00:00.01 |      33 |
|*  7 |      INDEX RANGE SCAN (MIN/MAX)           | I_T_OWNER |     29 |      1 |     6 |     2   (0)| 00:00:01 |     28 |00:00:00.01 |      33 |
|   8 |    RECURSIVE WITH PUMP                    |           |     30 |        |       |            |          |     29 |00:00:00.01 |       0 |
-------------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SET$1 / T_REC@SEL$4
   2 - SET$1
   3 - SEL$1
   4 - SEL$1 / T@SEL$1
   5 - SEL$3
   7 - SEL$3 / T@SEL$3
   8 - SEL$2
Predicate Information (identified by operation id):
---------------------------------------------------
   7 - access("OWNER">:B1)
--//充实使用取最大最小值,减少了逻辑读.我的表不大,逻辑读也就是35.如果返回记录许多的话效果也许没有这么好..
--//排除null应该修改如下:

WITH t_rec (ownerx) AS (
SELECT MIN (owner) ownerx FROM t
UNION ALL
SELECT (SELECT MIN (owner) FROM t WHERE owner > c.ownerx) ownerx FROM t_rec c WHERE c.ownerx IS NOT NULL)
SELECT * FROM t_rec where ownerx is not null;

--//如果存在空值呢?

SCOTT@book> alter table t modify owner  null;
Table altered.

SCOTT@book> update t set owner=null  where object_id=2;
1 row updated.

SCOTT@book> commit ;
Commit complete.

SCOTT@book> select distinct owner from t;
...
--//返回30行.

SCOTT@book> @ dpc '' '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  g3ywa5u5raj7d, child number 0
-------------------------------------
select distinct owner from t
Plan hash value: 1793979440
------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |       |   340 (100)|          |     30 |00:00:00.04 |    1215 |       |       |          |
|   1 |  HASH UNIQUE       |      |      1 |     29 |   174 |   340   (1)| 00:00:05 |     30 |00:00:00.04 |    1215 |  5686K|  1858K| 1246K (0)|
|   2 |   TABLE ACCESS FULL| T    |      1 |  84801 |   496K|   338   (1)| 00:00:05 |  84801 |00:00:00.02 |    1215 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$1
   2 - SEL$1 / T@SEL$1
--//owner索引不再有效,因为存在null值.

WITH t_rec (ownerx) AS (
SELECT MIN (owner) ownerx FROM t
UNION ALL
SELECT (SELECT MIN (owner) FROM t WHERE owner > c.ownerx) ownerx FROM t_rec c WHERE c.ownerx IS NOT NULL)
SELECT * FROM t_rec where ownerx is not null;
--//执行计划同上.返回29条.

WITH t_rec (ownerx) AS (
SELECT MIN (owner) ownerx FROM t
UNION ALL
SELECT (SELECT MIN (owner) FROM t WHERE owner > c.ownerx) ownerx FROM t_rec c WHERE c.ownerx IS NOT NULL)
SELECT * FROM t_rec where ownerx is not null
union all
select owner from t where owner is null and rownum=1;

--//如果全表扫描很快满足owner is null and rownum=1的情况,逻辑读不会很高,不行只能建立函数索引,解决这个问题.

Plan hash value: 2493656999
--------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name      | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |           |      1 |        |       |   342 (100)|          |     30 |00:00:00.01 |      38 |
|   1 |  UNION-ALL                                 |           |      1 |        |       |            |          |     30 |00:00:00.01 |      38 |
|*  2 |   VIEW                                     |           |      1 |      2 |    34 |     4   (0)| 00:00:01 |     29 |00:00:00.01 |      35 |
|   3 |    UNION ALL (RECURSIVE WITH) BREADTH FIRST|           |      1 |        |       |            |          |     30 |00:00:00.01 |      35 |
|   4 |     SORT AGGREGATE                         |           |      1 |      1 |     6 |            |          |      1 |00:00:00.01 |       2 |
|   5 |      INDEX FULL SCAN (MIN/MAX)             | I_T_OWNER |      1 |      1 |     6 |     2   (0)| 00:00:01 |      1 |00:00:00.01 |       2 |
|   6 |     SORT AGGREGATE                         |           |     29 |      1 |     6 |            |          |     29 |00:00:00.01 |      33 |
|   7 |      FIRST ROW                             |           |     29 |      1 |     6 |     2   (0)| 00:00:01 |     28 |00:00:00.01 |      33 |
|*  8 |       INDEX RANGE SCAN (MIN/MAX)           | I_T_OWNER |     29 |      1 |     6 |     2   (0)| 00:00:01 |     28 |00:00:00.01 |      33 |
|   9 |     RECURSIVE WITH PUMP                    |           |     30 |        |       |            |          |     29 |00:00:00.01 |       0 |
|* 10 |   COUNT STOPKEY                            |           |      1 |        |       |            |          |      1 |00:00:00.01 |       3 |
|* 11 |    TABLE ACCESS FULL                       | T         |      1 |      1 |     6 |   338   (1)| 00:00:05 |      1 |00:00:00.01 |       3 |
--------------------------------------------------------------------------------------------------------------------------------------------------

--//最差扫描全部.
Plan hash value: 2493656999
--------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name      | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |           |      1 |        |       |   342 (100)|          |     30 |00:00:00.01 |    1240 |
|   1 |  UNION-ALL                                 |           |      1 |        |       |            |          |     30 |00:00:00.01 |    1240 |
|*  2 |   VIEW                                     |           |      1 |      2 |    34 |     4   (0)| 00:00:01 |     29 |00:00:00.01 |      35 |
|   3 |    UNION ALL (RECURSIVE WITH) BREADTH FIRST|           |      1 |        |       |            |          |     30 |00:00:00.01 |      35 |
|   4 |     SORT AGGREGATE                         |           |      1 |      1 |     6 |            |          |      1 |00:00:00.01 |       2 |
|   5 |      INDEX FULL SCAN (MIN/MAX)             | I_T_OWNER |      1 |      1 |     6 |     2   (0)| 00:00:01 |      1 |00:00:00.01 |       2 |
|   6 |     SORT AGGREGATE                         |           |     29 |      1 |     6 |            |          |     29 |00:00:00.01 |      33 |
|   7 |      FIRST ROW                             |           |     29 |      1 |     6 |     2   (0)| 00:00:01 |     28 |00:00:00.01 |      33 |
|*  8 |       INDEX RANGE SCAN (MIN/MAX)           | I_T_OWNER |     29 |      1 |     6 |     2   (0)| 00:00:01 |     28 |00:00:00.01 |      33 |
|   9 |     RECURSIVE WITH PUMP                    |           |     30 |        |       |            |          |     29 |00:00:00.01 |       0 |
|* 10 |   COUNT STOPKEY                            |           |      1 |        |       |            |          |      1 |00:00:00.01 |    1205 |
|* 11 |    TABLE ACCESS FULL                       | T         |      1 |      1 |     6 |   338   (1)| 00:00:05 |      1 |00:00:00.01 |    1205 |
--------------------------------------------------------------------------------------------------------------------------------------------------

--//删除原来索引,建立函数索引再测试
SCOTT@book> create index if_t_owner on t(owner,0) COMPRESS 2;
Index created.

SCOTT@book> select distinct owner from t;
...
--//返回30行.

SCOTT@book> @ dpc '' '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  g3ywa5u5raj7d, child number 0
-------------------------------------
select distinct owner from t
Plan hash value: 3764002236
------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name       | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |            |      1 |        |       |    46 (100)|          |     30 |00:00:00.04 |     161 |    154 |       |       |          |
|   1 |  HASH UNIQUE          |            |      1 |     29 |   174 |    46   (7)| 00:00:01 |     30 |00:00:00.04 |     161 |    154 |  5686K|  1858K| 1243K (0)|
|   2 |   INDEX FAST FULL SCAN| IF_T_OWNER |      1 |  84801 |   496K|    43   (0)| 00:00:01 |  84801 |00:00:00.02 |     161 |    154 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$1
   2 - SEL$1 / T@SEL$1

WITH t_rec (ownerx) AS (
SELECT MIN (owner) ownerx FROM t
UNION ALL
SELECT (SELECT MIN (owner) FROM t WHERE owner > c.ownerx) ownerx FROM t_rec c WHERE c.ownerx IS NOT NULL)
SELECT * FROM t_rec where ownerx is not null
union all
select owner from t where owner is null and rownum=1;
   
SCOTT@book> @ dpc '' '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  cu7sbwm03bu92, child number 0
-------------------------------------
WITH t_rec (ownerx) AS ( SELECT MIN (owner) ownerx FROM t UNION ALL
SELECT (SELECT MIN (owner) FROM t WHERE owner > c.ownerx) ownerx FROM
t_rec c WHERE c.ownerx IS NOT NULL) SELECT * FROM t_rec where ownerx is
not null union all select owner from t where owner is null and rownum=1
Plan hash value: 995977602
---------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name       | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |            |      1 |        |       |     5 (100)|          |     30 |00:00:00.01 |      37 |
|   1 |  UNION-ALL                                 |            |      1 |        |       |            |          |     30 |00:00:00.01 |      37 |
|*  2 |   VIEW                                     |            |      1 |      2 |    34 |     4   (0)| 00:00:01 |     29 |00:00:00.01 |      35 |
|   3 |    UNION ALL (RECURSIVE WITH) BREADTH FIRST|            |      1 |        |       |            |          |     30 |00:00:00.01 |      35 |
|   4 |     SORT AGGREGATE                         |            |      1 |      1 |     6 |            |          |      1 |00:00:00.01 |       2 |
|   5 |      INDEX FULL SCAN (MIN/MAX)             | IF_T_OWNER |      1 |      1 |     6 |     2   (0)| 00:00:01 |      1 |00:00:00.01 |       2 |
|   6 |     SORT AGGREGATE                         |            |     29 |      1 |     6 |            |          |     29 |00:00:00.01 |      33 |
|   7 |      FIRST ROW                             |            |     29 |      1 |     6 |     2   (0)| 00:00:01 |     28 |00:00:00.01 |      33 |
|*  8 |       INDEX RANGE SCAN (MIN/MAX)           | IF_T_OWNER |     29 |      1 |     6 |     2   (0)| 00:00:01 |     28 |00:00:00.01 |      33 |
|   9 |     RECURSIVE WITH PUMP                    |            |     30 |        |       |            |          |     29 |00:00:00.01 |       0 |
|* 10 |   COUNT STOPKEY                            |            |      1 |        |       |            |          |      1 |00:00:00.01 |       2 |
|* 11 |    INDEX RANGE SCAN                        | IF_T_OWNER |      1 |      1 |     6 |     1   (0)| 00:00:01 |      1 |00:00:00.01 |       2 |
---------------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SET$2
   2 - SET$1 / T_REC@SEL$1
   3 - SET$1
   4 - SEL$2
   5 - SEL$2 / T@SEL$2
   6 - SEL$4
   8 - SEL$4 / T@SEL$4
   9 - SEL$3
  10 - SEL$5
  11 - SEL$5 / T@SEL$5
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("OWNERX" IS NOT NULL)
   8 - access("OWNER">:B1 AND "OWNER" IS NOT NULL)
  10 - filter(ROWNUM=1)
  11 - access("OWNER" IS NULL)

3.思索:
--//另有怎么方法可以解决这个问题,物化视图可以吗?感觉这样无法优化,写成group by也许可以。
select owner from t group by owner;
--//有机会测试看看。
--//另外以上代码存在问题,也许数据模型不应该允许生产系统执行这类sql语句。应该有类似应用的数据字典生存这些信息。
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。
回复

使用道具 举报

0 个回复

倒序浏览

快速回复

您需要登录后才可以回帖 登录 or 立即注册

本版积分规则

饭宝

金牌会员
这个人很懒什么都没写!

标签云

快速回复 返回顶部 返回列表