010.count()到底能有多快?
1.案例说明
一个大表的COUNT,究竟能有多快? 除类似物化视图的做法,我们所能做到的极限能有多快?这不是一个真实的案例,而是根据笔者在网上发的一篇帖子整理而来。通过对一条SQL,采用多种方式持续优化过程,表明SQL优化的手段随着优化者掌握的技能增多,其可能存在的手段也在不断增多。
2.数据准备
数据准备的代码如下:
create table t3 as select * from dba_objects;
insert into t3 select * from t3;
select count(*) from t4; <=== 107800000 --数据量有1亿多条
select sum(bytes)/1024/1024 mb from user_segments where segment_name='T3'; <===9477MB --数据对象大小差不多10g
3.使用全表扫描
全表扫描的代码如下( 共用15.71s)
SQL> set autot on
SQL> select count(*) from t3;
COUNT(*)
----------
107800000
Elapsed: 00:00:15.71
Execution Plan
----------------------------------------------------------
Plan hash value: 463314188
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 370K (9)| 01:14:09 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| T3 | 5244M| 370K (9)| 01:14:09 |
-------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2061571 consistent gets
1241504 physical reads
0 redo size
528 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
由上可知,全表扫描时间比较长。
4.主键索引
CREATE SEQUENCE seq_t3;
update t3 set id=(seq_t3.nextval);
alter table t3 add constraint pk_t3 primary key (id);
exec dbms_stats.gather_index_stats('lsq', 'pk_t3',estimate_percent =>10)
select count (*) from t3;
SQL> select count (*) from t3;
COUNT(*)
----------
107763441
Elapsed: 00:00:03.44
Execution Plan
----------------------------------------------------------
Plan hash value: 1276951659
-----------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 66164 (1)| 00:13:14 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FAST FULL SCAN| PK_T3 | 90M| 66164 (1)| 00:13:14 |
-----------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
4 recursive calls
0 db block gets
241473 consistent gets
241337 physical reads
0 redo size
530 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
通过引入索引,执行计划变成索引快速全扫描,因扫描块数较少,因此耗时也大大减少,共用33秒,快多了。
5.常数索引
create index idx_0 on t3(0);
exec dbms_stats.gather_index_stats('lsq', 'idx_0',estimate_percent =>10)
select count (*) from t3;
SQL> create index idx_0 on t3(0);
Index created.
Elapsed: 00:01:17.84
SQL> delete archivelog all;
SQL> exec dbms_stats.gather_index_stats('lsq', 'idx_0',estimate_percent =>10)
PL/SQL procedure successfully completed.
Elapsed: 00:00:04.72
SQL> select count (*) from t3;
COUNT(*)
----------
107763441
Elapsed: 00:00:02.69
Execution Plan
----------------------------------------------------------
Plan hash value: 3583767517
-----------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 53766 (2)| 00:10:46 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FAST FULL SCAN| IDX_0 | 90M| 53766 (2)| 00:10:46 |
-----------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
5 recursive calls
0 db block gets
195725 consistent gets
195589 physical reads
0 redo size
530 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
常数索引在存储密度上要高于普通字段索引,因此扫描块数更少,耗时也更少,共耗时29秒。
6.常数索引压缩
create index idx_0 on t3(0) compress;
exec dbms_stats.gather_index_stats('lsq', 'idx_0',estimate_percent =>10)
select count (*) from t3;
SQL> drop index idx_0;
Index dropped.
Elapsed: 00:00:00.10
SQL>
SQL>
SQL> create index idx_0 on t3(0) compress;
Index created.
Elapsed: 00:01:06.71
SQL> commit;
Commit complete.
Elapsed: 00:00:00.00
SQL> exec dbms_stats.gather_index_stats('lsq', 'idx_0',estimate_percent =>10)
PL/SQL procedure successfully completed.
Elapsed: 00:00:04.47
SQL> select count (*) from t3;
COUNT(*)
----------
107763441
Elapsed: 00:00:03.33
Execution Plan
----------------------------------------------------------
Plan hash value: 3583767517
-----------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 45689 (2)| 00:09:09 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FAST FULL SCAN| IDX_0 | 90M| 45689 (2)| 00:09:09 |
-----------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
19 recursive calls
0 db block gets
166006 consistent gets
163937 physical reads
0 redo size
530 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
7.位图索引
create bitmap index idx_status2 on t3(status);
exec dbms_stats.gather_index_stats('lsq', 'idx_status2',estimate_percent =>10)
select count (*) from t3;
SQL> drop index idx_0;
Index dropped.
Elapsed: 00:00:00.09
SQL> create bitmap index idx_status2 on t3(status);
Index created.
Elapsed: 00:00:26.20
SQL> commit;
Commit complete.
Elapsed: 00:00:00.00
SQL> exec dbms_stats.gather_index_stats('lsq', 'idx_status2',estimate_percent =>10)
PL/SQL procedure successfully completed.
Elapsed: 00:00:15.18
SQL> select count (*) from t3;
COUNT(*)
----------
107763441
Elapsed: 00:00:00.02
Execution Plan
----------------------------------------------------------
Plan hash value: 2671630479
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2449 (1)| 00:00:30 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | BITMAP CONVERSION COUNT | | 90M| 2449 (1)| 00:00:30 |
| 3 | BITMAP INDEX FAST FULL SCAN| IDX_STATUS2 | | | |
-------------------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2678 consistent gets
0 physical reads
0 redo size
530 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
8.位图索引加并行
create bitmap index idx_status2 on t3(status) parallel 8;
exec dbms_stats.gather_index_stats('lsq', 'idx_status2',estimate_percent =>10)
select count (*) from t3;
SQL> drop index idx_status2;
Index dropped.
Elapsed: 00:00:00.05
SQL> create bitmap index idx_status2 on t3(status) parallel 8;
Index created.
Elapsed: 00:00:18.12
SQL> exec dbms_stats.gather_index_stats('lsq', 'idx_status2',estimate_percent =>10)
PL/SQL procedure successfully completed.
Elapsed: 00:00:10.52
SQL> select count (*) from t3;
COUNT(*)
----------
107763441
Elapsed: 00:00:00.05
Execution Plan
----------------------------------------------------------
Plan hash value: 1148193717
-------------------------------------------------------------------------------------------------------------
| Id | Operation | Name |Rows| Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
-------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2514 (1)| 00:00:31 | | | |
| 1 | SORT AGGREGATE | | 1 | | | | | |
| 2 | PX COORDINATOR | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10000 | 1 | | |Q1,00 | P->S | QC (RAND) |
| 4 | SORT AGGREGATE | | 1 | | |Q1,00 | PCWP | |
| 5 | PX BLOCK ITERATOR | | 90M| 2514 (1)| 00:00:31 |Q1,00 | PCWC | |
| 6 | BITMAP CONVERSION COUNT | | 90M| 2514 (1)| 00:00:31 |Q1,00 | PCWP | |
| 7 | BITMAP INDEX FAST FULL | IDX_STATUS2 | | | |Q1,00 | PCWP | |
-------------------------------------------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
43 recursive calls
0 db block gets
4167 consistent gets
2694 physical reads
0 redo size
530 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
并行技术可以较快执行速度。一致性读有所增加,但并行还是能加快整体运行速度,这种手段耗时0.03秒,竟然又快了不少。
9.分析结论
- 位图索引可以按很高密度存储数据,因此往往比B树索引小很多。前提是在基数比较小的情况下。
- 位图索引是保存空值的,因此可以在count中利用。
- 众所周知,位图索引不太适合OLTP类型数据库。该实例仅为了测试展示。
10.给我们的启示
优化没有止境,对数据库了解越多,你能想到的方法越多就越多。