跳转至

010.count()到底能有多快?

1.案例说明

​ 一个大表的COUNT,究竟能有多快? 除类似物化视图的做法,我们所能做到的极限能有多快?这不是一个真实的案例,而是根据笔者在网上发的一篇帖子整理而来。通过对一条SQL,采用多种方式持续优化过程,表明SQL优化的手段随着优化者掌握的技能增多,其可能存在的手段也在不断增多。

2.数据准备

数据准备的代码如下:

create table t3 as select * from dba_objects;
insert into t3 select * from t3;
select count(*)  from t4;    <=== 107800000  --数据量有1亿多条
select sum(bytes)/1024/1024 mb from user_segments where segment_name='T3'; <===9477MB   --数据对象大小差不多10g

3.使用全表扫描

全表扫描的代码如下( 共用15.71s)

SQL> set autot on
SQL> select count(*)  from t3;
  COUNT(*)
----------
 107800000

Elapsed: 00:00:15.71
Execution Plan
----------------------------------------------------------
Plan hash value: 463314188
-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |   370K  (9)| 01:14:09 |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| T3   |  5244M|   370K  (9)| 01:14:09 |
-------------------------------------------------------------------
Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
    2061571  consistent gets
    1241504  physical reads
          0  redo size
        528  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

由上可知,全表扫描时间比较长。

4.主键索引

CREATE SEQUENCE seq_t3;
update t3 set id=(seq_t3.nextval);
alter table t3 add constraint pk_t3 primary key (id);
exec dbms_stats.gather_index_stats('lsq', 'pk_t3',estimate_percent =>10)
select count (*)  from t3;
SQL> select count (*)  from t3;

  COUNT(*)
----------
 107763441

Elapsed: 00:00:03.44

Execution Plan
----------------------------------------------------------
Plan hash value: 1276951659

-----------------------------------------------------------------------
| Id  | Operation             | Name  | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT      |       |     1 | 66164   (1)| 00:13:14 |
|   1 |  SORT AGGREGATE       |       |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| PK_T3 |    90M| 66164   (1)| 00:13:14 |
-----------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
          4  recursive calls
          0  db block gets
     241473  consistent gets
     241337  physical reads
          0  redo size
        530  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

通过引入索引,执行计划变成索引快速全扫描,因扫描块数较少,因此耗时也大大减少,共用33秒,快多了。

5.常数索引

create index idx_0 on t3(0);
exec dbms_stats.gather_index_stats('lsq', 'idx_0',estimate_percent =>10)
select count (*)  from t3;
SQL> create index idx_0 on t3(0);

Index created.

Elapsed: 00:01:17.84
SQL>  delete archivelog all;

SQL> exec dbms_stats.gather_index_stats('lsq', 'idx_0',estimate_percent =>10)

PL/SQL procedure successfully completed.

Elapsed: 00:00:04.72
SQL> select count (*)  from t3;

  COUNT(*)
----------
 107763441

Elapsed: 00:00:02.69

Execution Plan
----------------------------------------------------------
Plan hash value: 3583767517

-----------------------------------------------------------------------
| Id  | Operation             | Name  | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT      |       |     1 | 53766   (2)| 00:10:46 |
|   1 |  SORT AGGREGATE       |       |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| IDX_0 |    90M| 53766   (2)| 00:10:46 |
-----------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
          5  recursive calls
          0  db block gets
     195725  consistent gets
     195589  physical reads
          0  redo size
        530  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 

常数索引在存储密度上要高于普通字段索引,因此扫描块数更少,耗时也更少,共耗时29秒。

6.常数索引压缩

create index idx_0 on t3(0) compress;
exec dbms_stats.gather_index_stats('lsq', 'idx_0',estimate_percent =>10)
select count (*)  from t3;
SQL> drop index idx_0;

Index dropped.

Elapsed: 00:00:00.10
SQL> 
SQL> 
SQL> create index idx_0 on t3(0) compress;

Index created.

Elapsed: 00:01:06.71
SQL> commit;

Commit complete.

Elapsed: 00:00:00.00
SQL> exec dbms_stats.gather_index_stats('lsq', 'idx_0',estimate_percent =>10)

PL/SQL procedure successfully completed.

Elapsed: 00:00:04.47
SQL> select count (*)  from t3;

  COUNT(*)
----------
 107763441

Elapsed: 00:00:03.33

Execution Plan
----------------------------------------------------------
Plan hash value: 3583767517

-----------------------------------------------------------------------
| Id  | Operation             | Name  | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT      |       |     1 | 45689   (2)| 00:09:09 |
|   1 |  SORT AGGREGATE       |       |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| IDX_0 |    90M| 45689   (2)| 00:09:09 |
-----------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
         19  recursive calls
          0  db block gets
     166006  consistent gets
     163937  physical reads
          0  redo size
        530  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 

7.位图索引

create bitmap index idx_status2 on t3(status);
exec dbms_stats.gather_index_stats('lsq', 'idx_status2',estimate_percent =>10)
select count (*)  from t3;
SQL> drop index idx_0;
Index dropped.

Elapsed: 00:00:00.09

SQL> create bitmap index idx_status2 on t3(status);

Index created.

Elapsed: 00:00:26.20
SQL> commit;

Commit complete.

Elapsed: 00:00:00.00
SQL> exec dbms_stats.gather_index_stats('lsq', 'idx_status2',estimate_percent =>10)

PL/SQL procedure successfully completed.

Elapsed: 00:00:15.18
SQL> select count (*)  from t3;

  COUNT(*)
----------
 107763441

Elapsed: 00:00:00.02

Execution Plan
----------------------------------------------------------
Plan hash value: 2671630479

-------------------------------------------------------------------------------------
| Id  | Operation                     | Name        | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |             |     1 |  2449   (1)| 00:00:30 |
|   1 |  SORT AGGREGATE               |             |     1 |            |          |
|   2 |   BITMAP CONVERSION COUNT     |             |    90M|  2449   (1)| 00:00:30 |
|   3 |    BITMAP INDEX FAST FULL SCAN| IDX_STATUS2 |       |            |          |
-------------------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       2678  consistent gets
          0  physical reads
          0  redo size
        530  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 

8.位图索引加并行

create bitmap index idx_status2 on t3(status) parallel 8;
exec dbms_stats.gather_index_stats('lsq', 'idx_status2',estimate_percent =>10)
select count (*)  from t3;
SQL> drop  index idx_status2;

Index dropped.

Elapsed: 00:00:00.05
SQL> create bitmap index idx_status2 on t3(status) parallel 8;

Index created.

Elapsed: 00:00:18.12
SQL> exec dbms_stats.gather_index_stats('lsq', 'idx_status2',estimate_percent =>10)

PL/SQL procedure successfully completed.

Elapsed: 00:00:10.52
SQL> select count (*)  from t3;

  COUNT(*)
----------
 107763441

Elapsed: 00:00:00.05

Execution Plan
----------------------------------------------------------
Plan hash value: 1148193717

-------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name        |Rows| Cost (%CPU)| Time     |  TQ  |IN-OUT| PQ Distrib |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |             |  1 |  2514   (1)| 00:00:31 |      |      |            |
|   1 |  SORT AGGREGATE               |             |  1 |            |          |      |      |            |
|   2 |   PX COORDINATOR              |             |    |            |          |      |      |            |
|   3 |    PX SEND QC (RANDOM)        | :TQ10000    |  1 |            |          |Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE            |             |  1 |            |          |Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR        |             | 90M|  2514   (1)| 00:00:31 |Q1,00 | PCWC |            |
|   6 |       BITMAP CONVERSION COUNT |             | 90M|  2514   (1)| 00:00:31 |Q1,00 | PCWP |            |
|   7 |        BITMAP INDEX FAST FULL | IDX_STATUS2 |    |            |          |Q1,00 | PCWP |            |
-------------------------------------------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)

Statistics
----------------------------------------------------------
         43  recursive calls
          0  db block gets
       4167  consistent gets
       2694  physical reads
          0  redo size
        530  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

并行技术可以较快执行速度。一致性读有所增加,但并行还是能加快整体运行速度,这种手段耗时0.03秒,竟然又快了不少。

9.分析结论

  • 位图索引可以按很高密度存储数据,因此往往比B树索引小很多。前提是在基数比较小的情况下。
  • 位图索引是保存空值的,因此可以在count中利用。
  • 众所周知,位图索引不太适合OLTP类型数据库。该实例仅为了测试展示。

10.给我们的启示

优化没有止境,对数据库了解越多,你能想到的方法越多就越多。