oracle 11g在统计信息收集方面增加了扩展统计信息的特性,它可以收集一个表中相关列上的统计信息,也可以收集函数表达式上的 统计信息.使选择率,成本的估计更加准确,也更容易走正确的执行计划.在相关列上收集统计信息,好处还是很明显的.例如两列在逻辑 上有一定的关系,但如果只是对这两个列单独做统计信息的收集,根据多条件的选择率计算{(A AND B的选择率为:OPSEL[a]*OPSEL[b]); (A OR B 的选择率为:OPSEL[a]+OPSEL[b]-OPSEL[a]OPSEL[b]);(NOT A的选择率为:1-OPSEL[a])},估算出来的选择率就可能偏差很大.
以下测试: DB Version:11.2.0.4 ----产生测试数据 drop table scott.test01 purge; create table scott.test01 as select * from dba_objects;
--把object_name 更新为和object_type一样,用于测试. update scott.test01 set object_name=object_type; commit;
1.收集单列统计信息,查看执行计划 --收集单列统计信息 begin dbms_stats.gather_table_stats('scott','test01'); end; --查看表的行数 select table_name,num_rows from dba_tables where owner = 'SCOTT' and table_name = 'TEST01'; /* TABLE_NAME NUM_ROWS TEST01 87212 */ --产生语句的执行计划 explain plan for select * from scott.test01 where object_name='INDEX' and object_type='INDEX';
SELECT lpad(' ', 2 * (LEVEL - 1)) || operation operation, options, object_name, cardinality, bytes, io_cost, cpu_cost, cost, time FROM plan_table START WITH id = 0 CONNECT BY PRIOR id = parent_id; /* OPERATION OPTIONS OBJECT_NAME CARDINALITY BYTES IO_COST CPU_COST COST TIME SELECT STATEMENT 41 3362 347 35338490 348 5 TABLE ACCESS FULL TEST01 41 3362 347 35338490 348 5 */ 这里可以看到,估算的返回行数是41,显然和实际相差很远 rollback;
--行数估算 select rpad(column_name, 30, ' ') column_name, rpad(num_distinct, 8, ' ') num_distinct, rpad(utl_raw.cast_to_varchar2(low_value), 15, ' ') low_value, rpad(utl_raw.cast_to_varchar2(high_value), 10, ' ') high_value, rpad(num_nulls, 8, ' ') num_nulls, rpad(avg_col_len, 6, ' ') avg_col_len, rpad(density, 20, ' ') density, histogram from dba_tab_col_statistics where owner = 'SCOTT' and table_name = 'TEST01' and column_name in ('OBJECT_NAME','OBJECT_TYPE'); /* COLUMN_NAME NUM_DISTINCT LOW_VALUE HIGH_VALUE NULLABLE NUM_NULLS AVG_COL_LEN DENSITY HISTOGRAM OBJECT_NAME 46 CLUSTER XML SCHEMA Y 0 9 .0217391304347826 NONE OBJECT_TYPE 46 CLUSTER XML SCHEMA Y 0 9 .0217391304347826 NONE */ 估算的返回行数是41,是由两个列的density相乘再乘以表的行数得到,.0217391304347826*.0217391304347826*87212=41.2155009451796=41 2.收集多列扩展统计信息,查看执行计划 --收集多列扩展统计信息 begin dbms_stats.gather_table_stats('scott','test01',method_opt =>'for columns (object_name,object_type)'); end;
--产生语句的执行计划 explain plan for select * from scott.test01 where object_name='INDEX' and object_type='INDEX';
SELECT lpad(' ', 2 * (LEVEL - 1)) || operation operation, options, object_name, cardinality, bytes, io_cost, cpu_cost, cost, time FROM plan_table START WITH id = 0 CONNECT BY PRIOR id = parent_id; /* OPERATION OPTIONS OBJECT_NAME CARDINALITY BYTES IO_COST CPU_COST COST TIME SELECT STATEMENT 5303 498482 347 36285951 348 5 TABLE ACCESS FULL TEST01 5303 498482 347 36285951 348 5 */ 这里可以看到,估算的返回行数是5303,已经基本上和实际返回行数相近.