VMCD.ORG » 11g nested loop faster than 10g

在pythian看到了一个测试，是否证明了oracle 11g is x% faster than 10g?(oracle官方的说法）

First, you have to create and fill a table to run your query :

create table gark
(id1 number not null,
id2 number not null,
id3 number not null);

begin
for i in 1..100000 loop
insert into gark(id1, id2, id3)
values (i, i, i);
end loop;
commit;
end;
/

create unique index gark_idx on gark(id1, id3);

begin
dbms_stats.gather_table_stats(
user,
‘GARK’,
cascade=>true,
estimate_percent=>100,
method_opt=>’FOR ALL COLUMNS SIZE 254′,
no_invalidate=> false);
end;
/
Then, run the query below on a 10g database :

set autotrace traceonly

select /*+ use_nl(A B) */count(a.id3)
from gark a, gark b
where a.id1=b.id2;

Execution Plan
—————————-
Plan hash value: 3137705415

——————————————————–
| Id | Operation | Name | Rows | Cost |
——————————————————–
| 0 | SELECT STATEMENT | | 1 | 100K |
| 1 | SORT AGGREGATE | | 1 | |
| 2 | NESTED LOOPS | | 100K| 100K |
| 3 | TABLE ACCESS FULL| GARK | 100K| 65 |
|* 4 | INDEX RANGE SCAN | GARK_IDX | 1 | 1 |
——————————————————–

Predicate Information (identified by operation id):
—————————————————

4 – access(“A”.”ID1″=”B”.”ID2″)

Statistics
———————————————————-
0 recursive calls
0 db block gets
100556 consistent gets
0 physical reads
0 redo size
415 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
Run the same query on 11g (There is no trick here, the plan is the right plan, I’ve check it with dbms_xplan.display_cursor) :

set autotrace traceonly

select /*+ use_nl(A B) */count(a.id3)
from gark a, gark b
where a.id1=b.id2;

Execution Plan
———————————————————-
Plan hash value: 3137705415

——————————————————-
| Id | Operation | Name | Rows | Cost |
——————————————————-
| 0 | SELECT STATEMENT | | 1 | 100K |
| 1 | SORT AGGREGATE | | 1 | |
| 2 | NESTED LOOPS | | 100K| 100K |
| 3 | TABLE ACCESS FULL| GARK | 100K| 105 |
|* 4 | INDEX RANGE SCAN | GARK_IDX | 1 | 11 |
——————————————————-

Predicate Information (identified by operation id):
—————————————————

4 – access(“A”.”ID1″=”B”.”ID2″)

Statistics
———————————————————-
0 recursive calls
0 db block gets
3373 consistent gets
0 physical reads
0 redo size
422 bytes sent via SQL*Net to client
415 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
If you know how the NL algorithm works in 10g, it’s quite easy to understand how it has been modified. The good news with that is that it won’t only speed up queries with optimal plans but also speed up queries with sub optimal plans that are using Nested Loops.

这里所谓的算法是什么算法，我们通过查找oracle文档，发现11g nested loop 与 10g 的区别：

Original and New Implementation for Nested Loop Joins

Oracle Database 11g Release 1 (11.1) introduces a new implementation for nested loop joins. As a result, execution plans that include nested loops might appear different than they did in previous releases of Oracle Database. Both the new implementation and the original implementation for nested loop joins are possible in Oracle Database 11g Release 1 (11.1). So, when analyzing execution plans, it is important to understand that the number of NESTED LOOPS join row sources might be different.

Original Implementation for Nested Loop Joins

Consider the following query:

SELECT e.first_name, e.last_name, e.salary, d.department_name
FROM hr.employees e, hr.departments d
WHERE d.department_name IN (‘Marketing’, ‘Sales’)
AND e.department_id = d.department_id;
before Oracle Database 11g Release 1 (11.1), the execution plan for this query might appear similar to the following execution plan:

————————————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
————————————————————————————————-
| 0 | SELECT STATEMENT | | 19 | 722 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 10 | 220 | 1 (0)| 00:00:01 |
| 2 | NESTED LOOPS | | 19 | 722 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL | DEPARTMENTS | 2 | 32 | 2 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 10 | | 0 (0)| 00:00:01 |
————————————————————————————————-

Predicate Information (identified by operation id):
—————————————————
3 – filter(“D”.”DEPARTMENT_NAME”=’Marketing’ OR “D”.”DEPARTMENT_NAME”=’Sales’)
4 – access(“E”.”DEPARTMENT_ID”=”D”.”DEPARTMENT_ID”)
In this example, the outer side of the join consists of a scan of the hr.departments table that returns the rows that match the condition department_name IN (‘Marketing’, ‘Sales’). The inner loop retrieves the employees in the hr.employees table that are associated with those departments.

New Implementation for Nested Loop Joins
Oracle Database 11g Release 1 (11.1) introduces a new implementation for nested loop joins to reduce overall latency for physical I/O. When an index or a table block is not in the buffer cache and is needed to process the join, a physical I/O is required. In Oracle Database 11g Release 1 (11.1), Oracle Database can batch multiple physical I/O requests and process them using a vector I/O instead of processing them one at a time. As part of the new implementation for nested loop joins, two NESTED LOOPS join row sources might appear in the execution plan where only one would have appeared in prior releases. In such cases, Oracle Database allocates one NESTED LOOPS join row source to join the values from the table on the outer side of the join with the index on the inner side. A second row source is allocated to join the result of the first join, which includes the rowids stored in the index, with the table on the inner side of the join.

Consider the query in “Original Implementation for Nested Loop Joins”. In Oracle Database 11g Release 1 (11.1), with the new implementation for nested loop joins, the execution plan for this query might appear similar to the following execution plan:

————————————————————————————————
| Id | Operation | Name | Rows | Bytes | Cost(%CPU)| Time |
————————————————————————————————-
| 0 | SELECT STATEMENT | | 19 | 722 | 3 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 19 | 722 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL | DEPARTMENTS | 2 | 32 | 2 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 10 | | 0 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 10 | 220 | 1 (0)| 00:00:01 |
————————————————————————————————-

Predicate Information (identified by operation id):
—————————————————
3 – filter(“D”.”DEPARTMENT_NAME”=’Marketing’ OR “D”.”DEPARTMENT_NAME”=’Sales’)
4 – access(“E”.”DEPARTMENT_ID”=”D”.”DEPARTMENT_ID”)
In this case, the rows from the hr.departments table constitute the outer side of the first join. The inner side of the first join is the index emp_department_ix. The results of the first join constitute the outer side of the second join, which has the hr.employees table as its inner side.

There are cases where a second join row source is not allocated, and the execution plan looks the same as it did in prior releases. The following list describes such cases:

All of the columns needed from the inner side of the join are present in the index, and there is no table access required. In this case, Oracle Database allocates only one join row source.

The order of the rows returned might be different than it was in previous releases. Hence, when Oracle Database tries to preserve a specific ordering of the rows, for example to eliminate the need for an ORDER BY sort, Oracle Database might use the original implementation for nested loop joins.

The OPTIMIZER_FEATURES_ENABLE initialization parameter is set to a release before Oracle Database 11g Release 1 (11.1). In this case, Oracle Database uses the original implementation for nested loop joins.

在11g中，oracle通过得到的rowid 批量的从磁盘上得到数据，速度>>10g 时候nested loop 的行为。

再回到上面的测试：在11g中 oracle在取得数据时候采用了不同的方法（Oracle Database can batch multiple physical I/O requests and process them using a vector I/O instead of processing them one at a time）使用multiple physical I/O取代了一次返回一行的行为。