VMCD.ORG » How to create very big index

how to create very big index

我们总会碰到各种case 需要建立索引有些索引大至1T,这些索引的创建是个非常棘手的问题,oracle 11g 在建立partition index时,给出了不错的解决方案

Let’s now re-create the index as a globally partitioned index, with one partition defined to contain all values of interest and another partition defined to contain the vast number of processed values. Initially, the index is created in an unusable state so no segments and no storage is allocated to any of the partitions

在创建partition index 时候使用unusable 可以不立刻分配storage和segment空间,从而解决了空间分配这个问题：

通过Richard Foote提供的例子,我们可以很好的看出效果：

SQL> create table bowie_stuff (id number, processed varchar2(10));

Table created.

SQL> insert into bowie_stuff select rownum, ‘YES’ from dual connect by level <= 1000000; 1000000 rows created. SQL> commit;

Commit complete.

SQL> update bowie_stuff set processed = ‘NO’ where id in (999990, 999992, 999994, 999996, 999998);

5 rows updated.

SQL> commit;

Commit complete.

SQL> create index bowie_stuff_i on bowie_stuff(processed) pctfree 0;

Index created.

SQL> select index_name, leaf_blocks from dba_indexes where index_name = ‘BOWIE_STUFF_I’;

INDEX_NAME LEAF_BLOCKS

—————————— ———–

BOWIE_STUFF_I 1877

SQL> select segment_name, blocks from dba_segments where segment_name = ‘BOWIE_STUFF_I’;

SEGMENT_NAME BLOCKS

——————– ———-

BOWIE_STUFF_I 1920

建立了一个normal index 分配了1920个块

——————————————————

SQL> drop index bowie_stuff_i;

Index dropped.

SQL> create index bowie_stuff_i on bowie_stuff(processed)

2 global partition by range (processed)

3 (partition not_processed_part values less than (‘YES’),

4 partition processed_part values less than (MAXVALUE))

5 unusable;

Index created.

SQL> alter index bowie_stuff_i rebuild partition not_processed_part;

Index altered.

SQL> select index_name, partition_name, leaf_blocks from dba_ind_partitions where index_name = ‘BOWIE_STUFF_I’;

INDEX_NAME PARTITION_NAME LEAF_BLOCKS

——————– ——————– ———–

BOWIE_STUFF_I PROCESSED_PART 0

BOWIE_STUFF_I NOT_PROCESSED_PART 1

SQL> select segment_name, partition_name, blocks from dba_segments where segment_name = ‘BOWIE_STUFF_I’;

SEGMENT_NAME PARTITION_NAME BLOCKS

——————– ——————– ———-

BOWIE_STUFF_I NOT_PROCESSED_PART 8

这里仅仅分配了 8个block

根据Richard Foote 的分析，在index create 完成之后alter index xx unusable 在10g,11g 也是截然不同的：

Starting with a simple Oracle 10g example, we create a table and associated index:

SQL> create table bowie as select rownum id, ‘BOWIE’ name from dual connect by level <= 1000000; Table created. SQL> create index bowie_id_i on bowie(id);

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>’BOWIE’, cascade=> true, estimate_percent=> null, method_opt=> ‘FOR ALL COLUMNS SIZE 1’);

PL/SQL procedure successfully completed.

If we now make the index unusable:

view sourceprint?

SQL> alter index bowie_id_i unusable;

Index altered.

SQL> select index_name, blevel, leaf_blocks, num_rows, status, dropped from dba_indexes where index_name = ‘BOWIE_ID_I’;

INDEX_NAME BLEVEL LEAF_BLOCKS NUM_ROWS STATUS DRO

———- ———- ———– ———- ——– —

BOWIE_ID_I 2 2226 1000000 UNUSABLE NO

SQL> select segment_name, bytes, blocks, extents from dba_segments where segment_name = ‘BOWIE_ID_I’;

SEGMENT_NAME BYTES BLOCKS EXTENTS

———— ———- ———- ———-

BOWIE_ID_I 18874368 2304 18

当index 置于unusable时候 blocks 依然存在

Using the same demo as before but running Oracle11g R2:

SQL> create table bowie as select rownum id, ‘BOWIE’ name from dual connect by level <= 1000000; Table created. SQL> create index bowie_id_i on bowie(id);

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>’BOWIE’, cascade=> true, estimate_percent=> null, method_opt=> ‘FOR ALL COLUMNS SIZE 1’);

PL/SQL procedure successfully completed.

SQL> alter index bowie_id_i unusable;

Index altered.

SQL> select index_name, blevel, leaf_blocks, num_rows, status, dropped from dba_indexes where index_name = ‘BOWIE_ID_I’;

INDEX_NAME BLEVEL LEAF_BLOCKS NUM_ROWS STATUS DRO

———- ———- ———– ———- ——– —

BOWIE_ID_I 2 2226 1000000 UNUSABLE NO

SQL> select segment_name, bytes, blocks, extents from dba_segments where segment_name = ‘BOWIE_ID_I’;

no rows selected

在11gR2中 alter index unusable 发现index 所占用的blocks 完全消失了

以下是分区表的测试：

SQL> CREATE TABLE big_album_sales(id number, album_id number, country_id number,

release_date date, total_sales number) PARTITION BY RANGE (release_date)

(PARTITION ALBUMS_2007 VALUES LESS THAN (TO_DATE(’01-JAN-2008′, ‘DD-MON-YYYY’)),

PARTITION ALBUMS_2008 VALUES LESS THAN (TO_DATE(’01-JAN-2009′, ‘DD-MON-YYYY’)),

PARTITION ALBUMS_2009 VALUES LESS THAN (TO_DATE(’01-JAN-2010′, ‘DD-MON-YYYY’)),

PARTITION ALBUMS_2010 VALUES LESS THAN (MAXVALUE));

Table created.

SQL> INSERT INTO big_album_sales SELECT rownum, mod(rownum,5000)+1, mod(rownum,100)+1, sysdate-mod(rownum,2000), ceil(dbms_random.value(1,500000)) FROM dual CONNECT BY LEVEL <= 1000000; 1000000 rows created. SQL> commit;

Commit complete.

We first create a Non-Partitioned Index:

SQL> CREATE INDEX big_album_tot_sales_i ON big_album_sales(total_sales);

Index created.

Next a Global Partitioned Index:

SQL> CREATE INDEX big_album_country_id_i ON big_album_sales(country_id)

GLOBAL PARTITION BY RANGE (country_id)

(PARTITION TS1 VALUES LESS THAN (26),

PARTITION TS2 VALUES LESS THAN (51),

PARTITION TS3 VALUES LESS THAN (76),

PARTITION TS4 VALUES LESS THAN (MAXVALUE));

Index created.

Finally, a Local Partitioned index:

SQL> CREATE INDEX big_album_album_id_i ON big_album_sales(album_id) local;

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=> ‘BOWIE’, tabname=> ‘BIG_ALBUM_SALES’, estimate_percent=> null, method_opt=> ‘FOR ALL COLUMNS SIZE 1′);

PL/SQL procedure successfully completed.

If we now split the last table partition, this will effectively make the:

1) Non-Partitioned Unusable
2) All partitions of the Global Partitioned index unusable
3) Just the last 2 partitions of the Local Partitioned Index unusable

SQL> ALTER TABLE big_album_sales SPLIT PARTITION ALBUMS_2010

AT (TO_DATE(’01-JAN-2011’, ‘DD-MON-YYYY’))

INTO (PARTITION ALBUMS_2010, PARTITION ALBUMS_2011);

Table altered.

SQL> select index_name, status from dba_indexes where table_name = ‘BIG_ALBUM_SALES’;

INDEX_NAME STATUS

———————— ——–

BIG_ALBUM_TOT_SALES_I UNUSABLE

BIG_ALBUM_COUNTRY_ID_I N/A

BIG_ALBUM_ALBUM_ID_I N/A

SQL> select index_name, partition_name, status, leaf_blocks from dba_ind_partitions where index_name like ‘BIG_ALBUM_%’;

INDEX_NAME PARTITION_NAME STATUS LEAF_BLOCKS

———————– ————– ——– ———–

BIG_ALBUM_ALBUM_ID_I ALBUMS_2007 USABLE 807

BIG_ALBUM_ALBUM_ID_I ALBUMS_2008 USABLE 381

BIG_ALBUM_ALBUM_ID_I ALBUMS_2009 USABLE 383

BIG_ALBUM_ALBUM_ID_I ALBUMS_2010 UNUSABLE

BIG_ALBUM_ALBUM_ID_I ALBUMS_2011 UNUSABLE

BIG_ALBUM_COUNTRY_ID_I TS1 UNUSABLE 629

BIG_ALBUM_COUNTRY_ID_I TS2 UNUSABLE 629

BIG_ALBUM_COUNTRY_ID_I TS3 UNUSABLE 629

BIG_ALBUM_COUNTRY_ID_I TS4 UNUSABLE 629

SQL> select segment_name, partition_name, bytes, blocks from dba_segments where segment_name like ‘BIG_ALBUM_%’ and segment_type like ‘INDEX%’;

SEGMENT_NAME PARTITION_NAME BYTES BLOCKS

——————— ————– ——– ——

BIG_ALBUM_ALBUM_ID_I ALBUMS_2007 7340032 896

BIG_ALBUM_ALBUM_ID_I ALBUMS_2008 3145728 384

BIG_ALBUM_ALBUM_ID_I ALBUMS_2009 4194304 512

BIG_ALBUM_TOT_SALES_I 23068672 2816

可以发现 BIG_ALBUM_ALBUM_ID_I 状态为UNUSABLE的 blocks已经消失，BIG_ALBUM_COUNTRY_ID_I 由于所有partition都变成了 UNUSABLE 所以整个segment占用的blocks都被取消
BIG_ALBUM_TOT_SALES_I由于是normal index blocks 依然存在，这跟上面的heap表测试出现了矛盾 heap table 在index unusable 时候 blocks 将被drop掉，而在分区表中blocks依然存在！