ORA600[13011]表與索引數據邏輯錯誤分析及解決

1、問題概述sql

一、數據庫環境:數據庫

Oracle Database 11.2.0.3.0 for Oracle Linux Server release 6.4RAC,虛擬機bash

 

2、巡檢時發現某數據庫alert.log日誌報ORA-00600[13011]錯誤,報錯頻繁,雖然未致使數據庫宕機,但已影響業務,報錯以下:session

[oracle@NODE1 trace]$ grep -i ora-00600 alert*.log|grep 13011 | sort -u
ORA-00600: : [13011], [321401], [33682485], [24], [33682485], [3], [], [], [], [], [], [] -- Tue Feb 06 00:07:53 開始報錯
ORA-00600: : [13011], [321401], [33682485], [27], [33682485], [3], [], [], [], [], [], []

trace文件中信息:oracle

從這裏能夠看出是XXXXXMIN.XXX_XX_XX_XXX_OLD表進行DELEDE操做致使該錯誤發生app

Dump continued from file: /u01/app/oracle/diag/rdbms/racdb/racdb1/trace/racdb1_ora_19795.trc
ORA-00600: [13011], [321401], [33682485], [27], [33682485], [3], [], [], [], [], [], []
========= Dump for incident 49853 (ORA 600 [13011]) ========
*** 2018-02-06 09:37:44.987
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=b6nmg0fpy3smf) -----
 delete from "XXXXXMIN"."XXX_XX_XX_XXX_OLD" where "AX_ID" = :1


2、問題分析ide

一、MOS關於ORA 600 [13011]的描述:this

Format: ORA-600 [13013] [a] [b] {c} [d] [e] [f] Arg 
 [a] Passcount Arg 
 [b] Data Object number Arg
 {c} Tablespace Decimal Relative DBA (RDBA) of block containing the row to be updated Arg 
 [d] Row Slot number Arg 
 [e] Decimal RDBA of block being updated (Typically same as {c}) Arg 
 [f] Code

參考《 New and Improved: ORA-600 [13013] "Unable to get a Stable set of Records" (文檔 ID 1438920.1)》和《ORA-600 [13013] "Unable to get a Stable set of Records" (文檔 ID 28185.1)》文章。該報錯是因爲對某個表執行DML操做,該表對應的某個索引損壞致使的,解決的辦法是找出操做的表和受損的索引,重建索引便可。spa


二、查找報錯對象日誌

根據ORA-00600 [13011], [321401], [33682485], [27], [33682485], [3]報錯代碼,查找報錯對象:

select dbms_utility.data_block_address_file(33682485) rfile,dbms_utility.data_block_address_block(33682485) blocks from dual;

     RFILE     BLOCKS
---------- ----------
         8     128053         
select owner, segment_name, segment_type, tablespace_name, a.partition_name from dba_extents a where file_id = 8 and 128053 between block_id and block_id + blocks - 1;

OWNER     SEGMENT_NAME   SEGMENT_TYPE       TABLESPACE_NAME  PARTITION_NAME
------------ ------------------------ ------------------ ---------------- --------------
XXXXXMIN     XXX_XX_XX_XXX_OLD      TABLE              XXX

trace文件中信息:

BH (0xf60ee308) file#: 8 rdba: 0x0201f435 (8/128053) class: 1 ba: 0xf6c96000 --其對象XXXXXMIN.XXX_XX_XX_XXXXX_OLD與查詢一致
  set: 12 pool: 3 bsz: 8192 bsi: 0 sflg: 1 pwc: 0,25
  dbwrid: 0 obj: 321401 objn: 321401 tsn: 8 afn: 8 hint: f
  hash: [0x13ef9fd78,0x13ef9fd78] lru: [0xc900efb0,0xaf13f128]
  ckptq: [NULL] fileq: [NULL] objq: [0x132d5a950,0x132d5a950] objaq: [0x132d5a940,0x132d5a940]
  st: XCURRENT md: NULL fpin: 'kddwh01: kdddel' tch: 1 le: 0xcb0e3ee8
  flags: remote_transfered
  LRBA: [0x0.0.0] LSCN: [0x0.0] HSCN: [0xffff.ffffffff] HSUB: [65535]
  buffer tsn: 8 rdba: 0x0201f435 (8/128053)  --與查詢一致,其對象爲XXXXXMIN.XXX_XX_XX_XXXXX_OLD
  scn: 0x0001.084d4f80 seq: 0x01 flg: 0x06 tail: 0x4f800601
  frmt: 0x02 chkval: 0x538d type: 0x06=trans data
Hex dump of block: st=0, typ_found=1


三、分析異常表

analyze table xxxxxmin.xxx_xx_xx_xxxxx_old validate structure cascade;
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file –-根據文檔 1499.1查找trace文件

OERR: ORA-1499 table/Index Cross Reference Failure - see trace file [ID 1499.1]

Error ORA-1499 is produced by statement "ANALIZE TABLE|CLUSTER VALIDATE STRUCTURE CASCADE" to report an inconsistency between a table or a cluster and its index where an index key value is not found in the index or vice versa.
The content of the trace file has:
: tsn: rdba: 
description: 
"row not found in index" 
"Table/Index row count mismatch"
"row mismatch in index dba"
"Table row count/Bitmap index bit count mismatch"
"kdavls: kdcchk returns %d when checking cluster dba 0x%08lx objn %d\n" 
tsn: Tablespace Number where the INDEX is stored.
rdba: Relative data block address of the INDEX segment header.

根據文檔 1499.1查找trace文件未找到相應的報錯。看來與文檔描述的狀況不一樣,需進一步分析。


四、根據ROWID分析

經過前面的分析知道ORA-600 [13013]該報錯是因爲表與索引之間的邏輯數據不一致致使。查詢明確關聯的索引:

select owner,index_name,index_type from dba_indexes where table_name='XXX_XX_XX_XXXXX_OLD' and owner='XXXXXMIN';

OWNER              INDEX_NAME           INDEX_TYPE
------------------------------------------- ---------------------------
XXXXXMIN           PK_XXX_XX_XX_XXXXX       NORMAL

--索引建立語句以下:

DBMS_METADATA.GET_DDL(UPPER('INDEX'),UPPER('PK_XXX_XX_XX_XXXXX'),UPPER('XXXXXMIN'))
-----------------------------------------------------------------------------------------------------------
CREATE UNIQUE INDEX "XXXXXMIN"."PK_XXX_XX_XX_XXXXX" ON "XXXXXMIN"."XXX_XX_XX_XXXXX_OLD" ("AX_ID", "BX_ID")

根據"XXXXXMIN"."PK_XXX_XX_XX_XXXXX"索引的建立語句,該索引爲B樹索引,它是基於二叉樹的,由分支塊和葉子塊組成,包括每一個索引列的值和行所對應的ROWID


經過下面的語句查詢出全表掃描時和索引掃描時存在差別的行:

select /*+ INDEX_FFS(t pk_xxx_xx_xx_xxx) */ rowid, 
  2  dbms_rowid.ROWID_RELATIVE_FNO(rowid) relative_fno, 
  3  dbms_rowid.ROWID_BLOCK_NUMBER(rowid) block   
  4  from XXXXXMIN.XXX_XX_XX_XXXXX_OLD t  where (t.AX_ID is not null or BX_ID is not null)   
  5  minus   
  6  select /*+ FULL(t1)*/ rowid, 
  7  dbms_rowid.ROWID_RELATIVE_FNO(rowid) relative_fno, 
  8  dbms_rowid.ROWID_BLOCK_NUMBER(rowid) block from XXXXXMIN.XXX_XX_XX_XXXXX_OLD t1;

查詢結果以下:

ROWID              RELATIVE_FNO      BLOCK
------------------ ------------ ----------
AABOd5AAIAAAfQ1AAP            8     128053
AABOd5AAIAAAfQ1AAQ            8     128053
AABOd5AAIAAAfQ1AAR            8     128053
AABOd5AAIAAAfQ1AAY            8     128053
AABOd5AAIAAAfQ1AAZ            8     128053
AABOd5AAIAAAfQ1AAa            8     128053
AABOd5AAIAAAfQ1AAb            8     128053
AABOd5AAIAAAfQ1AAc            8     128053
AABOd5AAIAAAfQ1AAd            8     128053
AABOd5AAIAAAfQ1AAe            8     128053
AABOd5AAIAAAfQ1AAf            8     128053
AABOd5AAIAAAfQ1AAg            8     128053
AABOd5AAIAAAfQ1AAq            8     128053
AABOd5AAIAAAfQ1AAr            8     128053
AABOd5AAIAAAfQ1AAs            8     128053
15 rows selected.


五、驗證該表全表掃描與索引掃描時存在差別行

根據下面語句找出差別的數據:

select e.*,rowid from XXXXXMIN.XXX_XX_XX_XXX_OLD e where e.rowid > (select min(x.rowid) from XXXXXMIN.XXX_XX_XX_XXX_OLD x where x.AX_ID=e.AX_ID and x.BX_ID=e.BX_ID);

     AX_ID      BX_ID COMMITED_XXXXX UNCOMMITED_XXXXX UNSHARED_XXXXX UPDATED_T ROWID
---------- ---------- -------------- ---------------- -------------- --------- ------------------
      ****         **     **********  ***************     ********** ********* AABOd5AAIAAAzAPAAM
      ****         **     **********  ***************     ********** ********* AABOd5AAIAAAzAPAAN
      ****         **     **********  ***************     ********** ********* AABOd5AAIAAAzAPAAP
      ****         **     **********  ***************     ********** ********* AABOd5AAIAAAzAPAAL
      ****         **     **********  ***************     ********** ********* AABOd5AAIAAAzAPAAQ
      ****         **     **********  ***************     ********** ********* AABOd5AAIAABFRCACA
      ****         **     **********  ***************     ********** ********* AABOd5AAIAABFRCACl
      ****         **     **********  ***************     ********** ********* AABOd5AAIAABFRCACk
      ****         **     **********  ***************     ********** ********* AABOd5AAIAAAzAPAAB
      ****         **     **********  ***************     ********** ********* AABOd5AAIAAAzAPAAE
      ****         **     **********  ***************     ********** ********* AABOd5AAIAABFRCACC
      ****         **     **********  ***************     ********** ********* AABOd5AAIAABFRCACm
      ****         **     **********  ***************     ********** ********* AABOd5AAIAAAzAPAAD
      ****         **     **********  ***************     ********** ********* AABOd5AAIAABFRCACB
      ****         **     **********  ***************     ********** ********* AABOd5AAIAAAzAPAAO
15 rows selected.

取其中一條數據來驗證走全表掃描和索引掃描時的差別

--SQL執行計劃經過索引掃描查詢的數據

SQL> alter session set statistics_level=all;  
Session altered.
SQL> select e.*,rownum,rowid from XXXXXMIN.XXX_XX_XX_XXX_OLD e where e.AX_ID=**** and e.BX_ID=**;
     AX_ID      BX_ID COMMITED_XXXXX UNCOMMITED_XXXXX UNSHARED_XXXXX UPDATED_T     ROWNUM ROWID
---------- ---------- -------------- ---------------- -------------- --------- ---------- ------------------
     ****      **     **********               *             * *********          * AABOd5AAIAAAGcaABR
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  cy48jvzrnuv22, child number 1
-------------------------------------
select e.*,rownum,rowid from XXXXXMIN.XXX_XX_XX_XXX_OLD e where e.AX_ID=**** and e.BX_ID=**
Plan hash value: 1022151449  
--------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                     |      1 |        |      1 |00:00:00.01 |       3 |
|   1 |  COUNT                       |                     |      1 |        |      1 |00:00:00.01 |       3 |
|   2 |   TABLE ACCESS BY INDEX ROWID| XXX_XX_XX_XXX_OLD |      1 |      1 |      1 |00:00:00.01 |       3 |
|*  3 |    INDEX UNIQUE SCAN         | PK_XXX_XX_XX_XXX  |      1 |      1 |      1 |00:00:00.01 |       2 |
--------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("E"."AX_ID"=**** AND "E"."BX_ID"=**)
21 rows selected.

--SQL執行計劃經過全表查詢的數據

SQL> select /*+ full(e) */ e.*,rownum,rowid from XXXXXMIN.XXX_XX_XX_XXX_OLD e where e.AX_ID=**** and e.BX_ID=**;
     AX_ID      BX_ID COMMITED_XXXXX UNCOMMITED_XXXXX UNSHARED_XXXXX UPDATED_T     ROWNUM ROWID
---------- ---------- -------------- ---------------- -------------- --------- ---------- ------------------
     ****      **     **********               *             * *********          *  AABOd5AAIAABFRCACk
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  14vbv6bu472ty, child number 1
-------------------------------------
select /*+ full(e) */ e.*,rownum,rowid from XXXXXMIN.XXX_XX_XX_XXX_OLD e where e.AX_ID=**** and e.BX_ID=**
 
Plan hash value: 3364144674
----------------------------------------------------------------------------------------------------
| Id  | Operation          | Name                | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                     |      1 |        |      1 |00:00:00.01 |      68 |
|   1 |  COUNT             |                     |      1 |        |      1 |00:00:00.01 |      68 |
|*  2 |   TABLE ACCESS FULL| XXX_XX_XX_XXX_OLD |      1 |      1 |      1 |00:00:00.01 |      68 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(("E"."AX_ID"=**** AND "E"."BX_ID"=**))
20 rows selected.

經過對比走全表掃描和索引掃描時存在差別

SQL> select e.*,rownum,rowid from XXXXXMIN.XXX_XX_XX_XXX_OLD e where e.AX_ID=**** and e.BX_ID=**;

     AX_ID      BX_ID COMMITED_XXXXX UNCOMMITED_XXXXX UNSHARED_XXXXX UPDATED_T     ROWNUM ROWID
---------- ---------- -------------- ---------------- -------------- --------- ---------- ------------------
      ****      **     **********               *             * *********          *  AABOd5AAIAAAGcaABR
      
SQL> select /*+ full(e) */ e.*,rownum,rowid from XXXXXMIN.XXX_XX_XX_XXX_OLD e where e.AX_ID=**** and e.BX_ID=**;

     AX_ID      BX_ID COMMITED_XXXXX UNCOMMITED_XXXXX UNSHARED_XXXXX UPDATED_T     ROWNUM ROWID
---------- ---------- -------------- ---------------- -------------- --------- ---------- ------------------
        ****      **   **********               *           * *********          *    AABOd5AAIAABFRCACk

小結:在執行delete from "XXXXXMIN"."XXX_XX_XX_XXXXX_OLD" where "AX_ID" = :1時,該SQL的執行計劃是走索引掃描,由於表與索引之間的邏輯數據不一致(索引列的值和行所對應的ROWID組成的索引數據與表數據不一致),在變量「:1」的值恰好是異常的值時,致使了ORA 600 [13011]的報錯。


3、解決方案

1、重建XXXXXMIN.PK_XXX_XX_XX_XXXXX索引

XXXXXMIN.PK_XXX_XX_XX_XXXXX"AX_ID", "BX_ID"列的聯合主鍵索引而且AX_ID列做爲"XXXXXMIN"."XXX_VX" ("ID")的關聯外鍵,BX_ID列做爲 "XXXXXMIN"."XXX_DATAXXXXX" ("ID")的關聯外鍵。因此爲避免對業務形成影響,使用ONLINE在線重建的方式重建XXXXXMIN.PK_XXX_XX_XX_XXXXX索引。


ALTER INDEX XXXXXMIN.PK_XXX_XX_XX_XXX REBUILD ONLINE;


二、驗證重建索引後的數據

select /*+ INDEX_FFS(t pk_xxx_xx_xx_xxx) */ rowid, 
  2  dbms_rowid.ROWID_RELATIVE_FNO(rowid) relative_fno, 
  3  dbms_rowid.ROWID_BLOCK_NUMBER(rowid) block   
  4  from XXXXXMIN.XXX_XX_XX_XXXXX_OLD t  where (t.AX_ID is not null or BX_ID is not null)   
  5  minus   
  6  select /*+ FULL(t1)*/ rowid, 
  7  dbms_rowid.ROWID_RELATIVE_FNO(rowid) relative_fno, 
  8  dbms_rowid.ROWID_BLOCK_NUMBER(rowid) block from XXXXXMIN.XXX_XX_XX_XXXXX_OLD t1;
  
  no rows selected