ASM Fast Mirror Resync - Example To Simulate Transient Disk Failure And Restore Disk - qjoycn

qjoycn

浏览: 1217268 次

最近访客更多访客>>

iptcp

康敏栋

u012363178

liaoyang.777

博主相关

博客

微博

相册

留言

关于我

文章分类

全部博客 (1734)

社区版块

存档分类

ASM Fast Mirror Resync - Example To Simulate Transient Disk Failure And Restore Disk

Oracle Server - Enterprise Edition - Version: 11.1.0.6 to 11.2.0.1.0 - Release: to 11.2

This note discusses the New 11g ASM feature called ASM Fast Mirror Resync . Also an example is taken to show how this works. We will simulate the transient disk failure and recover the disk before disk repair time.

ASM Fast Mirror Resync - Example To Simulate Transient Disk Failure And Restore Disk

ASM Fast Mirror Resync

ASM fast resync keeps track of pending changes to extents on an OFFLINE disk during an outage. The extents are resynced when the disk is brought back online or replaced.

By default, ASM drops a disk shortly after it is taken offline. You can set the DISK_REPAIR_TIME attribute to prevent this operation by specifying a time interval to repair the disk and bring it back online. The default DISK_REPAIR_TIME attribute value of 3.6h should be adequate for most environments.

-- 这段话需要注意：默认情况下，当disk take offline 之后， ASM 会很快删除这些disk。 DISK_REPAIR_TIME 参数可以设定保护时间，该值默认是3.6小时，即在这个时间内，即使设置为offline，也不会被删除。

The elapsed time (since the disk was set to OFFLINE mode) is incremented only when the disk group containing the offline disks is mounted. The REPAIR_TIMER column of V$ASM_DISK shows the amount of time left (in seconds) before an offline disk is dropped. After the specified time has elapsed, ASM drops the disk.

You can override this attribute with an ALTER DISKGROUP DISK OFFLINE statement and the DROP AFTER clause.

If an ALTER DISKGROUP SET ATTRIBUTE DISK_REPAIR_TIME is issued on a disk group that has disks that are currently offline, the new attribute value applies only to those disks that are not currently in OFFLINE mode.

A disk that is in OFFLINE mode cannot be dropped with an ALTER DISKGROUP DROP DISK statement; an error is returned if attempted. If for some reason the disk needs to be dropped (such as the disk cannot be repaired) before the repair time has expired, a disk can be dropped immediately by issuing a second OFFLINE statement with a DROP AFTER clause specifying 0h or 0m.

You can use ALTER DISKGROUP to set the DISK_REPAIR_TIME attribute to a specified hour or minute value, such as 4.5 hours or 270 minutes. For example:

alter diskgroup dg set attribute 'disk_repair_time' = '4.5h'
alter diskgroup dg set attribute 'disk_repair_time' = '270m'

After you repair the disk, run the SQL statement ALTER DISKGROUP DISK ONLINE. This statement brings a repaired disk group back online to enable writes so that no new writes are missed. This statement also starts a procedure to copy of all of the extents that are marked as stale on their redundant copies.

If a disk goes offline when the ASM instance is in rolling upgrade mode, the disk remains offline until the rolling upgrade has ended and the timer for dropping the disk is stopped until the ASM cluster is out of rolling upgrade mode. See "ASM Rolling Upgrade".

Note: To use this feature, the disk group compatibility attributes must be set to 11.1 or higher.

Please find below example in which we will simulate the transient disk failure and recover the disk before disk repair time

SQL> create diskgroup dgnm11gasm disk '/dev/raw/raw1','/dev/raw/raw2'
attribute 'compatible.rdbms'='11.1','compatible.asm'='11.1';
Diskgroup created.

SQL> select group_number,name from v$asm_diskgroup where group_number=1;

GROUP_NUMBER NAME
------------ --------------------
1 DGNM11GASM

SQL>select name,value from v$asm_attribute where group_number=1;
NAME VALUE
-------------------- --------------------
disk_repair_time 3.6h
au_size 1048576
compatible.asm 11.1.0.0.0
compatible.rdbms 11.1.0.0.0

Default disk repair time is 3.6 hours

Connect to DB Instance

SQL> create tablespace test datafile '+DGNM11GASM' size 20m;
Tablespace created.

Shutdown the DB Instance
Dismount the ASM Diskgroup

SQL> alter diskgroup DGNM11GASM dismount;
Diskgroup altered.

Change the permission of /dev/raw/raw1 to simulate the disk loss

[root@11g ~]# chown root.root /dev/raw/raw1
[root@11g ~]# ls -ltr /dev/raw/raw1
crw-rw---- 1 root root 162, 1 Jul 8 01:47 /dev/raw/raw1

SQL> alter diskgroup dgnm11gasm mount;
alter diskgroup dgnm11gasm mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "0" is missing

With Oracle Database 11g, ASM will fail to mount a diskgroup if there are any missing disks or failgroups during mount.You need to mount the diskgroup with FORCE option.

Disk groups mounted with the FORCE option will have one or more disks offline if they were not available at the time of the mount.

SQL> alter diskgroup dgnm11gasm mount force;
Diskgroup altered.

SQL>select path,name,repair_timer from v$asm_disk where group_number=1;
PATH NAME REPAIR_TIMER
--------------- -------------------- ------------
DGNM11GASM_0000 12960
/dev/raw/raw2 DGNM11GASM_0001 0

Disk groups mounted with the FORCE option will have one or more disks offline if they are not available at time of the mount.You must take corrective actions before DISK_REPAIR_TIME expires to restore those devices

Connect to DB Instance and add new datafile to the tablespace.

SQL> alter tablespace test add datafile '+DGNM11GASM' size 20m;
Tablespace altered.

As there is only one disk available in the diskgroup (Normal redundancy), there will not be any mirror copy until the lost disk is accessible from oracle user and it is onlined using alter diskgroup online/new disk is added to diskgroup

chown oracle.dba /dev/raw/raw1
SQL> alter diskgroup dgnm11gasm online disk DGNM11GASM_0000;
Diskgroup altered.

SQL> select group_number,operation,state from v$asm_operation;
GROUP_NUMBER OPERA STAT POWER
---------- ----------------------------
1 ONLIN RUN 1

ASM fast resync keeps track of pending changes to extents on an OFFLINE disk during an outage. The extents are resynced when the disk is brought back online or replaced.

SQL> select path,header_status,mount_status from v$asm_disk where group_number=1;

PATH HEADER_STATU MOUNT_S
--------------- ------------ -------
/dev/raw/raw2 MEMBER CACHED
/dev/raw/raw1 MEMBER CACHED

简单的说一下，就是对于有个冗余的disk，当一个disk 损坏之后，2个磁盘会不一致，当这个磁盘被修复之后，oracle ASM Fast Mirror Resync 会自动同步这2个磁盘的数据。

From Oracle

-------------------------------------------------------------------------------------------------------

Blog： http://blog.csdn.net/tianlesoftware

Email: dvd.dba@gmail.com

DBA1 群：62697716(满); DBA2 群：62697977(满) DBA3 群：62697850(满)

DBA 超级群：63306533(满); DBA4 群： 83829929 DBA5群： 142216823

聊天群：40132017 聊天2群：69087192

--加群需要在备注说明Oracle表空间和数据文件的关系，否则拒绝申请

分享到：

ASM Fast Rebalance | How v$recovery_file_dest.space_used is c ...

2011-04-21 21:52
浏览 753
评论(0)
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

ASM Fast Mirror Resync - Example To Simulate Transient Disk Failure And Restore Disk

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

ASM Fast Mirror Resync - Example To Simulate Transient Disk Failure And Restore Disk

评论

发表评论

相关推荐

最近访客更多访客>>