Maintaining the Physical Disks of Storage Servers

About System Disks and Data Disks

The first two disks of storage servers are system disks. Storage server software system software resides on a portion of each of the system disks. These portions on both system disks are referred to as the system area. The nonsystem area of the system disks, referred to as data partitions, is used for normal data storage. All other disks in a storage server are called data disks.

Monitoring the Status of Physical Disks

You can monitor a physical disk by checking its attributes with the CellCLI LIST PHYSICALDISK command. For example, a physical disk with a status of failed or warning - predictive failure is having problems and probably must be replaced. The disk firmware maintains the error counters, and marks a drive with Predictive Failure when internal thresholds are exceeded. The drive, not the server software, determines if it needs replacement.

The following list identifies the storage server physical disk statuses.

Physical Disk Status for Storage Servers

Physical Disk Status
normal
normal - dropped for replacement
normal - confinedOnline
normal - confinedOnline - dropped for replacement
not present
failed
failed - dropped for replacement
failed - rejected due to incorrect disk model
failed - rejected due to incorrect disk model - dropped for replacement
failed - rejected due to wrong slot
failed - rejected due to wrong slot - dropped for replacement
warning - confinedOnline
warning - confinedOnline - dropped for replacement
warning - peer failure
warning - poor performance
warning - poor performance - dropped for replacement
warning - poor performance, write-through caching
warning - predictive failure, poor performance
warning - predictive failure, poor performance - dropped for replacement
warning - predictive failure, write-through caching
warning - predictive failure
warning - predictive failure - dropped for replacement
warning - predictive failure, poor performance, write-through caching
warning - write-through caching

What Happens When Disk Errors Occur?

Oracle ASM performs bad extent repair for read errors caused by hardware errors. The disks stay online, and no alerts are sent.

When a disk fails:

The Oracle ASM disks associated with it are dropped automatically with the FORCE option, and then an Oracle ASM rebalance restores data redundancy.
The blue LED and the amber LED are turned on for the drive, indicating that disk replacement can proceed. The drive LED stays on solid. See "LED Status Descriptions" for information about LED status lights during predictive failure and poor performance.
The server generates an alert, which includes specific instructions for replacing the disk. If you configured the system for alert notifications, then the alert is sent by email to the designated address.

When a disk has a faulty status:

The Oracle ASM disks associated with the grid disks on the physical drive are dropped automatically.
An Oracle ASM rebalance relocates the data from the predictively failed disk to other disks.
The blue LED is turned on for the drive, indicating that disk replacement can proceed.

When Oracle ASM gets a read error on a physically-addressed metadata block, it does not have mirroring for the blocks:

Oracle ASM takes the disk offline.
Oracle ASM drops the disk with the FORCE option.
The storage server software sends an alert stating that the disk can be replaced.

See Also:

About Detecting Underperforming Disks

ASR automatically identifies and removes a poorly performing disk from the active configuration. Recovery Appliance then runs a set of performance tests. When CELLSRV detects poor disk performance, the cell disk status changes to normal - confinedOnline, and the physical disk status changes to warning - confinedOnline. Table 13-2 describes the conditions that trigger disk confinement:

Table 13-2 Alerts Indicating Poor Disk Performance

Alert Code	Cause
CD_PERF_HANG	Disk stopped responding
CD_PERF_SLOW_ABS	High service time threshold (slow disk)
CD_PERF_SLOW_RLTV	High relative service time threshold (slow disk)
CD_PERF_SLOW_LAT_WT	High latency on writes
CD_PERF_SLOW_LAT_RD	High latency on reads
CD_PERF_SLOW_LAT_RW	High latency on reads and writes
CD_PERF_SLOW_LAT_ERR	Frequent very high absolute latency on individual I/Os
CD_PERF_IOERR	I/O errors

If the problem is temporary and the disk passes the tests, then it is brought back into the configuration. If the disk does not pass the tests, then it is marked poor performance, and ASR submits a service request to replace the disk. If possible, Oracle ASM takes the grid disks offline for testing. Otherwise, the cell disk status stays at normal - confinedOnline until the disks can be taken offline safely. See "Removing an Underperforming Physical Disk".

The disk status change is recorded in the server alert history:

MESSAGE ID date_time info "Hard disk entered confinement status. The LUN
 n_m changed status to warning - confinedOnline. CellDisk changed status to normal
 - confinedOnline. Status: WARNING - CONFINEDONLINE  Manufacturer: name  Model
 Number: model  Size: size  Serial Number: serial_number  Firmware: fw_release 
 Slot Number: m  Cell Disk: cell_disk_name  Grid Disk: grid disk 1, grid disk 2
     .
     .
     .
Reason for confinement: threshold for service time exceeded"

These messages are entered in the storage cell alert log:

CDHS: Mark cd health state change cell_disk_name  with newState HEALTH_BAD_
ONLINE pending HEALTH_BAD_ONLINE ongoing INVALID cur HEALTH_GOOD
Celldisk entering CONFINE ACTIVE state with cause CD_PERF_SLOW_ABS activeForced: 0
inactiveForced: 0 trigger HistoryFail: 0, forceTestOutcome: 0 testFail: 0
global conf related state: numHDsConf: 1 numFDsConf: 0 numHDsHung: 0 numFDsHung: 0
     .
     .
     .

About Rebalancing the Data

After you replace the physical disk, you must re-create the grid disks and cell disks that existed on the previous disk in that slot. If those grid disks were part of an Oracle ASM group, then add them back to the disk group, and rebalance the data, based on the disk group redundancy and the ASM_POWER_LIMIT parameter.

Oracle ASM rebalance occurs when dropping or adding a disk. To check the status of the rebalance:

Did the rebalance operation run successfully?

Check the Oracle ASM alert logs.
Is the rebalance operation currently running?

Check the GV$ASM_OPERATION view.
Did the rebalance operation fail?

Check the V$ASM_OPERATION.ERROR view.

You can perform rebalance operations from multiple disk groups on different Oracle ASM instances in the same cluster, if the failed physical disk contained ASM disks from multiple disk groups. One Oracle ASM instance can run one rebalance operation at a time. If all Oracle ASM instances are busy, then the rebalance operations are queued.

Monitoring Hard Disk Controller Write-Through Cache Mode

The hard disk controller on each storage server periodically performs a discharge and charge of the controller battery. During the operation, the write cache policy changes from write-back caching to write-through caching. Write-through cache mode is slower than write-back cache mode. However, write-back cache mode risks data loss if the storage server loses power or fails. The operation occurs every three months, for example, at 01:00 on the 17th day of January, April, July and October.

This example shows an informational alert that a storage server generates about the status of the caching mode for its logical drives:

HDD disk controller battery on disk contoller at adapter 0 is going into a learn
cycle. This is a normal maintenance activity that occurs quarterly and runs for
approximately 1 to 12 hours. The disk controller cache might go into WriteThrough
caching mode during the learn cycle. Disk write throughput might be temporarily
lower during this time. The message is informational only, no action is required.

Use the following commands to manage changes to the periodical write cache policy:

To change the start time for the learn cycle, use a command like the following example:
```
CellCLI> ALTER CELL bbuLearnCycleTime="2013-01-22T02:00:00-08:00"
```
The time reverts to the default learn cycle time after the cycle completes.

To see the time for the next learn cycle:

CellCLI> LIST CELL ATTRIBUTES bbuLearnCycleTime

To view the status of the battery:

# /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuStatus -a0

BBU status for Adapter: 0
 
BatteryType: iBBU08
Voltage: 3721 mV
Current: 541 mA
Temperature: 43 C
 
BBU Firmware Status:
Charging Status : Charging
Voltage : OK
Temperature : OK
Learn Cycle Requested : No
Learn Cycle Active : No
Learn Cycle Status : OK
Learn Cycle Timeout : No
I2c Errors Detected : No
Battery Pack Missing : No
Battery Replacement required : No
Remaining Capacity Low : Yes
Periodic Learn Required : No
Transparent Learn : No
 
Battery state:
 
GasGuageStatus:
Fully Discharged : No
Fully Charged : No
Discharging : No
Initialized : No
Remaining Time Alarm : Yes
Remaining Capacity Alarm: No
Discharge Terminated : No
Over Temperature : No
Charging Terminated : No
Over Charged : No
 
Relative State of Charge: 7 %
Charger System State: 1
Charger System Ctrl: 0
Charging current: 541 mA
Absolute state of charge: 0 %
Max Error: 0 %
 
Exit Code: 0x00

Replacing a Failed Physical Disk

A physical disk outage can reduce performance and data redundancy. Therefore, you should replace a failed disk with a new disk as soon as possible.

To replace a disk when it fails:

Determine which disk failed.

CellCLI> LIST PHYSICALDISK WHERE diskType=HardDisk AND status=failed DETAIL

         name:                   28:5
         deviceId:               21
         diskType:               HardDisk
         enclosureDeviceId:      28
         errMediaCount:          0
         errOtherCount:          0
         foreignState:           false
         luns:                   0_5
         makeModel:              "SEAGATE ST360057SSUN600G"
         physicalFirmware:       0705
         physicalInterface:      sas
         physicalSerial:         A01BC2
         physicalSize:           558.9109999993816G
         slotNumber:             5
         status:                 failed

The slot number shows the location of the disk, and the status shows that the disk failed.

Ensure that the blue "OK to Remove" LED on the disk is lit, before you remove the disk.
Replace the physical disk on the storage server and wait three minutes. The physical disk is hot pluggable, and you can replace it with the power on.
Confirm that the disk is online and its status is NORMAL:
```
CellCLI> LIST PHYSICALDISK WHERE name=28:5 ATTRIBUTES status
```
When you replace a physical disk, the RAID controller must acknowledge the replacement disk before you can use it. Acknowledgment is quick.
Verify that the firmware is correct:
```
ALTER CELL VALIDATE CONFIGURATION
```
You can also check the ms-odl.trc file to confirm that the firmware was updated and the logical unit number (LUN) was rebuilt.
Re-create the grid disks and cell disks that existed on the previous disk in that slot. See "About Rebalancing the Data".

See Also:

"Parts for Storage Servers"
Oracle Database Reference about the V$ASM_OPERATION view

Replacing a Faulty Physical Disk

You might need to replace a physical disk because its status is warning - predictive failure. This status indicates that the physical disk will fail soon, and you should replace it at the earliest opportunity.

If the drive fails before you replace it, then see "Replacing a Failed Physical Disk".

To replace a disk before it fails:

Identify the faulty disk:

CellCLI> LIST PHYSICALDISK WHERE diskType=HardDisk AND status= \
        "warning - predictive failure" DETAIL

         name:                   28:3
         deviceId:               19
         diskType:               HardDisk
         enclosureDeviceId:      28
         errMediaCount:          0
         errOtherCount:          0
         foreignState:           false
         luns:                   0_3
         makeModel:              "SEAGATE ST360057SSUN600G"
         physicalFirmware:       0705
         physicalInterface:      sas
         physicalSerial:         E07L8E
         physicalSize:           558.9109999993816G
         slotNumber:             3
         status:                 warning - predictive failure

In the sample output from the previous command, the slot number shows the location of the disk, and the status shows that the disk is expected to fail.

Ensure that the blue "OK to Remove" LED on the disk is lit, before you remove the disk.
Wait while the affected Oracle ASM disks are dropped. To check the status, query the V$ASM_DISK_STAT view on the Oracle ASM instance.

Caution:

The disks in the first two slots are system disks, which store the operating system and the Recovery Appliance storage server software. One system disk must be in working condition for the server to operate.

Before replacing the other system disk, wait until ALTER CELL VALIDATE CONFIGURATION shows no RAID mdadm errors. This output indicates that the system disk resynchronization is complete.

See Also:

Oracle Database Reference for information about querying the V$ASM_DISK_STAT view
Replace the physical disk on the storage server and wait three minutes. The physical disk is hot pluggable, and you can replace it when the power is on.
Confirm that the disk is online and its status is NORMAL:
```
CellCLI> LIST PHYSICALDISK WHERE name=28:5 ATTRIBUTES status
```
When you replace a physical disk, the RAID controller must acknowledge the replacement disk before you can use it. Acknowledgment is quick.
Verify that the firmware is correct:
```
ALTER CELL VALIDATE CONFIGURATION
```
Re-create the grid disks and cell disks that existed on the previous disk in that slot. See "About Rebalancing the Data".

See Also:

"Parts for Storage Servers"
Oracle Database Reference for information about the V$ASM_OPERATION view

Removing an Underperforming Physical Disk

A bad physical disk can degrade the performance of other good disks. You should remove the bad disk from the system.

To remove a physical disk after identifying the bad disk:

Illuminate the physical drive service LED to identify the drive to be replaced:
```
cellcli -e 'alter physicaldisk disk_name serviceled on'
```
In the preceding command, disk_name is the name of the physical disk to be replaced, such as 20:2.
Identify all grid disks on the bad disk, and direct Oracle ASM to stop using them:
```
ALTER DISKGROUP diskgroup_name DROP DISK asm_disk_name
```
Ensure that the blue "OK to Remove" LED on the disk is lit.
Query the V$ASM_DISK_STAT view to ensure that the Oracle ASM disks affected by the bad disk were dropped successfully.
Remove the bad disk.

An alert is sent when the disk is removed.
When a new disk is available, install it in the system. The cell disks and grid disks are created automatically on the new physical disk.
Confirm that the disk is online and its status is NORMAL:
```
CellCLI> LIST PHYSICALDISK WHERE name=28:5 ATTRIBUTES status
```
When you replace a physical disk, the RAID controller must acknowledge the replacement disk before you can use it. Acknowledgment is quick.

See Also:

"About Detecting Underperforming Disks"

Moving All Drives from One Storage Server to Another

You might need to move all drives from one storage server to another storage server. This situation might occur when a chassis-level component fails, such as a motherboard or Oracle ILOM, or when you are troubleshooting a hardware problem.

To move the drives between storage servers:

Back up the files in the following directories:
- /etc/hosts
- /etc/modprobe.conf
- /etc/sysconfig/network
- /etc/sysconfig/network-scripts
Inactivate all grid disks and shut down the storage server. See "Shutting Down a Storage Server".
Ensure that the Oracle ASM disk_repair_time attribute is set long enough, so that Oracle ASM does not drop the disks before you can activate the grid disks in another storage server.
Move the physical disks, flash disks, disk controller, and USB flash drive from the original storage server to the new storage server.
Caution:
- Ensure that the first two disks, which are the system disks, are in the same, first two slots. Otherwise, the storage server will function improperly.
- Ensure that the flash cards are installed in the same PCIe slots as in the original storage server.
Power on the new storage server. You can either use the service processor interface or press the power button.
Log in to the console using the service processor.
Check the files in the following directories. Restore corrupt files from the backups.
- /etc/hosts
- /etc/modprobe.conf
- /etc/sysconfig/network
- /etc/sysconfig/network-scripts

Use the ifconfig command to retrieve the new MAC addresses for eth0, eth1, eth2, and eth3. This example shows that the eth0 MAC address (HWaddr) is 00:14:4F:CA:D9:AE.

# ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 00:14:4F:CA:D9:AE
          inet addr:10.204.74.184  Bcast:10.204.75.255  Mask:255.255.252.0
          inet6 addr: fe80::214:4fff:feca:d9ae/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:141455 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6340 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:9578692 (9.1 MiB)  TX bytes:1042156 (1017.7 KiB)
          Memory:f8c60000-f8c80000

In the /etc/sysconfig/network-scripts directory, edit the following files to change HWADDR to the value returned in step 8:

ifcfg-eth0
ifcfg-eth1
ifcfg-eth2
ifcfg-eth3

The following example shows the edited ifcfg-eth0 file:

#### DO NOT REMOVE THESE LINES ####
#### %GENERATED BY CELL% ####
DEVICE=eth0
BOOTPROTO=static
ONBOOT=yes
IPADDR=10.204.74.184
NETMASK=255.255.252.0
NETWORK=10.204.72.0
BROADCAST=10.204.75.255
GATEWAY=10.204.72.1
HOTPLUG=no
IPV6INIT=no
HWADDR=00:14:4F:CA:D9:AE

Restart the storage server.
Activate the grid disks:
```
CellCLI> ALTER GRIDDISK ALL ACTIVE
```
If the Oracle ASM disks were not dropped, then they go online automatically and start being used.

Validate the configuration:

CellCLI> ALTER CELL VALIDATE CONFIGURATION

Activate Oracle ILOM for ASR.

Removing and Replacing the Same Physical Disk

If you remove the wrong physical disk and replace it, then Recovery Appliance automatically adds the disk back in the Oracle ASM disk group, and resynchronizes its data.

Note:

When replacing a faulty or failed disk, look for a lit LED on the disk. The LED is lit to help you locate the bad disk.

Reenabling a Rejected Physical Disk

Recovery Appliance rejects a physical disk when it is in the wrong slot.

Caution:

Reenabling a physical disk removes all data stored on it.

To reenable a rejected physical disk, replace hard_disk_name and hard_disk_id with the appropriate values in this command:

CellCLI> ALTER PHYSICALDISK hard_disk_name/hard_disk_id reenable force
Physical disk hard_disk_name/hard_disk_id  was reenabled.