Maintaining the Physical Disks of Storage Servers

This section contains the following topics:

See Also:

Oracle Maximum Availability Architecture (MAA) website at http://www.oracle.com/goto/maa for additional information about maintenance best practices

About System Disks and Data Disks

The first two disks of storage servers are system disks. Storage server software system software resides on a portion of each of the system disks. These portions on both system disks are referred to as the system area. The nonsystem area of the system disks, referred to as data partitions, is used for normal data storage. All other disks in a storage server are called data disks.

Monitoring the Status of Physical Disks

You can monitor a physical disk by checking its attributes with the CellCLI LIST PHYSICALDISK command. For example, a physical disk with a status of failed or warning - predictive failure is having problems and probably must be replaced. The disk firmware maintains the error counters, and marks a drive with Predictive Failure when internal thresholds are exceeded. The drive, not the server software, determines if it needs replacement.

The following list identifies the storage server physical disk statuses.

Physical Disk Status for Storage Servers

  • Physical Disk Status
  • normal
  • normal - dropped for replacement
  • normal - confinedOnline
  • normal - confinedOnline - dropped for replacement
  • not present
  • failed
  • failed - dropped for replacement
  • failed - rejected due to incorrect disk model
  • failed - rejected due to incorrect disk model - dropped for replacement
  • failed - rejected due to wrong slot
  • failed - rejected due to wrong slot - dropped for replacement
  • warning - confinedOnline
  • warning - confinedOnline - dropped for replacement
  • warning - peer failure
  • warning - poor performance
  • warning - poor performance - dropped for replacement
  • warning - poor performance, write-through caching
  • warning - predictive failure, poor performance
  • warning - predictive failure, poor performance - dropped for replacement
  • warning - predictive failure, write-through caching
  • warning - predictive failure
  • warning - predictive failure - dropped for replacement
  • warning - predictive failure, poor performance, write-through caching
  • warning - write-through caching

What Happens When Disk Errors Occur?

Oracle ASM performs bad extent repair for read errors caused by hardware errors. The disks stay online, and no alerts are sent.

When a disk fails:

  • The Oracle ASM disks associated with it are dropped automatically with the FORCE option, and then an Oracle ASM rebalance restores data redundancy.

  • The blue LED and the amber LED are turned on for the drive, indicating that disk replacement can proceed. The drive LED stays on solid. See "LED Status Descriptions" for information about LED status lights during predictive failure and poor performance.

  • The server generates an alert, which includes specific instructions for replacing the disk. If you configured the system for alert notifications, then the alert is sent by email to the designated address.

When a disk has a faulty status:

  • The Oracle ASM disks associated with the grid disks on the physical drive are dropped automatically.

  • An Oracle ASM rebalance relocates the data from the predictively failed disk to other disks.

  • The blue LED is turned on for the drive, indicating that disk replacement can proceed.

When Oracle ASM gets a read error on a physically-addressed metadata block, it does not have mirroring for the blocks:

  • Oracle ASM takes the disk offline.

  • Oracle ASM drops the disk with the FORCE option.

  • The storage server software sends an alert stating that the disk can be replaced.

About Detecting Underperforming Disks

ASR automatically identifies and removes a poorly performing disk from the active configuration. Recovery Appliance then runs a set of performance tests. When CELLSRV detects poor disk performance, the cell disk status changes to normal - confinedOnline, and the physical disk status changes to warning - confinedOnline. Table 13-2 describes the conditions that trigger disk confinement:

Table 13-2 Alerts Indicating Poor Disk Performance

Alert Code Cause

CD_PERF_HANG

Disk stopped responding

CD_PERF_SLOW_ABS

High service time threshold (slow disk)

CD_PERF_SLOW_RLTV

High relative service time threshold (slow disk)

CD_PERF_SLOW_LAT_WT

High latency on writes

CD_PERF_SLOW_LAT_RD

High latency on reads

CD_PERF_SLOW_LAT_RW

High latency on reads and writes

CD_PERF_SLOW_LAT_ERR

Frequent very high absolute latency on individual I/Os

CD_PERF_IOERR

I/O errors

If the problem is temporary and the disk passes the tests, then it is brought back into the configuration. If the disk does not pass the tests, then it is marked poor performance, and ASR submits a service request to replace the disk. If possible, Oracle ASM takes the grid disks offline for testing. Otherwise, the cell disk status stays at normal - confinedOnline until the disks can be taken offline safely. See "Removing an Underperforming Physical Disk".

The disk status change is recorded in the server alert history:

MESSAGE ID date_time info "Hard disk entered confinement status. The LUN
 n_m changed status to warning - confinedOnline. CellDisk changed status to normal
 - confinedOnline. Status: WARNING - CONFINEDONLINE  Manufacturer: name  Model
 Number: model  Size: size  Serial Number: serial_number  Firmware: fw_release 
 Slot Number: m  Cell Disk: cell_disk_name  Grid Disk: grid disk 1, grid disk 2
     .
     .
     .
Reason for confinement: threshold for service time exceeded"

These messages are entered in the storage cell alert log:

CDHS: Mark cd health state change cell_disk_name  with newState HEALTH_BAD_
ONLINE pending HEALTH_BAD_ONLINE ongoing INVALID cur HEALTH_GOOD
Celldisk entering CONFINE ACTIVE state with cause CD_PERF_SLOW_ABS activeForced: 0
inactiveForced: 0 trigger HistoryFail: 0, forceTestOutcome: 0 testFail: 0
global conf related state: numHDsConf: 1 numFDsConf: 0 numHDsHung: 0 numFDsHung: 0
     .
     .
     .

About Rebalancing the Data

After you replace the physical disk, you must re-create the grid disks and cell disks that existed on the previous disk in that slot. If those grid disks were part of an Oracle ASM group, then add them back to the disk group, and rebalance the data, based on the disk group redundancy and the ASM_POWER_LIMIT parameter.

Oracle ASM rebalance occurs when dropping or adding a disk. To check the status of the rebalance:

  • Did the rebalance operation run successfully?

    Check the Oracle ASM alert logs.

  • Is the rebalance operation currently running?

    Check the GV$ASM_OPERATION view.

  • Did the rebalance operation fail?

    Check the V$ASM_OPERATION.ERROR view.

You can perform rebalance operations from multiple disk groups on different Oracle ASM instances in the same cluster, if the failed physical disk contained ASM disks from multiple disk groups. One Oracle ASM instance can run one rebalance operation at a time. If all Oracle ASM instances are busy, then the rebalance operations are queued.

Monitoring Hard Disk Controller Write-Through Cache Mode

The hard disk controller on each storage server periodically performs a discharge and charge of the controller battery. During the operation, the write cache policy changes from write-back caching to write-through caching. Write-through cache mode is slower than write-back cache mode. However, write-back cache mode risks data loss if the storage server loses power or fails. The operation occurs every three months, for example, at 01:00 on the 17th day of January, April, July and October.

This example shows an informational alert that a storage server generates about the status of the caching mode for its logical drives:

HDD disk controller battery on disk contoller at adapter 0 is going into a learn
cycle. This is a normal maintenance activity that occurs quarterly and runs for
approximately 1 to 12 hours. The disk controller cache might go into WriteThrough
caching mode during the learn cycle. Disk write throughput might be temporarily
lower during this time. The message is informational only, no action is required.

Use the following commands to manage changes to the periodical write cache policy:

  • To change the start time for the learn cycle, use a command like the following example:

    CellCLI> ALTER CELL bbuLearnCycleTime="2013-01-22T02:00:00-08:00"
    

    The time reverts to the default learn cycle time after the cycle completes.

  • To see the time for the next learn cycle:

    CellCLI> LIST CELL ATTRIBUTES bbuLearnCycleTime
    
  • To view the status of the battery:

    # /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuStatus -a0
    
    BBU status for Adapter: 0
     
    BatteryType: iBBU08
    Voltage: 3721 mV
    Current: 541 mA
    Temperature: 43 C
     
    BBU Firmware Status:
    Charging Status : Charging
    Voltage : OK
    Temperature : OK
    Learn Cycle Requested : No
    Learn Cycle Active : No
    Learn Cycle Status : OK
    Learn Cycle Timeout : No
    I2c Errors Detected : No
    Battery Pack Missing : No
    Battery Replacement required : No
    Remaining Capacity Low : Yes
    Periodic Learn Required : No
    Transparent Learn : No
     
    Battery state:
     
    GasGuageStatus:
    Fully Discharged : No
    Fully Charged : No
    Discharging : No
    Initialized : No
    Remaining Time Alarm : Yes
    Remaining Capacity Alarm: No
    Discharge Terminated : No
    Over Temperature : No
    Charging Terminated : No
    Over Charged : No
     
    Relative State of Charge: 7 %
    Charger System State: 1
    Charger System Ctrl: 0
    Charging current: 541 mA
    Absolute state of charge: 0 %
    Max Error: 0 %
     
    Exit Code: 0x00

Replacing a Failed Physical Disk

A physical disk outage can reduce performance and data redundancy. Therefore, you should replace a failed disk with a new disk as soon as possible.

To replace a disk when it fails:

  1. Determine which disk failed.
    CellCLI> LIST PHYSICALDISK WHERE diskType=HardDisk AND status=failed DETAIL
    
             name:                   28:5
             deviceId:               21
             diskType:               HardDisk
             enclosureDeviceId:      28
             errMediaCount:          0
             errOtherCount:          0
             foreignState:           false
             luns:                   0_5
             makeModel:              "SEAGATE ST360057SSUN600G"
             physicalFirmware:       0705
             physicalInterface:      sas
             physicalSerial:         A01BC2
             physicalSize:           558.9109999993816G
             slotNumber:             5
             status:                 failed
    

    The slot number shows the location of the disk, and the status shows that the disk failed.

  2. Ensure that the blue "OK to Remove" LED on the disk is lit, before you remove the disk.
  3. Replace the physical disk on the storage server and wait three minutes. The physical disk is hot pluggable, and you can replace it with the power on.
  4. Confirm that the disk is online and its status is NORMAL:
    CellCLI> LIST PHYSICALDISK WHERE name=28:5 ATTRIBUTES status
    

    When you replace a physical disk, the RAID controller must acknowledge the replacement disk before you can use it. Acknowledgment is quick.

  5. Verify that the firmware is correct:
    ALTER CELL VALIDATE CONFIGURATION
    

    You can also check the ms-odl.trc file to confirm that the firmware was updated and the logical unit number (LUN) was rebuilt.

  6. Re-create the grid disks and cell disks that existed on the previous disk in that slot. See "About Rebalancing the Data".

See Also:

Replacing a Faulty Physical Disk

You might need to replace a physical disk because its status is warning - predictive failure. This status indicates that the physical disk will fail soon, and you should replace it at the earliest opportunity.

If the drive fails before you replace it, then see "Replacing a Failed Physical Disk".

To replace a disk before it fails:

  1. Identify the faulty disk:
    CellCLI> LIST PHYSICALDISK WHERE diskType=HardDisk AND status= \
            "warning - predictive failure" DETAIL
    
             name:                   28:3
             deviceId:               19
             diskType:               HardDisk
             enclosureDeviceId:      28
             errMediaCount:          0
             errOtherCount:          0
             foreignState:           false
             luns:                   0_3
             makeModel:              "SEAGATE ST360057SSUN600G"
             physicalFirmware:       0705
             physicalInterface:      sas
             physicalSerial:         E07L8E
             physicalSize:           558.9109999993816G
             slotNumber:             3
             status:                 warning - predictive failure
    

    In the sample output from the previous command, the slot number shows the location of the disk, and the status shows that the disk is expected to fail.

  2. Ensure that the blue "OK to Remove" LED on the disk is lit, before you remove the disk.
  3. Wait while the affected Oracle ASM disks are dropped. To check the status, query the V$ASM_DISK_STAT view on the Oracle ASM instance.

    Caution:

    The disks in the first two slots are system disks, which store the operating system and the Recovery Appliance storage server software. One system disk must be in working condition for the server to operate.

    Before replacing the other system disk, wait until ALTER CELL VALIDATE CONFIGURATION shows no RAID mdadm errors. This output indicates that the system disk resynchronization is complete.

    See Also:

    Oracle Database Reference for information about querying the V$ASM_DISK_STAT view

  4. Replace the physical disk on the storage server and wait three minutes. The physical disk is hot pluggable, and you can replace it when the power is on.
  5. Confirm that the disk is online and its status is NORMAL:
    CellCLI> LIST PHYSICALDISK WHERE name=28:5 ATTRIBUTES status
    

    When you replace a physical disk, the RAID controller must acknowledge the replacement disk before you can use it. Acknowledgment is quick.

  6. Verify that the firmware is correct:
    ALTER CELL VALIDATE CONFIGURATION
    
  7. Re-create the grid disks and cell disks that existed on the previous disk in that slot. See "About Rebalancing the Data".

See Also:

Removing an Underperforming Physical Disk

A bad physical disk can degrade the performance of other good disks. You should remove the bad disk from the system.

To remove a physical disk after identifying the bad disk:

  1. Illuminate the physical drive service LED to identify the drive to be replaced:
    cellcli -e 'alter physicaldisk disk_name serviceled on'
    

    In the preceding command, disk_name is the name of the physical disk to be replaced, such as 20:2.

  2. Identify all grid disks on the bad disk, and direct Oracle ASM to stop using them:
    ALTER DISKGROUP diskgroup_name DROP DISK asm_disk_name
    
  3. Ensure that the blue "OK to Remove" LED on the disk is lit.
  4. Query the V$ASM_DISK_STAT view to ensure that the Oracle ASM disks affected by the bad disk were dropped successfully.
  5. Remove the bad disk.

    An alert is sent when the disk is removed.

  6. When a new disk is available, install it in the system. The cell disks and grid disks are created automatically on the new physical disk.
  7. Confirm that the disk is online and its status is NORMAL:
    CellCLI> LIST PHYSICALDISK WHERE name=28:5 ATTRIBUTES status
    

    When you replace a physical disk, the RAID controller must acknowledge the replacement disk before you can use it. Acknowledgment is quick.

Moving All Drives from One Storage Server to Another

You might need to move all drives from one storage server to another storage server. This situation might occur when a chassis-level component fails, such as a motherboard or Oracle ILOM, or when you are troubleshooting a hardware problem.

To move the drives between storage servers:

  1. Back up the files in the following directories:
    • /etc/hosts

    • /etc/modprobe.conf

    • /etc/sysconfig/network

    • /etc/sysconfig/network-scripts

  2. Inactivate all grid disks and shut down the storage server. See "Shutting Down a Storage Server".
  3. Ensure that the Oracle ASM disk_repair_time attribute is set long enough, so that Oracle ASM does not drop the disks before you can activate the grid disks in another storage server.
  4. Move the physical disks, flash disks, disk controller, and USB flash drive from the original storage server to the new storage server.

    Caution:

    • Ensure that the first two disks, which are the system disks, are in the same, first two slots. Otherwise, the storage server will function improperly.

    • Ensure that the flash cards are installed in the same PCIe slots as in the original storage server.

  5. Power on the new storage server. You can either use the service processor interface or press the power button.
  6. Log in to the console using the service processor.
  7. Check the files in the following directories. Restore corrupt files from the backups.
    • /etc/hosts

    • /etc/modprobe.conf

    • /etc/sysconfig/network

    • /etc/sysconfig/network-scripts

  8. Use the ifconfig command to retrieve the new MAC addresses for eth0, eth1, eth2, and eth3. This example shows that the eth0 MAC address (HWaddr) is 00:14:4F:CA:D9:AE.
    # ifconfig eth0
    eth0      Link encap:Ethernet  HWaddr 00:14:4F:CA:D9:AE
              inet addr:10.204.74.184  Bcast:10.204.75.255  Mask:255.255.252.0
              inet6 addr: fe80::214:4fff:feca:d9ae/64 Scope:Link
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
              RX packets:141455 errors:0 dropped:0 overruns:0 frame:0
              TX packets:6340 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1000
              RX bytes:9578692 (9.1 MiB)  TX bytes:1042156 (1017.7 KiB)
              Memory:f8c60000-f8c80000
    
  9. In the /etc/sysconfig/network-scripts directory, edit the following files to change HWADDR to the value returned in step 8:
    • ifcfg-eth0
    • ifcfg-eth1
    • ifcfg-eth2
    • ifcfg-eth3

    The following example shows the edited ifcfg-eth0 file:

    #### DO NOT REMOVE THESE LINES ####
    #### %GENERATED BY CELL% ####
    DEVICE=eth0
    BOOTPROTO=static
    ONBOOT=yes
    IPADDR=10.204.74.184
    NETMASK=255.255.252.0
    NETWORK=10.204.72.0
    BROADCAST=10.204.75.255
    GATEWAY=10.204.72.1
    HOTPLUG=no
    IPV6INIT=no
    HWADDR=00:14:4F:CA:D9:AE
    
  10. Restart the storage server.
  11. Activate the grid disks:
    CellCLI> ALTER GRIDDISK ALL ACTIVE
    

    If the Oracle ASM disks were not dropped, then they go online automatically and start being used.

  12. Validate the configuration:
    CellCLI> ALTER CELL VALIDATE CONFIGURATION
    
  13. Activate Oracle ILOM for ASR.

Removing and Replacing the Same Physical Disk

If you remove the wrong physical disk and replace it, then Recovery Appliance automatically adds the disk back in the Oracle ASM disk group, and resynchronizes its data.

Note:

When replacing a faulty or failed disk, look for a lit LED on the disk. The LED is lit to help you locate the bad disk.

Reenabling a Rejected Physical Disk

Recovery Appliance rejects a physical disk when it is in the wrong slot.

Caution:

Reenabling a physical disk removes all data stored on it.

  • To reenable a rejected physical disk, replace hard_disk_name and hard_disk_id with the appropriate values in this command:

    CellCLI> ALTER PHYSICALDISK hard_disk_name/hard_disk_id reenable force
    Physical disk hard_disk_name/hard_disk_id  was reenabled.