Maintaining the Flash Disks of Storage Servers
This section describes how to perform maintenance on flash disks. It contains the following topics:
About the Flash Disks
Recovery Appliance mirrors data across storage servers, and sends write operations to at least two storage servers. If a flash card in one storage server has problems, then Recovery Appliance services the read and write operations using the mirrored data in another storage server. Service is not interrupted.
If a flash card fails, then the storage server software identifies the data in the flash cache by reading the data from the surviving mirror. It then writes the data to the server with the failed flash card. When the failure occurs, the software saves the location of the data lost in the failed flash cache. Resilvering then replaces the lost data with the mirrored copy. During resilvering, the grid disk status is ACTIVE -- RESILVERING WORKING
.
Each storage server has four PCIe cards. Each card has four flash disks (FDOMs) for a total of 16 flash disks. The four PCIe cards are located in PCI slot numbers 1, 2, 4, and 5.
To identify a failed flash disk, use the following command:
CellCLI> LIST PHYSICALDISK WHERE DISKTYPE=flashdisk AND STATUS=failed DETAIL
name: FLASH_5_3
diskType: FlashDisk
luns: 5_3
makeModel: "Sun Flash Accelerator F40 PCIe Card"
physicalFirmware: TI35
physicalInsertTime: 2012-07-13T15:40:59-07:00
physicalSerial: 5L002X4P
physicalSize: 93.13225793838501G
slotNumber: "PCI Slot: 5; FDOM: 3"
status: failed
The card name
and slotNumber
attributes show the PCI slot and the FDOM number.
When the server software detects a failure, it generates an alert that indicates that the flash disk, and the LUN on it, failed. The alert message includes the PCI slot number of the flash card and the exact FDOM number. These numbers uniquely identify the field replaceable unit (FRU). If you configured the system for alert notification, then the alert is sent to the designated address in an email message.
A flash disk outage can reduce performance and data redundancy. Replace the failed disk at the earliest opportunity. If the flash disk is used for flash cache, then the effective cache size for the server is reduced. If the flash disk is used for flash log, then the flash log is disabled on the disk, thus reducing the effective flash log size. If the flash disk is used for grid disks, then the Oracle ASM disks associated with them are automatically dropped with the FORCE
option from the Oracle ASM disk group, and an Oracle ASM rebalance starts to restore the data redundancy.
See Also:
-
"Parts for Storage Servers" for part number information and a link to the service guide
-
Oracle Database Reference for information about the
V$ASM_OPERATION
view -
Sun Flash Accelerator F80 PCIe Card User's Guide at
Faulty Status Indicators
The following status indicators generate an alert. The alert includes specific instructions for replacing the flash disk. If you configured the system for alert notifications, then the alerts are sent by email message to the designated address.
- warning - peer failure
-
One of the flash disks on the same Sun Flash Accelerator PCIe card failed or has a problem. For example, if FLASH5_3 fails, then FLASH5_0, FLASH5_1, and FLASH5_2 have peer failure status:
CellCLI> LIST PHYSICALDISK 36:0 L45F3A normal 36:1 L45WAE normal 36:2 L45WQW normal . . . FLASH_5_0 5L0034XM warning - peer failure FLASH_5_1 5L0034JE warning - peer failure FLASH_5_2 5L002WJH warning - peer failure FLASH_5_3 5L002X4P failed
- warning - predictive failure
-
The flash disk will fail soon, and should be replaced at the earliest opportunity. If the flash disk is used for flash cache, then it continues to be used as flash cache. If the flash disk is used for grid disks, then the Oracle ASM disks associated with these grid disks are automatically dropped, and Oracle ASM rebalance relocates the data from the predictively failed disk to other disks.
When one flash disk has predictive failure status, then the data is copied. If the flash disk is used for write back flash cache, then the data is flushed from the flash disks to the grid disks.
- warning - poor performance
-
The flash disk demonstrates extremely poor performance, and should be replaced at the earliest opportunity. If the flash disk is used for flash cache, then flash cache is dropped from this disk, thus reducing the effective flash cache size for the storage server. If the flash disk is used for grid disks, then the Oracle ASM disks associated with the grid disks on this flash disk are automatically dropped with the
FORCE
option, if possible. IfDROP...FORCE
cannot succeed because of offline partners, then the grid disks are dropped normally, and Oracle ASM rebalance relocates the data from the poor performance disk to the other disks. - warning - write-through caching
-
The capacitors used to support data cache on the PCIe card failed, and the card should be replaced as soon as possible.
Identifying Flash Disks in Poor Health
To identify a flash disk with a particular health status, use the LIST PHYSICALDISK
command. This example queries for the warning - predictive failure
status:
CellCLI> LIST PHYSICALDISK WHERE DISKTYPE=flashdisk AND STATUS= \ 'warning - predictive failure' DETAIL name: FLASH_5_3 diskType: FlashDisk luns: 5_3 makeModel: "Sun Flash Accelerator F40 PCIe Card" physicalFirmware: TI35 physicalInsertTime: 2012-07-13T15:40:59-07:00 physicalSerial: 5L002X4P physicalSize: 93.13225793838501G slotNumber: "PCI Slot: 1; FDOM: 2" status: warning - predictive failure
Identifying Underperforming Flash Disks
ASR automatically identifies and removes a poorly performing disk from the active configuration. Recovery Appliance then runs a set of performance tests. When CELLSRV
detects poor disk performance, the cell disk status changes to normal - confinedOnline
, and the physical disk status changes to warning - confinedOnline
. Table 13-2 describes the conditions that trigger disk confinement. The conditions are the same for both physical and flash disks.
If the problem is temporary and the disk passes the tests, then it is brought back into the configuration. If the disk does not pass the tests, then it is marked poor performance
, and ASR submits a service request to replace the disk. If possible, Oracle ASM takes the grid disks offline for testing. Otherwise, the cell disk status stays at normal - confinedOnline
until the disks can be taken offline safely.
The disk status change is recorded in the server alert history:
MESSAGE ID
date_time
info "Hard disk entered confinement status. The LUN n_m changed status to warning - confinedOnline. CellDisk changed status to normal - confinedOnline. Status: WARNING - CONFINEDONLINE Manufacturer:name
Model Number:model
Size:size
Serial Number:serial_number
Firmware:fw_release
Slot Number:m
Cell Disk:cell_disk_name
Grid Disk: grid disk 1, grid disk 2 ... Reason for confinement: threshold for service time exceeded"
These messages are entered in the storage cell alert log:
CDHS: Mark cd health state change cell_disk_name
with newState HEALTH_BAD_
ONLINE pending HEALTH_BAD_ONLINE ongoing INVALID cur HEALTH_GOOD
Celldisk entering CONFINE ACTIVE state with cause CD_PERF_SLOW_ABS activeForced: 0
inactiveForced: 0 trigger HistoryFail: 0, forceTestOutcome: 0 testFail: 0
global conf related state: numHDsConf: 1 numFDsConf: 0 numHDsHung: 0 numFDsHung: 0
.
.
.
When Is It Safe to Replace a Faulty Flash Disk?
When the server software detects a predictive or peer failure in a flash disk used for write back flash cache, and only one FDOM is bad, then the server software resilvers the data on the bad FDOM, and flushes the data on the other three FDOMs. If there are valid grid disks, then the server software initiates an Oracle ASM rebalance of the disks. You cannot replace the bad disk until the tasks are completed and an alert indicates that the disk is ready.
An alert is sent when the Oracle ASM disks are dropped, and you can safely replace the flash disk. If the flash disk is used for write-back flash cache, then wait until none of the grid disks are cached by the flash disk.
Replacing a Failed Flash Disk
Caution:
The PCIe cards are not hot pluggable; you must power down a storage server before replacing the flash disks or cards.
Before you perform the following procedure, shut down the server. See "Shutting Down a Storage Server".
To replace a failed flash disk:
See Also:
-
"Parts for Storage Servers" for part numbers and links to the service guide
-
Oracle Database Reference for information about the
V$ASM_OPERATION
view -
Sun Flash Accelerator F80 PCIe Card User's Guide at
Replacing a Faulty Flash Disk
Caution:
The PCIe cards are not hot pluggable; you must power down a storage server before replacing the flash disks or cards.
Before you perform the following procedure, review the "When Is It Safe to Replace a Faulty Flash Disk?" topic.
To replace a faulty flash disk:
The system automatically uses the new flash disk, as follows:
-
If the flash disk is used for flash cache, then the effective cache size increases.
-
If the flash disk is used for grid disks, then the grid disks are re-created on the new flash disk.
-
If the grid disks were part of an Oracle ASM disk group, then they are added back to the disk group. The data is rebalanced on them, based on the disk group redundancy and the
ASM_POWER_LIMIT
parameter.
Removing an Underperforming Flash Disk
A bad flash disk can degrade the performance of other good flash disks. You should remove a bad flash disk. See "Identifying Underperforming Flash Disks".
To remove an underperforming flash drive:
-
If the flash disk is used for flash cache:
-
Ensure that data not synchronized with the disk (dirty data) is flushed from flash cache to the grid disks:
CellCLI> ALTER FLASHCACHE ... FLUSH
-
Disable the flash cache and create a new one. Do not include the bad flash disk when creating the flash cache.
CellCLI > DROP FLASHCACHE CellCLI > CREATE FLASHCACHE CELLDISK='fd1,fd2,fd3,fd4, ...'
-
-
If the flash disk is used for grid disks, then direct Oracle ASM to stop using the bad disk immediately:
SQL> ALTER DISKGROUP diskgroup_name DROP DISK asm_disk_name FORCE
Offline partners might cause the
DROP
command with theFORCE
option to fail. If the previous command fails, do one of the following:-
Restore Oracle ASM data redundancy by correcting the other server or disk failures. Then retry the
DROP...FORCE
command. -
Direct Oracle ASM to rebalance the data off the bad disk:
SQL> ALTER DISKGROUP diskgroup_name DROP DISK asm_disk_name NOFORCE
-
-
Wait until the Oracle ASM disks associated with the bad flash disk are dropped successfully. The storage server software automatically sends an alert when it is safe to replace the flash disk.
-
Stop the services:
CellCLI> ALTER CELL SHUTDOWN SERVICES ALL
The preceding command checks if any disks are offline, in predictive failure status, or must be copied to its mirror. If Oracle ASM redundancy is intact, then the command takes the grid disks offline in Oracle ASM, and stops the services.
The following error indicates that stopping the services might cause redundancy problems and force a disk group to dismount:
Stopping the RS, CELLSRV, and MS services... The SHUTDOWN of ALL services was not successful. CELL-01548: Unable to shut down CELLSRV because disk group DATA, RECO may be forced to dismount due to reduced redundancy. Getting the state of CELLSRV services... running Getting the state of MS services... running Getting the state of RS services... running
If this error occurs, then restore Oracle ASM disk group redundancy. Retry the command when the status is normal for all disks.
-
Shut down the server. See "Shutting Down a Storage Server".
-
Remove the bad flash disk, and replace it with a new flash disk.
-
Power up the server. The services are started automatically. As part of the server startup, all grid disks are automatically online in Oracle ASM.
-
Add the new flash disk to flash cache:
CellCLI> DROP FLASHCACHE CellCLI> CREATE FLASHCACHE ALL
-
Verify that all grid disks are online:
CellCLI> LIST GRIDDISK ATTRIBUTES asmmodestatus
Wait until
asmmodestatus
showsONLINE
orUNUSED
for all grid disks.
The flash disks are added as follows:
-
If the flash disk is used for grid disks, then the grid disks are re-created on the new flash disk.
-
If these grid disks were part of an Oracle ASM disk group and
DROP...FORCE
was used in Step 2, then they are added back to the disk group and the data is rebalanced on based on disk group redundancy and theASM_POWER_LIMIT
parameter. -
If
DROP...NOFORCE
was used in Step 2, then you must manually add the grid disks back to the Oracle ASM disk group.