B.9 fortified reliability and robustness – Accusys ExaRAID GUI User Manual

Page 278

Advertising
background image

Appendix

B-9

• Online SMART disk cloning

When a hard disk fails in a disk group, RAID enters the degradation state,
which means lower performance, higher risk of data loss or RAID corruption.
When a hard disk is likely to become faulty or unhealthy, such as bad
sectors of a physical disk increases over a threshold, or a disk reports SMART
warning, the controller will online copy all data of the disk to a spare disk.
Moreover, should the source disk fails during the cloning, controller will start
rebuilding on the cloning disk, and the rebuilding will skip the sectors where
the cloning has been done. The disk cloning has been approved as the
most effective solutions to prevent RAID degradation.

• Transaction log and auto parity recovery

The capability to rebuild data of parity-based data protection relies on the
consistency of parity and data. However, the consistency might not be
retained because of improper system shutdown when there are
uncompleted write commands. To maintain the consistency, the controller
keeps logs of write commands in the NVRAM, and when the controller is
restarted, the parity affected by the uncompleted writes will be
automatically recovered.

• Battery backup protection

The controller delays the writes to disk drives and caches the data in the
memory for performance optimization, but this also causes risk because the
data in the cache will be gone forever if the system is not properly powered
off. The battery backup module retains the data in the cache memory
during abnormal power loss, and when the system is restarted, the data in
the cache memory will be flushed to the disk drives. As the size of cache
memory installed grows increasingly, the data loss could lead to
unrecoverable disasters for applications.

B.9 Fortified Reliability and Robustness

The mission of a RAID controller is not only to protect user data from disk
drive failure but also any hazards that might cause data loss or system
downtime. Both hardware and firmware of RAID controller has incorporated
advanced mechanisms to fortify the data reliability and to ensure the
system robustness. These designs are derived from our field experiences of
more than one decade in all kinds of real-world environments dealing with
host computers, disk drives, and hardware components. One of the best
parts in the design is that the administrator can use the online utilities
provided by the firmware to solve his problems without calling the services
from the vendors.

Advertising