Recent from talks
Nothing was collected or created yet.
Disk mirroring
View on WikipediaThis article needs additional citations for verification. (September 2013) |

In data storage, disk mirroring is the replication of logical disk volumes onto separate physical hard disks in real time to ensure continuous availability. It is most commonly used in RAID 1. A mirrored volume is a complete logical representation of separate volume copies.
In a disaster recovery context, mirroring data over long distance is referred to as storage replication. Depending on the technologies used, replication can be performed synchronously, asynchronously, semi-synchronously, or point-in-time. Replication is enabled via microcode on the disk array controller or via server software. It is typically a proprietary solution, not compatible between various data storage device vendors.
Mirroring is typically only synchronous. Synchronous writing typically achieves a recovery point objective (RPO) of zero lost data. Asynchronous replication can achieve an RPO of just a few seconds while the remaining methodologies provide an RPO of a few minutes to perhaps several hours.
Disk mirroring differs from file shadowing that operates on the file level, and disk snapshots where data images are never re-synced with their origins.
Overview
[edit]Typically, mirroring is provided in either hardware solutions such as disk arrays, or in software within the operating system (such as Linux mdadm and device mapper).[1][2] Additionally, file systems like Btrfs or ZFS provide integrated data mirroring.[3][4] There are additional benefits from Btrfs and ZFS, which maintain both data and metadata integrity checksums, making themselves capable of detecting bad copies of blocks, and using mirrored data to pull up data from correct blocks.[5]
There are several scenarios for what happens when a disk fails. In a hot swap system, in the event of a disk failure, the system itself typically diagnoses a disk failure and signals a failure. Sophisticated systems may automatically activate a hot standby disk and use the remaining active disk to copy live data onto this disk. Alternatively, a new disk is installed and the data is copied to it. In less sophisticated systems, the system is operated on the remaining disk until a spare disk can be installed.
The copying of data from one side of a mirror pair to another is called rebuilding or, less commonly, resilvering.[6]
Mirroring can be performed site to site either by rapid data links, for example fibre optic links, which over distances of 500 m or so can maintain adequate performance to support real-time mirroring. Longer distances or slower links maintain mirrors using an asynchronous copying system. For remote disaster recovery systems, this mirroring may not be done by integrated systems but simply by additional applications on primary and secondary machines.
Additional benefits
[edit]In addition to providing an additional copy of the data for the purpose of redundancy in case of hardware failure, disk mirroring can allow each disk to be accessed separately for reading purposes. Under certain circumstances, this can significantly improve performance as the system can choose for each read which disk can seek most quickly to the required data. This is especially significant where there are several tasks competing for data on the same disk, and thrashing (where the switching between tasks takes up more time than the task itself) can be reduced. This is an important consideration in hardware configurations that frequently access the data on the disk.
See also
[edit]- Business continuance volume – EMC Corporation's term for redundant copies of data in a disk array
- Disk cloning
- Distributed Replicated Block Device (DRBD)
- Mirror site
- Stable storage
References
[edit]- ^ "ANNOUNCE: mdadm 3.3 - A tools for managing md Soft RAID under Linux". gmane.org. 2013-09-03. Archived from the original on 2014-08-21. Retrieved 2013-11-20.
- ^ "Logical Volume Manager Administration". Appendix A. The Device Mapper. Red Hat. Retrieved 2013-09-29.
- ^ "Using Btrfs with Multiple Devices". kernel.org. 2013-11-07. Retrieved 2013-11-20.
- ^ "Actually it's a n-way mirror". c0t0d0s0.org. 2013-09-04. Archived from the original on 2013-09-14. Retrieved 2013-11-20.
- ^ McPherson, Amanda (22 June 2009). "A Conversation with Chris Mason on BTRfs: the next generation file system for Linux". Linux Foundation. Archived from the original on 27 June 2012. Retrieved 2013-11-22.
- ^ "Why Is It Called 'Resilvering'?". The Lone SysAdmin. 23 March 2012. Retrieved 2013-09-19.
Disk mirroring
View on GrokipediaFundamentals
Definition and Purpose
Disk mirroring is a data storage technique that duplicates data across two or more physical disks in real time, creating identical copies to enhance system reliability.[1] This method, also known as RAID 1, ensures that every write operation is simultaneously applied to all mirrored disks, maintaining data consistency without interruption to ongoing processes.[4] The primary purpose of disk mirroring is to provide fault tolerance and high availability by protecting against data loss from single-point failures, such as disk crashes or hardware malfunctions.[1] By keeping exact replicas of data sets readily accessible, it allows for seamless failover to a surviving disk, minimizing downtime in critical environments like enterprise servers and financial systems.[3] Unlike periodic backups, which capture snapshots at scheduled intervals and require restoration time, disk mirroring delivers real-time redundancy for immediate access to current data.[1] This distinction makes mirroring suitable for applications demanding continuous operation rather than long-term archival protection. Disk mirroring was first conceptualized in the 1970s within fault-tolerant computing systems, notably through innovations by Tandem Computers, which introduced commercial implementations in 1977 to support non-stop transaction processing.[3]Basic Mechanism
Disk mirroring maintains identical copies of data across multiple physical disks configured as a mirror set, where each disk in the set holds a complete duplicate of the stored information. In a fundamental duplex configuration, two disks form the mirror set, with each serving as an identical copy of the data. This setup ensures that data blocks are synchronized without striping or interleaving, relying instead on direct replication to provide redundancy.[6] The write process in disk mirroring begins when a host system issues a write request for a specific data block. The storage controller then simultaneously directs the write operation to all disks in the mirror set, replicating the block on each one in parallel to maintain consistency. For instance, in a two-disk duplex setup, the controller sends the identical data and address to both disks, completing the operation only after confirmation from all involved disks, thereby ensuring no divergence in content. This simultaneous replication step is the core of mirroring's redundancy mechanism.[7][8] Read operations retrieve data from any available disk within the mirror set, allowing flexibility in access. In normal conditions, the controller may distribute read requests across the disks to balance load and improve performance. In a simple two-disk topology, this means the read can come from either disk, with the choice often based on proximity or current workload to optimize response times. Mirror sets can scale to triplex configurations with three disks or more, where writes propagate to all members and reads draw from any, further enhancing availability through additional replicas.[6][8]Implementation
Hardware Approaches
Hardware-based disk mirroring, commonly implemented through RAID level 1 (RAID 1), relies on dedicated controllers that manage data replication at the firmware level, ensuring that writes to one disk are simultaneously duplicated to a mirror disk without involving the host operating system.[9] These controllers, often in the form of PCIe cards or integrated RAID-on-Chip (ROC) solutions on motherboards, feature their own processors and memory to handle all mirroring operations independently.[10] Dedicated hardware RAID 1 cards, such as LSI MegaRAID or Broadcom Tri-Mode controllers, offload mirroring tasks from the host CPU, resulting in lower overhead and allowing the processor to focus on application workloads.[11] This offloading enables faster I/O processing, as the controller can access mirrored data from either disk concurrently, maintaining performance during reads while providing redundancy.[9] In integrated setups, motherboard RAID controllers like those in Cisco UCS B-Series servers support RAID 1 volumes with 2 disks, configured via BIOS utilities for seamless firmware-level management.[9] Enterprise storage arrays exemplify advanced hardware mirroring, with systems from Dell EMC using PERC controllers to duplicate data across physical disks in RAID 1 configurations for high-availability environments.[12] Similarly, NetApp E-Series arrays incorporate dual controllers per enclosure to facilitate hardware-managed mirroring, ensuring data replication across drives in fault-tolerant setups.[13] A key feature of these hardware approaches is support for hot-swappable drives, where failed disks can be replaced without system interruption, as the controller automatically rebuilds the mirror using background resynchronization.[10]Software Approaches
Software-based disk mirroring implements redundancy by duplicating data writes across multiple storage devices at the operating system or application level, typically without relying on specialized hardware controllers. This approach leverages kernel drivers or modules to intercept input/output (I/O) operations and replicate them to mirror devices, enabling fault tolerance on commodity hardware. Unlike hardware mirroring, software methods offer greater configurability, as they can be dynamically adjusted through administrative tools without physical reconfiguration.[14] In operating system-level implementations, software mirroring is often managed through built-in utilities that create virtual devices aggregating physical disks. For instance, in Linux, the mdadm tool configures RAID 1 arrays using the Multiple Devices (md) driver, which writes identical data to paired disks for redundancy while allowing reads from either for improved performance.[15] Similarly, ZFS provides native mirroring within its storage pools, where vdevs (virtual devices) can be configured as mirrors to ensure data integrity across multiple disks, integrating seamlessly with its copy-on-write mechanism.[16] On Windows, Dynamic Disks support mirroring via the Disk Management console, converting basic disks to dynamic volumes that duplicate data across partners for fault tolerance.[17] These OS-level tools operate by hooking into the block I/O layer; for example, Linux's md driver uses kernel modules to intercept write requests and synchronously propagate them to all mirrors before acknowledging completion.[18] Application-level mirroring extends redundancy to specific workloads, often at the file system or database layer. The Logical Volume Manager (LVM) in Linux enables mirrored logical volumes by spanning physical extents across devices and replicating data through the device-mapper framework, providing flexibility for resizing and snapshotting alongside mirroring.[19] A key advantage of software mirroring is its adaptability in virtualized environments, where hypervisors like VMware vSphere or Microsoft Hyper-V can provision mirrored storage pools on virtual disks, facilitating live migration and resource pooling without hardware dependencies.[20] This contrasts with hardware offloading, which may limit options in shared virtual infrastructures but offers lower CPU overhead for high-throughput scenarios.Operational Aspects
Synchronization Processes
Initial synchronization in disk mirroring, also known as the initial mirror or rebuild process, involves copying all data from the primary disk to the secondary disk during setup to establish redundancy. This process ensures that both disks contain identical copies before the mirror becomes operational, typically initiated when creating a new mirrored array with one pre-existing disk containing data. In Linux's device-mapper RAID implementation, this is triggered by the "sync" parameter during array creation, resulting in a full data transfer that populates the secondary disk.[21] Ongoing resynchronization maintains data consistency between mirrored disks by addressing minor discrepancies that may arise from events such as power losses or transient errors, without requiring a full rebuild. This process can be periodic, checking for inconsistencies at predefined intervals, or event-triggered, such as after an unclean shutdown where partial writes might leave mirrors out of alignment. Research on software RAID resynchronization has proposed mechanisms like journal-guided verification to target only affected blocks, replaying outstanding write intentions from the file system's journal to repair inconsistencies efficiently. For instance, after a crash, the system scans the journal to identify and verify modified regions, rewriting data only where discrepancies exist between mirrors.[22] Bitmap tracking enhances resynchronization efficiency by maintaining metadata that records which disk blocks have changed since the last synchronization point, allowing the system to skip unchanged regions during alignment. This write-intent bitmap divides the disk into fixed-size chunks (e.g., 4KB regions) and sets bits for modified areas, enabling targeted updates rather than scanning the entire volume. In RAID1 implementations, such as those using file system integration, the bitmap can be combined with discard notifications or interval trees for tracking unused blocks, further optimizing the process by avoiding unnecessary reads and writes on free space. Internal bitmaps in Linux dm-raid, for example, are stored on the disks themselves and managed by a background daemon that updates them during normal operations.[23][21] The duration of resynchronization is proportional to the disk size and the rate of data changes, with full initial syncs on large drives often taking several hours due to the need to copy terabytes of data sequentially. Bitmap-assisted resyncs significantly reduce this time by limiting operations to altered blocks; for example, in a 2 TiB RAID1 array with 43% utilization, traditional full sync might require 5 hours, while bitmap-optimized methods can complete in about 2 hours by skipping unused areas. In journal-guided approaches for ongoing resyncs, times scale with the journal size rather than the full array, dropping from minutes to seconds for small change sets in multi-gigabyte arrays.[23][22]Failure Handling
Failure detection in disk mirroring systems relies on multiple monitoring mechanisms to identify disk issues before or during failure. Self-Monitoring, Analysis, and Reporting Technology (SMART) attributes, such as reallocated sector count and error rates, provide predictive indicators of impending drive failure by tracking internal diagnostics like read/write errors and spin-up retries.[24] In RAID implementations, controllers or software layers detect failures through I/O error logs, such as "No such device or address" messages, and validate data integrity using checksums to identify silent corruption or read failures on mirrored copies.[25] Hardware controllers generate alerts for persistent read/write errors, marking the affected disk as failed and notifying administrators via system events or lights-out management interfaces.[26] Upon detecting a failure in a mirrored pair, the failover process automatically redirects all I/O operations to the surviving mirror, ensuring continued read/write access without interruption in well-configured setups. This transparent switchover is managed by the RAID controller or software stack, which degrades the array to a non-redundant state but maintains data availability as long as the remaining disk functions.[27] For example, in Linux LVM RAID 1, the volume remains operational on the healthy device, with policies likeraid_fault_policy=allocate attempting to reallocate resources dynamically.[25]
After replacing the failed disk, recovery involves inserting the new drive and initiating a rebuild, where data is copied from the active mirror to resynchronize the array and restore redundancy. This process, often called resynchronization, can be full (copying all data) or optimized (syncing only changed regions tracked by the driver), and progress is monitored through tools showing sync percentages.[25] In Solaris Volume Manager, for instance, the metasync command handles this, logging events and allowing resumption after interruptions.[27]
In multi-mirror configurations like RAID 10, which stripes data across multiple mirrored pairs, the system can tolerate multiple simultaneous disk failures without data loss, provided no two failed drives belong to the same mirror pair; failure of both drives in any single pair results in total data unavailability due to the absence of parity reconstruction.[28]
