Recent from talks
Nothing was collected or created yet.
Multi-chassis link aggregation group
View on WikipediaThis article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages)
|
A multi-chassis link aggregation group (MLAG or MC-LAG) is a type of link aggregation group (LAG) with constituent ports that terminate on separate chassis, primarily for the purpose of providing redundancy in the event one of the chassis fails. The IEEE 802.1AX-2008 industry standard for link aggregation does not mention MC-LAG, but does not preclude it. Its implementation varies by vendor; notably, the protocol for coordination between chassis is proprietary.
Background
[edit]A LAG is a method of inverse multiplexing over multiple Ethernet links, thereby increasing bandwidth and providing redundancy. It is defined by the IEEE 802.1AX-2008 standard, which states, "Link Aggregation allows one or more links to be aggregated together to form a Link Aggregation Group, such that a MAC client can treat the Link Aggregation Group as if it were a single link."[1] This layer 2 transparency is achieved by the LAG using a single MAC address for all the device’s ports in the LAG group. LAG can be configured as either static or dynamic. Dynamic LAG uses a peer-to-peer protocol, called Link Aggregation Control Protocol (LACP), for control. This LACP protocol is also defined within the 802.1AX-2008 standard.
Multi-chassis
[edit]MC-LAG adds node-level redundancy to the normal link-level redundancy that a LAG provides. This allows two or more nodes to share a common LAG endpoint. The multiple nodes present a single logical LAG to the remote end. Note that MC-LAG implementations are vendor-specific, but cooperating chassis remain externally compliant to the IEEE 802.1AX-2008 standard.[2] Nodes in an MC-LAG cluster communicate to synchronize and negotiate automatic switchovers in the event of failure. Some implementations may support administrator-initiated switchovers.
The diagram here shows four configurations:

- Switches A and B are each configured to group four discrete links (as indicated in green) into a single logical link with four times the bandwidth. Standard LACP protocol ensures that if any of the links go down, traffic will be distributed among the remaining three.
- Switch A is replaced by two chassis, switches A1 and A2. They communicate between themselves using a proprietary protocol and are thereby able to masquerade as a single virtual switch A running a shared instance of LACP. Switch B is not aware that it is connected to more than one chassis.
- Switch B is also replaced by two chassis B1 and B2. If these switches are from a different vendor, they may use a different proprietary protocol between themselves. But virtual switches A and B still communicate using LACP.
- Crossing two links to form an X makes no difference logically, any more than crossing links in a normal LAG would. However, physically, it provides much improved fault tolerance. If any of the switches fail, LACP reconfigures paths in as little as a few seconds. Operation continues with paths existing between all sources and destinations, albeit with degraded bandwidth.
Implementations
[edit]The following table lists known vendor implementations of MC-LAG, all of which are proprietary.
| Vendor | Implementation Name |
|---|---|
| ADVA Optical Networking | MC-LAG |
| Arista Networks | MLAG |
| Aruba Networks (formerly HP ProCurve) | Distributed Trunking under Intelligent Resilient Framework switch clustering technology |
| Avaya | Distributed Split Multi-Link Trunking |
| Ruckus Networks (formerly Brocade) | Multi-Chassis Trunking |
| Ciena | MC-LAG |
| Cisco Catalyst 6500 | Multichassis Etherchannel (MEC) - Virtual Switching System (VSS) |
| Cisco Catalyst 3750 (and similar) | Cross-Stack EtherChannel |
| Cisco Catalyst 9000 | StackWise Virtual |
| Cisco Nexus | Virtual PortChannel (vPC), where a PortChannel is a regular LAG |
| Cisco IOS XR | mLACP (Multichassis Link Aggregation Control Protocol) |
| Cumulus Networks | MLAG (formerly CLAG) |
| Dell Networking (formerly Force10 Networks, formerly nCore) | DNOS6.x Virtual Port Channel (vPC) or Virtual Link Trunking |
| Edgecore Networks | MLAG[3] |
| Extreme Networks | MLAG (Multi Switch Link Aggregation Group) |
| Ericsson | MC-LAG (Multi Chassis Link Aggregation Group) |
| FS | MLAG |
| Fortinet | MC-LAG (Multi Chassis Link Aggregation Group) |
| H3C | Distributed Resilient Network Interconnect |
| Huawei | M-LAG |
| Juniper | MC-LAG |
| Lenovo Networking (formerly IBM) | vLAG |
| Mellanox Technologies | MLAG |
| MikroTik | MLAG[4] |
| NEC | MC-LAG (Openflow to traditional network) |
| Nocsys | MLAG |
| Netgear | MLAG |
| Nokia (Formerly Alcatel-Lucent) | MC-LAG |
| Nortel | Split multi-link trunking |
| Nuage Networks (from Nokia) | MC-LAG ; including MCS (Multi-chassis Sync) |
| Plexxi (now Aruba Networks)) | vLAG |
| Pluribus Networks (now Arista Networks) | vLAG |
| UniFi | MC-LAG[5] |
| ZTE | MC-LAG |
Alternatives
[edit]The link aggregation configuration is superior to Spanning Tree Protocol as the load can be shared across all links during normal operation, whereas Spanning Tree Protocol must disable some links to prevent loops. With Spanning Tree Protocol there is a potential delay when recovering from failure. Link aggregation typically can recover quickly from failure.
IEEE 802.1aq (Shortest Path Bridging) is an alternative to MC-LAG that can be used for complex networks.[6]
TRILL (TRansparent Interconnection of Lots of Links) allows Ethernet to use an arbitrary topology, and enables per-flow pair-wise load splitting by way of Dijkstra's algorithm without configuration or user intervention.
References
[edit]- ^ IEEE. IEEE 802.1AX-2008. IEEE.
- ^ Bhagat, Amit N. "Multichassis Link Aggregation Group". Google Knowledge Base. Retrieved 15 March 2012.
- ^ "Aviz offers Networking 3.0". Retrieved 2025-04-06.
- ^ MikroTik: Multi-chassis Link Aggregation Group
- ^ (Enterprise Aggregate Switches)
- ^ Mike Fratto (2011-03-07). "When MLAG Is Good Enough". Network Computing.
Multi-chassis link aggregation group
View on GrokipediaFundamentals
Link Aggregation Overview
Link aggregation, also known as port trunking or link bundling, involves combining multiple parallel full-duplex point-to-point physical links operating at the same speed into a single logical link known as a Link Aggregation Group (LAG). This technique treats the aggregated links as one interface, providing increased aggregate bandwidth beyond that of a single link and redundancy in case of individual link failures.[4] The primary standard governing link aggregation is IEEE 802.1AX (formerly IEEE 802.3ad), which defines the Link Aggregation Control Protocol (LACP) for dynamic negotiation and management of LAGs. LACP operates by exchanging Link Aggregation Control Protocol Data Units (LACPDUs) between systems to automatically configure and maintain the aggregation. Devices can operate in active mode, where they initiate LACP negotiation by periodically sending LACPDUs, or passive mode, where they respond only to received LACPDUs without initiating exchanges. Key parameters include actor values (local system's identifiers such as System ID, Key, Port Priority, and Port Number) and partner values (corresponding remote system parameters), which ensure compatibility and proper port selection during negotiation.[4][5] In single-chassis setups, link aggregation enhances performance through load balancing, where traffic is distributed across member links using hashing algorithms that consider frame header information such as source/destination IP addresses, MAC addresses, or TCP/UDP ports to select the outgoing physical link. This prevents any single link from becoming a bottleneck while maintaining frame order within conversations. Additionally, it provides fault tolerance by detecting link failures via LACP timeouts or loss of LACPDUs, triggering rapid failover (typically within milliseconds for link-down events) to redistribute traffic over remaining active links without disrupting the logical interface.[6][4] The basic operational flow begins with LACP-enabled ports exchanging LACPDUs to advertise capabilities and parameters, followed by the selection of compatible ports based on matching actor and partner information. Selected ports are then aggregated into the LAG, enabling frame collection and distribution functions to treat the group as a unified link. Multi-chassis extensions naturally evolve from this foundation to enable redundancy across multiple devices for larger-scale deployments.[4]Rationale for Multi-Chassis Extension
Traditional single-chassis link aggregation groups (LAGs) are confined to ports within a single network device, creating a single point of failure where a chassis malfunction or maintenance downtime disrupts the entire aggregated link bundle.[7] This limitation restricts scalability for redundancy, as extending connections across multiple devices typically requires the Spanning Tree Protocol (STP) to prevent loops, resulting in blocked redundant ports that halve available bandwidth and introduce convergence delays during failures.[8][1] Multi-chassis link aggregation group (MLAG) addresses these issues by enabling active-active topologies across paired switches, allowing full utilization of aggregated bandwidth without STP-induced blocking or reconvergence delays.[9] In data center environments, MLAG supports top-of-rack (ToR) redundancy where servers are dual-homed to two separate switches via LACP-bonded interfaces, ensuring continuous connectivity and load balancing during switch failures or upgrades without traffic interruption.[7] This configuration eliminates downtime risks associated with single-chassis maintenance, providing seamless failover and enhanced reliability for high-availability server clusters.[8] MLAG emerged in the early 2000s as vendor-proprietary solutions to meet growing data center demands for non-blocking Layer 2 networks, extending protocols like LACP beyond single devices to support logical multi-homing.[10] A key requirement is presenting the aggregated links as a single logical entity to downstream devices, achieved through inter-chassis synchronization of control planes and MAC addresses, which prevents traffic blackholing by ensuring consistent forwarding states across peers.[1][7] Without this unified view, asymmetric traffic paths could lead to packet loss or loops, underscoring MLAG's role in enabling loop-free, resilient multi-device aggregation.[8]Technical Architecture
MLAG Components and Topology
A multi-chassis link aggregation group (MC-LAG) fundamentally consists of two peer chassis, typically switches, that operate in tandem to provide redundancy and load balancing. These peers are interconnected via an inter-chassis link (ICL), which serves as a dedicated pathway for both control plane synchronization and data plane forwarding between them. Client devices, such as servers or downstream switches, form logical link aggregation groups (LAGs) that span both peer chassis, enabling active-active utilization of links without creating loops in the network topology.[11][12] The ICL is typically implemented as a high-bandwidth aggregated Ethernet link, often itself a LAG, to handle substantial traffic volumes and ensure resilience. It plays a crucial role in forwarding unknown unicast, multicast, and broadcast traffic that arrives on one peer destined for the other, while also synchronizing forwarding database (FDB) entries, such as MAC address tables, across the peers to maintain consistent Layer 2 forwarding behavior. This setup prevents traffic blackholing during link failures and supports load balancing by allowing traffic to hash across multiple paths.[11][12][2] MC-LAG topologies are designed to support diverse network architectures, with dual-homed connections being the most common, where end devices like servers connect redundantly to both peer chassis via member links of a single LAG. In cascaded MC-LAG configurations, often used in spine-leaf fabrics, MC-LAG pairs at the leaf layer interconnect with upstream spine switches, enabling scalable east-west traffic flow without spanning tree protocol (STP) blocking. Peer-link configurations further optimize these setups by directing intra-MC-LAG domain traffic efficiently, treating the pair as a unified entity for north-south and east-west communications.[11][12] From a logical perspective, the client device perceives the MC-LAG as a single LAG endpoint with aggregated bandwidth, oblivious to the distribution across two physical chassis, which facilitates seamless failover and hashing-based load distribution. Physically, however, the setup involves separate cabling from the client to each chassis, with the ICL bridging them to ensure synchronized operations and redundant paths. Extensions to the Link Aggregation Control Protocol (LACP), such as multi-chassis LACP (mLACP), enable negotiation of these distributed LAGs across peers.[11][12][2]Synchronization and Control Protocols
In multi-chassis link aggregation groups (MC-LAGs), control plane synchronization ensures that peer devices maintain consistent operational states, including configuration, roles, and failure detection, primarily through vendor-proprietary protocols transmitted over the inter-chassis link (ICL) or dedicated Layer 3 paths. Synchronization protocols are vendor-specific, with no overarching IEEE standard; examples include Juniper's Inter-Chassis Control Protocol (ICCP) over TCP for exchanging control information and Arista's TCP-based peer communication for state replication. These mechanisms support heartbeat detection, often using Bidirectional Forwarding Detection (BFD), to identify peer failures and trigger failover, typically achieving convergence in 1-10 seconds depending on configuration.[11][12][2] Data plane handling in MC-LAGs focuses on synchronizing forwarding information to avoid loops and duplicates, particularly through MAC address learning synchronization across peers. Vendor implementations use dedicated messages over the ICL to propagate MAC address updates (learned or released) between peers, ensuring consistent forwarding tables without excessive network flooding. Orphan port management, for links not part of the MC-LAG, is achieved by synchronizing port states via control messages, allowing isolated ports to remain operational while preventing them from forwarding MC-LAG traffic during peer disruptions.[11][12][13] LACP extensions in MC-LAG environments modify standard IEEE 802.1AX parameters to present the peers as a single logical device, including the use of a shared system ID and actor key derived from a common MC-LAG domain identifier. In implementations like Cisco's mLACP, peers exchange configuration and state information to ensure identical actor keys and system IDs are advertised in LACP Data Units (LACPDUs) to downstream devices, supporting timeout handling for split-brain scenarios where ICL failure could isolate peers. During such timeouts, state machines enforce consistent port blocking or unblocking based on peer connectivity checks, preventing bidirectional forwarding loops.[2][11] Failure modes in MC-LAG synchronization prioritize rapid recovery while maintaining traffic integrity, with most implementations using active-active models where both peers forward traffic symmetrically until a failure. Upon detecting peer downtime via missed heartbeats, the surviving peer isolates orphan devices and reroutes traffic using local ports, flushing remote MAC entries to avoid blackholing. Active-standby modes are optional and vendor-specific, with failover based on priorities or configuration, and mechanisms ensuring orphan device isolation by blocking non-ICL ports during downtime to prevent asymmetric routing. These approaches, supported by redundant ICL paths, achieve sub-second to few-second convergence in controlled environments.[11][12][13]Implementation Details
Vendor-Specific Variants
Multi-chassis link aggregation group (MLAG) implementations vary across vendors, with each introducing proprietary extensions to standard LACP (IEEE 802.3ad) for enhanced redundancy and synchronization in dual-chassis topologies. Cisco's virtual PortChannel (vPC) technology, launched in 2009 for the Nexus 5000 Series switches, extends LACP through multi-chassis LACP (mLACP), incorporating system priority mechanisms for peer device election to ensure consistent link handling across chassis.[14] This approach supports active-active forwarding in data center environments, where downstream devices perceive a single logical port channel connected to two physical switches.[15] Juniper Networks' MC-LAG, integrated into the Junos OS since the early 2010s, relies on the Inter-Chassis Control Protocol (ICCP) for control-plane synchronization between peer devices, enabling state sharing for MAC addresses, ARP entries, and LACP parameters.[16] ICCP operates over TCP/IP to maintain forwarding consistency and prevent loops, with recommendations for inter-chassis link (ICL) sizing based on traffic volume to avoid bottlenecks—typically 20-40% of aggregate client bandwidth. This implementation emphasizes carrier-grade reliability in enterprise and service provider networks. Arista Networks' MLAG uses an Inter-Switch Link (MLAG-ISL) for peer connectivity, supporting low-latency environments common in cloud data centers through rapid failover and minimal overhead.[12] For Layer 3 gateway redundancy, Arista employs Virtual ARP (VARP), an active-active protocol that shares a virtual IP and MAC address across MLAG peers without requiring protocol elections, reducing convergence time compared to traditional VRRP.[17] Other vendors offer specialized MLAG variants tailored to open or enterprise ecosystems. Nvidia's Cumulus Linux implements MLAG via the clagd daemon, which detects peers using Link Layer Discovery Protocol (LLDP) advertisements and manages bond states to support LACP across white-box switches in disaggregated fabrics.[3] MikroTik introduced MLAG support in RouterOS v7 (beta 7.1, June 2021), enabling LACP bonds across two devices via ICCP-like synchronization and dedicated peer ports, primarily for cost-effective SMB deployments. Ubiquiti's MC-LAG, available on UniFi ECS-Aggregation switches since 2024, facilitates redundancy for high-density 25G/100G uplinks in campus networks, pairing two switches to present a unified LAG to downstream devices.[18] Over time, MLAG has evolved toward standards-based alternatives like EVPN multihoming for broader scalability, yet proprietary elements persist in vendor implementations to address split-brain scenarios through custom keepalive and election logic, ensuring reliability in non-standard topologies as of 2025.[19][20]Configuration and Deployment Best Practices
Configuring a multi-chassis link aggregation group (MLAG) begins with enabling the feature on both peer switches, which must be identical models running compatible software versions to ensure synchronization.[3][12][21] The inter-chassis link (ICL), also known as the peer-link, is configured as a link aggregation group (LAG) using multiple high-bandwidth interfaces, such as at least two 10-Gbps ports spanning different line cards for redundancy.[12][21] For client-facing bundles, Link Aggregation Control Protocol (LACP) is enabled with matching system IDs and keys across peers to form multi-chassis LAGs, often assigned unique MLAG IDs (e.g., 1-65535).[3][12] Heartbeat intervals for peer communication are tuned to 1-3 seconds, balancing detection speed with stability; for example, Cisco's default is 1 second, while Arista recommends 2.5 seconds or up to 10 seconds on certain platforms.[21][12] Vendor-specific commands vary, such as Arista'smlag configuration or Cumulus Linux's clag-id assignment.[12][3]
Best practices emphasize provisioning the ICL with bandwidth equivalent to 20-40% of the aggregate client link capacity to handle failover traffic without oversubscription, as seen in recommendations to allocate half the single-connected bandwidth in Cumulus Linux deployments.[3][21] Dedicated VLANs or interfaces are used for control traffic, such as VLAN 4094 in Arista or management interfaces in Cisco, to isolate heartbeats and synchronization from data flows.[12][21] Continuous monitoring for asymmetric routing is essential, achieved through commands like show mlag in Arista or show vpc in Cisco, to detect and correct imbalances that could degrade performance.[12][21] Consistent Spanning Tree Protocol (STP) settings across peers, including enabling BPDU Guard, prevent loops in hybrid environments.[3][1]
Deployment considerations include rigorous testing of failover scenarios using traffic generators to simulate link failures and verify sub-second convergence, ensuring no packet loss beyond 50 milliseconds in optimized setups.[21][11] In leaf-spine architectures, MLAG scales by deploying it at the leaf layer for dual-homing servers, supporting up to 128,000 virtual members with enhanced convergence features in platforms like Juniper's EX Series.[11] Integration with software-defined networking (SDN) controllers, such as those using NVUE in Cumulus Linux, automates MLAG provisioning alongside VXLAN overlays for dynamic scaling.[3]
Troubleshooting common issues like ICL congestion, which can lead to temporary loops or blackholing, involves monitoring utilization with tools like show interfaces and applying quality of service (QoS) policies to prioritize control traffic or adjust load-balancing hashing algorithms (e.g., source-destination IP and L4 ports).[21][12] In open-source implementations, recent enhancements in Cumulus Linux 5.14 (2024-2025 releases) for EVPN-MLAG hybrids introduce Ethernet Segment Identifiers (ESIs) and DF election for all-active redundancy, resolving legacy MLAG limitations in Clos fabrics through commands like nv set evpn multihoming enable on.[22] Logs such as /var/log/clagd.log in Cumulus or ICCP status in Juniper help diagnose synchronization failures.[3][11]
