Hubbry Logo
Multi-chassis link aggregation groupMulti-chassis link aggregation groupMain
Open search
Multi-chassis link aggregation group
Community hub
Multi-chassis link aggregation group
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Multi-chassis link aggregation group
Multi-chassis link aggregation group
from Wikipedia

A multi-chassis link aggregation group (MLAG or MC-LAG) is a type of link aggregation group (LAG) with constituent ports that terminate on separate chassis, primarily for the purpose of providing redundancy in the event one of the chassis fails. The IEEE 802.1AX-2008 industry standard for link aggregation does not mention MC-LAG, but does not preclude it. Its implementation varies by vendor; notably, the protocol for coordination between chassis is proprietary.

Background

[edit]

A LAG is a method of inverse multiplexing over multiple Ethernet links, thereby increasing bandwidth and providing redundancy. It is defined by the IEEE 802.1AX-2008 standard, which states, "Link Aggregation allows one or more links to be aggregated together to form a Link Aggregation Group, such that a MAC client can treat the Link Aggregation Group as if it were a single link."[1] This layer 2 transparency is achieved by the LAG using a single MAC address for all the device’s ports in the LAG group. LAG can be configured as either static or dynamic. Dynamic LAG uses a peer-to-peer protocol, called Link Aggregation Control Protocol (LACP), for control. This LACP protocol is also defined within the 802.1AX-2008 standard.

Multi-chassis

[edit]

MC-LAG adds node-level redundancy to the normal link-level redundancy that a LAG provides. This allows two or more nodes to share a common LAG endpoint. The multiple nodes present a single logical LAG to the remote end. Note that MC-LAG implementations are vendor-specific, but cooperating chassis remain externally compliant to the IEEE 802.1AX-2008 standard.[2] Nodes in an MC-LAG cluster communicate to synchronize and negotiate automatic switchovers in the event of failure. Some implementations may support administrator-initiated switchovers.

The diagram here shows four configurations:

Illustration comparing LAG to high-availability MLAG
  1. Switches A and B are each configured to group four discrete links (as indicated in green) into a single logical link with four times the bandwidth. Standard LACP protocol ensures that if any of the links go down, traffic will be distributed among the remaining three.
  2. Switch A is replaced by two chassis, switches A1 and A2. They communicate between themselves using a proprietary protocol and are thereby able to masquerade as a single virtual switch A running a shared instance of LACP. Switch B is not aware that it is connected to more than one chassis.
  3. Switch B is also replaced by two chassis B1 and B2. If these switches are from a different vendor, they may use a different proprietary protocol between themselves. But virtual switches A and B still communicate using LACP.
  4. Crossing two links to form an X makes no difference logically, any more than crossing links in a normal LAG would. However, physically, it provides much improved fault tolerance. If any of the switches fail, LACP reconfigures paths in as little as a few seconds. Operation continues with paths existing between all sources and destinations, albeit with degraded bandwidth.

Implementations

[edit]

The following table lists known vendor implementations of MC-LAG, all of which are proprietary.

Vendor Implementation Name
ADVA Optical Networking MC-LAG
Arista Networks MLAG
Aruba Networks (formerly HP ProCurve) Distributed Trunking under Intelligent Resilient Framework switch clustering technology
Avaya Distributed Split Multi-Link Trunking
Ruckus Networks (formerly Brocade) Multi-Chassis Trunking
Ciena MC-LAG
Cisco Catalyst 6500 Multichassis Etherchannel (MEC) - Virtual Switching System (VSS)
Cisco Catalyst 3750 (and similar) Cross-Stack EtherChannel
Cisco Catalyst 9000 StackWise Virtual
Cisco Nexus Virtual PortChannel (vPC), where a PortChannel is a regular LAG
Cisco IOS XR mLACP (Multichassis Link Aggregation Control Protocol)
Cumulus Networks MLAG (formerly CLAG)
Dell Networking (formerly Force10 Networks, formerly nCore) DNOS6.x Virtual Port Channel (vPC) or Virtual Link Trunking
Edgecore Networks MLAG[3]
Extreme Networks MLAG (Multi Switch Link Aggregation Group)
Ericsson MC-LAG (Multi Chassis Link Aggregation Group)
FS MLAG
Fortinet MC-LAG (Multi Chassis Link Aggregation Group)
H3C Distributed Resilient Network Interconnect
Huawei M-LAG
Juniper MC-LAG
Lenovo Networking (formerly IBM) vLAG
Mellanox Technologies MLAG
MikroTik MLAG[4]
NEC MC-LAG (Openflow to traditional network)
Nocsys MLAG
Netgear MLAG
Nokia (Formerly Alcatel-Lucent) MC-LAG
Nortel Split multi-link trunking
Nuage Networks (from Nokia) MC-LAG ; including MCS (Multi-chassis Sync)
Plexxi (now Aruba Networks)) vLAG
Pluribus Networks (now Arista Networks) vLAG
UniFi MC-LAG[5]
ZTE MC-LAG

Alternatives

[edit]

The link aggregation configuration is superior to Spanning Tree Protocol as the load can be shared across all links during normal operation, whereas Spanning Tree Protocol must disable some links to prevent loops. With Spanning Tree Protocol there is a potential delay when recovering from failure. Link aggregation typically can recover quickly from failure.

IEEE 802.1aq (Shortest Path Bridging) is an alternative to MC-LAG that can be used for complex networks.[6]

TRILL (TRansparent Interconnection of Lots of Links) allows Ethernet to use an arbitrary topology, and enables per-flow pair-wise load splitting by way of Dijkstra's algorithm without configuration or user intervention.

References

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
A multi-chassis link aggregation group (MC-LAG) is a networking technology that extends traditional —defined by the IEEE 802.3ad standard—to span multiple physical network switches or chassis, allowing a client device (such as a server or another switch) to form a single logical aggregated link with two peer devices for enhanced redundancy and bandwidth. This configuration enables active-active forwarding across the links, distributing traffic for load balancing while ensuring fast if one chassis or link fails, without relying on (STP) to manage loops. MC-LAG implementations typically build on the Link Aggregation Control Protocol (LACP) from IEEE 802.3ad, which negotiates and manages the bundled links, but incorporate vendor-specific extensions for inter- coordination. Peer communicate via dedicated protocols—such as Juniper's Inter-Chassis Control Protocol (ICCP) or equivalent mechanisms—to synchronize link states, tables, and forwarding decisions, ensuring consistent operation and preventing network disruptions. Although not defined in IEEE standards, which focus on single-device aggregation, MC-LAG is widely supported in enterprise and environments to provide for devices, supporting loop-free Layer 2 topologies and improving overall network resilience. Vendor-specific variants include ' MC-LAG for EX Series and QFX switches, Cisco's multi-chassis LACP (mLACP) for IOS-based platforms, and NVIDIA's MLAG for Cumulus Linux, each tailored to specific hardware while adhering to core LACP principles for interoperability in dual-homed setups. These solutions address limitations of traditional single-chassis LAGs by enabling higher throughput, sub-second convergence times, and protection against single points of failure in critical infrastructures like data centers and campus networks.

Fundamentals

Link aggregation, also known as port trunking or link bundling, involves combining multiple parallel full-duplex point-to-point physical links operating at the same speed into a single logical link known as a (LAG). This technique treats the aggregated links as one interface, providing increased aggregate bandwidth beyond that of a single link and redundancy in case of individual link failures. The primary standard governing link aggregation is IEEE 802.1AX (formerly IEEE 802.3ad), which defines the Link Aggregation Control Protocol (LACP) for dynamic and management of LAGs. LACP operates by exchanging Link Aggregation Control Protocol Data Units (LACPDUs) between systems to automatically configure and maintain the aggregation. Devices can operate in active mode, where they initiate LACP by periodically sending LACPDUs, or passive mode, where they respond only to received LACPDUs without initiating exchanges. Key parameters include actor values (local system's identifiers such as System ID, Key, Port Priority, and Port Number) and partner values (corresponding remote system parameters), which ensure compatibility and proper port selection during . In single-chassis setups, enhances performance through load balancing, where traffic is distributed across member links using hashing algorithms that consider frame header information such as source/destination IP addresses, MAC addresses, or TCP/UDP ports to select the outgoing physical link. This prevents any single link from becoming a bottleneck while maintaining frame order within conversations. Additionally, it provides by detecting link failures via LACP timeouts or loss of LACPDUs, triggering rapid (typically within milliseconds for link-down events) to redistribute traffic over remaining active links without disrupting the logical interface. The basic operational flow begins with LACP-enabled ports exchanging LACPDUs to advertise capabilities and parameters, followed by the selection of compatible ports based on matching actor and partner information. Selected ports are then aggregated into the LAG, enabling frame collection and distribution functions to treat the group as a unified link. Multi-chassis extensions naturally evolve from this foundation to enable across multiple devices for larger-scale deployments.

Rationale for Multi-Chassis Extension

Traditional single-chassis link aggregation groups (LAGs) are confined to ports within a single network device, creating a where a chassis malfunction or maintenance downtime disrupts the entire aggregated link bundle. This limitation restricts scalability for redundancy, as extending connections across multiple devices typically requires the (STP) to prevent loops, resulting in blocked redundant ports that halve available bandwidth and introduce convergence delays during failures. Multi-chassis link aggregation group (MLAG) addresses these issues by enabling active-active topologies across paired switches, allowing full utilization of aggregated bandwidth without STP-induced blocking or reconvergence delays. In environments, MLAG supports top-of-rack (ToR) redundancy where servers are to two separate switches via LACP-bonded interfaces, ensuring continuous connectivity and load balancing during switch failures or upgrades without traffic interruption. This configuration eliminates downtime risks associated with single-chassis maintenance, providing seamless and enhanced reliability for high-availability server clusters. MLAG emerged in the early as vendor-proprietary solutions to meet growing demands for non-blocking Layer 2 networks, extending protocols like LACP beyond single devices to support logical multi-homing. A key requirement is presenting the aggregated links as a single logical entity to downstream devices, achieved through inter-chassis synchronization of control planes and MAC addresses, which prevents traffic blackholing by ensuring consistent forwarding states across peers. Without this unified view, asymmetric traffic paths could lead to or loops, underscoring MLAG's role in enabling loop-free, resilient multi-device aggregation.

Technical Architecture

MLAG Components and Topology

A multi-chassis link aggregation group (MC-LAG) fundamentally consists of two peer , typically switches, that operate in tandem to provide and load balancing. These peers are interconnected via an inter-chassis link (ICL), which serves as a dedicated pathway for both synchronization and data plane forwarding between them. Client devices, such as servers or downstream switches, form logical link aggregation groups (LAGs) that span both peer chassis, enabling active-active utilization of links without creating loops in the network topology. The ICL is typically implemented as a high-bandwidth aggregated Ethernet link, often itself a LAG, to handle substantial volumes and ensure resilience. It plays a crucial role in forwarding unknown , , and broadcast that arrives on one peer destined for the other, while also synchronizing forwarding database (FDB) entries, such as tables, across the peers to maintain consistent Layer 2 forwarding behavior. This setup prevents blackholing during link failures and supports load balancing by allowing to hash across multiple paths. MC-LAG topologies are designed to support diverse network architectures, with connections being the most common, where end devices like servers connect redundantly to both peer via member links of a single LAG. In cascaded MC-LAG configurations, often used in spine-leaf fabrics, MC-LAG pairs at the leaf layer interconnect with upstream spine switches, enabling scalable east-west traffic flow without (STP) blocking. Peer-link configurations further optimize these setups by directing intra-MC-LAG domain traffic efficiently, treating the pair as a unified entity for north-south and east-west communications. From a logical perspective, the client device perceives the MC-LAG as a single LAG endpoint with aggregated bandwidth, oblivious to the distribution across two physical , which facilitates seamless and hashing-based load distribution. Physically, however, the setup involves separate cabling from the client to each , with the ICL bridging them to ensure synchronized operations and redundant paths. Extensions to the Control Protocol (LACP), such as multi-chassis LACP (mLACP), enable negotiation of these distributed LAGs across peers.

Synchronization and Control Protocols

In multi-chassis link aggregation groups (MC-LAGs), control plane synchronization ensures that peer devices maintain consistent operational states, including configuration, roles, and failure detection, primarily through vendor-proprietary protocols transmitted over the inter-chassis link (ICL) or dedicated Layer 3 paths. Synchronization protocols are vendor-specific, with no overarching IEEE standard; examples include Juniper's Inter-Chassis Control Protocol (ICCP) over TCP for exchanging control information and Arista's TCP-based peer communication for state replication. These mechanisms support heartbeat detection, often using (), to identify peer failures and trigger , typically achieving convergence in 1-10 seconds depending on configuration. Data plane handling in MC-LAGs focuses on synchronizing forwarding information to avoid loops and duplicates, particularly through learning across peers. Vendor implementations use dedicated messages over the ICL to propagate updates (learned or released) between peers, ensuring consistent forwarding tables without excessive network flooding. Orphan management, for links not part of the MC-LAG, is achieved by synchronizing states via control messages, allowing isolated ports to remain operational while preventing them from forwarding MC-LAG traffic during peer disruptions. LACP extensions in MC-LAG environments modify standard IEEE 802.1AX parameters to present the peers as a single logical device, including the use of a shared system ID and key derived from a common MC-LAG domain identifier. In implementations like Cisco's mLACP, peers exchange configuration and state information to ensure identical keys and system IDs are advertised in LACP Data Units (LACPDUs) to downstream devices, supporting timeout handling for scenarios where ICL failure could isolate peers. During such timeouts, state machines enforce consistent port blocking or unblocking based on peer connectivity checks, preventing bidirectional forwarding loops. Failure modes in MC-LAG prioritize rapid recovery while maintaining , with most implementations using active-active models where both peers forward symmetrically until a . Upon detecting peer via missed heartbeats, the surviving peer isolates devices and reroutes using local ports, flushing remote MAC entries to avoid blackholing. Active-standby modes are optional and vendor-specific, with based on priorities or configuration, and mechanisms ensuring device isolation by blocking non-ICL ports during to prevent asymmetric . These approaches, supported by redundant ICL paths, achieve sub-second to few-second convergence in controlled environments.

Implementation Details

Vendor-Specific Variants

Multi-chassis link aggregation group (MLAG) implementations vary across vendors, with each introducing proprietary extensions to standard LACP (IEEE 802.3ad) for enhanced redundancy and synchronization in dual-chassis topologies. Cisco's virtual PortChannel (vPC) technology, launched in 2009 for the Nexus 5000 Series switches, extends LACP through multi-chassis LACP (mLACP), incorporating system priority mechanisms for peer device election to ensure consistent link handling across chassis. This approach supports active-active forwarding in environments, where downstream devices perceive a single logical port channel connected to two physical switches. Juniper Networks' MC-LAG, integrated into the since the early 2010s, relies on the Inter-Chassis Control Protocol (ICCP) for control-plane synchronization between peer devices, enabling state sharing for MAC addresses, ARP entries, and LACP parameters. ICCP operates over TCP/IP to maintain forwarding consistency and prevent loops, with recommendations for inter-chassis link (ICL) sizing based on traffic volume to avoid bottlenecks—typically 20-40% of aggregate client bandwidth. This implementation emphasizes carrier-grade reliability in enterprise and service provider networks. Arista Networks' MLAG uses an Inter-Switch Link (MLAG-ISL) for peer connectivity, supporting low-latency environments common in cloud data centers through rapid and minimal overhead. For Layer 3 gateway , Arista employs Virtual ARP (VARP), an active-active protocol that shares a virtual IP and across MLAG peers without requiring protocol elections, reducing convergence time compared to traditional VRRP. Other vendors offer specialized MLAG variants tailored to open or enterprise ecosystems. Nvidia's Cumulus Linux implements MLAG via the clagd daemon, which detects peers using (LLDP) advertisements and manages bond states to support LACP across white-box switches in disaggregated fabrics. introduced MLAG support in RouterOS v7 (beta 7.1, June 2021), enabling LACP bonds across two devices via ICCP-like synchronization and dedicated peer ports, primarily for cost-effective SMB deployments. Ubiquiti's MC-LAG, available on UniFi ECS-Aggregation switches since 2024, facilitates redundancy for high-density 25G/100G uplinks in campus networks, pairing two switches to present a unified LAG to downstream devices. Over time, MLAG has evolved toward standards-based alternatives like EVPN for broader , yet proprietary elements persist in vendor implementations to address scenarios through custom and election logic, ensuring reliability in non-standard topologies as of 2025.

Configuration and Deployment Best Practices

Configuring a multi-chassis link aggregation group (MLAG) begins with enabling the feature on both peer switches, which must be identical models running compatible software versions to ensure . The inter-chassis link (ICL), also known as the peer-link, is configured as a group (LAG) using multiple high-bandwidth interfaces, such as at least two 10-Gbps ports spanning different line cards for redundancy. For client-facing bundles, Control Protocol (LACP) is enabled with matching system IDs and keys across peers to form multi-chassis LAGs, often assigned unique MLAG IDs (e.g., 1-65535). Heartbeat intervals for peer communication are tuned to 1-3 seconds, balancing detection speed with stability; for example, Cisco's default is 1 second, while Arista recommends 2.5 seconds or up to 10 seconds on certain platforms. Vendor-specific commands vary, such as Arista's mlag configuration or Cumulus Linux's clag-id assignment. Best practices emphasize provisioning the ICL with bandwidth equivalent to 20-40% of the aggregate client link capacity to handle failover traffic without oversubscription, as seen in recommendations to allocate half the single-connected bandwidth in Cumulus Linux deployments. Dedicated VLANs or interfaces are used for control traffic, such as VLAN 4094 in Arista or management interfaces in Cisco, to isolate heartbeats and synchronization from data flows. Continuous monitoring for asymmetric routing is essential, achieved through commands like show mlag in Arista or show vpc in Cisco, to detect and correct imbalances that could degrade performance. Consistent Spanning Tree Protocol (STP) settings across peers, including enabling BPDU Guard, prevent loops in hybrid environments. Deployment considerations include rigorous testing of failover scenarios using traffic generators to simulate link failures and verify sub-second convergence, ensuring no beyond 50 milliseconds in optimized setups. In leaf-spine architectures, MLAG scales by deploying it at the leaf layer for dual-homing servers, supporting up to 128,000 virtual members with enhanced convergence features in platforms like Juniper's EX Series. Integration with (SDN) controllers, such as those using NVUE in Cumulus , automates MLAG provisioning alongside VXLAN overlays for dynamic scaling. Troubleshooting common issues like ICL congestion, which can lead to temporary loops or blackholing, involves monitoring utilization with tools like show interfaces and applying (QoS) policies to prioritize control traffic or adjust load-balancing hashing algorithms (e.g., source-destination IP and L4 ports). In open-source implementations, recent enhancements in Cumulus 5.14 (2024-2025 releases) for EVPN-MLAG hybrids introduce Ethernet Segment Identifiers (ESIs) and DF election for all-active redundancy, resolving legacy MLAG limitations in Clos fabrics through commands like nv set evpn multihoming enable on. Logs such as /var/log/clagd.log in Cumulus or ICCP status in help diagnose failures.

Benefits and Challenges

Key Advantages

Multi-chassis link aggregation group (MC-LAG) offers enhanced redundancy by enabling active-active forwarding across multiple chassis, allowing traffic to continue uninterrupted during chassis or link failures without relying on (STP). This setup achieves sub-second failover times through mechanisms like backup liveness detection and (BFD), resulting in near-zero in properly configured environments. MC-LAG maximizes bandwidth utilization by aggregating and actively using all member links across , eliminating the port-blocking limitations of traditional STP deployments. For instance, connecting a client device via two 40 Gbps links to separate in an MC-LAG pair provides an effective 80 Gbps of bidirectional bandwidth, with hash-based load balancing distributing flows to prevent hotspots and ensure even utilization. The technology simplifies by presenting a single logical switch to upstream and downstream devices, streamlining Layer 2 protocol interactions such as LACP and reducing configuration complexity in environments. This logical unification supports non-blocking topologies optimized for patterns, where server-to-server communication predominates, without the overhead of managing redundant paths individually. Quantitative benefits include significantly faster Layer 2 convergence compared to traditional LACP combined with STP, as MC-LAG avoids STP's reconvergence delays—often in the range of seconds—while providing node-level redundancy. Vendor implementations support high-density environments, including 400 Gbps Ethernet interfaces on modern platforms as of 2025, enabling scalable deployments for demanding workloads.

Limitations and Mitigation Strategies

Multi-chassis link aggregation groups (MC-LAGs) introduce significant configuration complexity due to the need for precise between peer switches, including the setup of inter-chassis links (ICLs) and heartbeat mechanisms to maintain operational consistency. This overhead arises from the requirement to configure identical policies, VLANs, and forwarding tables across both , often manually or through vendor-specific scripts, increasing the risk of in large deployments. A critical dependency on ICL reliability exacerbates this complexity, as the ICL serves as the primary conduit for control plane synchronization and data forwarding during failover scenarios; failure or partition of the ICL can lead to a split-brain condition where both peers independently forward traffic, causing duplicate MAC addresses, loops, or blackholing. In such split-brain situations, heartbeat failures—typically detected via protocols like (BFD)—may not immediately isolate the faulty peer, potentially resulting in network instability until manual intervention. Scalability in MC-LAG deployments is inherently limited, with most implementations restricted to exactly two peer switches, lacking native support for multi-peer configurations that could extend across larger clusters. This pairwise limitation stems from the design of protocols, which do not scale efficiently beyond dual-chassis setups without introducing additional . Furthermore, ICLs can become bottlenecks in high-throughput environments, as all between peers must traverse these links, constraining overall fabric in scenarios with dense server connectivity or high east-west flows. Interoperability challenges further compound these issues, as MC-LAG implementations often rely on proprietary extensions to standard Control Protocol (LACP), leading to and limited cross-vendor compatibility despite partial adherence to IEEE 802.1AX. For instance, features like ICL management and prevention vary significantly between vendors, complicating multi-vendor environments. Recent efforts in open networking, such as the integration of MC-LAG support in the Software for Open Networking in the Cloud (SONiC) distribution, aim to address this through standardized models, with updates in 2025 enhancing compatibility across white-box hardware from multiple suppliers. To mitigate these limitations, deploying redundant ICLs configured in active-active mode distributes load and provides paths, reducing the risk of single-link failures triggering events; vendors like Arista and recommend deploying redundant ICLs configured with at least two interfaces to distribute load and provide paths, ensuring sufficient bandwidth and reliability based on the deployment scale. Software-defined monitoring tools, leveraging protocols such as over data models, enable automated validation of peer states and ICL health, allowing proactive detection of inconsistencies without manual intervention. For scalability beyond two , hybrid approaches combining MC-LAG with BGP-EVPN extend redundancy to multi-device fabrics, using Ethernet Segment Identifiers (ESIs) to synchronize MAC learning and avoid ICL bottlenecks in larger topologies.

Alternatives

Device Stacking Technologies

Device stacking technologies enable the interconnection of multiple physical network switches through proprietary high-speed stacking links, allowing them to operate as a single logical device with a unified control plane. For example, Cisco's StackWise technology uses dedicated stack ports to form a ring topology, providing up to 480 Gbps of stacking bandwidth across interconnected switches. Similarly, Juniper Networks' Virtual Chassis configuration links switches via Virtual Chassis Ports (VCPs), typically using high-speed Ethernet interfaces, to create a cohesive switching fabric managed as one entity. Key features of stacking include master election for centralized control, where one switch acts as the active master handling stack-wide decisions, while others function as members; support for hitless software upgrades through mechanisms like Stateful Switchover (SSO); and unified management via a single IP address and configuration interface. These systems support Link Aggregation Groups (LAGs), but treat the entire stack as a single chassis, preventing LAG formation across stack members in a multi-chassis manner. Juniper's implementation adds redundant Routing Engines for resiliency, enabling features like Graceful Routing Engine Switchover (GRES) to maintain operations during failures. Stacking is commonly deployed for redundancy and simplification at the access layer in campus networks, where switches in wiring closets can be interconnected to provide port expansion and without complex configurations. This contrasts with multi-chassis link aggregation group (MLAG) approaches, which are more oriented toward environments requiring across independent chassis. Modern evolutions in stacking, such as Juniper's Virtual Chassis supporting up to 10 units in certain EX Series configurations, enhance while maintaining a single-logical-device model. However, these technologies are inherently limited by the need for physical proximity due to stacking cable lengths—typically up to a few meters—and lack native support for active-active client LAGs spanning non-adjacent or non-stacked peers, restricting their use in geographically dispersed setups.

Distributed Control Plane Protocols

Distributed control plane protocols provide scalable alternatives to traditional multi-chassis link aggregation group (MLAG) by enabling redundancy and load balancing across multiple devices in a fabric without relying on proprietary inter-chassis links. (EVPN), standardized in RFC 7432, employs (BGP) as its and (VXLAN) for the data plane to support multi-homing, where endpoints connect redundantly to any number of provider edge devices. This approach uses an Ethernet Segment Identifier (ESI) to represent a multi-homed Ethernet segment, allowing link aggregation group (LAG)-like behavior across distributed devices in a standards-based manner. Unlike MLAG, which is typically limited to two chassis, EVPN's ESI mechanism scales to larger fabrics by synchronizing state via BGP updates, eliminating the need for direct chassis-to-chassis synchronization. Key features of EVPN include anycast gateways for distributed Layer 3 forwarding, where multiple devices share the same IP and MAC addresses to provide seamless mobility and redundancy for hosts. Symmetric Integrated Routing and Bridging (IRB) enables efficient Layer 3 routing within the overlay by using the same next-hop for both bridging and routing, reducing head-end replication. MAC and IP address learning occurs through BGP EVPN route type 2 (MAC/IP advertisement routes), which carry host reachability information including ARP suppression to minimize flooding, while type 5 routes (IP prefix routes) advertise subnets for inter-subnet routing across data centers. These capabilities support fabric-wide scalability beyond MLAG's dual-chassis constraint, with EVPN deployments in cloud-scale data centers emerging in the early 2010s and becoming widespread by the mid-2010s for multi-tenant virtualization. EVPN's underlay/overlay separation further decouples control from physical topology, using an IP fabric (often BGP or OSPF) for underlay to reach VXLAN endpoints, thus avoiding MLAG's dependency on dedicated inter-chassis links (ICLs) and enhancing resilience in large-scale environments. By 2025, EVPN-VXLAN has evolved to support 800G Ethernet interfaces in spine-leaf architectures, accommodating hyperscale demands for AI and high-throughput workloads without architectural overhauls. This separation allows independent scaling of the underlay for raw connectivity and the overlay for virtualized services, as outlined in RFC 8365 for overlays. Other protocols offering fabric-wide redundancy include Transparent Interconnection of Lots of Links (TRILL), defined in RFC 6325, which uses Intermediate System to Intermediate System () for link-state routing to provide multipath forwarding and eliminate loops in Ethernet fabrics. Similarly, Shortest Path Bridging (SPB), standardized in , extends to Ethernet for shortest-path multipath redundancy, supporting VLAN-agnostic forwarding across mesh topologies. However, both TRILL and SPB have seen limited adoption as direct MLAG replacements compared to EVPN, primarily due to the latter's integration with modern overlay technologies like VXLAN in environments.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.