Hubbry Logo
logo
Advanced eXtensible Interface
Community hub

Advanced eXtensible Interface

logo
0 subscribers
Read side by side
from Wikipedia

The Advanced eXtensible Interface (AXI) is an on-chip communication bus protocol and is part of the Advanced Microcontroller Bus Architecture specification (AMBA).[1][2] AXI is royalty-free and its specification is freely available from ARM.

AMBA AXI specifies many optional signals, which can be included depending on the specific requirements of the design,[3] making AXI a versatile bus for numerous applications.

While the communication over an AXI bus is between a single initiator and a single target, the specification includes detailed descriptions and signals to include N:M interconnects, able to extend the bus to topologies with multiple initiators and targets.[4]

AXI3 was introduced in 2003 with the AMBA3 specification. In 2010, a new revision of AMBA, AMBA4, defined the AXI4, AXI4-Lite and AXI4-Stream protocols. AMBA AXI4, AXI4-Lite and AXI4-Stream have been adopted by Xilinx and many of its partners as a main communication bus in their products.[5][6] AMBA5 with AXI5 was released in 2022, adding atomicity, data protection, and cache operations. A new ACE (AXI Coherency Extension) is specified.[7]

Thread IDs

[edit]

Thread IDs allow a single initiator port to support multiple threads, where each thread has in-order access to the AXI address space, however each thread ID initiated from a single initiator port may complete out of order with respect to each other. For instance in the case where one thread ID is blocked by a slow peripheral, another thread ID may continue independently of the order of the first thread ID. Another example, one thread on a CPU may be assigned a thread ID for a particular initiator port memory access such as read addr1, write addr1, read addr1, and this sequence will complete in order because each transaction has the same initiator port thread ID. Another thread running on the CPU may have another initiator port thread ID assigned to it, and its memory access will be in order as well but may be intermixed with the first thread IDs transactions.[8]

Thread IDs on an initiator port are not globally defined, thus an AXI switch with multiple initiator ports will internally prefix the initiator port index to the thread ID, and provide this concatenated thread ID to the target device, then on return of the transaction to its initiator port of origin, this thread ID prefix will be used to locate the initiator port and the prefix will be truncated. This is why the target port thread ID is wider in bits than the initiator port thread ID.[9]

AXI-Lite bus is an AXI bus that only supports a single ID thread per initiator. This bus is typically used for an end point that only needs to communicate with a single initiator device at a time, for example, a simple peripheral such as a UART. In contrast, a CPU is capable of initiating transactions to multiple peripherals and address spaces at a time, and will support more than one thread ID on its AXI initiator ports and AXI target ports. This is why a CPU will typically support a full spec AXI bus. A typical example of a front side AXI switch would include a full specification AXI initiator connected to a CPU initiator, and several AXI-Lite targets connected to the AXI switch from different peripheral devices.[10]

(Additionally, the AXI-Lite bus is restricted to only support transaction lengths of a single data word per transaction.)

Handshake

[edit]
Basic handshake mechanism of the AMBA AXI protocol. In this example, the destination entity waits for a high VALID to assert its own READY.

AXI defines a basic handshake mechanism, composed by an xVALID and xREADY signal.[11] The xVALID signal is driven by the source to inform the destination entity that the payload on the channel is valid and can be read from that clock cycle onwards. Similarly, the xREADY signal is driven by the receiving entity to notify that it is prepared to receive data.

When both the xVALID and xREADY signals are high in the same clock cycle, the data payload is considered transferred and the source can either provide a new data payload, by keeping high xVALID, or terminate the transmission, by de-asserting xVALID. An individual data transfer, so a clock cycle when both xVALID and xREADY are high, is called a "beat".

Two main rules are defined for the control of these signals:

  • A source must not wait for a high xREADY to assert xVALID.
  • Once xVALID is asserted, a source must maintain the assertion until a handshake occurs.

Thanks to this handshake mechanism, both the source and the destination can control the flow of data, throttling the speed if needed.

Channels

[edit]

In the AXI specification, five channels are described:[12]

  • Read Address channel (AR)
  • Read Data channel (R)
  • Write Address channel (AW)
  • Write Data channel (W)
  • Write Response channel (B)

Other than some basic ordering rules,[13] each channel is independent from each other and has its own couple of xVALID/xREADY handshake signals.[14]

AXI read channels
AXI Read Address and Read Data channels.
AXI write channels
AXI Write Address, Write Data and Write Response channels.

AXI

[edit]

Signals

[edit]
Signals of the Read and Write Address channels
Signal description Write Address channel Read Address channel
Address ID, to identify multiple streams over a single channel AWID ARID
Address of the first beat of the burst AWADDR ARADDR
Number of beats inside the burst AWLEN[nb 1] ARLEN[nb 1]
Size of each beat AWSIZE ARSIZE
Type of the burst AWBURST ARBURST
Lock type, to provide atomic operations AWLOCK[nb 1] ARLOCK[nb 1]
Memory type, how the transaction has to progress through the system AWCACHE ARCACHE
Protection type: privilege, security level and data/instruction access AWPROT ARPROT
Quality of service of the transaction AWQOS[nb 2] ARQOS[nb 2]
Region identifier, to access multiple logical interfaces from a single physical one AWREGION[nb 2] ARREGION[nb 2]
User-defined data AWUSER[nb 2] ARUSER[nb 2]
xVALID handshake signal AWVALID ARVALID
xREADY handshake signal AWREADY ARREADY
Signals of the Read and Write Data channels
Signal description Write Data channel Read Data channel
Data ID, to identify multiple streams over a single channel WID[nb 3] RID
Read/Write data WDATA RDATA
Read response, to specify the status of the current RDATA signal RRESP
Byte strobe, to indicate which bytes of the WDATA signal are valid WSTRB
Last beat identifier WLAST RLAST
User-defined data WUSER[nb 2] RUSER[nb 2]
xVALID handshake signal WVALID RVALID
xREADY handshake signal WREADY RREADY
Signals of the Write Response channel
Signal description Write Response channel
Write response ID, to identify multiple streams over a single channel BID
Write response, to specify the status of the burst BRESP
User-defined data BUSER[nb 2]
xVALID handshake signal BVALID
xREADY handshake signal BREADY

[15]

  1. ^ a b c d Different behavior between AXI3 and AXI4
  2. ^ a b c d e f g h i Available only with AXI4
  3. ^ Available only with AXI3

Bursts

[edit]
Example of FIXED, INCR and WRAP bursts

AXI is a burst-based protocol,[16] meaning that there may be multiple data transfers (or beats) for a single request. This makes it useful in the cases where it is necessary to transfer large amount of data from or to a specific pattern of addresses. In AXI, bursts can be of three types, selected by the signals ARBURST (for reads) or AWBURST (for writes):[17]

  • FIXED
  • INCR
  • WRAP

In FIXED bursts, each beat within the transfer has the same address. This is useful for repeated access at the same memory location, such as when reading or writing a FIFO.

In INCR bursts, on the other hand, each beat has an address equal to the previous one plus the transfer size. This burst type is commonly used to read or write sequential memory areas.

WRAP bursts are similar to the INCR ones, as each transfer has an address equal to the previous one plus the transfer size. However, with WRAP bursts, if the address of the current beat reaches the "Higher Address boundary", it is reset to the "Wrap boundary":

with

Transactions

[edit]

Reads

[edit]
Example of an AXI read transaction. The initiator requests 4 beats (ARLEN + 1[18]) of 4 Bytes each starting from address 0x0 with INCR type. The target returns 0x10 for address 0x0, 0x11 for address 0x4, 0x12 for address 0x8 and 0x13 for address 0xc, all with the OKAY status. Only the most relevant signals are shown here.

To start a read transaction, the initiator has to provide on the Read address channel:

  • the start address on ARADDR
  • the burst type, either FIXED, INCR or WRAP, on ARBURST (if present)
  • the burst length on ARLEN (if present).

Additionally, the other auxiliary signals, if present, are used to define more specific transfers.

After the usual ARVALID/ARREADY handshake, the target has to provide on the Read data channel:

  • the data corresponding to the specified address(es) on RDATA
  • the status of each beat on RRESP

plus any other optional signals. Each beat of the target's response is done with a RVALID/RREADY handshake and, on the last transfer, the target has to assert RLAST to inform that no more beats will follow without a new read request.

Writes

[edit]
Example of an AXI write transaction. The initiator drives 4 beats (AWLEN + 1[18]) of 4 Bytes each starting from address 0x0 with INCR type, writing 0x10 for address 0x0, 0x11 for address 0x4, 0x12 for address 0x8 and 0x13 for address 0xc. The target returns 'OKAY' as write response for the whole transaction. Only the most relevant signals are shown here.

To start a write operation, the initiator has to provide both the address information and the data information.

The address information is provided over the Write address channel, in a similar manner as a read operation:

  • the start address has to be provided on AWADDR
  • the burst type, either FIXED, INCR or WRAP, on AWBURST (if present)
  • the burst length on AWLEN (if present)

and, if present, all the optional signals.

An initiator has also to provide the data related to the specified address(es) on the Write data channel:

  • the data on WDATA
  • the "strobe" bits on WSTRB (if present), which conditionally mark the individual WDATA bytes as "valid" or "invalid"

Like in the read path, on the last data word, WLAST has to be asserted by the initiator.

After the completion of both the transactions, the target has to send back to the initiator the status of the write over the Write response channel, by returning the result over the BRESP signal.

Subsets

[edit]

AXI-Lite

[edit]

AXI4-Lite is a subset of the AXI4 protocol, providing a register-like structure with reduced features and complexity.[19] Notable differences are:

  • all bursts are composed by 1 beat only
  • all data accesses use the full data bus width, which can be either 32 or 64 bits

AXI4-Lite removes part of the AXI4 signals but follows the AXI4 specification for the remaining ones. Being a subset of AXI4, AXI4-Lite transactions are fully compatible with AXI4 devices, permitting the interoperability between AXI4-Lite initiators and AXI4 targets without additional conversion logic.[20]

Signals

[edit]
Write address channel Write data channel Write response channel Read address channel Read data channel
AWVALID WVALID BVALID ARVALID RVALID
AWREADY WREADY BREADY ARREADY RREADY
AWADDR WDATA BRESP ARADDR RDATA
AWPROT WSTRB ARPROT RRESP

[21]

AXI-Stream

[edit]

AXI4-Stream is a simplified, lightweight bus protocol designed specifically for high-speed streaming data applications. It supports only unidirectional data flow, without the need for addressing or complex handshaking. An AXI Stream is similar to an AXI write data channel, with some important differences on how the data is arranged:

  • no bursts, instead data is packed into packets, frames and data streams
  • no limit on the data length which may be continuous
  • data width can be any integer number of bytes

AXI5 Stream protocol introduces wake-up signaling and signal protection using parity.

A single AXI Stream transmitter can drive multiple streams which may be interleaved but reordering is not permitted.

Signal Source Width Description
ACLK Clock 1 ACLK is a global clock signal. All signals are sampled on the rising edge of ACLK.
ARESETn Reset 1 ARESETn is a global reset signal.
TVALID Transmitter 1 TVALID indicates the Transmitter is driving a valid transfer. A transfer takes place when both TVALID and TREADY are asserted.
TREADY Receiver 1 TREADY indicates that a Receiver can accept a transfer.
TDATA Transmitter TDATA_WIDTH TDATA is the primary payload used to provide the data that is passing across the interface. TDATA_WIDTH must be an integer number of bytes and is recommended to be 8, 16, 32, 64, 128, 256, 512 or 1024-bits.
TSTRB Transmitter TDATA_WIDTH/8 TSTRB is the byte qualifier that indicates whether the content of the associated byte of TDATA is processed as a data byte or a position byte.
TKEEP Transmitter TDATA_WIDTH/8 TKEEP is the byte qualifier that indicates whether content of the associated byte of TDATA is processed as part of the data stream.
TLAST Transmitter 1 TLAST indicates the boundary of a packet.
TID Transmitter TID_WIDTH TID is a data stream identifier. TID_WIDTH is recommended to be no more than 8.
TDEST Transmitter TDEST_WIDTH TDEST provides routing information for the data stream. TDEST_WIDTH is recommended to be no more than 8.
TUSER Transmitter TUSER_WIDTH TUSER is a user-defined sideband information that can be transmitted along the data stream. TUSER_WIDTH is recommended to be an integer multiple of TDATA_WIDTH/8.
TWAKEUP Transmitter 1 TWAKEUP identifies any activity associated with AXI-Stream interface.

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
The Advanced eXtensible Interface (AXI) is a high-performance, synchronous communication protocol developed by Arm as part of the Advanced Microcontroller Bus Architecture (AMBA) specification, enabling efficient on-chip interconnects for data transfer between master and subordinate components in system-on-chip (SoC) designs.[1] Introduced in 2003 with AMBA 3, AXI replaced earlier protocols like AHB to support higher bandwidth and frequency requirements in complex SoCs, and it has since become a foundational standard shipped in billions of devices worldwide.[2] The protocol is royalty-free and openly specified by Arm, promoting widespread adoption across industries including mobile, automotive, and data center computing.[1] AXI's architecture emphasizes scalability and efficiency through five independent, unidirectional channels: write address (AW), write data (W), write response (B), read address (AR), and read data (R), which allow concurrent read and write operations without interference.[1] In AXI4 and later, it supports burst-based transactions up to 256 beats, out-of-order processing via identifiers (IDs), and optional quality-of-service (QoS) signals for prioritizing traffic in multi-master environments.[1] Key variants include AXI3 (the original), AXI4 (introduced in 2010 for enhanced low-latency support and simplified bursts), AXI4-Lite (a simplified version for control registers), AXI4-Stream (for high-speed streaming data), and AXI5 (part of AMBA 5 in 2017, adding features like user signals and improved coherency for chiplet-based designs).[2] Notable for its flexibility, AXI facilitates the integration of intellectual property (IP) blocks such as processors, memory controllers, and peripherals, while enabling high-frequency operation—often exceeding 1 GHz—in modern FPGAs and ASICs from vendors like AMD and Intel. Its design minimizes latency through full handshaking mechanisms (valid-ready protocol) and supports atomic operations for reliable data handling in concurrent systems.[1] Overall, AXI remains integral to Arm-based architectures, evolving to meet demands in AI, 5G, and edge computing applications.[2]

Overview

Definition and Purpose

The Advanced eXtensible Interface (AXI) is a high-performance on-chip communication protocol defined as part of ARM's AMBA (Advanced Microcontroller Bus Architecture) specification family, originating with AMBA 3 in 2003 and evolving through subsequent versions.[1] It serves as a point-to-point interface specification that facilitates communication between master and subordinate components, such as processors and memory or peripherals, promoting modularity and reusability in complex SoC architectures.[3] The primary purposes of AXI include enabling high-bandwidth and low-latency data transfers to meet the demands of modern embedded systems, while offering scalability to support multiple concurrent masters and subordinates without performance bottlenecks.[1] Additionally, its extensible nature allows for custom implementations and adaptations to specific design requirements, ensuring compatibility across diverse hardware ecosystems.[3] This makes AXI particularly suited for resource-constrained environments where efficient resource sharing and interconnect optimization are critical. At its core, AXI incorporates features such as separate channels for read and write operations, support for out-of-order transaction completion to maximize throughput, and burst lengths of up to 256 beats to handle large data transfers efficiently.[1][4] These elements, underpinned by a handshake protocol for reliable signaling, enable robust and flexible communication in high-frequency systems.[1] AXI finds widespread application in connecting processors for tasks like IoT and networking, interfacing with peripherals such as controllers in multi-processor setups, and linking accelerators for high-speed data processing in advanced SoCs.[3]

History and Versions

The Advanced eXtensible Interface (AXI) protocol originated in 2003 as part of the AMBA 3 specification released by ARM, marking the introduction of AXI3 as a high-performance on-chip interconnect designed for high-frequency system-on-chip (SoC) designs. AXI3 was developed to address limitations in earlier AMBA protocols like the Advanced High-performance Bus (AHB), enabling more efficient burst transfers and pipelined operations to support the growing demands of complex integrated circuits.[5] In 2010, ARM released AMBA 4, which introduced AXI4 as an evolution of AXI3, incorporating enhancements such as quality-of-service (QoS) signaling for traffic prioritization and low-power interface features to optimize energy efficiency in multi-master environments. Alongside AXI4, the specification defined AXI4-Lite in 2010 as a simplified subset for low-bandwidth, single-transaction register accesses, reducing complexity for peripheral interfaces. The same year saw the introduction of AXI4-Stream, tailored for high-throughput, unidirectional streaming data transfers without address mapping, facilitating applications like video processing and networking.[6][7] In 2017, AMBA 5 introduced AXI5, which added features such as user signals, improved error handling, and enhanced coherency support for chiplet-based designs. Post-2010, AXI development has included both major updates like AXI5 and minor revisions for enhanced compatibility and clarity, such as updates to the ordering model in later issues of the specification. These iterations ensure backward compatibility while addressing refinements in protocol behavior. AXI has seen widespread adoption in ARM Cortex processor families, serving as the standard interconnect for SoC designs, and in field-programmable gate arrays (FPGAs) from vendors like AMD and Intel, where it underpins IP integration and high-bandwidth communication.[8][9][10][11] The evolution of AXI was driven by the escalating complexity of SoCs, necessitating support for concurrent, pipelined operations across multiple masters and subordinates to achieve higher bandwidth and scalability in embedded systems.[2]

Core Mechanisms

Handshake Protocol

The Advanced eXtensible Interface (AXI) utilizes a two-way handshake protocol based on VALID and READY signals to synchronize data transfers between a source and a destination, ensuring reliable communication without combinatorial paths. The source asserts VALID to indicate that stable and valid information is present on the associated bus lines, while the destination asserts READY to signal its acceptance capability. A transfer completes only on the rising clock edge when both signals are high simultaneously, guaranteeing that the source does not proceed until the destination is prepared.[12] This mechanism inherently supports backpressure, enabling the destination to deassert READY at any time to halt incoming transfers temporarily, which prevents data overflow and allows for adaptive flow control in systems with differing processing rates. Once READY is deasserted, the source must hold VALID asserted and keep the data stable until READY is reasserted, ensuring no information is lost during pauses.[12] In terms of timing, the protocol is designed for single-cycle assertions in full-speed scenarios, where both VALID and READY can align within one clock period to maximize throughput, and it applies uniformly to address, data, and response phases. READY may be asserted prior to or concurrently with VALID, but once VALID is driven high, it remains so until the handshake succeeds, promoting predictable synchronous behavior across all channels.[12] Error handling within the handshake relies on protocol compliance rather than dedicated error signals, with any deviations—such as invalid timings—leading to undefined behavior that implementations must avoid through design verification. Transaction-level errors are conveyed via response codes in later phases, but the core handshake ensures all intended transfers either complete successfully or are paused without corruption.[12] This VALID/READY mechanism is used in AXI4 and AXI5 protocols. The September 2025 Issue L update to the AMBA AXI Protocol Specification introduces the AXI-L variant, which employs a credit-based transport mechanism for improved efficiency in high-performance designs.[13] For a simple transfer cycle illustration, the sequence unfolds as follows:
  • Cycle 1: Source asserts VALID and drives data; if destination asserts READY in the same cycle, transfer occurs on the clock edge, and both signals may deassert afterward.
  • Cycle 2 (if backpressure applied): Source keeps VALID high and data stable; destination deasserts READY to delay.
  • Cycle 3: Destination asserts READY; transfer completes on the clock edge, resolving the handshake.
This flow demonstrates the protocol's robustness for point-to-point synchronization.[12]

Thread Identifiers

In the Advanced eXtensible Interface (AXI) protocol, thread identifiers—primarily AWID and ARID—serve as tags assigned by a master to distinguish between multiple concurrent transactions, enabling the management of outstanding requests and out-of-order responses across read and write channels.[14] These identifiers allow interconnects to route responses correctly back to the originating master, supporting efficient concurrency in systems with multiple initiators.[14] AWID, used on the write address channel, and ARID, on the read address channel, are typically implemented as 4-bit fields, permitting up to 16 unique threads or transaction streams per master.[14] In AXI3, the WID signal on the write data channel matches the corresponding AWID to link data beats to their address phase, facilitating potential write data interleaving; however, AXI4 simplifies this by removing WID entirely, assuming write data follows the address in strict order without separate identification.[14] For responses, slaves echo the master's identifier as RID on the read data channel (matching ARID) and BID on the write response channel (matching AWID), ensuring accurate transaction completion even when responses arrive out of sequence.[14] The use of thread identifiers provides key benefits, such as enabling interleaving of read transactions from multiple masters in AXI4 (with write interleaving unsupported), and allowing complex interconnects to handle up to 256 outstanding transactions through wider ID fields and burst mechanisms like INCR.[14] This concurrency improves system throughput in multi-master environments, such as SoCs with processors and peripherals.[14] However, limitations include the requirement that IDs remain unique within each master—interconnects may append bits for global uniqueness—and the absence of dynamic allocation, necessitating static assignment at design time.[14]

Channel Structure

The Advanced eXtensible Interface (AXI) protocol employs five independent channels to separate address, data, and response information, enabling efficient handling of read and write transactions in system-on-chip designs. These channels are the write address channel (AW), which conveys addressing information for write operations; the write data channel (W), which transfers the actual write data; the write response channel (B), which provides completion status for writes; the read address channel (AR), which carries addressing for read operations; and the read data channel (R), which delivers read data along with responses. This separation allows for modular and scalable interconnects by isolating different phases of transactions.[1] The channels operate with unidirectional flows to optimize data movement: the AW, W, and AR channels transmit information from master components to subordinate (slave) components, while the B and R channels flow in the reverse direction from slaves to masters. This design ensures that data and control signals do not interfere, supporting simultaneous read and write activities without contention on shared paths. Each channel applies a handshake mechanism for synchronization, further enhancing reliability in asynchronous environments.[1] By decoupling the channels, AXI facilitates pipelining and concurrent operations, where address issuance can proceed independently of data transfer or response handling, thereby improving overall throughput in high-frequency systems. This independence permits the insertion of pipeline stages or buffers within individual channels to balance latency and performance without affecting others. In practice, such decoupling is crucial for managing variable transaction latencies in complex interconnect fabrics.[1] Channel widths in AXI are configurable to match system requirements. The data channels (W and R) support widths from 8 to 1024 bits, commonly 32, 64, or 128 bits for high-bandwidth applications. The address channels (AW and AR) have widths determined by the address width parameter (typically 32 or 64 bits for the AWADDR and ARADDR signals) plus control signals such as length, size, and burst type (adding approximately 30-40 bits). The response channel (B) and the control portions of the R channel are narrower, based on the thread ID width (typically 4 bits) plus 2 bits for the response code and optional user signals, often totaling 6-16 bits. This flexibility in sizing allows adaptation to diverse IP cores. In the interconnect, these channels play a pivotal role by enabling non-blocking routing through multiplexers and arbiters, where transactions can be queued and forwarded independently to prevent stalls across the bus fabric.[1]

AXI4 Protocol

Interface Signals

The AXI4 interface employs a set of signals organized into five channels to facilitate read and write transactions between master and subordinate components, with all channels operating synchronously on the rising edge of a global clock and using VALID/READY handshaking for reliable data transfer. These signals enable high-bandwidth, low-latency communication while supporting features like burst transfers, out-of-order responses, and optional user-defined extensions. The protocol defines precise widths and behaviors for each signal to ensure interoperability in system-on-chip designs.[1]

Write Address Channel

The write address channel transfers address and control information from the master to the subordinate for initiating write bursts. This unidirectional channel uses the following key signals:
SignalDirectionWidthDescription
AWIDMaster to SubordinateUp to 16 bitsUnique transaction identifier for the write burst, enabling out-of-order responses and ordering rules.[15]
AWADDRMaster to Subordinate32 or 64 bitsSpecifies the start address of the first transfer in a write burst transaction.[1]
AWLENMaster to Subordinate8 bitsIndicates the burst length, supporting 1 to 256 transfers in AXI4 for incremental bursts.[1]
AWSIZEMaster to Subordinate3 bitsDefines the size of each transfer in the burst, from 1 byte up to 128 bytes (2^7).[1]
AWBURSTMaster to Subordinate2 bitsSpecifies the burst type: FIXED (no address change), INCR (incrementing), or WRAP (wrapping around a boundary).[1]
AWLOCKMaster to Subordinate1 bitIndicates normal (0) or exclusive (1) access; locked transactions not supported in AXI4.[16]
AWCACHEMaster to Subordinate4 bitsIndicates memory type attributes for caching and buffering behavior (e.g., non-cacheable, write-back).[1]
AWPROTMaster to Subordinate3 bitsEncodes protection information: privilege level, security state, and data/instruction access type.[1]
AWREGIONMaster to Subordinate4 bitsAddress region identifier used for decoding or routing in multi-region interconnects.[15]
AWQOSMaster to Subordinate4 bitsProvides quality-of-service priority to manage traffic in congested interconnects (AXI4-specific).[1]
AWVALIDMaster to Subordinate1 bitAsserted by the master to indicate that the address and control signals are valid and stable.[1]
AWREADYSubordinate to Master1 bitAsserted by the subordinate to acknowledge receipt and readiness for the address channel transfer.[1]
AWUSERMaster to SubordinateUp to 32 bits (configurable)Optional user-defined signal for custom extensions, propagated unchanged through the interconnect.[1]

Write Data Channel

The write data channel conveys the actual data payloads from the master to the subordinate, accompanying the address channel for burst writes. It includes strobe signals to handle byte-level granularity.
SignalDirectionWidthDescription
WDATAMaster to Subordinate32, 64, 128, 256, 512, or 1024 bitsCarries the write data for each transfer in the burst, aligned to the data bus width.[1]
WSTRBMaster to SubordinateWDATA width / 8 bitsByte strobes indicating which byte lanes of WDATA contain valid data (one bit per byte).[1]
WLASTMaster to Subordinate1 bitAsserted to signal the last transfer in the write burst sequence.[1]
WVALIDMaster to Subordinate1 bitIndicates that valid write data and strobes are available on the channel.[1]
WREADYSubordinate to Master1 bitSignifies the subordinate's readiness to accept the write data transfer.[1]
WUSERMaster to SubordinateUp to 32 bits (configurable)Optional user-defined signal for additional per-beat information, routed with the data.[1]

Write Response Channel

The write response channel delivers completion status from the subordinate back to the master after a write transaction, allowing out-of-order handling via identifiers.
SignalDirectionWidthDescription
BIDSubordinate to MasterUp to 16 bitsTransaction identifier matching the AWID to associate the response with the originating write address.[1]
BRESPSubordinate to Master2 bitsEncodes the write response: OKAY (success), EXOKAY (exclusive okay), SLVERR (slave error), or DECERR (decode error).[1]
BVALIDSubordinate to Master1 bitAsserted to indicate a valid write response is available.[1]
BREADYMaster to Subordinate1 bitAcknowledges the master's ability to accept the response.[1]
BUSERSubordinate to MasterUp to 32 bits (configurable)Optional user-defined response signal for custom status information.[1]

Read Address Channel

The read address channel mirrors the write address channel but is dedicated to read transactions, sending control signals from the master to the subordinate. It includes an identifier for out-of-order responses.
SignalDirectionWidthDescription
ARIDMaster to SubordinateUp to 16 bitsUnique identifier for the read transaction group, enabling out-of-order delivery.[1]
ARADDRMaster to Subordinate32 or 64 bitsStart address for the first transfer in a read burst.[1]
ARLENMaster to Subordinate8 bitsBurst length, indicating 1 to 256 transfers.[1]
ARSIZEMaster to Subordinate3 bitsTransfer size per beat in the read burst.[1]
ARBURSTMaster to Subordinate2 bitsBurst type: FIXED, INCR, or WRAP.[1]
ARLOCKMaster to Subordinate1 bitIndicates normal (0) or exclusive (1) access; locked transactions not supported in AXI4.[16]
ARCACHEMaster to Subordinate4 bitsCache attributes for the read memory type.[1]
ARPROTMaster to Subordinate3 bitsProtection encoding for privilege, security, and access type.[1]
ARREGIONMaster to Subordinate4 bitsAddress region identifier used for decoding or routing in multi-region interconnects.[17]
ARQOSMaster to Subordinate4 bitsQoS identifier for read traffic prioritization.[1]
ARVALIDMaster to Subordinate1 bitSignals valid read address and controls from the master.[1]
ARREADYSubordinate to Master1 bitIndicates subordinate readiness to process the read address.[1]
ARUSERMaster to SubordinateUp to 32 bits (configurable)Optional user-defined signal for read address extensions.[1]

Read Data Channel

The read data channel returns data and status from the subordinate to the master, supporting interleaved responses from multiple transactions.
SignalDirectionWidthDescription
RIDSubordinate to MasterUp to 16 bitsIdentifier matching the ARID to identify the source transaction.[1]
RDATASubordinate to Master32, 64, 128, 256, 512, or 1024 bitsRead data returned for each beat in the burst.[1]
RRESPSubordinate to Master2 bitsResponse status per read transfer: OKAY, EXOKAY, SLVERR, or DECERR.[1]
RLASTSubordinate to Master1 bitMarks the final transfer in the read burst.[1]
RVALIDSubordinate to Master1 bitIndicates valid read data and response availability.[1]
RREADYMaster to Subordinate1 bitMaster's acknowledgment of readiness to receive read data.[1]
RUSERSubordinate to MasterUp to 32 bits (configurable)Optional user-defined signal accompanying read data.[1]

Clock and Reset Signals

Global timing and initialization are managed by two fundamental signals shared across the interface.
SignalDirectionWidthDescription
ACLKClock source to all components1 bitClock signal; all AXI4 signals are sampled on its rising edge for synchronous operation.[1]
ARESETnReset source to all components1 bitActive-low asynchronous reset; deasserted synchronously to ACLK to initialize the interface.[1]

Burst Transfers

In the AXI4 protocol, burst transfers enable efficient data movement by allowing multiple beats (individual data transfers) within a single transaction, reducing overhead compared to single-beat operations.[12] The burst type is specified by the AxBURST signals on the address channel, which determine how the address evolves across beats. There are three supported burst types: FIXED, where the address remains constant for all beats, typically used for accessing FIFO-like structures; INCR, where the address increments sequentially by the transfer size for each beat, suitable for linear memory accesses; and WRAP, where the address increments until it reaches a wrap boundary and then loops back to the starting aligned address, ideal for circular buffers or cache line fills.[12] Key parameters governing bursts include the length and size fields on the address channel. The AWLEN (for write bursts) and ARLEN (for read bursts) signals encode the number of beats, with values from 0 to 255 representing burst lengths of 1 to 256 beats, respectively; however, AXI4 permits up to 256 beats only for INCR bursts, while FIXED and WRAP are limited to 16 beats.[12] The AWSIZE and ARSIZE signals specify the bytes transferred per beat, supporting sizes of 1, 2, 4, 8, 16, 32, 64, or 128 bytes (encoded as 0 to 7), which must not exceed the data bus width.[12] Address calculation for bursts follows deterministic rules based on the type. For both INCR and WRAP bursts, the address for beat number $ N $ (starting from 0) is computed as the starting address plus $ N \times $ transfer size, with alignment to the transfer size required for the initial address.[12] In WRAP bursts, the address wraps around a boundary sized at $ 2^{(\log_2(\text{size}) + \log_2(\text{length}))} $, where length must be a power of 2 (2, 4, 8, or 16); this ensures the burst does not cross natural alignment boundaries like 4KB pages.[12] FIXED bursts use the same address for every beat, with no increment.[12] AXI4 supports unaligned bursts through byte strobe signals, specifically WSTRB for write data (indicating valid bytes within a beat) and RRESP handling for reads, allowing partial data transfers without address adjustments.[12] However, the protocol does not support queueing multiple bursts within a single transaction; each burst is atomic and handled sequentially.[12] Burst transfers are subject to constraints that ensure compatibility with system architecture. The maximum burst length is inherently limited by the address width (e.g., 32-bit addresses cap effective spans), though AXI4 explicitly restricts it to 256 beats for INCR to maintain efficiency.[12] Each beat in a burst experiences fixed latency, determined by the pipeline stages between address issuance and data completion, promoting predictable throughput in high-performance SoCs.[12]

Read Transactions

In the AXI4 protocol, a read transaction is initiated by the master device sending address information on the AR (address read) channel to the slave device. This includes the starting address (ARADDR), burst length (ARLEN, specifying up to 256 transfers), burst type (ARBURST), and a unique transaction identifier (ARID) to support multiple outstanding requests. The slave then responds on the R (read data) channel, providing the requested data (RDATA), a response status (RRESP), the matching transaction ID (RID), and a last signal (RLAST) to indicate the end of the burst.[18] The protocol enables out-of-order execution of read transactions to improve performance in systems with multiple masters and slaves. Multiple read bursts can be interleaved on the AR channel using distinct ARID values, allowing the master to issue several requests before receiving responses; the slave returns data on the R channel in any order as long as transactions with the same ID complete in the order their addresses were issued; reordering is permitted between transactions with different IDs (threads). The number of outstanding read transactions is implementation-dependent, limited by the slave's buffering and reordering depth, with order preserved for transactions sharing the same ID.[19] Response information for read transactions is conveyed via the RRESP signal on the R channel, which indicates the outcome of the access. The possible codes are OKAY for a normal successful access or exclusive access failure, EXOKAY for a successful exclusive access, SLVERR for a slave-detected error during the transaction, and DECERR for a decode error where the slave cannot handle the request. Each beat of the R channel carries one RRESP value, applying to the corresponding RDATA.[18] During the data phase of a read transaction, the slave transfers data on successive beats of the R channel, with the width matching the interface (up to 1024 bits). The RLAST signal asserts on the final beat to mark the burst completion, enabling the master to detect the end without relying solely on ARLEN. Responses from different threads (distinct IDs) may interleave on the R channel, but all beats for a given burst must maintain order within their thread to preserve data integrity.[18] Atomic read operations in AXI4 are supported through the ARLOCK signal on the AR channel, which allows the master to request exclusive access to a memory location. For exclusive access, the master sets ARLOCK to 1 and uses the same ID for a subsequent write transaction; the slave monitors the address and returns EXOKAY on the write if unmodified, enabling atomic operations like test-and-set.[16]

Write Transactions

In the AXI4 protocol, write transactions follow a multi-phase structure involving three dedicated channels to facilitate efficient data transfer from a master to a slave. The process begins with the write address (AW) channel, where the master transmits the starting address, burst length, size, and other control information for the transaction. This is followed by the write data (W) channel, which carries the actual data bursts, and concludes with the write response (B) channel, where the slave provides acknowledgment of the transaction's completion. A complete write transaction requires successful handshakes on all three channels, ensuring that the address is accepted, all data is transferred, and a response is received before the transaction ends.[1] Ordering rules in AXI4 write transactions mandate that, for a given transaction identified by a unique AWID, the address transfer on the AW channel must precede the corresponding data transfers on the W channel, and the data transfers must precede the response on the B channel. Multiple write bursts from the same master can overlap through the use of distinct IDs, allowing the AW channel for one burst to proceed while data for a previous burst is still being sent on the W channel. This enables outstanding transactions and improves throughput, but transactions sharing the same ID to the same slave must maintain strict ordering to preserve causality.[1] Write acceptance occurs independently on each channel via a two-way handshake mechanism, where the master asserts VALID signals and the slave responds with READY signals. A slave can accept an address on the AW channel without immediately accepting the associated data on the W channel, providing flexibility for buffering and pipelining. During data transfer on the W channel, the WSTRB signal serves as byte enables, with one bit per byte lane indicating which bytes within the data word are valid and should be written; for example, on a 64-bit bus, eight strobe bits allow selective writing of individual bytes without affecting others. The last data beat in a burst is marked by the WLAST signal, signaling the end of the data phase.[1] The response phase utilizes the B channel to return a single response per burst, carrying the transaction ID (BID, matching the original AWID) and a response status (BRESP) such as OKAY for successful completion, SLVERR for slave errors, DECERR for decode errors, or EXOKAY for exclusive accesses. No data is returned on the B channel; it solely provides acknowledgment and error information to the master, which must accept it via a VALID/READY handshake. This decouples the response from the data flow, allowing slaves to process writes asynchronously.[1] Interleaving of write transactions is supported in AXI4 through the ID mechanism, permitting multiple bursts from the same or different masters to overlap across channels—for instance, the W channel data for one burst can interleave with data from another burst bearing a different ID, provided the slave supports sufficient outstanding transaction depth. However, within a single burst (same ID), all data transfers must remain consecutive without interleaving from other bursts. This design enhances system performance by allowing concurrent handling of multiple writes while maintaining order for related operations.[1]

AXI4-Lite Variant

Signal Simplifications

AXI4-Lite introduces significant signal simplifications compared to the full AXI4 protocol to support low-complexity peripherals, such as simple control registers, by eliminating features unnecessary for basic memory-mapped accesses.[14] This variant targets applications where advanced capabilities like multi-beat bursts or out-of-order transactions are not required, reducing hardware overhead while preserving core functionality.[20] Key omissions in AXI4-Lite include burst-related signals such as AWLEN and ARLEN, enforcing fixed single-beat transactions with no support for longer bursts.[14] Transaction identifiers (AWID and ARID) are also removed, mandating in-order execution without the need for routing or reordering.[14] Additionally, signals for cache attributes (AWCACHE and ARCACHE), quality of service (AxQOS), locking (AWLOCK and ARLOCK), and user-defined signals are entirely absent, further streamlining the interface for straightforward operations.[14] The retained core signals focus on essential address, data, and response elements, enabling basic read and write transactions. For write channels, these include AWADDR for the target address, AWPROT for protection attributes, WDATA for the write payload, WSTRB for write strobes, and BRESP for the response status.[14] Read channels retain ARADDR for the address, ARPROT for protection attributes, RDATA for the read data, and RRESP for the response.[14] All channels employ the standard VALID and READY handshake protocol to manage data flow, ensuring reliable point-to-point communication without additional complexity.[14] AXI4-Lite maintains the five-channel structure of AXI4—write address, write data, write response, read address, and read response—but with simplified payloads that exclude optional fields like burst length.[14] For instance, the write address channel payload consists solely of the address and control signals without length or ID attributes.[14] This design keeps the protocol familiar while minimizing signal count. Address buses in AXI4-Lite are typically 32 or 64 bits wide (implementation-defined), and data buses support widths of 32, 64, 128, 256, 512, or 1024 bits to balance simplicity and performance needs.[14] As a strict subset of AXI4, AXI4-Lite ensures full compatibility, allowing Lite peripherals to connect directly to full AXI4 interconnects without protocol conversion.[14]

Transaction Handling

The AXI4-Lite protocol is designed exclusively for single-beat transactions, meaning each read or write operation involves only one address and one corresponding data transfer without support for bursts or multiple beats. This simplification eliminates the need for burst length signals and transaction IDs, ensuring all transactions execute in-order, where responses from the subordinate (slave) always correspond to the most recently issued request from the manager (master).[1] In a write transaction, the manager first asserts the write address channel (AWVALID) with the target address, waiting for the subordinate's acknowledgment (AWREADY) via the standard handshake mechanism of VALID and READY signals. Once the address is accepted, the manager sends the write data on the write data channel (WVALID), including the data payload and write strobes to indicate active bytes, and awaits WREADY from the subordinate. The transaction completes when the subordinate issues the write response on the write response channel (BVALID) with a response code, which the manager acknowledges via BREADY.[1] Read transactions follow a similar two-phase process: the manager asserts the read address channel (ARVALID) with the target address and receives ARREADY from the subordinate to complete the address phase. The subordinate then provides the read data on the read data channel (RVALID), including the data and a response code, which the manager accepts with RREADY to finalize the transaction. This streamlined flow supports efficient, low-latency access without the complexity of out-of-order handling.[1] Error handling in AXI4-Lite mirrors the response codes of the full AXI4 protocol but excludes exclusive access features, limiting responses to three values: OKAY (00b) for successful normal access, SLVERR (10b) indicating a slave-detected error such as an invalid address, and DECERR (11b) for decode errors like unhandled addresses. The EXOKAY (01b) response is not supported, as AXI4-Lite does not implement exclusive read or write operations. These codes are conveyed in the BRESP signal for writes and RRESP for reads, allowing the manager to detect and respond to issues appropriately.[1] AXI4-Lite is particularly suited for use cases involving simple peripheral devices that require register accesses, such as UARTs, timers, or GPIO controllers, where high-performance bursts are unnecessary. Its reduced signal set and single-beat nature result in lower gate count and simpler implementation in hardware, making it ideal for resource-constrained systems while maintaining compatibility with broader AMBA ecosystems.[21]

AXI4-Stream Protocol

Streaming Interface

The AXI4-Stream protocol provides a standard interface for unidirectional, point-to-point data transfers between components in an AMBA-based system, enabling efficient streaming without the need for address information. Defined in the AMBA 4 specification released in 2010,[22] it supports the exchange of arbitrary data streams from a source (master) to a sink (slave), facilitating applications such as video processing, networking, and signal processing where continuous data flow is prioritized over memory-mapped access.[23] Unlike memory-mapped protocols, AXI4-Stream eliminates address channels entirely, focusing solely on data payload and control signals to simplify routing and reduce latency in high-throughput scenarios.[23] The core signals of the AXI4-Stream interface include TDATA, which carries the primary payload data with a configurable width in integer multiples of bytes; TVALID, asserted by the source to indicate that the TDATA and associated sideband signals are valid for transfer; and TREADY, asserted by the sink to indicate acceptance of the transfer.[23] The transfer completes only when both TVALID and TREADY are high in the same clock cycle, implementing a simple two-way handshake mechanism similar to that in AXI4.[23] Additional control signals encompass TLAST, which denotes the end of a packet; TKEEP and TSTRB for byte-level qualifiers, where TKEEP indicates which bytes in TDATA are significant (asserted for bytes to be kept or transmitted, deasserted for null bytes), and TSTRB indicates which byte lanes are valid (asserted for data or position bytes, deasserted for undefined bytes); as well as optional signals like TID for stream identification, TDEST for routing destinations, and TUSER for user-defined sideband information.[23] In AXI4-Stream, data is organized into packets, each comprising a variable-length sequence of transfer beats concluded by a beat with TLAST asserted, allowing flexible packet sizes without predefined burst lengths.[23] This packet-based structure enables the protocol to handle streams of differing lengths and types efficiently, with the source responsible for generating TLAST and the sink for preserving it to maintain packet integrity.[23] Flow control is managed through backpressure: the sink can deassert TREADY to pause transfers when it cannot accept data, preventing overflow while the source holds the current beat until ready.[23] Support for concurrent streams is provided via the TID signal, which allows multiplexing multiple independent streams over the same interface, with a recommended maximum width of 8 bits.[23] The interface operates synchronously on a single clock domain using ACLK, where all signals are sampled on the rising edge, ensuring predictable timing in the system.[23] Reset is handled via ARESETn, an active-low signal that initializes the interface and must be asserted long enough to propagate through the clock domain.[23] This clocking and reset scheme aligns with other AMBA protocols, promoting reusability across designs.[23]

Data Transfer Mechanics

In the AXI4-Stream protocol, data transfer occurs through a point-to-point, unidirectional handshake mechanism between a source (transmitter) and a sink (receiver). The source asserts the TVALID signal alongside valid TDATA to indicate that transfer information is available, while the sink asserts TREADY to signal its readiness to accept the data. A successful transfer takes place only when both TVALID and TREADY are asserted in the same clock cycle, enabling flexible flow control where either signal can be asserted first or simultaneously.[24] Packets in AXI4-Stream are delineated by the TLAST signal, which the source asserts to mark the end of a packet and deasserts for ongoing transfers within the packet. To handle partial or selective byte transfers, the optional TKEEP signal indicates which bytes in TDATA are to be processed (asserted for transmitted bytes, deasserted for null bytes), while TSTRB specifies valid byte lanes (asserted for data or position bytes, deasserted for undefined positions). Scatter-gather operations are supported through sideband signals like TUSER, which provide user-defined metadata on a per-beat basis, allowing additional control information without altering the core data stream.[24] Stream ordering is strictly maintained within a single logical stream to ensure predictability, with the protocol prohibiting reordering of transfers that share the same TID and TDEST values. However, interleaving is permitted across multiple independent streams identified by distinct TID values, enabling efficient multiplexing on a per-transfer basis without restriction to packet boundaries via TLAST. This multi-stream capability contrasts with the single-channel structure analogous to AXI4's VALID/READY handshake but omits address signals, focusing solely on continuous, addressless data pipelines.[24] AXI4-Stream is particularly suited for applications involving high-throughput, unidirectional data flows, such as direct memory access (DMA) controllers, video processing pipelines, and network interface units, where it facilitates efficient byte streams, continuous aligned or unaligned data, and sparse transfers without memory addressing overhead. Optional extensions like the TDEST signal enable routing decisions in interconnect fabrics by identifying destination endpoints, typically using up to 8 bits for flexibility in multi-sink environments. Error handling relies on protocol compliance and optional parity protection; violations, such as parity errors in AXI5-Stream extensions, allow the receiver to terminate, propagate, or correct affected transfers based on system requirements, though the base AXI4-Stream does not mandate specific error responses.[24]
User Avatar
No comments yet.