Recent from talks
Contribute something
Nothing was collected or created yet.
Macroblock
View on WikipediaThe macroblock is a processing unit in image and video compression formats based on linear block transforms, typically the discrete cosine transform (DCT). A macroblock typically consists of 16×16 samples, and is further subdivided into transform blocks, and may be further subdivided into prediction blocks. Formats which are based on macroblocks include JPEG, where they are called MCU blocks, H.261, MPEG-1 Part 2, H.262/MPEG-2 Part 2, H.263, MPEG-4 Part 2, and H.264/MPEG-4 AVC.[1][2][3][4] In H.265/HEVC, the macroblock as a basic processing unit has been replaced by the coding tree unit.[5]
Technical details
[edit]Transform blocks
[edit]A macroblock is divided into transform blocks, which serve as input to the linear block transform, e.g. the DCT. In H.261, the first video codec to use macroblocks, transform blocks have a fixed size of 8×8 samples.[1] In the YCbCr color space with 4:2:0 chroma subsampling, a 16×16 macroblock consists of 16×16 luma (Y) samples and 8×8 chroma (Cb and Cr) samples. These samples are split into four Y blocks, one Cb block and one Cr block. This design is also used in JPEG and most other macroblock-based video codecs with a fixed transform block size, such as MPEG-1 Part 2 and H.262/MPEG-2 Part 2. In other chroma subsampling formats, e.g. 4:0:0, 4:2:2, or 4:4:4, the number of chroma samples in a macroblock will be smaller or larger, and the grouping of chroma samples into blocks will differ accordingly.
In more modern macroblock-based video coding standards such as H.263 and H.264/AVC, transform blocks can be of sizes other than 8×8 samples. For instance, in H.264/AVC main profile, the transform block size is 4×4.[4] In H.264/AVC High profile, the transform block size can be either 4×4 or 8×8, adapted on a per-macroblock basis.[4]
Prediction blocks
[edit]Distinct from the division into transform blocks, a macroblock can be split into prediction blocks. In early standards such as H.261, MPEG-1 Part 2, and H.262/MPEG-2 Part 2, motion compensation is performed with one motion vector per macroblock.[1][2] In more modern standards such as H.264/AVC, a macroblock can be split into multiple variable-sized prediction blocks, called partitions.[4] In an inter-predicted macroblock in H.264/AVC, a separate motion vector is specified for each partition.[4] Correspondingly, in an intra-predicted macroblock, where samples are predicted by extrapolating from the edges of neighboring blocks, the predicted direction is specified on a per-partition basis.[4] In H.264/AVC, prediction partition size ranges from 4×4 to 16×16 samples for both inter-prediction (motion compensation) and intra-prediction.[4]
Bitstream representation
[edit]A possible bitstream representation of a macroblock in a video codec which uses motion compensation and transform coding is given below.[6] It is similar to the format used in H.261.[1]
+------+------+-------+--------+-----+----+----+--------+ | ADDR | TYPE | QUANT | VECTOR | CBP | b0 | b1 | ... b5 | +------+------+-------+--------+-----+----+----+--------+
- ADDR — address of block in image
- TYPE — identifies type of macroblock (intra frame, inter frame, bi-directional inter frame)
- QUANT — quantization value to vary quantization
- VECTOR — motion vector
- CBP — Coded Block Pattern, this is bit mask indicating for which blocks coefficients are present.
- bN — the blocks (4 Y, 1 Cr, 1 Cb)
Macroblocking
[edit]The term macroblocking is commonly used to refer to block coding artifacts.
See also
[edit]References
[edit]- ^ a b c d ITU-T (March 1993). "Video codec for audiovisual services at p x 64 kbit/s". Retrieved 2013-04-28.
- ^ a b ITU-T (February 2012). "Advanced video coding for generic audiovisual services". Retrieved 2013-04-28.
- ^ ITU-T (January 2005). "Video coding for low bit rate communication". Retrieved 2013-04-28.
- ^ a b c d e f g ITU-T (April 2013). "Information technology — Generic coding of moving pictures and associated audio information: Video". Retrieved 2013-04-28.
- ^ G.J. Sullivan; J.-R. Ohm; W.-J. Han; T. Wiegand (2012-05-25). "Overview of the High Efficiency Video Coding (HEVC) Standard" (PDF). IEEE Transactions on Circuits and Systems for Video Technology. Retrieved 2013-04-26.
- ^ Marshall, Dave (2001-04-10). "Intra Frame Coding". Multimedia Module No: CM0340. Retrieved 2014-02-13.
Macroblock
View on GrokipediaFundamentals
Definition and Purpose
A macroblock serves as the fundamental processing unit in block-based video codecs, such as those defined in the ITU-T H.261 and H.264 standards. It typically comprises a 16×16 array of luma samples, along with associated chroma samples—such as two 8×8 arrays for the Cb and Cr components in 4:2:0 color sampling formats. This structure allows the macroblock to represent a compact spatial region within a video frame, facilitating localized analysis and manipulation of pixel data during compression. The primary purpose of the macroblock is to enable efficient spatial and temporal compression by grouping pixels into discrete units suitable for motion estimation, intra- and inter-prediction, and transform coding. In motion estimation, for instance, the macroblock is matched against reference blocks from previous or future frames to compute motion vectors, exploiting temporal redundancy across video sequences. Similarly, spatial prediction within the macroblock leverages adjacent pixel correlations to minimize residual data, which is then transformed (e.g., via discrete cosine transform) and quantized to further reduce bitrate while preserving essential visual information. This block-based approach originated in early standards like H.261 for videoconferencing and has been refined in subsequent codecs to achieve higher compression ratios. Key benefits of using macroblocks include simplified computational complexity in encoding and decoding pipelines, as operations are confined to fixed-size blocks rather than processing the entire frame holistically, which optimizes hardware and software implementations. This partitioning enhances overall compression efficiency by allowing adaptive techniques, such as variable block partitioning for better motion compensation accuracy, leading to improved video quality at lower bitrates without excessive computational overhead. In standard-definition video (e.g., 720×480 resolution), a single macroblock might cover a small detail like part of a face or a uniform background patch, demonstrating its role in balancing detail preservation with data reduction.Historical Development
The macroblock concept emerged in the late 1980s amid the development of early block-based video codecs, addressing the need for efficient compression in bandwidth-limited telecommunication applications. It was first formalized in the ITU-T H.261 standard, ratified in 1990 for video telephony over ISDN lines at bitrates ranging from 64 to 2048 kbit/s. H.261 specified the macroblock as a 16×16 luma block accompanied by corresponding 8×8 chroma blocks, serving as the fundamental unit for differenced inter-frame coding through motion compensation and discrete cosine transform. This structure set the template for subsequent standards, enabling practical digital video transmission in resource-constrained environments.[8][9] Subsequent adoption expanded the macroblock's role across storage and broadcast media. The ISO/IEC MPEG-1 standard, released in 1993, incorporated H.261's 16×16 macroblock framework for CD-ROM-based video at approximately 1 Mbps, introducing bi-directional prediction to enhance temporal redundancy reduction for resolutions like CIF and SIF. This was followed by MPEG-2 (ITU-T H.262), standardized in 1995 through joint ITU-T and MPEG collaboration, which retained the fixed 16×16 macroblock while adding interlaced-scan support for digital television and DVD applications at 2–20 Mbps. These milestones reflected the era's hardware computational limitations, prioritizing algorithms that balanced efficiency with feasible real-time processing on 1990s-era processors.[9] The early 2000s brought evolutionary refinements driven by rising demands for internet streaming and higher resolutions, amid persistent bandwidth constraints. The H.264/AVC standard, finalized in 2003 by the ITU-T Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG) Joint Video Team, preserved the 16×16 macroblock as the processing unit but introduced variable-size partitions down to 4×4 for more adaptive motion compensation, achieving roughly double the compression efficiency of prior standards. Building on this, the High Efficiency Video Coding (HEVC, ITU-T H.265) standard, approved in 2013, shifted from fixed macroblocks to larger Coding Tree Units (CTUs) up to 64×64 with recursive adaptive subdivisions, optimizing for HD and 4K video while further reducing bitrate needs by about 50% compared to H.264 at equivalent quality.[6][10][11]Technical Specifications
Macroblock Structure
A macroblock serves as the fundamental processing unit in video compression standards like H.264/AVC, comprising 256 luma samples arranged in a 16×16 grid, accompanied by chroma samples in the YCbCr color space. In the prevalent 4:2:0 chroma subsampling format, which is widely used for standard-definition and high-definition video, the macroblock includes two 8×8 blocks—one for the blue-difference (Cb) component and one for the red-difference (Cr) component—resulting in 64 chroma samples overall. This structure totals 384 samples per macroblock, calculated as: The YCbCr color space separates luminance (Y) from chrominance (Cb and Cr), enabling efficient compression by exploiting human visual sensitivity to brightness over color details. H.264/AVC supports multiple subsampling ratios to accommodate varying applications: 4:2:0, where chroma resolution is quartered relative to luma (common in consumer video); 4:2:2, with horizontal chroma subsampling by a factor of 2 (used in professional broadcast and editing workflows); and 4:4:4, preserving full chroma resolution (suited for high-fidelity graphics or medical imaging).[12] For instance, in 4:2:2 format, each macroblock features 256 luma samples alongside 256 chroma samples (two 8×16 blocks of 128 samples each for Cb and Cr), resulting in 512 samples total. In 4:4:4 format, the macroblock includes two 16×16 blocks for Cb and Cr, each with 256 samples (512 chroma samples total, 768 samples per macroblock).[12] Macroblocks tile the video frame contiguously without overlap, ensuring complete coverage of the picture area. To facilitate this alignment, frame dimensions in luma samples are typically padded during preprocessing to multiples of 16 in both width and height, avoiding partial macroblocks at the edges.[13] In progressive video, all samples within a macroblock are processed uniformly as a single spatial unit. For interlaced video, however, the macroblock may be adaptively split into top-field and bottom-field components via macroblock-adaptive frame-field (MBAFF) coding, allowing independent processing of the interlaced lines to better handle motion artifacts.Subdivisions and Blocks
In video coding standards such as H.264/AVC, a macroblock is subdivided into smaller blocks to enable more flexible and efficient processing for prediction and transformation.[12] These subdivisions allow the encoder to adapt to varying content characteristics, such as using larger blocks for uniform areas and smaller ones for detailed regions like edges.[12] For inter prediction, macroblocks can be partitioned into rectangular blocks including 16×16, 16×8, 8×16, and 8×8 sizes, with the 8×8 partitions further divisible into 8×4, 4×8, or 4×4 sub-partitions to refine motion compensation.[12] Intra prediction, in contrast, operates on square blocks of 4×4 or 16×16 within the macroblock, facilitating directional spatial prediction.[12] Transform blocks in H.264 are square and applied to the residual data after prediction, typically using 4×4 or 8×8 integer transforms akin to the discrete cosine transform (DCT).[12] These block sizes balance computational efficiency with compression performance, with 4×4 blocks capturing high-frequency details in complex textures and 8×8 blocks handling smoother areas more effectively.[12] The choice of subdivision is determined by the encoder's rate-distortion optimization, which selects partitions that minimize bitrate for a given quality level, often resulting in finer splits for high-motion or textured content.[12] Building on H.264, the HEVC (H.265) standard evolves the macroblock concept into larger coding tree units (CTUs) of up to 64×64 pixels, which are recursively partitioned using a quad-tree structure down to minimum blocks of 4×4.[14] This hierarchical approach allows for greater adaptability, where coding units (CUs) derived from CTU splits serve as the basis for prediction blocks ranging from 64×64 to 4×4, including non-square options like 16×8 for motion compensation in irregular motion patterns.[14] Transform blocks in HEVC extend to larger square sizes up to 32×32, also using DCT-like operations on residuals, enabling better energy compaction in high-resolution videos while maintaining finer granularity for detailed areas.[14] The quad-tree partitioning promotes content-adaptive decisions, such as deeper splits around object edges to preserve sharpness without excessive bitrate overhead.[14]| Standard | Prediction Block Examples | Transform Block Sizes |
|---|---|---|
| H.264/AVC | 16×16, 16×8, 8×16, 8×8, 4×4 | 4×4, 8×8 |
| HEVC (H.265) | 64×64 to 4×4 (including rectangles like 16×8) | 4×4 to 32×32 |
