Hubbry Logo
MPEG-4MPEG-4Main
Open search
MPEG-4
Community hub
MPEG-4
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
MPEG-4
MPEG-4
from Wikipedia

MPEG-4 is a group of international standards for the compression of digital audio and visual data, multimedia systems, and file storage formats. It was originally introduced in late 1998 as a group of audio and video coding formats and related technology agreed upon by the ISO/IEC Moving Picture Experts Group (MPEG) (ISO/IEC JTC 1/SC29/WG11) under the formal standard ISO/IEC 14496 – Coding of audio-visual objects. Uses of MPEG-4 include compression of audiovisual data for Internet video and CD distribution, voice (telephone, videophone) and broadcast television applications. The MPEG-4 standard was developed by a group led by Touradj Ebrahimi (later the JPEG president) and Fernando Pereira.[1]

Background

[edit]

MPEG-4 absorbs many of the features of MPEG-1 and MPEG-2 and other related standards, adding new features such as (extended) VRML support for 3D rendering, object-oriented composite files (including audio, video and VRML objects), support for externally specified digital rights management and various types of interactivity. AAC (Advanced Audio Coding) was standardized as an adjunct to MPEG-2 (as Part 1) before MPEG-4 was issued.

MPEG-4 is still an evolving standard and is divided into a number of parts. Companies promoting MPEG-4 compatibility do not always clearly state which "part" level compatibility they are referring to. The key parts to be aware of are MPEG-4 Part 2 (including Advanced Simple Profile, used by codecs such as DivX, Xvid, Nero Digital, RealMedia, 3ivx, H.263 and by QuickTime 6) and MPEG-4 part 10 (MPEG-4 AVC/H.264 or Advanced Video Coding, used by the x264 encoder, Nero Digital AVC, QuickTime 7, Flash Video, and high-definition video media like Blu-ray Disc).

Most of the features included in MPEG-4 are left to individual developers to decide whether or not to implement. This means that there are probably no complete implementations of the entire MPEG-4 set of standards. To deal with this, the standard includes the concept of "profiles" and "levels", allowing a specific set of capabilities to be defined in a manner appropriate for a subset of applications.

Initially, MPEG-4 was aimed primarily at low-bit-rate video communications; however, its scope as a multimedia coding standard was later expanded. MPEG-4 is efficient across a variety of bit rates ranging from a few kilobits per second to tens of megabits per second. MPEG-4 provides the following functions:

  • Improved coding efficiency over MPEG-2[2]
  • Ability to encode mixed media data (video, audio, speech)
  • Error resilience to enable robust transmission
  • Ability to interact with the audio-visual scene generated at the receiver

Overview

[edit]

MPEG-4 provides a series of technologies for developers, for various service-providers and for end users:

  • MPEG-4 enables different software and hardware developers to create multimedia objects possessing better abilities of adaptability and flexibility to improve the quality of such services and technologies as digital television, animation graphics, the World Wide Web and their extensions.
  • Data network providers can use MPEG-4 for data transparency. With the help of standard procedures, MPEG-4 data can be interpreted and transformed into other signal types compatible with any available network.
  • The MPEG-4 format provides end users with a wide range of interaction with various animated objects.
  • Standardized digital rights management signaling, otherwise known in the MPEG community as Intellectual Property Management and Protection (IPMP).

The MPEG-4 format can perform various functions, among which might be the following:

  • Multiplexes and synchronizes data, associated with media objects, in such a way that they can be efficiently transported further via network channels.
  • Interaction with the audio-visual scene, which is formed on the side of the receiver.

Profiles and Levels

[edit]

MPEG-4 provides a large and rich set of tools for encoding.[vague] Subsets of the MPEG-4 tool sets have been provided for use in specific applications.[vague] These subsets, called 'Profiles', limit the size of the tool set a decoder is required to implement.[3] In order to restrict computational complexity, one or more 'Levels' are set for each Profile.[3] A Profile and Level combination allows:[3]

  • A codec builder to implement only the subset of the standard needed, while maintaining interworking with other MPEG-4 devices that implement the same combination.[3]
  • Checking whether MPEG-4 devices comply with the standard, referred to as conformance testing.[3]

MPEG-4 Parts

[edit]

MPEG-4 consists of several standards—termed "parts"—including the following (each part covers a certain aspect of the whole specification):

MPEG-4 parts[4][5]
Part Number First public release date (first edition) Latest public release date (last edition) Latest amendment Title Description
Part 1 ISO/IEC 14496-1[6] 1999 2010[7] 2014[8] Systems Describes synchronization and multiplexing of video and audio. For example, the MPEG-4 file format version 1 (obsoleted by version 2 defined in MPEG-4 Part 14). The functionality of a transport protocol stack for transmitting and/or storing content complying with ISO/IEC 14496 is not within the scope of 14496-1 and only the interface to this layer is considered (DMIF). Information about transport of MPEG-4 content is defined e.g. in MPEG-2 Transport Stream, RTP Audio Video Profiles and others.[9][10][11][12][13]
Part 2 ISO/IEC 14496-2[14] 1999 2004[15] 2009 Visual A compression format for visual data (video, still textures, synthetic images, etc.). Contains many profiles, including the Advanced Simple Profile (ASP), and the Simple Profile (SP).
Part 3 ISO/IEC 14496-3[16] 1999 2009[17] 2017[18] Audio A set of compression formats for perceptual coding of audio signals, including some variations of Advanced Audio Coding (AAC) as well as other audio/speech coding formats and tools (such as Audio Lossless Coding (ALS), Scalable Lossless Coding (SLS), Structured Audio, Text-To-Speech Interface (TTSI), HVXC, CELP and others)
Part 4 ISO/IEC 14496-4[19] 2000 2004[20] 2016 Conformance testing Describes procedures for testing conformance to other parts of the standard.
Part 5 ISO/IEC 14496-5[21] 2000 2001[22] 2017 Reference software Provides reference software for demonstrating and clarifying the other parts of the standard.
Part 6 ISO/IEC 14496-6[23] 1999 2000[24] Delivery Multimedia Integration Framework (DMIF)
Part 7 ISO/IEC TR 14496-7[25] 2002 2004[26] Optimized reference software for coding of audio-visual objects Provides examples of how to make improved implementations (e.g., in relation to Part 5).
Part 8 ISO/IEC 14496-8[27] 2004 2004[28] Carriage of ISO/IEC 14496 contents over IP networks Specifies a method to carry MPEG-4 content on IP networks. It also includes guidelines to design RTP payload formats, usage rules of SDP to transport ISO/IEC 14496-1-related information, MIME type definitions, analysis on RTP security and multicasting.
Part 9 ISO/IEC TR 14496-9[29] 2004 2009[30] Reference hardware description Provides hardware designs for demonstrating how to implement the other parts of the standard.
Part 10 ISO/IEC 14496-10[31] 2003 2014[32] 2016[33] Advanced Video Coding (AVC) A compression format for video signals which is technically identical to the ITU-T H.264 standard.
Part 11 ISO/IEC 14496-11[34] 2005 2015[35] Scene description and application engine Can be used for rich, interactive content with multiple profiles, including 2D and 3D versions. MPEG-4 Part 11 revised MPEG-4 Part 1 – ISO/IEC 14496-1:2001 and two amendments to MPEG-4 Part 1. It describes a system level description of an application engine (delivery, lifecycle, format and behaviour of downloadable Java byte code applications) and the Binary Format for Scene (BIFS) and the Extensible MPEG-4 Textual (XMT) format – a textual representation of the MPEG-4 multimedia content using XML, etc.[35] (It is also known as BIFS, XMT, MPEG-J.[36] MPEG-J was defined in MPEG-4 Part 21)
Part 12 ISO/IEC 14496-12[37] 2004 2015[38] 2017[39] ISO base media file format A file format for storing time-based media content. It is a general format forming the basis for a number of other more specific file formats (e.g. 3GP, Motion JPEG 2000, MPEG-4 Part 14). It is technically identical to ISO/IEC 15444-12 (JPEG 2000 image coding system – Part 12).
Part 13 ISO/IEC 14496-13[40] 2004 2004[41] Intellectual Property Management and Protection (IPMP) Extensions MPEG-4 Part 13 revised an amendment to MPEG-4 Part 1 – ISO/IEC 14496-1:2001/Amd 3:2004. It specifies common Intellectual Property Management and Protection (IPMP) processing, syntax and semantics for the carriage of IPMP tools in the bit stream, IPMP information carriage, mutual authentication for IPMP tools, a list of registration authorities required for the support of the amended specifications (e.g. CISAC), etc. It was defined due to the lack of interoperability of different protection mechanisms (different DRM systems) for protecting and distributing copyrighted digital content such as music or video.[42][43][44][45][46][47][48][49][50]
Part 14 ISO/IEC 14496-14[51] 2003 2003[52] 2010[53] MP4 file format It is also known as "MPEG-4 file format version 2". The designated container file format for MPEG-4 content, which is based on Part 12. It revises and completely replaces Clause 13 of ISO/IEC 14496-1 (MPEG-4 Part 1: Systems), in which the MPEG-4 file format was previously specified.
Part 15 ISO/IEC 14496-15[54] 2004 2022[55] 2023[56] Part 15: Carriage of network abstraction layer (NAL) unit structured video in the ISO base media file format For storage of Part 10 video. File format is based on Part 12, but also allows storage in other file formats.
Part 16 ISO/IEC 14496-16[57] 2004 2011[58] 2016[59] Animation Framework eXtension (AFX) It specifies MPEG-4 Animation Framework eXtension (AFX) model for representing 3D Graphics content. MPEG-4 is extended with higher-level synthetic objects for specifying geometry, texture, animation and dedicated compression algorithms.
Part 17 ISO/IEC 14496-17[60] 2006 2006[61] Streaming text format Timed Text subtitle format
Part 18 ISO/IEC 14496-18[62] 2004 2004[63] 2014 Font compression and streaming For Open Font Format defined in Part 22.
Part 19 ISO/IEC 14496-19[64] 2004 2004[65] Synthesized texture stream Synthesized texture streams are used for creation of very low bitrate synthetic video clips.
Part 20 ISO/IEC 14496-20[66] 2006 2008[67] 2010 Lightweight Application Scene Representation (LASeR) and Simple Aggregation Format (SAF) LASeR requirements (compression efficiency, code and memory footprint) are fulfilled by building upon the existing the Scalable Vector Graphics (SVG) format defined by the World Wide Web Consortium.[68]
Part 21 ISO/IEC 14496-21[69] 2006 2006[70] MPEG-J Graphics Framework eXtensions (GFX) Describes a lightweight programmatic environment for advanced interactive multimedia applications – a framework that marries a subset of the MPEG standard Java application environment (MPEG-J) with a Java API.[36][70][71][72] (at "FCD" stage in July 2005, FDIS January 2006, published as ISO standard on 2006-11-22).
Part 22 ISO/IEC 14496-22[73] 2007 2015[74] 2017 Open Font Format OFFS is based on the OpenType version 1.4 font format specification, and is technically equivalent to that specification.[75][76] Reached "CD" stage in July 2005, published as ISO standard in 2007
Part 23 ISO/IEC 14496-23[77] 2008 2008[78] Symbolic Music Representation (SMR) Reached "FCD" stage in October 2006, published as ISO standard in 2008-01-28
Part 24 ISO/IEC TR 14496-24[79] 2008 2008[80] Audio and systems interaction Describes the desired joint behavior of MPEG-4 File Format and MPEG-4 Audio.
Part 25 ISO/IEC 14496-25[81] 2009 2011[82] 3D Graphics Compression Model Defines a model for connecting 3D Graphics Compression tools defined in MPEG-4 standards to graphics primitives defined in any other standard or specification.
Part 26 ISO/IEC 14496-26[83] 2010 2010[84] 2016 Audio Conformance
Part 27 ISO/IEC 14496-27[85] 2009 2009[86] 2015[87] 3D Graphics conformance 3D Graphics Conformance summarizes the requirements, cross references them to characteristics, and defines how conformance with them can be tested. Guidelines are given on constructing tests to verify decoder conformance.
Part 28 ISO/IEC 14496-28[88] 2012 2012[89] Composite font representation
Part 29 ISO/IEC 14496-29[90] 2014 2015 Web video coding Text of Part 29 is derived from Part 10 - ISO/IEC 14496-10. Web video coding is a technology that is compatible with the Constrained Baseline Profile of ISO/IEC 14496-10 (the subset that is specified in Annex A for Constrained Baseline is a normative specification, while all remaining parts are informative).
Part 30 ISO/IEC 14496-30[91] 2014 2014 Timed text and other visual overlays in ISO base media file format It describes the carriage of some forms of timed text and subtitle streams in files based on ISO/IEC 14496-12 - W3C Timed Text Markup Language 1.0, W3C WebVTT (Web Video Text Tracks). The documentation of these forms does not preclude other definition of carriage of timed text or subtitles; see, for example, 3GPP Timed Text (3GPP TS 26.245).
Part 31 ISO/IEC 14496-31[92] Under development (2018-05) Video Coding for Browsers Video Coding for Browsers (VCB) - a video compression technology that is intended for use within World Wide Web browser
Part 32 ISO/IEC CD 14496-32[93] Under development Conformance and reference software
Part 33 ISO/IEC FDIS 14496-33[94] Under development Internet video coding

Profiles are also defined within the individual "parts", so an implementation of a part is ordinarily not an implementation of an entire part.

MPEG-1, MPEG-2, MPEG-7 and MPEG-21 are other suites of MPEG standards.

Licensing

[edit]

MPEG-4 contains patented technologies, the use of which requires licensing in countries that acknowledge software algorithm patents. Over two dozen companies claim to have patents covering MPEG-4. MPEG LA[95] licenses patents required for MPEG-4 Part 2 Visual from a wide range of companies (audio is licensed separately) and lists all of its licensors and licensees on the site. New licenses for MPEG-4 System patents are under development[96] and no new licenses are being offered while holders of its old MPEG-4 Systems license are still covered under the terms of that license for the patents listed.[97]

The majority of patents used for the MPEG-4 Visual format are held by three Japanese companies: Mitsubishi Electric (255 patents), Hitachi (206 patents), and Panasonic (200 patents).

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
MPEG-4, formally known as ISO/IEC 14496, is a suite of international standards developed by the (MPEG) for the compression, delivery, and management of content across fixed and mobile networks. It enables the representation of audio-visual scenes as compositions of objects, supporting interactive and scalable applications. The standard emphasizes object-based coding, allowing individual elements like video objects, audio streams, and graphics to be manipulated independently for enhanced interactivity and efficiency. The development of MPEG-4 built upon the successes of earlier MPEG standards, such as for digital storage media and for broadcast television, with work beginning in the mid-1990s to address emerging needs for and mobile . The first core parts of the standard were published in early 1999, marking its formal ratification as ISO/IEC 14496 by the (ISO) and the (IEC). Subsequent amendments and additional parts have been released over the years, with ongoing updates to incorporate advancements like improved compression algorithms. MPEG-4 comprises over 30 parts, each addressing specific aspects of multimedia handling; key components include Part 1 (Systems) for scene description and delivery, Part 2 (Visual) for video object compression, Part 3 (Audio) for speech and general audio coding, Part 10 (, or AVC/H.264) for high-efficiency video, and Part 14 () for storing timed media streams. These parts support a range of profiles and levels tailored to applications, from low-bitrate mobile streaming to high-definition broadcasting. Notable features of MPEG-4 include its support for 2D and 3D graphics, synthetic content generation, text rendering, and binary format for scenes (BIFS) to enable dynamic scene composition and user interaction. It provides superior compression efficiency compared to prior standards, facilitating bandwidth savings in diverse environments. Applications encompass , video-on-demand services, mobile devices, , surveillance systems, and web-based delivery. The MP4 , in particular, has become ubiquitous for storing and streaming video files across platforms.

History and Development

Origins and Goals

The (MPEG) was established in 1988 as a (WG11) under the (ISO) and the (IEC) Joint Technical Committee 1, Subcommittee 29 (ISO/IEC JTC1/SC29), initially focused on developing standards for the compression of moving pictures and associated audio. Building on the success of , which targeted storage on digital storage media like , and , designed for broadcasting and DVD, the committee expanded its scope with MPEG-4 to encompass broader multimedia applications beyond traditional video and audio coding. In July 1993, MPEG issued a call for requirements to define the objectives of a new standard aimed at addressing emerging needs in communication and delivery, particularly over heterogeneous networks. The primary goals included achieving improved compression efficiency, targeting approximately 50% better performance than in terms of bitrate reduction for equivalent subjective quality, to enable efficient transmission and storage. Additional objectives encompassed support for , allowing users to manipulate individual elements within scenes, and content-based manipulation, facilitating access and editing of specific objects rather than entire frames; these features were essential for applications over low-bitrate channels, such as mobile networks operating at bitrates as low as 10 kbit/s. A core emphasis of the MPEG-4 goals was object-based coding, which treats data as composable objects—either natural (e.g., captured video) or synthetic (e.g., computer-generated )—to enable scalable representation and universal across diverse devices and network conditions. This approach supported robustness in error-prone environments and adaptability to varying bandwidths, promoting content delivery to a wide range of users from desktop computers to portable devices. Key milestones included the issuance of the for proposals in July 1995, soliciting technologies aligned with the defined requirements, followed by the and selection of core technologies at the MPEG meeting in in January 1996, marking the integration of proposals into the initial verification model.

Standardization Timeline

The standardization process for MPEG-4, designated as ISO/IEC 14496, began in July 1993 under the (MPEG), part of ISO/IEC JTC 1/SC 29, with the issued in July 1995, a working draft released in November 1996, and the committee draft published in late 1997. The effort culminated in the of the initial in late 1998, with formal publication of the first editions in 1999. Phase 1 of MPEG-4 development, covering the period from 1996 to 1999, concentrated on establishing core tools for 2D and 3D natural/synthetic models, facial animation parameters, and basic video and audio coding functionalities. This phase resulted in the first editions of Parts 1 through 3 of ISO/IEC 14496 published as International Standards in December 1999: Part 1 (Systems) for multiplexing and , Part 2 (Visual) for object-based video coding, Part 3 (Audio) for advanced audio compression, with Part 4 () for verification procedures following in 2000. Phase 2, extending from 2000 to 2002, expanded the standard with advanced features such as fine granularity scalability, error resilience, and support for higher-resolution content, incorporating amendments to existing parts and introducing new ones. This included Parts 5 (Reference Software), 6 (Delivery Multimedia Integration Framework), 7 (Optimized Reference Software), 8 (Carriage of Enhanced Audio), and 9 (Reference Hardware Description), with their initial editions published between 2002 and 2004. A major milestone was Part 10 (, or AVC), developed in collaboration with as H.264, which was finalized and published in May 2003 as the first edition of ISO/IEC 14496-10. Following Phase 2, subsequent development phases from 2004 onward addressed emerging needs in delivery and 3D content, leading to Parts 11 through 30. Notable additions included Part 12 () in its first edition published in 2004, providing a foundational structure for media storage shared with other standards, and Part 16 (Animation Framework eXtension, or AFX) in February 2004, enabling advanced 3D graphics and animation tools. The MPEG group has continued maintenance, with amendments and new editions focusing on , , and integration with modern applications; for instance, updates to Part 10 for enhanced compression were incorporated through editions up to 2023. As of 2025, MPEG continues maintenance of MPEG-4 parts, with recent updates to file formats and integration with modern standards. This ongoing collaboration between ISO, IEC, and ensures MPEG-4's adaptability across fixed and mobile environments.

Technical Overview

Core Principles

MPEG-4 adopts an object-based representation for scenes, treating media as discrete, reusable objects such as video objects, audio streams, or graphics elements, each described by metadata for properties like , texture, and temporal behavior. These objects are composed hierarchically within a , allowing independent manipulation, synchronization, and delivery, which enables applications like content editing, selective transmission, and user interaction without decoding the entire stream. This approach contrasts with frame-based methods in prior standards by emphasizing semantic structure over pixel-level processing, facilitating scalability across diverse devices and networks. Central to this architecture is the Binary Format for Scenes (BIFS), a compact syntax derived from that describes the spatiotemporal organization of objects in 2D or 3D scenes using nodes for grouping, positioning, transformations, and event handling. BIFS supports dynamic scene updates through command streams, enabling real-time interactivity such as object selection or animation, while its binary encoding ensures efficient storage and transmission. This scene description mechanism promotes content reusability and adaptability, allowing scenes to scale from low-bitrate mobile viewing to high-resolution immersive experiences. The Delivery Multimedia Integration Framework (DMIF) serves as a unified for accessing and delivering MPEG-4 content across heterogeneous environments, including networks, storage media, and broadcasts, via a standardized session protocol and application interface. DMIF handles resource negotiation, quality-of-service management, and of object streams, insulating applications from underlying transport specifics like IP or MPEG-2 transport streams. This framework ensures seamless integration of delivery, supporting both pull (interactive) and push (broadcast) modes. MPEG-4 inherently supports hybrid natural and synthetic content by integrating compressed streams of real-world media—such as video and audio—with generated elements like 2D/3D graphics, facial animation parameters, and within the same scene framework. This unification allows for blended audiovisual experiences, where synthetic objects can overlay or interact with natural ones, enhancing applications in , gaming, and augmented communication. The design accommodates bitrates from low (e.g., 2-5 kbit/s for speech) to high (up to 10 Mbit/s for video), prioritizing efficient coding for mixed content types. To ensure robustness in channels, particularly error-prone ones like networks, MPEG-4 incorporates error resilience features at the systems level, including resynchronization markers in and scalable object hierarchies that permit graceful degradation. These mechanisms, combined with BIFS updates for error recovery, maintain scene integrity without full retransmission, supporting reliable operation at bitrates below 64 kbit/s.

Key Innovations

MPEG-4 introduced content-based interactivity as a core advancement, enabling the manipulation of individual audiovisual objects within a scene, such as selecting, editing, or scaling specific video objects without requiring re-encoding of the entire multimedia stream. This object-based approach allows for applications like user-driven content customization, where elements can be interacted with independently, distinguishing it from frame-based standards like . The standard's scalability features represent another major innovation, supporting temporal scalability through layered to adjust frame rates, spatial scalability for varying resolutions via enhancement layers, and quality (SNR) scalability using fine granularity scalability (FGS) techniques that enable progressive refinement of video . These mechanisms facilitate adaptive streaming over heterogeneous networks, allowing bitstreams to be tailored to fluctuating bandwidth conditions without full re-transmission. Universal multimedia access was achieved through tools optimized for low-bitrate delivery, supporting rates as low as 5 kbit/s for basic video while enabling high-quality rendering on diverse devices ranging from mobile phones to high-definition displays. This versatility ensures accessibility across varying computational resources and network capabilities, promoting widespread adoption in early and mobile applications. MPEG-4 integrated synthetic media by defining Facial Animation Parameters (FAP) and Body Animation Parameters (BAP) in Parts 1 and 2, providing a parametric framework for animating 3D face and body models with minimal data overhead—68 FAPs control deformations of 84 facial feature points to produce expressions, while 196 BAPs define joint rotations and movements for the body model. These parameters enable realistic synthesis of virtual characters, blending seamlessly with natural video for hybrid content creation. Hybrid coding in MPEG-4 combines traditional with wavelet-based methods, particularly in the Synthetic and Natural Hybrid Coding (SNHC) tools, to enhance efficiency for textured and composite scenes. This approach leverages discrete wavelet transforms for scalable texture representation, achieving better compression ratios than the purely block-based in , especially for irregular or synthetic-natural hybrid imagery.

Video Compression

MPEG-4

MPEG-4 encompasses multiple video compression standards defined in ISO/IEC 14496, with Parts 2 and 10 being central to its video coding capabilities. Part 2, known as MPEG-4 Visual, provides a flexible for compressing rectangular frame-based video, supporting ranging from 5 kbit/s to over 1 Gbit/s. It accommodates progressive and interlaced formats, resolutions from sub-QCIF (128×96) to 4096×4096 pixels, and color subsampling options such as , 4:2:2, and 4:4:4. This part builds on earlier MPEG technologies by introducing object-based coding, allowing video objects to be independently encoded and manipulated, which enhances in applications. The core of MPEG-4 Visual employs a hybrid coding framework that integrates motion-compensated prediction with (DCT) quantization. Intra-coded video object planes (I-VOPs) use spatial prediction within the frame, while predictive (P-VOPs) and bidirectional (B-VOPs) planes leverage temporal prediction from reference frames. Innovations include quarter-pixel motion vector accuracy for smoother motion representation, global for camera panning effects, and variable block sizes (e.g., 16×16 or 8×8 macroblocks) to adapt to content complexity. Additional tools support scalability—spatial (resolution), temporal (), and (SNR)—enabling adaptive streaming, as well as error resilience features like resynchronization markers, data partitioning, and reversible variable-length coding (RVLC) for transmission over unreliable networks. Profiles such as Simple, Advanced Simple, and Core provide tailored constraints for applications from mobile video to broadcast. MPEG-4 Part 10, or (AVC), also standardized as H.264, represents a major advancement in video compression efficiency within the MPEG-4 family. Developed jointly by MPEG and 's (VCEG) from 2001 onward, it achieves approximately 50% better compression than at equivalent quality levels, supporting at bit rates as low as 1 Mbit/s for 720p content. AVC uses an enhanced hybrid approach with more sophisticated intra and inter prediction modes: intra prediction employs directional modes (e.g., 9 for 4×4 blocks), while inter prediction supports multiple reference frames, weighted prediction, and sub-pixel accuracy up to 1/4 pixel. The transform stage applies a 4×4 DCT approximation for reduced complexity, followed by context-adaptive binary (CABAC) or context-adaptive variable-length coding (CAVLC) for encoding, which significantly improves rate-distortion performance. Key innovations in AVC include deblocking filters to reduce blocking artifacts, in-loop processing for better prediction accuracy, and flexible partitioning (from 16×16 to 4×4) for efficient handling of diverse video content. It supports scalability through extensions like SVC (Scalable Video Coding in Part 10 amendments) and multiview coding for stereoscopic video. Profiles such as Baseline (for low-latency applications like video conferencing), Main (for broadcast), and High (for HD/4K with 8×8 transforms) ensure broad applicability, from mobile devices to professional cinema. AVC's widespread adoption stems from its balance of compression efficiency—up to twice that of MPEG-4 Visual—and computational feasibility, powering formats like Blu-ray discs and streaming services. Beyond core 2D video, MPEG-4 includes specialized compression for synthetic content, such as Face and Body Animation (Parts 1 and 2), which encodes facial expressions and body poses at low bit rates (2–3 kbit/s for faces, up to 40 kbit/s for bodies) using parameter-based models. 3D Mesh Coding (Part 16) achieves 30:1 to 40:1 compression ratios for triangular mesh models by wavelet decomposition and of and connectivity. These extensions enable immersive applications like and animation, integrating seamlessly with the binary format for scenes (BIFS) in MPEG-4 systems. Overall, MPEG-4 video compression prioritizes versatility, enabling content creation, delivery, and interaction across diverse platforms.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.