Extent (file systems)
View on WikipediaThis article needs additional citations for verification. (December 2016) |
In computing, an extent is a contiguous area of storage reserved for a file in a file system, represented as a range of block numbers, or tracks on count key data devices. A file can consist of zero or more extents; one file fragment requires one extent. The direct benefit is in storing each range compactly as two numbers, instead of canonically storing every block number in the range.[1] Also, extent allocation results in less file fragmentation.
Extent-based file systems can also eliminate most of the metadata overhead of large files that would traditionally be taken up by the block-allocation tree. But because the savings are small compared to the amount of stored data (for all file sizes in general) but make up a large portion of the metadata (for large files), the overall benefits in storage efficiency and performance are slight.[2]
In order to resist fragmentation, several extent-based file systems do allocate-on-flush. Many modern fault-tolerant file systems also do copy-on-write, although that increases fragmentation. As a similar design, the CP/M file system uses extents as well, but those do not correspond to the definition given above. CP/M's extents appear contiguously as a single block in the combined directory/allocation table, and they do not necessarily correspond to a contiguous data area on disk.
IBM OS/360 and successors allocate files in multiples of disk tracks or cylinders. Files could originally have up to 16 extents, but this restriction has since been lifted. The initial allocation size, and the size of additional extents to be allocated if required, are specified by the user via Job Control Language. The system attempts to allocate the initial size as a contiguous area, although this may be split if contiguous space is not available.
Adoption
[edit]The systems supporting file system extents include the following:
- APFS – Apple File System
- ASM – Automatic Storage Management – Oracle's database-oriented file system
- BFS – BeOS, Zeta and Haiku operating systems
- Btrfs – Extent-based copy-on-write (COW) file system for Linux
- EFS – Extent File System – SGI's first-generation file system for IRIX
- ext4 – Linux file system (when the configuration enables extents – the default in Linux since version 2.6.23)
- Files-11 – OpenVMS file system
- HFS and HFS Plus – Hierarchical File System – Apple Macintosh file systems
- High Performance File System (HPFS) – on OS/2, eComStation and ArcaOS
- IceFS – IceFileSystem – optional file system for MorphOS
- JFS – Journaled File System – used by AIX, OS/2/eComStation/ArcaOS and Linux operating systems
- ISO 9660 – Extent-based file system for optical disc media
- MPE File System – the file system of the Multi-Programming Executive operating system.
- NTFS – used by Windows
- OCFS2 – Oracle Cluster File System – a shared-disk file system for Linux
- Reiser4 – Linux file system (in "extents" mode)
- SINTRAN III – file system used by early computer company Norsk Data
- UDF – Universal Disk Format – standard for optical media
- VERITAS File System – enabled via the pre-allocation API and CLI
- XFS – SGI's second-generation file system for IRIX and Linux
Adoption outside of file systems include the following:
- Microsoft SQL Server – versions support 64 KB extents consisting of eight 8 KB pages.[3]
- Oracle Database groups blocks into extents and extents into segments.[4]
See also
[edit]References
[edit]- ^ "Understanding Ext4 (part1): Extents". 2010-12-20. Archived from the original on 2015-02-03. Retrieved 2015-02-02.
What's really a departure for EXT4 however, is the use of extents rather than the old, inefficient indirect block mechanism used by earlier Unix file systems (e.g. EXT2 and EXT3) for tracking file content. Extents are similar to cluster runs in the NTFS file system; essentially, they specify an initial block address and the number of blocks that make up the extent. A file that is fragmented will have multiple extents, but EXT4 tries very hard to keep files contiguous.
- ^ "Ext4 Disk Layout". 2015-01-26. Retrieved 2015-02-02.
If flex_bg is enabled, it is possible to allocate very large files with a single extent, at a considerable reduction in metadata block use, and some improvement in disk efficiency.
- ^ "Pages and Extents Architecture Guide - SQL Server". learn.microsoft.com. Microsoft. 12 June 2024. Retrieved 18 December 2024.
- ^ "Oracle Database 23ai Technical Architecture". docs.oracle.com. Oracle Corporation. Retrieved 18 December 2024.
External links
[edit]- Getting to know the Solaris filesystem, Part 1: Allocation and storage strategy – a comparison of block-based and extent-based allocation
Extent (file systems)
View on GrokipediaFundamentals
Definition
In file systems, storage is divided into fixed-size allocation units, such as blocks or clusters, which represent the smallest addressable portions of disk space used to hold file data.[8] These units enable efficient management of data placement on secondary storage devices like hard disks.[8] An extent is a contiguous sequence of these blocks or sectors allocated to a single file, represented compactly by two integers: the starting logical block address and the length (number of blocks).[8] This structure allows the file system to describe large, continuous portions of a file's data with minimal metadata overhead, as opposed to listing each individual block.[8] For instance, a 1 MB file using 4 KB blocks could be allocated as one extent comprising 256 contiguous blocks, specified by its starting block number and length of 256.[8] In cases of fragmentation, the same file might instead span multiple non-contiguous extents, each described similarly but requiring additional metadata to link them.[8] This approach contrasts with non-contiguous methods like linked or indexed allocation, where blocks are not required to be adjacent.[8]Comparison to Other Allocation Methods
In linked allocation, disk blocks for a file are organized as a linked list, where each block contains a pointer to the next block in the sequence, with the file's metadata storing only the location of the first block.[9] This method avoids external fragmentation since blocks can be allocated anywhere available, but it incurs high seek times for sequential reads due to the need to traverse the chain block by block, and random access is impossible without reading the entire preceding chain.[9] Additionally, it suffers from reliability issues, as corruption in a single pointer can render the rest of the file inaccessible.[10] Indexed allocation addresses some of linked allocation's drawbacks by using a dedicated index block that holds pointers to all data blocks of the file, enabling direct random access to any block without traversal.[11] The file's metadata points to this index block, and space is allocated as needed up to the index's capacity.[11] While it eliminates external fragmentation and supports efficient growth for small to medium files, the method is limited by the fixed size of the index block, which can only hold a finite number of pointers—typically restricting large files unless extended with multi-level indexing, which adds further overhead and indirection.[9] For very small files, it wastes space since the entire index block must be allocated regardless of usage.[9] Extents improve upon these approaches by grouping multiple contiguous disk blocks into a single logical unit, represented in metadata by just the starting block and length, rather than individual pointers.[9] A file can consist of several such extents, allowing non-contiguous storage while minimizing metadata overhead—for instance, a 1 GB file divided into 100 MB extents might require only a few entries instead of thousands of pointers in an indexed scheme.[9] This reduces the space and access cost for metadata, enhances sequential read performance due to locality, and mitigates the index size limitations of pure indexed allocation, though it still risks external fragmentation if extents cannot be placed adjacently.[10]| Method | Pros | Cons | Use Cases |
|---|---|---|---|
| Linked | No external fragmentation; simple growth without relocation.[9] | Poor random access; high sequential seek overhead; reliability risks from pointer corruption.[10] | Systems prioritizing simplicity over performance, such as early embedded or sequential-only workloads.[11] |
| Indexed | Supports random access; no external fragmentation; efficient growth for files fitting within index size.[11] | High metadata overhead; limited file size without multi-level extensions; space waste for small files.[9] | General-purpose file systems needing balanced access patterns, like medium-sized files in multi-user environments.[9] |
| Extents | Low metadata overhead (one entry per group); good sequential locality; scales to large files with fewer entries.[9] | Potential external fragmentation; growth limited by maximum extents per file.[10] | Large-file storage in modern systems, such as databases or media files where sequential access dominates.[9] |
History
Origins in Early File Systems
The concept of extents in file systems emerged as a response to the challenges of managing storage on early computing hardware, where contiguous allocation was often handled in systems like the Compatible Time-Sharing System (CTSS) and Multics. In CTSS, introduced in 1961, files were allocated contiguously on disk tracks by the system based on user size requests, with symbolic referencing via names and logical module numbers, which limited flexibility and increased administrative overhead on tape and early drum storage.[12] Multics, developed starting in 1965 as an evolution of CTSS, employed segments—logically contiguous units in its hierarchical file system—with automatic allocation to handle storage efficiently, though early designs faced fragmentation challenges on direct-access devices, highlighting the need for automated mechanisms to handle non-contiguous storage efficiently.[13] Extents were first formalized in the IBM System/360 operating systems, with OS/360 released alongside the System/360 hardware in 1964 to address mainframe storage efficiency. In OS/360, an extent is defined as a contiguous area of direct-access storage allocated to a data set, allowing files to span multiple non-contiguous regions when a single contiguous block was unavailable due to fragmentation.[14] Initially, data sets were limited to up to 16 extents per volume, allocated via the SPACE parameter in job control statements, which helped minimize seek times and optimize access on early disks like the IBM 2311.[15] This limit was later increased in successor systems, but the original design prioritized reducing directory overhead on constrained media by grouping blocks into larger units rather than tracking each individually.[14] The primary motivation was to support growing data sets for business applications, such as payroll processing, on volumes with limited capacity, where full contiguous allocation often failed due to prior usage patterns.[16] A notable variant appeared in CP/M, the Control Program for Microcomputers, introduced in 1974 by Gary Kildall for 8-bit microprocessors. In CP/M's file system, extents consist of 16 KB units composed of 128-byte logical records, with each directory entry (32 bytes) describing one extent's allocation blocks, differing from modern block-based systems by lacking a separate bitmap and instead deriving free space from directory scans.[17] Early versions like CP/M 1.4 supported up to 32 extents per file on floppy disks, expanding to 512 in CP/M 2.2 by 1979, to accommodate larger files while keeping the fixed 64-entry directory compact.[17][18] This approach was driven by the era's small, removable media constraints, aiming to reduce metadata size and enable efficient sequential access without the fragmentation issues of per-block pointers prevalent in even smaller hobbyist systems.[19]Evolution in Modern File Systems
In the late 20th century, file system extents evolved from the fixed limitations of early implementations, such as the 16 extents per volume in IBM's OS/360, to support for thousands of dynamic extents in modern designs, enabling efficient handling of large, fragmented files without excessive metadata overhead.[20] This shift was driven by increasing storage capacities and the need for scalability, as seen in the Linux ext family, where ext2 (introduced in 1993) relied on block lists limited to 12 direct pointers plus indirect blocks, while ext4 (released in 2008) introduced an extent tree structure allowing files to span up to 16 terabytes with potentially thousands of extents per inode through multi-level indexing.[21] Similarly, database systems like Oracle, which adopted extents in the 1980s for tablespace allocation, transitioned to locally managed extents in version 8i (1999) and beyond, automatically sizing and scaling allocations to minimize administrative overhead in petabyte-scale environments.[22] Post-1990s advancements integrated extents with journaling mechanisms to enhance fault tolerance, particularly in recovery scenarios following crashes or power failures. Journaling file systems like ext3 (2001) and JFS2 logged metadata changes, including extent allocations, to enable rapid replay and reconstruction of file structures, reducing recovery times from hours to seconds.[7] In ext4, extents are journaled alongside data or metadata, ensuring atomic updates and consistency during replays, which proved crucial for enterprise reliability as storage volumes grew.[23] This integration addressed the fragmentation risks in non-journaled systems, allowing extents to be reliably extended or split without data loss in fault scenarios. Adaptations for solid-state drives (SSDs) emerged in the 2000s as flash memory proliferated, with extents facilitating wear-leveling and TRIM operations to optimize endurance and performance. Extents' contiguous block representation simplifies identifying free ranges for TRIM commands, which notify SSD controllers of deallocated space for immediate garbage collection, preventing write amplification since SSDs became mainstream around 2006.[24] File systems like ext4 added discard (TRIM) support in kernel 2.6.33 (2010), leveraging extent metadata to batch free extent notifications, thus aligning allocation strategies with SSD internals like flash translation layers.[25] Key milestones include the ext2-to-ext4 progression, which marked a paradigm shift toward extent-based allocation for Linux, and Oracle's evolution to uniform and auto-allocated extents for database scaling. Post-2010, extents influenced cloud storage abstractions in distributed systems, such as the Container File System (CFS) proposed in 2019, where large files are split into distributed extents across nodes for scalable, fault-tolerant storage in containerized environments.[26]Technical Implementation
Allocation and Management
In extent-based file systems, the allocation process begins by scanning structures that track free space, such as bitmaps or lists of free extents, to locate the largest available contiguous blocks that meet or exceed the requested size for a new or expanding file.[27] This scan identifies suitable ranges of free blocks, prioritizing contiguity to align with the inherent design of extents as contiguous storage units. Greedy algorithms are commonly employed for efficiency: the first-fit approach allocates the initial contiguous free space encountered that is large enough, while the best-fit strategy selects the smallest such space to reduce internal fragmentation and leftover unusable fragments.[28] Deallocation reverses this by returning freed extents to the free space pool and immediately merging them with any adjacent free extents, consolidating smaller fragments into larger contiguous blocks to combat external fragmentation and improve future allocation performance.[27] Several techniques optimize extent handling during operations. Allocate-on-flush delays extent allocation until data is flushed from cache to disk, enabling the system to batch writes and select larger contiguous blocks, thereby minimizing fragmentation from scattered small allocations.[29] Conversely, copy-on-write protocols, often used for snapshots and versioning, redirect modifications to new extents rather than altering originals, which preserves data integrity but proliferates small, non-contiguous extents over time and exacerbates fragmentation.[30] File growth, such as through appends, is managed by attempting to extend the trailing extent if adjacent free space allows; if not, the system allocates an additional extent, which may require splitting the file's logical structure across multiple extents in the metadata. Preallocation reserves a substantial contiguous extent upfront for anticipated large-file growth, avoiding repeated small allocations and associated fragmentation. The total size of an extent is determined by multiplying the underlying block size by the extent's length in blocks:This formula ensures precise mapping of logical file data to physical storage.[31]
Metadata Structures
In file systems that employ extents, metadata structures are designed to efficiently map logical file offsets to physical storage locations while minimizing overhead for large files. These structures typically organize extents in hierarchical trees, such as B-trees or B+-trees, to avoid linear scans and support rapid lookups and updates. For instance, in the ext4 file system, the extent tree is rooted in the inode's i_block array and uses a multi-level structure to represent mappings, enabling efficient handling of fragmented or large files without the limitations of traditional indirect block pointers.[32] The on-disk format of extent metadata generally consists of headers followed by entries that encode essential attributes. A typical extent entry includes the starting logical block number, the length of the contiguous run (in blocks), and flags indicating properties like compression, encryption, or preallocation status. In ext4, the extent tree begins with a 12-byte extent header (struct ext4_extent_header) containing a magic number (0xF30A), the number of valid and maximum entries, tree depth, and generation counter. This header is followed by leaf entries (struct ext4_extent, 12 bytes each) for direct mappings or index entries (struct ext4_extent_idx, 12 bytes each) for interior nodes pointing to child blocks. For example, an extent entry might specify ee_block as the starting file block, ee_len as up to 32,768 blocks (128 MiB maximum with 4 KiB blocks), and ee_start (split into high and low parts) as the physical block address.[32] Multi-level extents extend this format for files exceeding direct inode capacity, using indirect pointers to additional extent blocks. In ext4, the tree supports up to five levels of indirection, where interior nodes reference lower-level extent blocks, allowing mappings for files up to 16 TiB while keeping the root compact within the 60-byte i_block field (12 bytes header plus up to four direct extents). Similarly, in Btrfs, the extent tree (object ID 2) uses a copy-on-write B-tree where EXTENT_ITEM keys encode the extent start (as objectid), and the item describes allocated space for data or metadata with fields for length, reference count, generation, and flags; interior nodes point to child subtrees for hierarchical organization. This structure tracks shared extents via back references, supporting features like snapshots without duplicating mappings.[32][33][34] Metadata overhead arises primarily from the space required per extent entry, scaling with fragmentation. The total metadata space is approximately the number of extents multiplied by the entry size, typically 12-16 bytes per entry across systems like ext4 (12 bytes per extent or index) and XFS (where B+-tree extent records average similar sizes in allocation group trees). In ext4, leaf nodes include a 12-byte header plus entries, with optional 4-byte checksums, leaving minimal unallocated space in 4 KiB blocks; excessive fragmentation can thus inflate inode-linked metadata blocks. Btrfs extent trees add overhead for reference counts and backrefs but optimize via block groups (e.g., 256 MiB chunks) to batch allocations.[32][33][35]| Structure | Size (bytes) | Key Fields | Example Usage |
|---|---|---|---|
| ext4_extent_header | 12 | eh_magic (2), eh_entries (2), eh_max (2), eh_depth (2), eh_generation (4) | Root or node header in i_block or extent block |
| ext4_extent (leaf) | 12 | ee_block (4), ee_len (2), ee_start_hi (2), ee_start_lo (4) | Maps 1,000 contiguous blocks: ee_block=0, ee_len=1000, ee_start=physical address |
| btrfs_extent_item | Variable (≥32) | size (8), refs (8), generation (8), flags (8), plus backrefs | Tracks shared extent from byte offset X, size Y bytes, with refcount Z |