F2FS

F2FSMain

Community hub

F2FS

7 pages, 0 posts

0 subscribers

Recent from talks

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

Contribute something

About hubMembersContent overviewUpdatesRules

Main reference articles

F2FS

View on Wikipedia

from Wikipedia

Not found

Revisions and contributors Edit on Wikipedia Read on Wikipedia

View on Grokipedia

from Grokipedia

F2FS (Flash-Friendly File System) is a log-structured file system developed by Samsung Electronics for the Linux kernel, specifically optimized for NAND flash memory-based storage devices such as SSDs, eMMC, and SD cards.^[1]^[2] Introduced in Linux kernel version 3.8 on December 20, 2012, it addresses the unique characteristics of flash storage, including erase-before-write operations and limited write endurance, by using an append-only logging scheme to convert random writes into sequential ones.^[1]^[2] The design of F2FS draws from traditional log-structured file systems like LFS while mitigating their drawbacks, such as the "wandering tree" problem and high garbage collection overhead, through innovative features like a Node Address Table (NAT) for efficient index updates and multi-head logging to separate hot and cold data.^[1] Its on-disk layout divides the storage volume into fixed-size segments of 2 MB each, organized into sections and zones that align with the operational units of underlying flash translation layers (FTLs), enabling better wear leveling and reduced write amplification.^[1]^[2] F2FS supports adaptive logging modes—switching between normal and threaded logging based on storage utilization—to maintain high performance even as the file system fills up, and it incorporates background cleaning algorithms with greedy or cost-benefit victim selection for efficient garbage collection.^[1]^[2] In terms of performance, F2FS has demonstrated significant advantages over file systems like EXT4 on flash storage: on mobile devices, it achieves up to 3.1× faster throughput in benchmarks like IOzone and 2× in SQLite workloads, while on server-grade SSDs, it delivers up to 2.5× improvement in random write scenarios.^[2] Additional optimizations include fsync acceleration via roll-forward recovery and support for zoned storage, making it suitable for a range of applications from embedded systems to data centers.^[1]^[2] Developed by a team at Samsung's Memory Business unit, including lead author Jaegeuk Kim, F2FS was first presented at the 13th USENIX Conference on File and Storage Technologies (FAST '15) in 2015, where its architecture and empirical results were detailed.^[1]^[2] Tools for creating, checking, and debugging F2FS volumes are maintained in the official kernel repository, ensuring ongoing integration and evolution within the Linux ecosystem.^[1]

History

Development and Initial Release

F2FS, or Flash-Friendly File System, was initiated by Samsung Electronics in 2012 to overcome the inefficiencies of traditional block-based file systems when used on NAND flash storage devices.^[3] Traditional systems like ext4, designed for rotating magnetic media, struggle with flash memory's erase-before-write mechanism and the need for wear leveling, leading to increased write amplification and reduced lifespan.^[4] Samsung's development effort focused on creating a file system that aligns with flash characteristics, such as sequential writes and out-of-place updates, to minimize garbage collection overhead and enhance performance on devices like SSDs, eMMC, and SD cards.^[3] The primary development was led by Jaegeuk Kim, a Samsung engineer, who submitted the initial patch series to the Linux kernel mailing list on October 5, 2012.^[3] Kim's team at Samsung addressed known limitations in earlier log-structured file systems, including the snowball effect of the wandering tree problem and excessive cleaning costs, by introducing configurable on-disk layouts and adaptive allocation strategies.^[4] Early prototypes were tested internally on Samsung's NAND flash-based devices to validate optimizations for real-world mobile and embedded storage scenarios.^[4] F2FS was merged into the mainline Linux kernel on December 20, 2012, via commit a13eea6bd9ee62ceacfc5243d54c84396bc86cb4, marking its official integration as part of the 3.8 release cycle.^[5] The full Linux kernel version 3.8, including F2FS support, was released on February 18, 2013.^[6] This initial release introduced core features like checkpointing for crash recovery and support for extended attributes, establishing F2FS as an open-source alternative tailored for flash-centric workloads.^[5]

Subsequent Enhancements

Following its initial integration into the Linux kernel in version 3.8, F2FS received inline data support in Linux 3.14, allowing small files to be stored directly within inode blocks to reduce metadata overhead and improve access times for tiny files.^[7] This enhancement was part of broader refactoring efforts to optimize bio operations and inline features.^[7] In Linux 4.2, F2FS added native file-based encryption support via the fscrypt framework, enabling per-directory encryption keys for secure storage on flash devices while maintaining compatibility with Android's requirements.^[8] This feature addressed growing needs for data protection in mobile environments without significant performance penalties.^[8] Transparent compression capabilities were introduced in Linux 5.6 with support for LZO and LZ4 algorithms, allowing optional on-the-fly compression to enhance storage efficiency on NAND flash.^[9] Zstd compression was added shortly after in Linux 5.7, providing higher compression ratios for better space savings in resource-constrained systems.^[10] Linux 6.7 brought support for 16K block sizes, aligning F2FS with larger page sizes common in modern ARM-based mobile SoCs and enabling improved I/O throughput.^[11] Concurrently, the maximum file size for 16K-block configurations was adjusted to 16 TB to accommodate crypto data unit compatibility and prevent overflows in addressing schemes.^[12] The Linux 6.18 kernel series also includes performance refinements, such as optimizations for journaling through tunable checkpoint modes and sysfs controls to reduce latency during metadata updates.^[13] These updates also improve fragmentation handling via node block prefetching, efficient FUA write merging, and prioritized block allocation for hot data, yielding measurable gains in write-heavy workloads.^[13] Throughout its evolution, F2FS has benefited from contributions by companies like Google, Huawei, and Motorola, focusing on mobile-specific tweaks such as quota support and block size adaptations for Android devices.^[14]^[15]

Design Principles

Log-Structured Approach

F2FS adopts a log-structured file system (LFS) paradigm, designed specifically for NAND flash storage, where new data versions are appended sequentially to logs rather than performing in-place updates. This approach minimizes the number of erase operations on flash blocks, which are constrained by NAND flash's requirement to erase entire blocks before rewriting. By treating the storage as an append-only log, F2FS avoids the inefficiencies of random writes that would otherwise lead to fragmented updates and increased wear on the medium.^[2] To enhance efficiency, F2FS separates metadata, referred to as nodes, and user data into distinct log streams. Metadata updates are written to node logs, while data is directed to separate data logs, allowing for independent management and reducing contention during writes. This separation enables optimized handling of different access patterns, with append-only operations converting potential random I/O into sequential writes, which align well with the strengths of NAND flash for high-throughput sequential access. F2FS employs adaptive logging modes, switching dynamically between normal logging (copy-and-compaction scheme) when the filesystem is clean and threaded logging (writing to dirty segments without foreground cleaning) as utilization increases, to maintain performance and minimize garbage collection overhead.^[2] File system updates in F2FS are managed through a versioning mechanism, where modifications create new log entries, and previous versions are marked as invalid without immediate erasure. Invalidated blocks accumulate until garbage collection reclaims space, ensuring that the system maintains a clean log structure over time. This contrasts with traditional LFS implementations, which often suffer from write amplification due to excessive copying during log cleaning. F2FS mitigates this issue through multi-head logging, which divides logs into hot, warm, and cold categories based on access frequency, thereby isolating frequently updated data to reduce unnecessary overwrites and garbage collection overhead.^[2]

Flash-Specific Optimizations

F2FS incorporates optimizations specifically tailored to NAND flash memory's inherent constraints, including limited program/erase (P/E) cycles, the requirement for erase-before-write operations, and the inefficiency of in-place updates. These adaptations prioritize endurance by minimizing write amplification and leveraging hardware-level features like the Flash Translation Layer (FTL), while enhancing performance for flash-based workloads such as those in mobile and embedded devices.^[2] A core optimization is the hot/cold data classification, which separates frequently accessed "hot" data from archival "cold" data to distribute wear more evenly and reduce erase cycles on high-activity areas. F2FS achieves this through multi-head logging with up to six distinct logs (configurable to 2, 4, or 6), categorizing node and data blocks as hot, warm, or cold based on access patterns and file types—for instance, directory entries and small inline data as hot, regular file data as warm, and multimedia or migrated blocks as cold. This proactive separation aligns writes with flash's sequential preferences, mitigating the "wandering tree" problem in log-structured systems and improving overall lifespan without relying on FTL alone.^[2] To integrate effectively with the device's FTL, F2FS avoids implementing its own wear-leveling algorithms, instead mapping active logs across different zones that match the FTL's set-associative or fully-associative granularity. This design reduces filesystem-induced garbage collection overhead by distributing updates and allowing the hardware FTL to handle block remapping and leveling transparently, which is particularly beneficial for consumer-grade flash like eMMC and UFS.^[2] Out-of-place updates form another fundamental adaptation, where all modifications are appended sequentially to new log segments rather than overwriting existing blocks, thereby preventing write amplification and aligning with NAND flash's out-of-place write semantics. Old data versions are invalidated via structures like the Node Address Table (NAT), enabling efficient later reclamation during cleaning without immediate erases. Building on its log-structured appending, this approach converts random writes into sequential ones, further optimizing flash endurance.^[2] F2FS supports multi-device configurations for flash arrays, enabling a single filesystem to span multiple block devices through linear concatenation of address spaces without striping or built-in redundancy. This feature, introduced in Linux kernel 4.10, facilitates larger storage pools on multi-flash setups like SSD arrays, with segment allocation adjusted per device to maintain flash alignment.^[2]^[16] For small-block writes prevalent in mobile workloads, F2FS optimizes via inline data and dentry storage within inode blocks for files smaller than 3,692 bytes (approximately 3.6 KB), eliminating separate block allocations and reducing metadata overhead. Additionally, roll-forward recovery during fsync operations writes only affected data and direct node blocks, minimizing latency and amplification for frequent small synchronous updates common in journaling scenarios. These mechanisms ensure efficient handling of flash's page-level programming while preserving atomicity.^[2]

On-Disk Format

Volume Segmentation

F2FS organizes the storage volume into fixed-size segments to facilitate efficient management tailored to flash memory characteristics. Each segment is 2 MB in size, comprising 512 blocks of the default 4 KB block size, serving as the fundamental unit for allocation, garbage collection, and wear leveling.^[1] These segments are grouped into sections (typically one segment per section by default) and zones (groups of sections), enabling structured handling of metadata and data across the volume.^[1] This segmentation supports the log-structured design by allowing sequential writes within segments, minimizing random access patterns that could degrade flash performance.^[2] The volume is partitioned into distinct areas starting from the beginning of the device. The superblock area, located at the partition's start with two redundant copies for fault tolerance, stores essential filesystem parameters such as version information, block size, segment count, and configuration options like zone sizes.^[1] Following the superblock is the checkpoint area, which contains checkpoint blocks (CP blocks) that record the filesystem's state for crash recovery, including references to valid metadata bitmaps and summaries.^[1] The summary areas encompass the Segment Information Table (SIT), which tracks the valid block count and dirty status bitmap for each segment; the Node Address Table (NAT), which maps node IDs to physical block addresses for all node blocks in the main area; and the Segment Summary Area (SSA), which provides ownership information for data and node blocks within segments to aid in recovery and allocation.^[1]^[2] The bulk of the volume constitutes the main area, dedicated to storing actual file data and metadata, and is subdivided into hot, warm, and cold zones to optimize write amplification and load distribution. Data segments in the main area hold file contents, while node segments store metadata such as inodes and directory entries; these are allocated preferentially in specific zones based on access patterns to balance wear across flash blocks.^[1] The hot zone prioritizes frequently accessed items like directory entries and direct node blocks, the warm zone handles moderately accessed data and indirect nodes, and the cold zone manages infrequently updated or archived data such as multimedia files, ensuring that hot data remains isolated from colder segments to reduce garbage collection overhead.^[1]^[2] This zoning aligns allocations with the flash translation layer's granularity, promoting even utilization of underlying storage.^[1] F2FS supports maximum volume sizes of 16 TB when using the default 4 KB block size, limited by 32-bit block addressing (2^32 blocks × 4 KB).^[17] Following enhancements in Linux kernel 6.7 for larger page sizes, F2FS accommodates 16 KB block sizes, extending the maximum volume capacity to 64 TB (2^32 blocks × 16 KB).^[11]

Checkpoint and Metadata Areas

F2FS employs a checkpointing mechanism to ensure filesystem consistency, particularly in the face of sudden power losses common in mobile devices. This involves creating periodic snapshots of the filesystem state, which serve as recovery points without requiring full filesystem synchronization operations. The checkpoints utilize shadow paging, where updates are written to new locations while preserving the previous checkpoint as a fallback, enabling quick recovery by simply rolling back to the last valid checkpoint during mount.^[1] For redundancy, F2FS maintains dual checkpoint packs, labeled CP #0 and CP #1, stored in the checkpoint area. Each pack contains critical information including references to the root inode, valid NAT/SIT bitmaps, segment summaries that outline block usage per segment, and journal logs recording recent modifications such as inode changes and directory entries. At mount time, the filesystem scans the checkpoint area to identify and load the most recent valid pack, ensuring atomic updates and minimizing recovery overhead.^[1] Supporting the checkpoint process are dedicated metadata structures that track filesystem state efficiently. The Segment Information Table (SIT) resides in the SIT area and uses bitmaps to record the number of valid and invalid blocks within each 2 MB segment, facilitating informed decisions during garbage collection by identifying segments for cleaning. The Node Address Table (NAT), located in the NAT area, maps logical inode numbers to their physical node block addresses in the main area, enabling rapid location of file metadata. Complementing these, the Summary Structure Area (SSA) provides per-block summary entries detailing ownership information for data and node blocks within segments, such as the parent inode number and offset.^[1]

Core Data Structures

Node and Index Structures

F2FS utilizes a pointer-based hierarchical node system to index file data blocks and metadata, mitigating update propagation issues common in traditional file systems by isolating node updates. This structure draws from log-structured principles but incorporates multi-level indirection optimized for flash storage, where nodes are stored in dedicated segments of the main area.^[1] The core of this system is the inode, a fixed-size 4 KB block that stores file attributes such as permissions, timestamps, ownership, and size, alongside indexing pointers. Each inode includes up to 923 direct pointers to data blocks, two pointers to direct node blocks, two pointers to indirect node blocks, and one pointer to a double indirect node block, enabling efficient access to file contents without excessive indirection for small to medium files.^[1] For small files under approximately 3.5 KB, F2FS supports inline data storage directly within the inode block via the inline_data mount option, which embeds the file contents to minimize block allocations and I/O operations.^[1] Node blocks fall into three primary types: inodes, direct nodes, and indirect nodes, each designed to chain hierarchically for scalability. A direct node block points to up to 1018 data blocks, providing immediate access for files that fit within this limit. Indirect nodes extend this by pointing to up to 1018 additional node blocks (direct or further indirect), while a double indirect node points to 1018 indirect nodes, allowing the system to address vast file sizes—up to 3.94 TB for 4 KB block sizes or 16 TB for 16 KB block sizes—through this multi-level indirection scheme, akin to traditional Unix file systems but with log-optimized updates to reduce write amplification on flash.^[1] Node addressing relies on the Node Address Table (NAT), a bitmap-based structure in the metadata area that maps logical node identifiers to physical block addresses in the main area, ensuring quick lookups without traversing the entire hierarchy. To maintain consistency during log-structured writes, NAT entries include version numbers that are updated only at checkpoints, preventing stale references and enabling atomic metadata operations.^[1] Directory entries reference these inodes by ID to link filenames to file metadata and data.^[1]

Directory Organization

F2FS organizes directories through hash-based entries stored in specialized dentry blocks, which are pointed to by the directory's inode. These blocks contain filename hashes, corresponding inode numbers, and the full names, enabling quick mapping from names to file metadata. Each 4 KB dentry block includes a 27-byte bitmap to track valid entries, followed by an array of up to 214 dentry slots (each 11 bytes, holding a 32-bit hash, inode number, name length, and type) and space for filename entries (1712 bytes total, with individual filenames limited to 255 bytes). This structure supports efficient storage and retrieval in flash-optimized environments.^[2]^[18] To handle large directories, F2FS implements multi-level hash tables divided into buckets. A filename's hash value identifies the starting bucket, after which the system incrementally scans a fixed number of dentry blocks per level—typically 2 blocks at level 0 and 4 blocks at higher levels—until the target entry is located or the maximum level (configurable up to 6) is exhausted. This hashing mechanism delivers an average lookup performance of

O(1)

, as most operations resolve within the initial bucket scan, with a fallback to linear scanning of the block if collisions occur.^[2]^[18] In the worst case, especially for directories containing millions of entries, the multi-level traversal results in

O(\log N)

complexity, where

N

is the total number of entries, ensuring scalability without excessive overhead. The lookup behavior can be tuned via the lookup_mode mount option (e.g., perf for performance-optimized hashing or compat for compatibility with linear scans). Since Linux kernel 5.3, F2FS has supported case-insensitive name handling through the casefold feature, which normalizes filenames using Unicode casefolding for lookups while preserving the original case on disk; this is enabled during filesystem creation with the -O casefold option in mkfs.f2fs and controlled at mount time.^[18]^[19] Hard links and deletions in directories are managed via reference counts stored in the target inodes. Creating a hard link increments the inode's reference count and adds a new dentry entry, while deletion invalidates the entry by clearing the bitmap bit in the dentry block and decrementing the count; the inode and its data are reclaimed only when the count reaches zero during garbage collection. This approach maintains consistency without immediate space reclamation, aligning with F2FS's log-structured design.^[18]^[2]

Allocation and Maintenance

Block Allocation Strategies

F2FS implements block allocation through a multi-log architecture that separates data into hot, warm, and cold categories to optimize flash storage performance by minimizing write amplification and fragmentation.^[2] This separation is managed via six active log areas in the main storage region: hot, warm, and cold logs for both node blocks (metadata) and data blocks.^[1] Hot data typically includes directory entries and small or frequently accessed files, warm data covers general file content, and cold data encompasses less frequently modified blocks such as multimedia files or those migrated during maintenance.^[2] By default, new or small files are allocated to the hot data log to prioritize quick access and reduce initial fragmentation.^[2] As files age or their access frequency decreases—determined by last modification time and usage patterns—subsequent writes shift to warm or cold logs, ensuring hotter data remains in faster-accessible areas while colder data is isolated to reduce unnecessary overwrites.^[2] Allocation occurs through dedicated cursors for each log type, such as CURSEG_HOT_DATA, CURSEG_WARM_DATA, CURSEG_COLD_DATA, CURSEG_HOT_NODE, CURSEG_WARM_NODE, and CURSEG_COLD_NODE, which track the current segment for writes and advance sequentially within section-aligned boundaries (default 2 MB segments).^[1] To select victim segments for new writes, F2FS uses a cost-based policy that evaluates segment utilization and age, favoring underutilized areas to provide space efficiently without excessive cleaning overhead.^[2] This approach integrates with the file system's segment zoning layout, where logs are confined to specific zones for parallelism.^[1] For compressed or encrypted files, F2FS prefers inline allocation when data fits within the inode block (up to approximately 3.4 KB), storing compressed clusters directly to avoid extra block overhead.^[1] Larger or extent-based files leverage an extent cache—a red-black tree structure—to map contiguous logical blocks to physical ones, enabling efficient allocation of sequential clusters (default minimum 16 KB) and reducing fragmentation in workloads involving compression algorithms like LZ4 or ZSTD. The extent cache can be enhanced with the age_extent_cache mount option, which tracks the update frequency of extents to provide better allocation hints based on access patterns.^[1] Adaptive strategies further enhance allocation for sequential workloads: the default "adaptive" mode switches between normal logging (for random writes) and LFS-style threaded logging (for sequential patterns) at high filesystem utilization levels, such as when overprovisioned space becomes limited, promoting contiguous block assignment to minimize fragmentation.^[2] Experimental modes like "fragment:segment" or "fragment:block" allow controlled scattering of allocations for testing, but production use relies on the extent cache and log separation for robustness.^[1]

Garbage Collection Process

F2FS employs garbage collection (GC) to reclaim invalid blocks in its log-structured layout, ensuring sufficient free space for ongoing writes while minimizing write amplification on flash storage. The process identifies segments with invalid data, migrates any valid blocks to new locations, and erases the victim segments to make them available again. This is crucial for maintaining performance, as flash devices cannot overwrite blocks in place and require erasure at the segment level before reuse.^[2]^[1] Background GC operates periodically through a kernel thread when the system is idle, proactively cleaning segments to prevent space exhaustion. It selects victim segments using a cost-benefit policy that evaluates both the number of valid blocks (utilization) and the segment's age, derived from the last modification time in the Segment Information Table (SIT), to balance efficiency and avoid excessive thrashing. An alternative age-threshold GC (ATGC) policy, available via mount option, uses age thresholds for victim selection in background operations, particularly beneficial for workloads with aged data patterns. This approach prioritizes reclaiming segments with high invalid block ratios while considering the cost of migration, such as the effort to relocate older, potentially colder data. In contrast, foreground (on-demand) GC triggers synchronously when free segments fall below a threshold—typically when insufficient space is available for a VFS operation—potentially blocking I/O until completion. It employs a simpler greedy policy, selecting the segment with the fewest valid blocks to minimize latency during urgent reclamation.^[2]^[1] Victim selection emphasizes cold segments to preserve hot data integrity and reduce unnecessary migrations. The SIT maintains per-segment validity bitmaps and counts, enabling quick identification of invalid blocks without scanning the entire filesystem. During GC, the Segment Summary Area (SSA) provides block ownership details, facilitating the retrieval of parent node structures for valid data migration. Valid blocks are lazily moved to the page cache and then written to free logs, after which the victim segment is invalidated and queued for erasure post-checkpoint. This process integrates hot/cold data classification by directing colder data toward segments easier to reclaim.^[2]^[1] Multi-head logging supports even distribution of writes across hot, warm, and cold zones, reducing localized wear and aiding GC by isolating data lifetimes. F2FS maintains up to six active logs (separate for nodes and data, categorized by temperature), which helps in selecting victims from colder logs during background operations. When free space drops critically low (e.g., below 5% of total segments), the system switches to threaded logging to trigger more aggressive cleaning.^[2]^[1] Tuning parameters like the overprovisioning ratio—defaulting to 5% of device capacity—reserve hidden space to buffer against fragmentation and delay GC invocations, enhancing overall efficiency. This ratio can be adjusted during formatting (e.g., via mkfs.f2fs), with values around 4-7% commonly used to optimize for different workloads, allowing more headroom for invalidation before reclamation is needed. Mount options such as background_gc=on (default) enable periodic cleaning, while gc_merge integrates foreground requests into background processes for better throughput.^[2]^[1]

Key Features

Compression and Encryption

F2FS supports transparent compression to reduce storage usage on flash devices, allowing files to be compressed on write and decompressed on read without user intervention. Compression can be enabled on a per-inode or per-directory basis using the chattr +c command, with support for algorithms including LZO for fast compression, LZ4 for a balance of speed and ratio, and Zstd for higher compression ratios. In the default "fs" mode, F2FS automatically compresses eligible data during writeback, while "user" mode permits manual control via ioctls such as F2FS_IOC_COMPRESS_FILE and F2FS_IOC_DECOMPRESS_FILE. Compression is limited to write-once files to avoid write amplification, with data compressed only if the ratio meets a configurable threshold.^[1] The compression implementation organizes data into clusters, typically starting at 16 KB (configurable via compress_log_size), where each cluster maps to one or more physical blocks depending on the compression efficiency. Compressed clusters are stored as extents in metadata, including details like the compression flag, data length, checksum for verification (if enabled via compress_chksum), and the compressed payload itself; small extents may be stored inline to minimize overhead. This approach reduces the physical footprint on NAND flash, extending device lifespan by lowering write volumes, though it introduces CPU overhead during compression and decompression operations. Mount options like compress_algorithm=lz4 or compress_extension=ext allow fine-tuning, and unused compressed space can be released explicitly with F2FS_IOC_RELEASE_COMPRESS_BLOCKS to reclaim storage.^[1]^[20] F2FS integrates with the Linux kernel's fscrypt framework for filesystem-level encryption, providing transparent protection for file contents and filenames using per-file keys derived from a master key via a key derivation function. The primary algorithm is AES-256-XTS for file data encryption, with AES-256-CBC-CTS or AES-256-HCTR2 for filenames to handle variable lengths up to the POSIX NAME_MAX of 255 bytes; Adiantum is also supported as a lightweight alternative for hardware without AES acceleration. Encryption support was added in Linux kernel 4.2, enabling per-directory policies that apply to all descendants while maintaining POSIX compliance through full filename length preservation and no stacking layers.^[21]^[22] Inline encryption is available via the blk-crypto framework when mounted with the inlinecrypt option, offloading operations to compatible storage hardware for reduced CPU load, though this remains optional and depends on device capabilities. Encrypted files in F2FS store keys in inode extended attributes, ensuring secure, efficient access without performance penalties from user-space encryption tools, while supporting features like case-insensitive lookups when combined with other policies. This integration prioritizes security for mobile and embedded environments where F2FS is common, without compromising the filesystem's log-structured design.^[1]^[23]

Additional Capabilities

F2FS includes support for project quotas, introduced in Linux kernel version 4.14, which enable disk usage limits for specific projects or user groups to manage resource allocation efficiently.^[1] The prjquota mount option activates plain project disk quota accounting, allowing administrators to enforce limits without affecting standard user or group quotas.^[1] Since Linux kernel version 5.4, F2FS has integrated FS-verity, a framework for transparent integrity and authenticity protection of read-only files, enabling verification of file contents against tampering through Merkle tree-based signatures.^[24] This feature requires the filesystem to be formatted with the verity option using f2fs-tools version 1.11.0 or later, and it performs on-demand cryptographic checks during reads to ensure data integrity.^[24] F2FS provides casefold support starting from Linux kernel version 5.2, facilitating case-insensitive filename handling on otherwise case-sensitive filesystems, particularly useful for compatibility with applications expecting such behavior.^[25] The feature is enabled via the casefold option during mkfs.f2fs formatting and relies on Unicode normalization for consistent lookups, with mount options like lookup_mode controlling performance modes such as perf or compat.^[1] Nanosecond timestamps have been available in F2FS since its introduction in Linux kernel version 3.8, providing high-precision recording of file access, modification, and change times in inode metadata to support applications requiring fine-grained temporal resolution.^[26] For crash-safe operations, F2FS implements atomic writes, allowing applications to perform failure-atomic updates to files via ioctls, ensuring that either all changes commit or none do in the event of a power loss.^[27] Complementing this, ORPHAN inode handling maintains lists of inodes with open but unlinked files in the checkpoint area, enabling recovery and cleanup during mount to prevent data leaks from incomplete operations.^[2] Multi-device support was added in Linux kernel version 4.10, permitting a single F2FS instance to span multiple block devices for enhanced scalability and capacity without relying on external volume managers.^[28] Additionally, online resize functionality, available since version 4.14, allows dynamic expansion or shrinking of the filesystem while mounted, using the resize.f2fs tool to adjust segment allocations and metadata structures seamlessly.^[1] F2FS also supports standard POSIX ACLs and extended attributes (xattrs) by default when the relevant kernel configurations are enabled, providing fine-grained access control and custom metadata storage.^[1]

Adoption

Mobile and Embedded Devices

F2FS gained initial traction in mobile and embedded devices through early adoption by key manufacturers optimizing for flash storage. Motorola implemented F2FS in its smartphones starting with the second-generation Moto G in 2015 for the userdata partition, leveraging the file system's log-structured design to enhance performance on NAND flash.^[29] Samsung, as the original developer of F2FS, integrated it into select Galaxy devices starting in the mid-2010s, with explicit use confirmed in flagships such as the Galaxy Note 10 in 2019 and subsequent models, where it paired with UFS storage to reduce write amplification and improve I/O efficiency.^[30] Google adopted F2FS for its Nexus 9 tablet in 2014, marking an early endorsement for /data partitions in stock Android hardware.^[31] This was followed by the Pixel 3 in 2018, where F2FS replaced ext4 to capitalize on flash-optimized features like inline data and reduced fragmentation, contributing to faster app loading and system responsiveness.^[32] By Android 10, F2FS became a recommended option in AOSP for the /data partition, supporting file-based encryption and aligning with the shift toward UFS and eMMC storage in mid-range and flagship devices.^[33] Other vendors followed suit, including OnePlus with the 3T in 2016 for improved sequential read/write speeds, and ZTE in the Axon 10 Pro in 2019 as the first device to pair F2FS with SanDisk's iNAND flash for anti-fragmentation benefits.^[34]^[35] Integration into the Android Open Source Project has driven widespread use of F2FS for user data partitions in many flagship and mid-range Android devices with eMMC or UFS storage by 2025, particularly where its log-structured approach minimizes wear and boosts endurance. As of 2025, F2FS is used in a significant portion of flagship Android devices, estimated to cover hundreds of millions of units annually, according to vendor reports. Huawei incorporated F2FS starting with the P9 in 2016 to optimize flash I/O patterns. This preference stems from F2FS's ability to handle out-of-place updates efficiently, reducing latency in random access scenarios common in mobile workloads. F2FS continues to be used in recent Google Pixel and Samsung Galaxy flagships, enabling superior performance on high-speed UFS storage.

Linux Kernel and Distributions

F2FS was merged into the mainline Linux kernel with version 3.8, released in December 2012, enabling native support for the file system on compatible hardware.^[36] The accompanying user-space tools, including mkfs.f2fs for formatting and fsck.f2fs for consistency checks, are provided by the f2fs-tools package, with initial releases dating back to around 2013.^[37] Major Linux distributions offer varying levels of F2FS integration. Arch Linux provides full support through its kernel and f2fs-tools packages, allowing straightforward installation and use for root or data partitions.^[38] Debian has included F2FS kernel support since version 8.0 (Jessie) in 2014, with f2fs-tools available in its repositories, though the installer lacks native F2FS partitioning options.^[39] Gentoo supports F2FS via kernel configuration and the sys-fs/f2fs-tools ebuild, enabling custom builds for flash-optimized setups.^[40] Fedora includes F2FS in its kernel and offers f2fs-tools via DNF, though the graphical installer requires terminal commands for creating F2FS partitions. In Ubuntu, F2FS is supported experimentally through manual configuration, as the installer does not natively handle it, requiring post-installation adjustments for root filesystems.^[41] On desktop and server systems, F2FS adoption has grown steadily for SSD and NVMe drives due to its optimizations for flash wear leveling and performance, though ext4 and Btrfs remain dominant for general-purpose use.^[42] It is particularly recommended for environments with heavy flash storage workloads, such as high-write databases or virtual machines on NVMe arrays, where its log-structured design reduces latency.^[43] Recent kernel updates have further enhanced F2FS suitability for modern hardware. Linux 6.18, released in late 2025, introduced performance improvements including optimized lookup modes, node block readahead, and better inline data handling, benefiting NVMe deployments by reducing overhead in flash-intensive operations.^[13] Adoption has extended to embedded Linux variants, such as user-modified Raspberry Pi OS images converted via scripts to leverage F2FS for extended SD card lifespan.^[44] F2FS configuration in Linux environments relies on mount options to tailor behavior to hardware. The "discard" option enables TRIM commands for efficient garbage collection on SSDs and NVMe drives, issuing discards asynchronously during segment cleaning to maintain performance without manual intervention.^[1] For compression, the "compress_algorithm=zstd" mount option activates transparent zstd-based file compression, configurable with levels for balancing speed and ratio in space-constrained setups.^[45]

Performance

Benchmarks and Comparisons

F2FS demonstrates superior random write performance on flash storage compared to EXT4, particularly in synthetic workloads. In IOzone tests on a mobile system, F2FS achieved up to 3.1 times higher bandwidth for 4KB random writes by converting them into sequential writes through its log-structured design.^[2] On server systems with SATA SSDs, F2FS outperformed EXT4 by 2.5 times in the varmail benchmark, which involves small file operations with frequent fsync calls.^[2] Recent benchmarks on NVMe drives highlight F2FS's strengths in mobile and database workloads. Phoronix tests using Linux 6.11 showed F2FS delivering the fastest results in SQLite database operations, making it suitable for small-file database tasks.^[46] In Android environments, F2FS reduced application launch times by 20% for the Facebook app and 40% for the Twitter app compared to EXT4, attributed to efficient handling of flash I/O patterns.^[2] Post-2023 enhancements, including 16K block size support in Linux 6.7, further improved performance on modern devices with larger page sizes, enabling better alignment with hardware capabilities.^[11] Comparisons with other file systems underscore F2FS's optimizations for NAND flash. F2FS exhibits better wear leveling than EXT4 on NAND storage due to its low write amplification factor (around 1.02 at high utilization), reducing unnecessary erases and extending device lifespan.^[2]^[47] F2FS's simpler log-structured approach avoids the heavier metadata management of Btrfs's copy-on-write mechanism. In 2024 NVMe benchmarks on Linux 6.10 and later, F2FS achieved performance parity with XFS for sequential reads on PCIe SSDs, though XFS edged ahead by about 20% in some high-throughput scenarios.^[46] F2FS excels in workloads involving small-file writes, such as database operations, where its adaptive logging combines small updates efficiently.^[2] However, foreground garbage collection can introduce higher latency in write-intensive scenarios, as the greedy cleaning policy prioritizes minimizing visible delays but may still impact tail latencies by up to several milliseconds under heavy load.^[2] In September 2025 benchmarks on Linux 6.17, F2FS was among the fastest file systems for 4K random reads alongside EXT4 and XFS on NVMe SSDs.^[48]

Benchmark	Workload	F2FS vs. EXT4	Source
IOzone	4KB Random Writes (Mobile)	3.1× faster	FAST'15 Paper^[2]
Varmail	Small Files with fsync (SATA SSD)	2.5× faster	FAST'15 Paper^[2]
SQLite	Concurrent Database Writes (NVMe, Linux 6.11)	Fastest among tested file systems	Phoronix^[46]
App Launch	Facebook/Twitter (Android)	20-40% faster	FAST'15 Paper^[2]

Limitations and Improvements

Despite its optimizations for flash storage, F2FS exhibits fragmentation issues in long-running mobile devices, primarily due to out-of-place updates that scatter file extents over time, leading to increased read amplification and performance degradation.^[49] This critical fragmentation is particularly pronounced in consumer scenarios with frequent small writes, such as app updates and media caching. Research in 2024 proposed a controller co-design approach to mitigate this by coordinating host and storage-level operations for proactive extent coalescing, reducing fragmentation by up to 70% in emulated mobile workloads.^[49] F2FS sees limited adoption on desktops compared to ext4, owing to perceptions of lower maturity in handling diverse workloads and potential data integrity risks under heavy use.^[50] Unlike filesystems such as BTRFS, F2FS lacks native support for RAID5 or RAID6 configurations, relying instead on underlying software RAID like MDADM, which complicates setup for parity-based arrays.^[51] In scenarios involving sudden power failures, F2FS's checkpointing mechanism can result in slower recovery times than traditional journaling filesystems like ext4, as it may require replaying extensive log segments to achieve consistency, potentially leading to partial data loss if fsck operations fail.^[52] Known bugs in earlier versions include occasional garbage collection (GC) stalls during high-load scenarios, where background cleaning operations block foreground I/O, causing latency spikes; these were addressed through performance optimizations in Linux kernel 6.18 and later.^[13] Recent improvements include the introduction of decentralized, epoch-based journaling in F2FS, presented at OSDI 2025, which enables finer-grained consistency guarantees by distributing journal entries across segments, reducing recovery overhead by 40-60% compared to prior checkpointing.^[52] Additionally, deduplication extensions via F2DFS, detailed in a 2024 ACM Transactions on Storage paper, integrate hybrid inline and offline deduplication, achieving up to 53% higher space savings on flash workloads without significant performance penalties.^[53] Future directions for F2FS emphasize enhanced support for zoned storage, with ongoing kernel updates improving zone-aware allocation to better handle ZNS SSDs and reduce write amplification.^[54] Inline deduplication is also a priority, building on recent hybrid schemes to enable real-time duplicate detection during writes for consumer devices. Efforts continue to address critical fragmentation through advanced host-storage co-design, aiming for sustained performance in aging mobile environments.^[49]

History

F2FS

Recent from talks

Recent from talks

Contribute something

Contribute something

Media Pages

Timelines

Articles

Notes collections

Notes

Notes

Days in Chronicle

F2FS

F2FS

History

Development and Initial Release

Subsequent Enhancements

Design Principles

Log-Structured Approach

Flash-Specific Optimizations

On-Disk Format

Volume Segmentation

Checkpoint and Metadata Areas

Core Data Structures

Node and Index Structures

Directory Organization

Allocation and Maintenance

Block Allocation Strategies

Garbage Collection Process

Key Features

Compression and Encryption

Additional Capabilities

Adoption

Mobile and Embedded Devices

Linux Kernel and Distributions

Performance

Benchmarks and Comparisons

Limitations and Improvements

References

Add your contribution

Related Hubs

Contribute something