Recent from talks
Nothing was collected or created yet.
Reiser4
View on WikipediaThis article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages)
|
| Developer(s) | Edward Shishkin and others[1] |
|---|---|
| Full name | Reiser4 |
| Introduced | 2004 with Linux |
| Partition IDs | Apple_UNIX_SVR2 (Apple Partition Map)
Basic data partition (GPT) |
| Structures | |
| Directory contents | Dancing B*-tree |
| Limits | |
| Max file size | 8 TiB on x86 |
| Max filename length | 3976 bytes |
| Allowed filename characters | All bytes except NUL and '/' |
| Features | |
| Dates recorded | modification (mtime), metadata change (ctime), access (atime) |
| Date range | 64-bit timestamps[2] |
| Forks | No |
| File system permissions | Unix permissions |
| Transparent compression | Yes |
| Transparent encryption | No |
| Data deduplication | No |
| Other | |
| Supported operating systems | Linux |
| Website | reiser4.wiki.kernel.org |
| Repository | github.com/edward6/reiser4 |
Reiser4 is a computer file system, successor to the ReiserFS file system, developed from scratch by Namesys and sponsored by DARPA as well as Linspire. Reiser4 was named after its former lead developer Hans Reiser. As of 2021[update], the Reiser4 patch set is still being maintained,[3][4] but according to Phoronix, it is unlikely to be merged into mainline Linux without corporate backing.[5]
Features
[edit]Some of the goals of the Reiser4 file system are:
- Atomicity (filesystem operations either complete, or they do not, and they do not corrupt due to partially occurring)
- Different transaction models: journaling, write-anywhere (copy-on-write), hybrid transaction model[6]
- More efficient journaling through wandering logs
- More efficient support of small files, in terms of disk space and speed through block suballocation
- Liquid items (or virtual keys) – a special format of records in the storage tree, which completely resolves the problem of internal fragmentation
- EOTTL (extents on the twig level) – fully balanced storage tree, meaning that all paths to objects are of equal length
- Faster handling of directories with large numbers of files
- Transparent compression: Lempel-Ziv-Oberhumer (LZO), zlib
- Plugin infrastructure
- Dynamically optimized disk-layout through allocate-on-flush (also called delayed allocation in XFS)
- Delayed actions (tree balancing, compression, block allocation, local defragmentation)
- R and D (Rare and Dense) caches, synchronized at commit time
- Transactions support for user-defined integrity
- Metadata and inline-data checksums[7]
- Mirrors and failover[8]
- Precise discard support[9] with delayed issuing of discard requests for SSD devices[10]
Some of the more advanced Reiser4 features (such as user-defined transactions) are also not available because of a lack of a VFS API for them.
At present Reiser4 lacks a few standard file system features, such as an online repacker (similar to the defragmentation utilities provided with other file systems). The creators of Reiser4 say they will implement these later, or sooner if someone pays them to do so.[11]
Performance
[edit]Reiser4 uses B*-trees in conjunction with the dancing tree balancing approach, in which underpopulated nodes will not be merged until a flush to disk except under memory pressure or when a transaction completes. Such a system also allows Reiser4 to create files and directories without having to waste time and space through fixed blocks.
As of 2004[update], synthetic benchmarks performed by Namesys in 2003 show that Reiser4 is 10 to 15 times faster than its most serious competitor ext3 working on files smaller than 1 KiB. Namesys's benchmarks suggest it is typically twice the performance of ext3 for general-purpose filesystem usage patterns.[12] Other benchmarks from 2006 show results of Reiser4 being slower on many operations.[13] Benchmarks conducted in 2013 with Linux Kernel version 3.10 show that Reiser4 is considerably faster in various tests compared to in-kernel filesystems ext4, btrfs and XFS.[14]
Integration with Linux
[edit]Reiser4 has patches for Linux 2.6, 3.x, 4.x and 5.x.,[15][3] but as of 2019[update], Reiser4 has not been merged into the mainline Linux kernel[3] and consequently is still not supported on many Linux distributions; however, its predecessor ReiserFS v3 has been widely adopted. Reiser4 is also available from Andrew Morton's -mm kernel sources, and from the Zen patch set. The Linux kernel developers claim that Reiser4 does not follow the Linux "coding style" by the decision to use its own plugin system,[16] but Hans Reiser suggested the decision was made for political reasons.[17] The latest released Reiser4 kernel patches and tools can be downloaded from Reiser4 project page at sourceforge.net.[4]
History of Reiser4
[edit]Hans Reiser was convicted of murder on April 28, 2008, leaving the future of Reiser4 uncertain. After his arrest, employees of Namesys were assured they would continue to work and that the events would not slow down the software development in the immediate future. In order to afford increasing legal fees, Hans Reiser announced on December 21, 2006, that he was going to sell Namesys;[18] as of March 26, 2008, it had not been sold, although the website was unavailable. During a CNET interview in January 2008, Edward Shishkin, an employee and programmer working for Namesys, said: "Commercial activity of Namesys has stopped." Shishkin and others continued the development of Reiser4,[19] making source code available from Shishkin's web site,[20] later relocated to kernel.org.[21] Since 2008, Namesys employees have received 100% of their sponsored funding from DARPA.[22][23][24]
In 2010, Phoronix wrote that Edward Shishkin was exploring options to get Reiser4 merged into Linux kernel mainline.[25] As of 2019[update], the file system is still being updated for new kernel releases, but has not been submitted for merging.[3] In 2015, Michael Larabel mentioned it is unlikely to happen without corporate backing,[26] and then he suggested in April 2019 that the main obstacle could be the renaming of Reiser4 so as to avoid any references to Reiser.[3]
Shishkin announced a Reiser5 filesystem on December 31, 2019.[27]
See also
[edit]References
[edit]- ^ "Credits - Reiser4 FS Wiki". reiser4.wiki.kernel.org. Retrieved 2019-08-05.
- ^ Documentation/filesystems/reiser4.txt from a reiser4-patched kernel source, "By default file in reiser4 have 64-bit timestamps."
- ^ a b c d e Larabel, Michael (2019-04-13). "Reiser4 Brought To The Linux 5.0 Kernel - Phoronix". Phoronix. Retrieved 2019-08-04.
- ^ a b https://reiser4.sourceforge.net/
- ^ "Ten Features You Will Not Find in the Mainline Linux 4.10 Kernel - Phoronix".
- ^ "Reiser4 transaction models". Reiser4 wiki.
- ^ "Reiser4 checksums". Reiser4 wiki.
- ^ "Reiser4 Mirrors and Failover". Reiser4 wiki.
- ^ "Precise Discard". Reiser4 wiki.
- ^ "Reiser4 discard support". Reiser4 wiki.
- ^ Reiser, Hans (2004-09-16). "Re: Benchmark: ext3 vs reiser4 and effects of fragmentation". Namesys, ReiserFS mailing list. Retrieved 2009-10-03.
- ^ Hans Reiser (November 20, 2003). "Benchmarks Of ReiserFS Version 4". Namesys. Archived from the original on September 29, 2007. Retrieved 2014-01-18.
- ^ Justin Piszcz (January 2006). "Benchmarking Filesystems Part II". Retrieved 2006-04-23.
- ^ Michael Larabel (July 31, 2013). "Reiser4 File-System Shows Decent Performance On Linux 3.10". Phoronix. Retrieved 2013-07-31.
- ^ "Reiser4 file system for Linux OS - Browse Files at SourceForge.net". sourceforge.net. Retrieved 2019-08-04.
- ^ "Linux: Why Reiser4 Is Not in the Kernel". Kerneltrap. September 19, 2005. Archived from the original on 2007-04-23.
- ^ Reiser, Hans (21 July 2006). "The "'official' point of view" expressed by kernelnewbies.org regarding reiser4 inclusion". Retrieved 2008-03-01.
- ^ "Murder Suspect Selling Namesys". Wired News. 2006-12-21. Retrieved 2006-12-30.
- ^ Namesys vanishes, but ReiserFS project lives on. http://www.news.com/8301-13580_3-9851703-39.html Archived 2008-09-05 at the Wayback Machine CNet (January 16, 2008). Retrieved on 2008-01-26.
- ^ "Namesys things". Chichkin_i.zelnet.ru. Archived from the original on 2010-03-24. Retrieved 2010-02-08.
- ^ New location of Namesys software Linux Kernel Mailing List post, 2008-08-04
- ^ "Re: we got the DARPA grant to add views to Reiser4". Mail-archive.com. 2004-04-10. Retrieved 2010-02-08.
- ^ "Bug 114785 – reiserfs won't mount with usrquota option". Red Hat Bugzilla.
- ^ "Reports - ext3 or ReiserFS? Hans Reiser Says Red Hat's Move Is Understandable - Red Hat's Decision is Conservative, Not Radical". LinuxPlanet. Archived from the original on 2010-01-22. Retrieved 2010-02-08.
- ^ "Reiser4 May Go For Mainline Inclusion In 2010". Phoronix. 2009-11-10. Retrieved 2010-02-08.
- ^ Michael Larabel (23 February 2015). "KDBUS & Other Features You Won't Find In The Linux 4.0 Kernel". Phoronix.
- ^ "[ANNOUNCE] Reiser5 (Format Release 5.X.Y)". Linux Weekly News. 2019-12-31.
External links
[edit]- ReiserFS and Reiser4 wiki
- Current Reiserfs4 patches as Namesys' website is down
- Reiserfs v4 utilities
- Introduction to Reiser4 on kuro5hin
- Reiser4 transaction design document
- Trees in the Reiser4 Filesystem, Part I from Linux Journal
- Trees in the Reiser4 Filesystem, Part II from Linux Journal
- Hans Reiser: The Reiser4 Filesystem Hans Reiser's lecture at Google
- Why Reiser4 is not in the Linux Kernel at kernelnewbies.org and Hans Reiser's response to Kernelnewbies' criticism
- Reiser4 and the Politics of the Kernel by Bruce Byfield on Linux.com
- The Reiser4 Filesystem: Ways In Which Extra Rigor In Scientific Methodology Can Consume Years Of Your Life, And How The Result Can Be So Very Worthwhile - lecture given by Hans Reiser at Stanford University (video archive).
- Reiser4 Gentoo FAQ
- Metztli Reiser4 – a Debian installer including Reiser4
Reiser4
View on GrokipediaCore Concepts and Architecture
Design Goals and Innovations
Reiser4 was designed with a primary focus on optimizing performance and space efficiency for small files, particularly those under 1 KiB, which are common in metadata-heavy workloads such as email systems and web servers. Traditional file systems often waste significant disk space by allocating full 4 KiB blocks to tiny files, leading to fragmentation and inefficiency. To address this, Reiser4 incorporates block suballocation, allowing multiple small files or file tails to share a single block, achieving approximately 94% space efficiency for small files and minimizing wasted space. This approach reduces fragmentation by enabling contiguous storage where possible through extent pointers, while supporting dynamic allocation without excessive overhead.[10] A key innovation in Reiser4 is the use of dancing B*-trees for indexing, which provide balanced, adaptive tree structures that optimize for both read and write operations. Unlike traditional B+-trees that rebalance immediately, dancing trees defer balancing until disk flushes, improving caching efficiency by segregating frequently accessed pointers from less frequent data objects, thus doubling read speeds compared to its predecessor Reiser3. This design shifts from rigid, fixed hierarchies to a more flexible object-based storage model, where files are treated as collections of items stored in tree nodes, allowing for semantic layering that separates user-visible naming from underlying performance optimizations. Additionally, Reiser4 emphasizes atomic file operations, ensuring that transactions either complete fully or not at all, which enhances consistency without requiring kernel recompilation for extensibility via plugins.[11][10] The development of Reiser4 was sponsored by DARPA and Linspire, with the latter aiming to improve desktop performance for everyday workloads involving numerous small files. These sponsorships underscored the file system's goals of robustness and adaptability in diverse environments, from secure systems to consumer applications.[1][12]Key Data Structures
Reiser4 employs the Dancing B*-tree as its primary indexing structure, a variant of the B+-tree designed for efficient storage and retrieval of filesystem metadata and data. This structure organizes all filesystem objects in a single balanced tree, where internal nodes contain keys and pointers to child nodes, while leaf nodes hold the actual items. The Dancing B*-tree incorporates an adaptive balancing algorithm that defers node merging and splitting until a flush to disk, allowing temporary imbalances in memory to optimize caching and reduce immediate overhead during operations. Node rotation and repacking occur lazily during these flushes, rotating underpopulated or overpopulated nodes to maintain balance without frequent disk I/O, thereby supporting scalability for large numbers of objects through 64-bit object identifiers that enable up to 2^{64} entities.[13][10][3] Central to Reiser4's object model are object items and stat data, which provide a unified representation for files, directories, and other metadata as extensible objects within the Dancing B*-tree leaves. Object items serve as containers for data or metadata, sized to fit within a single node (typically 4 KiB), and can include extents for file bodies, indirect pointers for larger files, or direct data for small files. Stat data, stored as indivisiblestatic_stat_data items, encapsulate core attributes such as ownership, permissions, timestamps, size, and link count for each object, dynamically allocated on disk as needed. This approach allows objects to integrate plugins for custom behaviors, such as compression or encryption, while maintaining a consistent key-based addressing scheme that sorts items by object ID, type, and offset.[10][3]
File allocation in Reiser4 leverages suballocation within blocks to optimize space for small files, packing up to 16 small objects into a single 4 KiB block through tail packing, where unused space in a block accommodates tails of multiple files. For larger files, allocation uses contiguous extents aligned to 4 KiB boundaries, with pointers managed via item plugins to minimize fragmentation. Copy-on-write mechanisms are integral to the allocation process, particularly through "copy-on-capture" during transactions, where modified blocks are copied before overwriting to ensure atomicity and enable potential snapshot functionality by retaining prior versions in the journal until commit. This suballocation and COW strategy aligns with Reiser4's design goals for efficient handling of small files by reducing wasted space in partially filled blocks.[10][14][3]
Directories in Reiser4 are handled through a flat global namespace of objects, where directories appear as specialized views over the Dancing B*-tree rather than separate hierarchical structures, promoting efficiency in namespace management. Directory entries are stored as cmpnd_dir_item objects, linking names to object keys via hash-based lookups facilitated by pluggable hash functions that map directory names to file locations, enabling fast O(1) average-case retrieval even in large directories. This hash-driven approach sorts entries approximately lexicographically within tree nodes, minimizing collisions and supporting efficient traversal without full linear scans.[10][3][15]
Features
Journaling and Atomic Operations
Reiser4 employs an asynchronous journaling mechanism to ensure data integrity and facilitate rapid crash recovery, logging changes in a way that allows operations to proceed without blocking the system. This approach supports two primary modes: metadata journaling, which captures only structural changes to the file system such as directory entries and inode updates, and data journaling, which additionally logs file content modifications for heightened consistency guarantees.[14] These modes operate with ordered or writeback options; in ordered mode, data blocks are written to their final locations before corresponding metadata to prevent partial updates, while writeback mode permits metadata commits ahead of data for better performance, relying on subsequent flushes.[14] Central to Reiser4's design are atomic operations, implemented through "transcrashes"—collections of disk updates grouped into all-or-nothing transactions that either fully succeed or leave the file system unchanged in the event of a failure. For instance, operations like file creation, deletion, or renaming are encapsulated such that partial execution cannot result in inconsistencies, with dependencies between updates enforced to maintain referential integrity.[14] This atomicity extends to unformatted nodes, which represent raw file data blocks not subject to journaling in metadata mode but integrated into transactions when data journaling is enabled, allowing efficient packing without fixed formatting overhead.[14] Upon system mount following a crash, Reiser4's replay process scans the journal for committed transactions—identified by commit records—and replays them by applying overwrite sets and deallocations, restoring the file system to a consistent state without extensive full-disk checks.[14] Journal transactions are sized to balance atomicity with manageable replay overhead, enabling the use of wandering blocks that are allocated dynamically rather than from a fixed journal area.[16] This mechanism ties briefly into copy-on-write techniques for node modifications, ensuring that in-flight changes do not corrupt the on-disk tree.[14]Plugins and Extensibility
Reiser4 features a highly modular plugin architecture that promotes extensibility by allowing key file system behaviors to be customized through interchangeable components integrated into the kernel module. This design decouples specific implementations from the core file system logic, enabling developers to add or modify functionality such as data formatting, directory indexing, and allocation strategies without recompiling the kernel or altering the base code. The architecture draws on object-based storage principles, where plugins operate on file system objects to handle diverse operations efficiently.[15] The system incorporates multiple built-in plugins across several categories, with documentation indicating at least 24 metadata-related components supporting various operations. Item plugins manage the storage and retrieval of file data and metadata, including specialized handlers for directories and indirect pointers that facilitate flexible extent management. Hash plugins provide functions for ordering directory entries to optimize lookups, while policy plugins govern resource allocation decisions, such as whether to employ tail packing for small files or extents for larger ones. Format plugins define on-disk structures, ensuring compatibility and adaptability for different storage scenarios. Examples of formatting options include tails for fragmented data, extents for contiguous blocks, and hybrid smart approaches that adapt based on file characteristics.[15][17][4] This plugin framework offers significant extensibility benefits, permitting the integration of future enhancements like advanced compression algorithms or custom encryption without requiring kernel modifications—new plugins can be added via updates to the Reiser4 module. It also supports user-defined metadata by treating plugins as modular objects within the file system namespace, exposing them for programmatic access and customization. Plugins are compiled into the Reiser4 kernel module and user-space tools like libreiser4, with configurations selectable at filesystem creation time using options such as--override for keys or formats. The extensible naming scheme enabled by directory item plugins allows filenames up to 3976 bytes in length, far exceeding typical limits in other systems and accommodating complex or internationalized names.[15][18][4]
Compression and Security
Reiser4 provides transparent data compression through dedicated plugins that integrate seamlessly with its extensible architecture. The filesystem supports LZO (Lempel–Ziv–Oberhumer) and zlib compression algorithms, which can be applied at the file or block level to reduce storage requirements without user intervention.[3] These plugins evaluate data compressibility, typically testing whether compression would yield benefits before applying it, with configurable thresholds to balance performance and space savings.[19] Compression is handled within item plugins, where data is compressed on write and decompressed transparently on read, while avoiding unnecessary copy-on-write operations for unmodified compressed blocks to maintain efficiency.[3] Compression ratios in Reiser4 vary significantly depending on the data type; for instance, text-heavy or repetitive files achieve higher ratios with zlib, while LZO offers faster processing for less compressible data like binaries.[3] On x86 architectures, Reiser4 supports maximum file sizes of up to 8 TiB, allowing compressed files to scale within these limits while benefiting from the filesystem's efficient packing of small files, which can save approximately 5% in disk space overall.[20] For security, Reiser4 incorporates access control lists (ACLs) via specialized plugins that enable fine-grained permissions beyond standard POSIX attributes.[3] These plugins treat security attributes as hidden files within the directory structure, allowing modular enforcement of access rules on a per-file basis. Basic encryption is available through external plugins, such as the cryptcompress plugin, which combines compression with per-file encryption using algorithms like AES, applied transparently before data is stored on disk.[1] However, Reiser4 does not provide native full-disk encryption, relying instead on these loadable plugins for optional protection.[5] This plugin-based approach ensures that security features can be enabled selectively without impacting the core filesystem integrity.[3]Performance
Benchmark Results
Early benchmarks conducted by Namesys in 2003 demonstrated that Reiser4 was 10 to 15 times faster than ext3 for operations on files smaller than 4 KiB, including creation and deletion tasks.[21] These results were obtained on early Linux 2.6 kernels and highlighted Reiser4's efficiency in handling fragmented or numerous small files, though they were based on developer-controlled environments.[2] In mid-term evaluations from 2006, Reiser4 showed mixed results on Linux 2.6 kernels compared to other file systems like XFS. Sequential write performance lagged, with Reiser4 taking 25.40 seconds to create a 1 GB file versus 15.87 seconds for XFS. However, it excelled in random I/O workloads involving small files; for instance, splitting a 10 MB file into 1000-byte segments completed in 2.95 seconds on Reiser4, outperforming XFS's 4.87 seconds. Similar advantages appeared in tests with 1024-byte and 2048-byte files, where Reiser4 times were 2.61 seconds and 1.55 seconds, respectively, against XFS's 4.01 seconds and 1.95 seconds. These benchmarks, run on a standard PC with a 2.4 GHz Athlon XP processor, underscored Reiser4's strengths in metadata-heavy and small-file random access, while noting higher CPU utilization due to plugin overhead.[22] More recent benchmarks in 2013 on Linux 3.10 kernels revealed Reiser4 continuing to outperform ext4 and Btrfs in specific areas. File creation tests showed Reiser4 up to twice as fast as ext4 and Btrfs, particularly for initial compile and metadata-intensive operations on an Intel SSD. Directory listing performance was also superior, with Reiser4 completing large-scale listings more efficiently than the in-kernel alternatives. These results, derived from the Phoronix Test Suite on a Lenovo ThinkPad W510 with an Intel Core i7 720QM, confirmed Reiser4's edge in small-file and metadata workloads even on modern hardware, though overall throughput in sequential operations remained competitive but not leading. Factors like plugin architecture were noted to introduce minor overhead in some tests, but without deep analysis.[23][24] Later benchmarks on Linux 4.17 in 2018 showed Reiser4 lagging behind ext4, XFS, Btrfs, and F2FS in most tests, including FS-Mark file creation and sequential I/O operations, with modern file systems being 2-3 times faster in several workloads. This reflects the lack of ongoing maintenance and optimizations for newer kernel features and hardware.[25]| Test Category | Reiser4 Time (2006) | XFS Time (2006) | Source |
|---|---|---|---|
| Sequential Write (1 GB file) | 25.40 s | 15.87 s | Linux Gazette |
| Random I/O (1000-byte split) | 2.95 s | 4.87 s | Linux Gazette |
| Random I/O (1024-byte split) | 2.61 s | 4.01 s | Linux Gazette |
| Test Category | Reiser4 Performance (2013) | Comparison to ext4/Btrfs | Source |
|---|---|---|---|
| File Creation | Up to 2x faster | Superior to both | Phoronix |
| Directory Listings | Efficient for large sets | Outperforms both | Phoronix |
