Recent from talks
All channels
Be the first to start a discussion here.
Be the first to start a discussion here.
Be the first to start a discussion here.
Be the first to start a discussion here.
Welcome to the community hub built to collect knowledge and have discussions related to Virtual memory compression.
Nothing was collected or created yet.
Virtual memory compression
View on Wikipediafrom Wikipedia
Not found
Virtual memory compression
View on Grokipediafrom Grokipedia
Virtual memory compression is a memory management technique that compresses inactive or less frequently accessed data pages within physical RAM to increase the effective capacity of memory, thereby reducing the need for slower disk swapping and minimizing page faults in virtual memory systems.[1]
By dedicating a portion of RAM to a compressed cache, this approach inserts an intermediate layer between uncompressed active memory and disk storage, where pages are losslessly compressed using algorithms like WKdm or LZO before storage and decompressed only upon access.[1][2] Compression ratios often achieve around 2:1, allowing systems to retain more working sets in RAM while balancing CPU overhead from compression/decompression operations.[3] Adaptive mechanisms, such as dynamic resizing of the compressed pool based on workload locality and cost-benefit analysis, further optimize performance by prioritizing recently used pages via policies like least recently used (LRU).[1]
The primary benefits include substantial reductions in paging costs—typically 20-80% (averaging 40%)—and enhanced system responsiveness, particularly as CPU speeds continue to outpace disk latency improvements, making disk access increasingly costly.[1] However, potential drawbacks involve CPU cycles consumed by compression (mitigated in modern multi-core systems) and challenges in achieving high compression ratios for all data types, which can limit effectiveness in diverse workloads.[1][2]
Research on compressed caching dates back to Paul R. Wilson's proposal in 1990, with foundational studies in 1999 demonstrating its viability through simulations on real workloads.[1] Practical implementations emerged in the 2010s: Microsoft introduced memory compression in Windows 10 in 2015 to preserve data in RAM and reduce hard page faults, macOS added it starting with version 10.9 (Mavericks) in 2013 to compress inactive pages and free up space, and Linux incorporated zram—a compressed RAM block device for swap or temporary storage—via kernel module since around 2011, offering fast I/O with expected 2:1 ratios.[4][5][3] These features have since become standard in commodity systems, evolving with hardware support for efficient decompression.[2]
/, enabling fine-tuned control without recompiling the kernel. Key parameters include disksize, which sets the virtual device capacity (e.g.,
zswap functions as a lightweight, compressed RAM-based cache for pages being swapped out, intercepting them before they reach the backing swap device and storing them in a dynamic pool managed by the zsmalloc allocator. Merged into the kernel in version 3.11 (2013), it merges similar pages to minimize storage and employs techniques like same-value page detection—treating zero-filled or identical pages without full compression—to achieve inherent deduplication, thereby improving swap efficiency.[10] This frontswap approach evicts least-recently-used entries to disk only when the pool fills, preserving hot pages in compressed form for faster retrieval.
Enabling zswap is straightforward, via boot parameter
Fundamentals
Definition and Purpose
Virtual memory compression is a memory management technique in operating systems that compresses inactive or less frequently accessed pages in physical RAM to reduce their storage footprint, thereby freeing up space for active processes and minimizing the need for swapping to slower disk storage.[1] This approach stores compressed pages within a dedicated portion of RAM, creating an intermediate layer in the memory hierarchy that holds data in a denser form without immediate eviction to secondary storage.[1] Unlike traditional paging, which directly evicts pages to disk when memory pressure arises, compression acts as a buffer to retain more data in fast-access memory.[6] The primary purpose of virtual memory compression is to effectively extend the usable capacity of physical RAM without requiring hardware upgrades, allowing systems to handle larger workloads or more concurrent processes under memory constraints.[1] By reducing paging activity to disk, it mitigates out-of-memory conditions and improves overall system performance, as disk I/O operations are significantly slower than RAM access—often by orders of magnitude due to latency differences.[1] This technique is particularly beneficial in environments where memory is limited relative to demand, such as embedded or resource-constrained systems, enabling better resource utilization and responsiveness.[6] Virtual memory compression builds upon the foundational concepts of traditional virtual memory, which abstracts physical memory limitations by mapping virtual addresses to physical ones and using paging for overflow.[1] It introduces compression as an intermediate step before full eviction to disk, effectively increasing the effective memory size by allowing more pages to remain resident in RAM through size reduction.[1] A key aspect of virtual memory compression is its reliance on lossless compression algorithms to shrink page sizes while preserving all original data integrity, distinguishing it from deduplication—which eliminates redundant copies across pages—or encryption—which prioritizes data security over size.[1] This focus on pure size reduction ensures that decompressed pages can be restored exactly as they were, maintaining system correctness without altering content semantics.[6]Core Mechanisms
Virtual memory compression integrates into the operating system's memory management by intercepting pages during the swap-out phase of reclamation, triggered when physical memory pressure rises and free pages drop below configurable thresholds monitored by the kernel's memory management subsystem. This activation occurs through mechanisms such as page fault handlers or direct reclaim paths, where the kernel identifies eligible pages—typically anonymous or clean file-backed ones—for potential compression before they are written to slower storage. In systems like Linux, this is facilitated by APIs such as Frontswap, which hook into the swap subsystem to divert pages from disk I/O.[7][1] The compression process begins with page selection based on recency or working set analysis, compressing candidate pages in fixed-size blocks, often 4 KB, using lightweight algorithms to minimize CPU overhead. Compressed data is then allocated into a dedicated RAM pool, with metadata structures—such as red-black trees or hash tables—recording the original virtual address, compressed size, and storage location for quick retrieval. This pool operates as a cache layer, dynamically resizing based on available memory and compression efficacy, storing blocks at ratios typically around 2:1 to 3:1 depending on data patterns. Special handling for uniform pages, like zero-filled ones, skips full compression by marking them with minimal metadata, avoiding unnecessary computation.[7][3][1] Decompression is demand-driven, occurring on-the-fly during page faults when a compressed page is accessed, where the kernel retrieves the block from the pool, expands it using the matching decompressor, and faults it back into physical RAM for use. This process supports efficient partial-page access by decompressing only required portions if the underlying storage allows, reducing latency compared to full-page operations. Integration with the virtual memory subsystem involves modifications to allocators like the buddy system or shadow page tables to distinguish compressed from uncompressed regions, enabling transparent mapping without altering application address spaces. The effective memory gain can be modeled as: This equation illustrates how the compressed pool extends usable memory by factoring in the ratio achieved during compression.[1][7] Error handling ensures reliability by monitoring compression outcomes; if a page fails to compress adequately (e.g., below a threshold ratio) or encounters allocation errors due to pool exhaustion, the system falls back to traditional disk swapping, evicting the uncompressed page to backing storage via least-recently-used policies. Pool limits, such as maximum percentage of total RAM, trigger evictions of least valuable compressed entries to maintain balance, with invalidated pages freed immediately to prevent leaks. These safeguards prevent data loss while prioritizing performance under varying loads.[7][3]Types
Swap-Based Compression
Swap-based compression treats compressed memory pages as a virtual swap device residing entirely within RAM, simulating traditional disk swap functionality without any involvement of secondary storage. In this approach, when the operating system needs to evict pages from physical memory due to pressure, it compresses them using a selected algorithm and stores the resulting data in a dedicated in-RAM block device, effectively expanding the available swap space through compression rather than relying on slower disk I/O. This method originated from early efforts to enhance swap efficiency in Linux, where researchers implemented a compressed RAM disk to store swapped pages, achieving average compression ratios exceeding 50% with the LZO algorithm.[8][3] Key characteristics of swap-based compression include its fully diskless operation, where all compression and storage occur in RAM, eliminating disk latency and wear—making it ideal for systems with solid-state drives (SSDs) prone to degradation from frequent writes or embedded devices lacking persistent storage. Compression is performed proactively before pages are "swapped" to the virtual device, allowing the system to handle memory pressure more responsively than traditional swapping. The zram module in the Linux kernel exemplifies this, creating compressed block devices such as /dev/zram0 that can be formatted and activated as swap space, formerly known as compcache in its initial implementations.[3][9] A primary advantage unique to this type is the complete avoidance of disk access, resulting in significantly reduced latency for swap operations compared to disk-based alternatives; for instance, early benchmarks showed speedups of 1.2 to 2.1 times in application performance under memory stress. Compression ratios typically range from 2:1 to 3:1 for mixed workloads, effectively doubling or tripling the usable swap capacity within the same RAM footprint, though actual ratios depend on data compressibility and the chosen algorithm.[8][3] Configuration of swap-based compression involves kernel parameters to allocate device size and select algorithms, often managed via sysfs interfaces. For example, the device size is set using thedisksize attribute (e.g., 512 MB), while the compression algorithm is chosen from options like LZ4 (default in recent kernels) or LZO for balancing speed and ratio; a memory limit can also be imposed via mem_limit to cap RAM usage. These settings allow administrators to tune the system for specific workloads, such as setting the device to half the physical RAM to align with expected 2:1 compression.[3]
Cache-Based Compression
Cache-based compression integrates compressed pages into the main memory hierarchy by storing them in a dedicated pool within RAM, alongside uncompressed pages in the page cache, which allows for faster access compared to disk-based alternatives. This approach adds an intermediate level to the virtual memory system, where pages destined for swapping are compressed on-the-fly and retained in RAM if space permits, with decompression performed only upon demand to minimize latency for active workloads. Unlike purely diskless swap-based methods, if the compressed pool fills, pages may be evicted to disk storage.[1][10] Key characteristics include seamless integration with the existing page cache, enabling transparent operation to applications without requiring modifications to user-space code. This method achieves higher integration with active memory regions by prioritizing the retention of "hot" pages in uncompressed form while compressing "cold" ones, thereby optimizing overall system responsiveness.[1] A prominent example is zswap in the Linux kernel, which compresses pages before they enter the swap cache, storing them in a RAM-based compressed pool to avoid immediate disk writes. Another is the WK-class algorithms, developed for compatibility with buddy allocators, which enable efficient allocation and deallocation of variable-sized compressed blocks in virtual memory systems without disrupting standard memory management structures.[10][1] Unique advantages include partial reduction of swap I/O through background compression, which keeps more pages in RAM and mitigates disk bottlenecks during memory pressure. It also supports prioritization of hot pages by evicting compressed cold pages first, improving hit rates in the active memory pool for workloads with temporal locality.[10][1] Trade-offs involve more complex memory mapping to handle both compressed and uncompressed formats, increasing kernel overhead for page table management. This approach is particularly effective for workloads featuring compressible cold data, such as databases or virtual machines with sparse access patterns, but may underperform if compression ratios are low due to the added CPU cycles for on-demand decompression.[1][10]Algorithms and Techniques
Compression Algorithms
Virtual memory compression employs several algorithms optimized for the unique characteristics of memory pages, such as their typical size of 4 KB and the need for rapid compression and decompression to minimize latency in page swaps. The LZ family of algorithms, including LZO and LZ4, are widely adopted due to their balance of speed and efficiency; LZO achieves compression ratios around 2:1 on average for memory data, while LZ4 offers similar ratios of 2:1 to 2.5:1 with even faster performance, compressing at over 500 MB/s and decompressing at speeds exceeding 1 GB/s per core.[1] Zstandard (zstd), a more recent algorithm, is also commonly used in modern systems like Linux zram, providing compression ratios of 2:1 to 4:1 with speeds comparable to LZ4 while offering better ratios for diverse data types.[3] WKdm, a word-based algorithm tailored for memory pages, operates on 32-bit words using a small direct-mapped dictionary to exploit patterns like repeated integers and pointers, yielding average ratios of about 2:1 and higher up to 3:1 for compressible workloads such as text and code, with fast compression suitable for real-time use on modern hardware.[1][11] Algorithm selection in virtual memory systems prioritizes a trade-off between compression ratio and computational overhead, as higher ratios often increase CPU cycles at the expense of speed. LZ4 is particularly favored for low-latency scenarios, with decompression latencies under 1 μs per 4 KB page, making it suitable for real-time page access in compressed caches or swap spaces like zram.[11] In contrast, WKdm provides better ratios for structured data at a modest speed cost, compressing about 2.3 times slower than raw memory copy but still enabling overall system throughput improvements.[11] Zstd balances these trade-offs effectively in contemporary workloads. The effectiveness of these algorithms is quantified by the compression ratio, defined as: Compression typically occurs at the block level, treating entire 4 KB pages as units to align with virtual memory paging granularity, which simplifies integration with page tables and reduces fragmentation overhead. For incompressible pages—such as those containing random or encrypted data—algorithms like LZ4 and WKdm either store the data uncompressed to avoid size expansion or flag it for bypassing compression, ensuring no net loss in effective memory usage.[1][12][13] To optimize performance across diverse workloads, some virtual memory compression schemes incorporate adaptivity, dynamically selecting or tuning algorithms based on page characteristics or system load—for example, applying higher-ratio methods like WKdm to idle pages with more compressible patterns while using faster LZ4 for active ones.[14]Page Management Strategies
In virtual memory compression systems, page management strategies govern the selection, storage, and reclamation of pages to optimize the use of a compressed memory pool, effectively extending available RAM without immediate disk I/O. These strategies typically integrate with existing virtual memory hierarchies by treating the compressed pool as an intermediate layer between uncompressed RAM and swap space, using heuristics to balance compression overhead against memory gains. Seminal work on compressed caching highlights the importance of adaptive policies that track page recency and compressibility to decide when to compress pages evicted from main memory.[1] Page selection for compression relies on heuristics such as least recently used (LRU) ordering based on page age, access frequency tracking via reference bits, or compressibility predictions derived from historical data patterns. For instance, LRU identifies inactive pages likely to remain unused, while access frequency prioritizes less-referenced ones to minimize decompression latency upon reuse; compressibility prediction, often using simple models like last-compression ratios, avoids wasting CPU cycles on incompressible data by estimating potential size reduction before full compression. These methods ensure that only evictable pages—those not in the active working set—are targeted, with predictions achieving up to 98% accuracy in selecting compressible candidates in memory-intensive workloads.[1][2] Storage of compressed pages occurs in a dedicated pool within physical memory, employing metadata structures such as hash tables or auxiliary page tables for rapid lookup and mapping of variable-sized compressed blocks to fixed virtual pages. To mitigate fragmentation, allocation strategies use contiguous blocks or log-structured buffers that append new compressed pages sequentially, avoiding the need for frequent compaction; page tables are extended with flags indicating compression status, size, and offset in the pool, enabling efficient address translation with minimal overhead (e.g., 64 bytes of metadata per page). This approach supports compression ratios around 2:1 on average for text and code-heavy workloads, doubling effective memory capacity without altering the virtual address space.[1][15][16] Reclamation begins when the compressed pool reaches capacity, triggering the decompression of selected pages—typically the oldest or least recently accessed via LRU queues—and their relocation to disk swap, thereby freeing space for new compressions. Prioritization favors pages with high compressibility to maintain pool efficiency, decompressing only when necessary to avoid thrashing; in full-pool scenarios, multiple pages may be batched for eviction to amortize I/O costs. Adaptations of traditional policies, such as clock algorithms using reference bits for approximate LRU or FIFO queues in circular buffers, guide this process by scanning pages in a sweeping manner or evicting in arrival order, respectively. If a page's projected compression ratio falls below a threshold (e.g., 2:1), it may bypass the pool and swap directly to disk, preserving resources for more beneficial candidates.[1][15] Integration of these strategies modifies core virtual memory components, such as zone allocators for reserving compressed regions or slab allocators for metadata, ensuring seamless handling of variable page sizes without disrupting application-visible addressing. Page table extensions track compressed locations and status bits, allowing the memory management unit to route faults appropriately; this decouples compression from base paging algorithms, enabling dynamic pool sizing (e.g., 10-50% of RAM) based on workload demands. Overall, these mechanisms reduce page faults by 20-80% in simulated environments compared to uncompressed swapping, depending on data locality and compressibility.[16][1]Benefits
Performance Enhancements
Virtual memory compression significantly reduces input/output (I/O) operations by keeping more pages in compressed form within RAM, thereby minimizing disk accesses during memory pressure. In scenarios with high swap activity, such as server workloads, compressed caching has been shown to reduce paging costs by 20% to 80%, averaging approximately 40%, by avoiding costly disk faults that number in the tens of thousands per run.[1] The technique also benefits from multi-core processor architectures, where compression streams are allocated per CPU core to enable parallel processing of page compressions and decompressions. This parallelism enhances throughput in memory-bound applications; for instance, zswap on multi-core systems can deliver up to 40% performance gains in benchmarks like SPECjbb2005 under heap sizes exceeding physical memory.[17][18] Performance improvements from multi-core scaling have been observed in swap-intensive tasks, leveraging modern CPUs' ability to handle concurrent compression threads efficiently. Performance improvements are particularly pronounced in workload-specific contexts like desktop and multimedia applications, where compression excels by reducing response times under pressure without excessive overcommitment in virtualized setups. For example, in Citrix Virtual Apps environments, enabling memory compression drops page file usage from over 3% to nearly 0%, by curtailing I/O bottlenecks.[19] Additionally, latency metrics highlight the advantage: decompression is significantly faster than disk-based page faults (typically 5–10 milliseconds), effectively boosting system responsiveness. In mobile systems, these I/O and latency reductions contribute to power savings by minimizing disk accesses during app relaunch and multitasking.Resource Efficiency
Virtual memory compression extends effective RAM capacity by achieving compression ratios typically ranging from 2:1 to 3:1, allowing systems to store more active pages in physical memory without immediate eviction to storage.[20][1] For instance, a system with 4 GB of RAM can effectively behave as if it has 8–12 GB available, enabling sustained operation of memory-intensive workloads under constrained conditions.[1] This extension arises from compressing less frequently accessed pages into a smaller footprint within RAM, thereby delaying or preventing the need for slower disk-based paging. By prioritizing compressed storage in RAM over traditional swapping, virtual memory compression significantly reduces disk I/O operations, minimizing wear on SSDs and HDDs.[21] In setups without persistent storage, such as diskless embedded configurations, background I/O can approach zero since all swapping occurs within compressed RAM, preserving storage longevity and eliminating mechanical degradation risks associated with frequent writes.[20] This approach is particularly beneficial for flash-based storage, where write cycles are limited, as fewer pages reach the backing device. In resource-limited environments, virtual memory compression proves essential for embedded devices like IoT systems with less than 1 GB of RAM, where it maximizes available memory for real-time tasks without hardware upgrades.[22] Similarly, in virtualization scenarios, it supports higher virtual machine density on host servers by compressing guest memory pages, allowing more instances to run concurrently on the same physical hardware.[23] These savings stem from the ability to hold compressed data equivalent to a larger uncompressed volume, optimizing overall storage allocation without compromising accessibility.Shortcomings
CPU and Latency Overhead
Virtual memory compression imposes notable computational costs on the CPU for both compressing pages during swap-out and decompressing them upon access, primarily due to the intensive nature of lossless algorithms applied to memory pages. In software-based implementations, this typically results in 5–15% CPU utilization overhead on single-core systems under moderate memory pressure, though the impact diminishes with multi-core scaling; for instance, the LZ4 algorithm, commonly used in zram and zswap, achieves compression speeds exceeding 500 MB/s per core, leading to less than 5% overhead on 8-core configurations during balanced workloads.[24][25] These costs arise from the need to process 4 KB pages in real-time, where decompression latency adds 1–5 μs per page, calculated from LZ4's decoder throughput of over 3 GB/s, which can accumulate during high swap activity.[25] Several factors influence this overhead, including the choice of compression algorithm—fast options like LZ4 prioritize low latency and CPU use at the expense of slightly lower compression ratios (around 2:1), while higher-ratio alternatives such as zstd or lzo-rle offer better space savings but increase processing time by 20–50% in kernel benchmarks.[26] Additionally, thread contention in kernel space exacerbates costs, as compression operations compete with other system tasks in the swapper context, potentially spiking utilization to 10–20% during peak memory pressure when background compression queues fill.[27] To mitigate these drawbacks, modern systems employ asynchronous compression queues; proposals like the kcompressd mechanism (as of 2025) offload compression from the main kswapd reclaimer thread to dedicated workers, potentially reducing page allocation stalls by over 50% and overall CPU overhead under pressure by allowing parallel processing across cores. Emerging hardware accelerations, such as specialized instructions in modern CPUs, further reduce software overheads.[28][2]Effectiveness Limitations
Virtual memory compression typically achieves compression ratios of 2:1 to 2.5:1 across common workloads, though these vary based on the underlying algorithms employed, such as LZ-style methods.[1][29] For incompressible data types like multimedia files, encrypted content, or random data patterns, ratios often fall below 1.5:1, rendering the effort less effective.[1] The effectiveness of compression is highly workload-dependent, performing poorly on active pages with low compressibility—such as those involving real-time media processing or pseudorandom computations—while yielding better results on idle text, code segments, or data exhibiting repetitions like integers and pointers.[1] In cache-based systems, even when the compressed pool fills under extreme memory pressure, fallback to disk storage can introduce minor background I/O operations, particularly for pages that resist compression.[27] Real-world benchmarks demonstrate effective memory savings of 20% to 40%, falling short of the full theoretical ratio due to metadata overheads associated with tracking compressed blocks, which can impose additional storage and access costs.[1][2]Thrashing and Prioritization Issues
In virtual memory compression systems, thrashing can intensify under memory pressure as frequent compression and decompression cycles emulate the excessive paging of traditional disk-based swapping. When the compressed pool overflows in low available RAM, pages must be decompressed to make room for new ones, spiking page faults and CPU overhead in a self-reinforcing loop similar to disk thrashing. For instance, in benchmarks with working sets exceeding physical memory, full compression without selectivity can lead to over 1 page fault per 1000 instructions, exacerbating contention.[30] Prioritization challenges arise because accurately ranking compressed pages for eviction is difficult, as compression obscures access recency and compressibility patterns. Standard LRU mechanisms applied to compressed regions may inefficiently reclaim "cold" pages that could have remained compressed longer, resulting in repeated decompression and recompression loops that waste resources. Adaptive approaches attempt to mitigate this by resizing the compressed cache based on recent usage, but imperfect predictions can still lead to suboptimal selections, particularly when pages vary in compressibility.[1][30] In low-RAM scenarios, such as systems with less than 4 GB of memory under oversubscription (e.g., 150% utilization), fault rates can double or more compared to uncompressed setups, as decompression demands amplify contention. Solutions like hysteresis thresholds in page management help prevent oscillation by maintaining buffers before resizing the compressed pool, stabilizing behavior during pressure.[30] The overall impact includes responsiveness drops of up to 30% or more in worst-case thrashing, with some workloads slowing by 3x due to unchecked cycles, in contrast to traditional swapping's more predictable disk I/O latency. This highlights the need for careful tuning to avoid turning compression into a performance bottleneck rather than a relief.[30]Implementations
Linux Kernel (zram and zswap)
zram is a Linux kernel module that implements a compressed block device residing entirely in RAM, commonly utilized as swap space to avoid disk I/O and enhance memory efficiency on systems with limited physical RAM. Introduced to the mainline kernel in version 3.14 (released in 2014), it compresses data on-the-fly using algorithms like LZ4 or Zstd, allowing a portion of RAM to simulate a larger swap area through compression ratios typically ranging from 2:1 to 4:1 depending on workload.[3] The module creates devices such as /dev/zram0, which can be formatted and activated as swap with commands likemkswap /dev/zram0 followed by swapon /dev/zram0.
Configuration of zram occurs through sysfs interfaces under /sys/block/zramecho 1G > /sys/block/[zram](/page/Zram)0/disksize to allocate a 1 GB compressed swap), and comp_algorithm, which selects the compression method (e.g., echo lz4 > /sys/block/[zram](/page/Zram)0/comp_algorithm for fast, low-ratio compression suitable for real-time workloads).[3] zram has supported multi-stream compression since kernel version 3.15, with further enhancements in kernels from 6.2 onward, such as those in Linux 6.16 (released July 2025), allowing parallel operations across CPU cores via up to four concurrent streams and benefiting from general memory management refinements in 6.16.[3][31]zswap.enabled=1 or runtime toggle with echo 1 > /sys/module/zswap/parameters/enabled, and it integrates seamlessly with existing swap configurations.[10] Tuning options include max_pool_percent (e.g., set to 20 to cap the pool at 20% of system RAM, adjustable via /sys/module/zswap/parameters/max_pool_percent), which balances memory usage against compression benefits, and accept_threshold_percent for controlling refill behavior post-eviction. Ongoing developments as of November 2025 include proposed compression batching improvements with support for hardware-accelerated drivers like Intel IAA.[10][32]
Both zram and zswap are tunable for optimal performance in resource-constrained environments, with parameters like zram's streams or zswap's pool limits configurable to match hardware. Benchmarks on low-RAM systems, such as those with 4 GB in Ubuntu 24.04, demonstrate that enabling either can yield up to 2x effective memory extension through compression, significantly reducing swap-induced latency compared to traditional disk swap—though exact gains vary by workload, with zram often preferred for its simplicity in no-disk-swap setups.[24] From 2023 to 2025, kernel enhancements have targeted ARM64 efficiency, including better zsmalloc handling for big-endian and low-memory allocators, making these features standard in VPS deployments where 2:1 compression ratios extend viable RAM for server tasks without additional hardware.[33]
