Stack overflow

In software, a stack overflow occurs if the call stack pointer exceeds the stack bound. The call stack may consist of a limited amount of address space, often determined at the start of the program. The size of the call stack depends on many factors, including the programming language, machine architecture, multi-threading, and amount of available memory. When a program attempts to use more space than is available on the call stack (that is, when it attempts to access memory beyond the call stack's bounds, which is essentially a buffer overflow), the stack is said to overflow, typically resulting in a program crash.^[1]

Causes

Infinite recursion

The most-common cause of stack overflow is excessively deep or infinite recursion, in which a function calls itself so many times that the space needed to store the variables and information associated with each call is more than can fit on the stack.^[2]

Here is an example of infinite recursion in C:

int foo() {
    return foo();
}

The function foo, when it is invoked, continues to invoke itself, allocating additional space on the stack each time, until the stack overflows, resulting in a segmentation fault.^[2] However, some compilers implement tail-call optimization, allowing infinite recursion of a specific sort—tail recursion—to occur without stack overflow. This works because tail-recursion calls do not take up additional stack space.^[3]

Some C compiler options will effectively enable tail-call optimization; for example, compiling the above simple program using gcc with -O1 will result in a segmentation fault, but not when using -O2 or -O3, since these optimization levels imply the -foptimize-sibling-calls compiler option.^[4] Other languages, such as Scheme, require all implementations to include tail-recursion as part of the language standard.^[5]

Very deep recursion

A recursive function that terminates in theory but causes a call stack buffer overflow in practice can be fixed by transforming the recursion into a loop and storing the function arguments in an explicit stack (rather than the implicit use of the call stack). This is always possible because the class of primitive recursive functions is equivalent to the class of LOOP computable functions. Consider this example in C++-like pseudocode:

class SomeObject {
public:
    SomeObject* next;
};

void someFunction(SomeObject* argument) {
    if (getCondition()) {
        someFunction(argument.next);
    }
}

Stack<SomeObject*> s;

s.push(argument);
while (!s.empty()) {
    argument = s.pop();
    if (condition) {
        s.push(argument.next);
    }
}

A primitive recursive function like the one on the left side can always be transformed into a loop like on the right side.

A function like the example above on the left would not be a problem in an environment supporting tail-call optimization; however, it is still possible to create a recursive function that may result in a stack overflow in these languages. Consider the example below of two simple integer exponentiation functions.

int pow(int base, int exp) {
    if (exp > 0) {
        return base * pow(base, exp - 1);
    } else {
        return 1;
    }
}

int pow(int base, int exp) {
    return pow_accum(base, exp, 1);
}

int pow_accum(int base, int exp, int accum) {
    if (exp > 0) {
        return pow_accum(base, exp - 1, accum * base);
    } else {
        return accum;
    }
}

Both pow(base, exp) functions above compute an equivalent result, however, the one on the left is prone to causing a stack overflow because tail-call optimization is not possible for this function. During execution, the stack for these functions will look like this:

pow(5, 4)
5 * pow(5, 3)
5 * (5 * pow(5, 2))
5 * (5 * (5 * pow(5, 1)))
5 * (5 * (5 * (5 * pow(5, 0))))
5 * (5 * (5 * (5 * 1)))
625

pow(5, 4)
pow_accum(5, 4, 1)
pow_accum(5, 3, 5)
pow_accum(5, 2, 25)
pow_accum(5, 1, 125)
pow_accum(5, 0, 625)
625

Notice that the function on the left must store in its stack exp number of integers, which will be multiplied when the recursion terminates and the function returns 1. In contrast, the function at the right must only store 3 integers at any time, and computes an intermediary result which is passed to its following invocation. As no other information outside of the current function invocation must be stored, a tail-recursion optimizer can "drop" the prior stack frames, eliminating the possibility of a stack overflow.

Very large stack variables

The other major cause of a stack overflow results from an attempt to allocate more memory on the stack than will fit, for example by creating local array variables that are too large. For this reason some authors recommend that arrays larger than a few kilobytes should be allocated dynamically instead of as a local variable.^[6]

An example of a very large stack variable in C:

int foo() {
    double x[1048576];
}

On a C implementation with 8 byte double-precision floats, the declared array consumes 8 megabytes of data; if this is more memory than is available on the stack (as set by thread creation parameters or operating system limits), a stack overflow will occur.

Constrained environment

Stack overflows are made worse by anything that reduces the effective stack size of a given program. For example, the same program being run without multiple threads might work fine, but as soon as multi-threading is enabled the program will crash. This is because most programs with threads have less stack space per thread than a program with no threading support. Because kernels are generally multi-threaded, people new to kernel development are usually discouraged from using recursive algorithms or large stack buffers.^[7]

References

^ Burley, James Craig (1991-06-01). "Using and Porting GNU Fortran". Archived from the original on 2012-02-06.
^ ^a ^b What is the difference between a segmentation fault and a stack overflow? Archived 2021-09-13 at the Wayback Machine at Stack Overflow
^ "An Introduction to Scheme and its Implementation". 1997-02-19. Archived from the original on 2007-08-10.
^ "Using the GNU Compiler Collection (GCC): Optimize Options". Archived from the original on 2017-08-20. Retrieved 2017-08-20.
^ Richard Kelsey; William Clinger; Jonathan Rees; et al. (August 1998). "Revised⁵ Report on the Algorithmic Language Scheme". Higher-Order and Symbolic Computation. 11 (1): 7–105. doi:10.1023/A:1010051815785. S2CID 14069423. Archived from the original on 2007-01-05. Retrieved 2012-08-09.
^ Feldman, Howard (2005-11-23). "Modern Memory Management, Part 2". Archived from the original on 2012-09-20. Retrieved 2007-08-14.
^ "Kernel Programming Guide: Performance and Stability Tips". Apple Inc. 2014-05-02. Archived from the original on 2014-05-03. Retrieved 2014-05-02.

External links

The reasons why 64-bit programs require more stack memory Archived 4 November 2017 at the Wayback Machine

[fortran1-1] Burley, James Craig (1991-06-01). "Using and Porting GNU Fortran". Archived from the original on 2012-02-06.

[segFault-2] What is the difference between a segmentation fault and a stack overflow? Archived 2021-09-13 at the Wayback Machine at Stack Overflow

[tailRecur-3] "An Introduction to Scheme and its Implementation". 1997-02-19. Archived from the original on 2007-08-10.

[4] "Using the GNU Compiler Collection (GCC): Optimize Options". Archived from the original on 2017-08-20. Retrieved 2017-08-20.

[5] Richard Kelsey; William Clinger; Jonathan Rees; et al. (August 1998). "Revised⁵ Report on the Algorithmic Language Scheme". Higher-Order and Symbolic Computation. 11 (1): 7–105. doi:10.1023/A:1010051815785. S2CID 14069423. Archived from the original on 2007-01-05. Retrieved 2012-08-09.

[onlamp-6] Feldman, Howard (2005-11-23). "Modern Memory Management, Part 2". Archived from the original on 2012-09-20. Retrieved 2007-08-14.

[apple1-7] "Kernel Programming Guide: Performance and Stability Tips". Apple Inc. 2014-05-02. Archived from the original on 2014-05-03. Retrieved 2014-05-02.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

v t e Memory management
Memory management as a function of an operating system
Hardware	Memory management unit (MMU) Translation lookaside buffer (TLB) Input–output memory management unit (IOMMU)
Virtual memory	Demand paging Memory paging Page table Virtual memory compression
Segmentation	Protected mode Real mode Virtual 8086 mode x86 memory segmentation
Allocator	dlmalloc Hoard jemalloc libumem mimalloc ptmalloc
Manual means	Static memory allocation C dynamic memory allocation new and delete (C++)
Garbage collection	Automatic Reference Counting Boehm garbage collector Cheney's algorithm Concurrent mark sweep collector Finalizer Garbage Garbage-first collector Mark–compact algorithm Reference counting Tracing garbage collection Strong reference Weak reference
Safety	Buffer overflow Buffer over-read Dangling pointer Stack overflow
Issues	Fragmentation Memory leak Unreachable memory
Other	Automatic variable Constructors Destructors International Symposium on Memory Management Region-based memory management Memory pool
Memory management Virtual memory Automatic memory management Memory management algorithms Memory management software

History

Stack overflow

Recent from talks

Recent from talks

Contribute something

Contribute something

Media Pages

Timelines

Articles

Notes collections

Notes

Notes

Days in Chronicle

Stack overflow

Causes

Infinite recursion

Very deep recursion

Very large stack variables

Constrained environment

See also

References

External links

Stack overflow

Fundamentals

Definition

Call Stack Mechanics

Causes

Infinite Recursion

Very Deep Recursion

Large Local Variables

Constrained Environments

Consequences

Program Failure Modes

Security Implications

Detection and Prevention

Runtime Detection Methods

Mitigation Strategies

References

Add your contribution

Related Hubs

Contribute something