Loop-level parallelism
Loop-level parallelism
Main page

Loop-level parallelism

logo
Community Hub0 subscribers
What are your thoughts?
Be the first to start a discussion here.
Be the first to start a discussion here.
Loop-level parallelism

Loop-level parallelism is a form of parallelism in software programming that is concerned with extracting parallel tasks from loops. The opportunity for loop-level parallelism often arises in computing programs where data is stored in random access data structures. Where a sequential program will iterate over the data structure and operate on indices one at a time, a program exploiting loop-level parallelism will use multiple threads or processes which operate on some or all of the indices at the same time. Such parallelism provides a speedup to overall execution time of the program, typically in line with Amdahl's law.

For simple loops, where each iteration is independent of the others, loop-level parallelism can be embarrassingly parallel, as parallelizing only requires assigning a process to handle each iteration. However, many algorithms are designed to run sequentially, and fail when parallel processes race due to dependence within the code. Sequential algorithms are sometimes applicable to parallel contexts with slight modification. Usually, though, they require process synchronization. Synchronization can be either implicit, via message passing, or explicit, via synchronization primitives like semaphores.

Consider the following code operating on a list l, where l.size() == n.

Each iteration of the loop takes the value from the current index of l, and increments it by 10. If statement s1 takes T time to execute, then the loop takes time n * T to execute sequentially, ignoring time taken by loop constructs. Now, consider a system with p processors where p > n. If n threads run in parallel, the time to execute all n steps is reduced to T.

Less simple cases produce inconsistent, i.e. non-serializable outcomes. Consider the following loop operating on the same list l.

Each iteration sets the current index to be the value of the previous plus ten. When run sequentially, each iteration is guaranteed that the previous iteration will already have the correct value. With multiple threads, process scheduling and other considerations prevent the execution order from guaranteeing an iteration will execute only after its dependence is met. It very well may happen before, leading to unexpected results. Serializability can be restored by adding synchronization to preserve the dependence on previous iterations.

There are several types of dependences that can be found within code.

In order to preserve the sequential behaviour of a loop when run in parallel, True Dependence must be preserved. Anti-Dependence and Output Dependence can be dealt with by giving each process its own copy of variables (known as privatization).

See all
User Avatar
No comments yet.