Hubbry Logo
search
logo

Reduction operator

logo
Community Hub0 Subscribers
Write something...
Be the first to start a discussion here.
Be the first to start a discussion here.
See all
Reduction operator

In computer science, the reduction operator is a type of operator that is commonly used in parallel programming to reduce the elements of an array into a single result. Reduction operators are associative and often (but not necessarily) commutative. The reduction of sets of elements is an integral part of programming models such as Map Reduce, where a reduction operator is applied (mapped) to all elements before they are reduced. Other parallel algorithms use reduction operators as primary operations to solve more complex problems. Many reduction operators can be used for broadcasting to distribute data to all processors.

A reduction operator can help break down a task into various partial tasks by calculating partial results which can be used to obtain a final result. It allows certain serial operations to be performed in parallel and the number of steps required for those operations to be reduced. A reduction operator stores the result of the partial tasks into a private copy of the variable. These private copies are then merged into a shared copy at the end.

An operator is a reduction operator if:

These two requirements are satisfied for commutative and associative operators that are applied to all array elements.

Some operators which satisfy these requirements are addition, multiplication, and some logical operators (and, or, etc.).

A reduction operator can be applied in constant time on an input set of vectors with elements each. The result of the operation is the combination of the elements and has to be stored at a specified root processor at the end of the execution. If the result has to be available at every processor after the computation has finished, it is often called Allreduce. An optimal sequential linear-time algorithm for reduction can apply the operator successively from front to back, always replacing two vectors with the result of the operation applied to all its elements, thus creating an instance that has one vector less. It needs steps until only is left. Sequential algorithms can not perform better than linear time, but parallel algorithms leave some space left to optimize.

Suppose we have an array . The sum of this array can be computed serially by sequentially reducing the array into a single sum using the '+' operator. Starting the summation from the beginning of the array yields: Since '+' is both commutative and associative, it is a reduction operator. Therefore this reduction can be performed in parallel using several cores, where each core computes the sum of a subset of the array, and the reduction operator merges the results. Using a binary tree reduction would allow 4 cores to compute , , , and . Then two cores can compute and , and lastly a single core computes . So a total of 4 cores can be used to compute the sum in steps instead of the steps required for the serial version. This parallel binary tree technique computes . Of course the result is the same, but only because of the associativity of the reduction operator. The commutativity of the reduction operator would be important if there were a master core distributing work to several processors, since then the results could arrive back to the master processor in any order. The property of commutativity guarantees that the result will be the same.

IEEE 754-2019 defines 4 kinds of sum reductions and 3 kinds of scaled-product reductions. Because the operations are reduction operators, the standards specifies that "implementations may associate in any order or evaluate in any wider format."

See all
User Avatar
No comments yet.