Reduction Mode Example

This example is to clarify the difference between how default reduction and running reduction work. Imagine the extract from each classification produces a number from 0 to 10 and the reducer computes the average of these numbers.

The same extracts are processed by each reducer in the same order and we illustrate the changing values in the system as they arrive. For clarity, the values of extracts are indicated in bold.

Default Reduction

Extract ID	Extract Value	Extracts to reducer	Store Value In	Calculation	Store Value
1	5	1	nil	5/1	nil
2	3	1, 2	nil	(5+3)/2	nil
2	3	1, 2	nil	(5+3)/2	nil
3	4	1, 2, 3	nil	(5+3+4)/3	nil

Running Reduction

Extract ID	Extract Value	Extracts to reducer	Store Value In	Calculation	Store Value	Items in Association
1	5	1	nil	(0*0+5)/(0+1)	1	1
2	3	2	1	(5*1+3)/(1+1)	2	2
2	3	nil	N/A	N/A	2	2
3	4	3	2	(4*2+4)/(2+1)	3	3

Points of Note

Note that in default reduction mode, re-reduction is always triggered, regardless of whether an extract is being processed twice. Also notice that each computation in default reduction consumes all of the extracts. We calculate an average by summing together the values of all of the extracts and then dividing by the number of extracts.

In running reduction, on the other hand, the store keeps a running count of how many items the reducer has seen. This store, with the previous value of the reduction, can be used to compute the new average using only the new value by using the formula ((old average * previous count) + new value)/(old count + 1) and the store can be updated with the new count (old count + 1).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduction Mode Example

Default Reduction

Running Reduction

Points of Note

Clone this wiki locally