Chapter 3, Understanding MapReduce
Pop quiz – key/value pairs
Q1 |
2 |
Q2 |
3 |
Pop quiz – walking through a run of WordCount
Q1 |
1 |
Q2 |
3 |
Q3 |
2. Reducer C cannot be used because if such reduction were to occur, the final reducer could receive from the combiner a series of means with no knowledge of how many items were used to generate them, meaning the overall mean is impossible to calculate. Reducer D is subtle as the individual tasks of selecting a maximum or minimum are safe for use as combiner operations. But if the goal is to determine the overall variance between the maximum and minimum value for each key, this would not work. If the combiner that received the maximum key had values clustered around it, this would generate small results; similarly for the one receiving the minimum value. These subranges have little value in isolation and again the final reducer cannot construct the desired result. |