Correlation

Other capabilities are possible once the schedule is viewed as a probabilistic statement of how the program might unfold. One that is notable is the correlation between activity durations. Positive correlation is when two activity durations are both influenced by the same external force and can be expected to vary in the same direction within their own probability distributions in any consistent scenario.³⁷ Correlation might be positive and fairly strong if, for instance, the same assumption about the maturity of a technology is made to estimate the duration of design, fabrication, and testing activities or the contractor’s productivity affecting multiple activities that have been bid. If the technology maturity is not known with certainty, it would be consistent to assume that design, fabrication, and testing activities would all be longer or shorter together.

Likewise, if a particular trade is relatively unproductive in the house construction example, we may expect all activities associated with that trade to be delayed to some degree. Without specifying correlation between these activity durations in simulation, some iterations or scenarios would have some activities that are thought to be correlated go long and others short in their respective ranges during an iteration. This would be inconsistent with the idea that they all react to the same assumptions about technology maturity or trade productivity.

Specifying correlations between related activities ensures that each iteration represents a scenario in which their durations are consistently long or short in their ranges together. Because schedules tend to add durations (given their logical structure), if the durations are long together or short together, there is a chance that projects will be very long or very short. Correlation affects the low and high values in the simulation results. This means that the high values are even higher with correlation and the low values are even lower, because correlated durations tend to reinforce one another down the schedule paths. In practice, if the organization wants to focus on the 80th percentile, correlation matters; correlation does not matter as much around the mean duration from the simulation.

Figure 44 shows the effect of adding correlation between activity durations in the threepoint risk simulation for the house construction schedule. In this example, 90 percent correlation was added between activities that are related trades. While the 90 percent correlation is high (correlation is measured between -1.0 and 1.0), there are often no actual data on correlation, so expert judgment is often used to set the correlation coefficients. Assuming this degree of correlation, we get the result shown in figure 44. Notice that the correlation has widened the overall distribution. The 50th percentile is nearly the same in both cases, February 25 without correlation and February 24 with correlation. However, the 80th percentile increases by one workweek, from March 4 to March 9, when correlation is added.

Figure 44: Probability Distribution Results for Risk Analysis with and without Correlation

Tip: Click the figure to view a larger version in a new browser tab.

Using three-point estimates for activity durations requires estimating correlation coefficients, often in the absence of historical data. Inconsistent correlation matrixes often result in this pair-wise setting of correlation coefficients. In the risk driver method, assigning a risk to multiple activities causes them to be correlated, because if the risk occurs on one assigned activity during the simulation, it occurs on all the assigned activities. If there are also some risks on one activity but not another, correlation will be less than 100 percent. Modeling correlation with risk drivers avoids the difficult task of estimating a number of pair-wise correlations.

While durations might vary in opposite directions if they are negatively correlated, this is less common than positive correlation in program management.↩︎