Cost Estimating Methods

The three commonly used methods for estimating costs are analogy, engineering build-up, and parametric. An analogy uses the cost of a similar program to estimate the new program costs and adjusts for differences. The engineering build-up method develops the cost estimate at the lowest level of the WBS, one piece at a time, and the sum of the pieces is the program estimate. The parametric method relates cost to one or more technical, performance, cost, or program parameters through a statistical relationship.

The method selected depends on where the program is in its life cycle. Early in the program, definition is limited and costs may not have accrued. Once a program is in production, cost and technical data from the development phase can be used to estimate the remainder of the program. A variety of cost estimating methods will typically be used over the life of a program. Table 8 gives an overview of the strengths, weaknesses, and applications of the three methods.

Table 8: Three Cost Estimating Methods Compared

Scroll to the right to view full table.

Method	Strength	Weakness	Application
Analogy	Requires few data Based on actual data Reasonably quick Good audit trail	Subjective adjustments Accuracy depends on similarity of items Difficult to assess effect of design change Blind to cost drivers	When few data are available Rough-order-of-magnitude estimate Cross-check
Engineering build-up	Easily audited Sensitive to labor rates Tracks vendor quotes Time honored	Requires detailed design Slow and laborious Cumbersome	Production estimating Software development Negotiations
Parametric	Reasonably quick Encourages discipline Good audit trail Objective, little bias Cost driver visibility Incorporates real-world effects (funding, technical, risk)	Lacks detail Model investment Cultural barriers Need to understand model’s behavior	Budgetary estimates Design-to-cost trade studies Cross-check Baseline estimate Cost goal allocations

Other cost estimating methods include:

expert opinion, which relies on subject matter experts to give their opinion on what an element should cost;
extrapolating, which uses actual costs and data from prototypes to predict the cost of future elements; and
learning curves, which are a common form of extrapolating from actual costs.

The examples that follow are meant to provide an elementary understanding of the estimating methods. For more advanced treatments of these topics, the reader is encouraged to review additional references.

Analogy Cost Estimating Method

An analogy takes into consideration that no new program, no matter how advanced it may be technologically, represents a totally new system. Most new programs evolve from programs already fielded that have had new features added or that simply represent a new combination of existing components. The analogy method uses this concept for estimating new components, subsystems, or total programs. That is, an analogy uses actual costs from a similar program with adjustments to account for differences between the requirements of the existing and new systems. A cost estimator typically uses this method early in a program’s life cycle, when insufficient actual cost data are available for the new program but the technical and program definition is good enough to make the necessary adjustments.

Adjustments should be made as objectively as possible by using factors (sometimes called scaling parameters) that represent differences in size, performance, technology, or complexity. The cost estimator should identify the important cost drivers, determine how the old item relates to the new item, and decide how each cost driver affects the overall cost.

All estimates based on the analogy method should pass the “reasonable person” test—that is, the sources of the analogy and any adjustments must be logical, credible, and acceptable to a reasonable person. In addition, because analogies are one-to-one comparisons, the historical and new systems should have a strong parallel. Table 9 shows how an analogy works.

Table 9: An Example of the Analogy Cost Estimating Method

Scroll to the right to view full table.

Parameter	Existing system	New system	Cost of new system assuming a linear relationship
Engine	F-100	F-200	-
Thrust	12,000 lbs	16,000 lbs	-
Cost	$5.2 million	?	(16,000/12,000) x $5.2 million = $6.9 million

Source: ICEAA (International Cost Estimating and Analysis). Cost Estimating Body of Knowledge. Vienna, Va.: 2013. | GAO-20-195G

The equation in table 9 assumes a linear relationship between engine cost and the amount of thrust. Note that there should be a compelling scientific or engineering reason why an engine’s cost is directly proportional to its thrust. Without more data (or an expert opinion on engine costs), it is difficult to know what parameters are the true drivers of cost. Therefore, when using the analogy method, it is important that the estimator research and discuss the reasonableness of technical program drivers with program experts to determine whether they are significant cost drivers.

Analogy relies a great deal on expert opinion to modify the existing system data to approximate the new system. When possible, the adjustments should be quantitative rather than qualitative, avoiding subjective judgments as much as possible. Even when an analyst is using a more detailed cost estimating technique, an analogy can provide a useful cross-check.

The analogy method has several advantages:

It can be used before detailed program requirements are known.
If the analogy is strong, the estimate will be defensible.
An analogy can be developed quickly and at minimum cost.
The tie to historical data is simple enough to be readily understood.

Analogies also have some disadvantages:

An analogy relies on a single data point.
It is often difficult to find the detailed cost, technical, and program data required for analogies.
There is a tendency to be overly subjective about the technical parameter adjustment factors.

The last disadvantage can be best explained with an example. If a cost estimator assumes that a new component will be 20 percent more complex but cannot explain why, the adjustment factor is unacceptable. The complexity must be related to the system’s parameters, such as that the new system will have 20 percent more data processing capacity or will weigh 20 percent more. Case study 14 highlights what can happen when technical parameter assumptions are overly optimistic.

Case Study 14: Cost Estimating Methods, from Space Acquisitions, GAO-07-96

In 2004, Advanced Extremely High Frequency (AEHF) satellite program decision-makers relied on the program office cost estimate rather than the independent estimate the Cost Analysis Improvement Group (CAIG) developed to support the production decision. The program office estimated that the system would cost about $6 billion on the assumption that AEHF would have 10 times more capacity than Milstar, the predecessor satellite, at half the cost and weight. However, the CAIG concluded that the program could not deliver more data capacity at half the weight given the state of the technology. In fact, the CAIG believed that to get the desired increase in data rate, the weight would have to increase proportionally. As a result, the CAIG estimated that AEHF would cost $8.7 billion and predicted a $2.7 billion cost overrun.

The CAIG relied on weight data from historical satellites to estimate the program’s cost because it considered weight to be the best cost predictor for military satellite communications. The historical data from the AEHF contractor showed that the weight had more than doubled since the program began and that the majority of the weight growth was in the payload. The Air Force also used weight as a cost predictor, but attributed the weight growth to structural components rather than the more costly payload portion of the satellite. The CAIG stated that major cost growth was inevitable from the program start because historical data showed that it was possible to achieve a weight reduction or an increase in data capacity, but not both at the same time.

GAO, Space Acquisitions: DOD Needs to Take More Action to Address Unrealistic Initial Cost Estimates of Space Systems, GAO-07-96 (Washington, D.C.: November 17, 2006).

Several questions should be asked when the analogy method is used as an estimating technique.

What heritage programs and scaling factors were used to create the analogy?
Are the analogous data from reliable sources?
Did technical experts validate the scaling factor?
Can any unusual requirements invalidate the analogy?
Are the parameters used to develop an analogous factor similar to the program being estimated?
How were adjustments made to account for differences between existing and new systems? Were they logical, credible, and acceptable?

Engineering Build-up Cost Estimating Method

The engineering build-up cost estimating method builds the overall cost estimate by summing or “rolling up” detailed estimates done at lower levels of the WBS. Because the lower-level estimating associated with the build-up method uses industrial engineering principles, it is often referred to as engineering build-up. It is sometimes referred to as a grass-roots or bottom-up estimate.

An engineering build-up estimate is done at the lowest level of detail and consists of labor and materials costs that have overhead and fee added to them. In addition to labor hours, a detailed parts list is required. Once in hand, the material parts are allocated to the lowest WBS level based on how the work will be accomplished. In addition, quantity and schedule have to be considered for time-phasing the estimate and applying learning curves, if applicable. (Learning curves are discussed later in this chapter and in appendix VII.) Typically, cost estimators work with engineers to develop the detailed estimates. The cost estimator’s focus is to get detailed information from the engineer that is reasonable, complete, and consistent with the program’s ground rules and assumptions. The cost estimator should find additional data to validate the engineer’s estimates.

The underlying assumption of this method is that actual costs are good predictors of future costs. Thus, the engineering build-up method is normally used during the program’s production phase when the program’s configuration is stable and actual cost data are available. It is assumed that data from the development phase can be used to estimate the cost for production. As illustrated in table 10, the build-up method is used when there is enough detailed information about building an item—such as number of hours and number of parts—and the manufacturing process to be used.

Table 10: An Example of the Engineering Build-Up Cost Estimating Method

Scroll to the right to view full table.

Problem	Similar component	Solution	Result
Estimate labor hours for the sheet metal element of the inlet nacelle for a new aircraft	F/A-18 inlet nacelle	Apply historical F/A-18 variance for touch labor effort and apply support labor factor to adjust estimated touch labor hours	2,000 hours for F/A-18 inlet nacelle x 1.2 (variance factor) = 2,400 touch labor hours; and 2,400 labor hours x 1.48 (1+support labor factor) = 3,552 total labor hours (touch labor plus support labor) estimate for new aircraft inlet nacelle sheet metal
Estimate labor hour cost for the sheet metal of the inlet nacelle for a new aircraft		Apply average labor hour rate for manufacturing firm personnel to labor hours	Total labor hour cost = total labor hours x average labor hour rate

Source: ICEAA (International Cost Estimating and Analysis). Cost Estimating Body of Knowledge. Vienna, Va.: 2013. | GAO-20-195G

The several advantages to the build-up technique include:

the estimator’s ability to determine exactly what the estimate includes and whether anything was overlooked,
its unique application to the specific program and manufacturer, and
it gives good insight into major cost contributors.

Some disadvantages of the engineering build-up method are that:

it can be expensive to implement and it is time consuming,
it is not flexible enough to answer what-if questions,
new estimates must be built for each alternative,
the product specification must be well known and stable,
all product and process changes must be reflected in the estimate,
small errors can grow into larger errors during the summation, and
some elements can be omitted by accident.

As with the analogy method, several questions should be asked regarding engineering build-up to check the accuracy of the estimating technique.

Was each WBS cost element defined in enough detail to use this method correctly?
Were data adequate to accurately estimate the cost of each WBS element?
Did experienced experts help determine a reasonable cost estimate?
Was the estimate based on specific quantities that would be ordered at one time, allowing for quantity discounts?
Did the estimate account for contractor material handling overhead?
Was there a definitive understanding of each WBS cost element’s composition?
Were labor rates based on auditable sources? Did they include all applicable overhead, general and administrative costs, and fees? Were they consistent with industry standards?
Is a detailed and accurate materials and parts list available?

Parametric Cost Estimating Method

In the parametric method, a statistical relationship is developed between historical costs and program, physical, and performance characteristics. The method is sometimes referred to as a top-down approach. Types of physical characteristics used for parametric estimating include weight, power, and lines of code. Other program and performance characteristics include site deployment plans for information technology installations, maintenance plans, test and evaluation schedules, technical performance measures, and crew size. These are just some examples of potential cost drivers for a particular program.

Sources for these cost drivers are often found in the technical baseline or program technical data. It is important that the attributes used in a parametric estimate be cost drivers of the program. The assumption driving the parametric approach is that the same factors that affected cost in the past will continue to affect costs in the future. This method is often used when little is known about a program except for a few key characteristics like weight, volume, or speed.

Using a parametric method requires access to historical data, which may be difficult to obtain. If these data are available, they can be used to determine the cost drivers and to provide statistical results, and can be adjusted to meet the requirements of the new program. Unlike the analogy method, parametric estimating relies on data from many programs and covers a broader range. Confidence in a parametric estimate’s results depends on how valid the relationships are between cost and the physical attributes or performance characteristics. Using this method, the cost estimator must always present the related statistics, assumptions, and sources for the data.

The goal of parametric estimating is to create a statistically valid cost estimating relationship using historical data. The parametric CER can then be used to estimate the cost of the new program by entering its specific characteristics into the parametric model. CERs established early in a program’s life cycle should be periodically reviewed to make sure they are current and the input range still applies to the new program. In addition, parametric CERs should be well documented, because serious estimating errors can occur if the CER is improperly used.

Parametric techniques can be used in a wide variety of situations ranging from early planning estimates to detailed contract negotiations. It is essential to have an adequate number of relevant data points, and care must be taken to normalize the dataset so that it is consistent and complete. Because parametric relationships are often used early in a program, when the design is not well defined, they can easily be reflected in the estimate as the design changes simply by adjusting the values of the input parameters.

It is important to make sure that the program attributes being estimated fall within (or, at least, not far outside) the CER dataset. For example, if a new software program is expected to contain 1 million software lines of code and the data points for a software CER are based on programs with lines of code ranging from 10,000 to 250,000, it would be inappropriate to use the CER to estimate the new program.

To develop a parametric CER, cost estimators must determine the cost drivers that most influence cost. After studying the technical baseline and analyzing the data through scatterplots and other methods, the cost estimator should verify the selected cost drivers by discussing them with engineers. For example, in software development, the environment—that is, the extent to which the requirements are understood and the strength of the programmers’ skill and experience—are usually major cost drivers. The CER can then be developed with a mathematical expression, which can range from a simple rule of thumb (for example, dollars per pound) to a complex regression equation.

The more simplified CERs include rates, factors, and ratios. A rate uses a parameter to predict cost, using a multiplicative relationship. Since rate is defined to be cost as a function of a parameter, the units for rate are always dollars per parameter unit (e.g., pound or miles per hour). The rate most commonly used in cost estimating is the labor rate, expressed in dollars per hour.

A factor uses the cost of another element to estimate a new cost using a multiplier. Because a factor is defined to be cost as a function of another cost, it is often expressed as a percentage. For example, travel costs may be estimated as 5 percent of program management costs.

A ratio is a function of another parameter and is often used to estimate effort. For example, the cost to build a component could be based on the industry standard of 20 hours per subcomponent.

Rates, factors, and ratios are often the result of simple calculations (like averages) and many times do not include statistics.

More complex CERs are developed using regression techniques so that statistical inferences may be drawn. To perform a regression analysis, analysts first determine what relationship exists between cost (dependent variable) and its various drivers (independent variables). This relationship is determined by developing a scatterplot of the data. If the relationship is linear, they can be fit by a linear regression. If the relationship is not linear and transformation of the data does not produce a linear fit, nonlinear regression can be used. The ultimate goal is to create a fit with the least variation between the data and the regression line. This process helps minimize the statistical error or uncertainty associated with the regression equation.

Table 11 contains a parametric cost estimating example.

Table 11: An Example of the Parametric Cost Estimating Method

Scroll to the right to view full table.

Program attribute	Calculation
A cost estimating relationship (CER) for site activation (SA) is a function of the number of workstations (NW)	SA = $82,800 + ($26,500 x NW)
Data range for the CER	7 ’ 47 workstations based on 11 data points
Cost to site activate a program with 40 workstations	$82,800 + ($26,500 x 40) = $1,142,800

Source: ICEAA (International Cost Estimating and Analysis). Cost Estimating Body of Knowledge. Vienna, Va.: 2013. | GAO-20-195G

In table 11, the number of workstations is the cost driver. The equation is linear but has both a fixed component (that is, $82,800) and a variable component (that is, $26,500 x NW).

In addition, the range of the data is from 7 to 47 workstations, so it would be inappropriate to use this CER for estimating the activation cost of a site with as few as 2 or as many as 200 workstations. In fact, at one extreme, the CER estimates a cost of $82,800 for no workstation installations, which is not logical because no work is required.

Although we do not show any CER statistics for this example, the CERs should always be presented with their statistics to enable the cost estimator to understand the level of variation within the data and model its effect with an uncertainty analysis.

The independent variables should be highly correlated to cost and their relationships to cost should be logical. The purpose of the regression is to predict with known accuracy the next real-world occurrence of the dependent variable (the cost), based on knowledge of the independent variable (some physical, operational, or program variable). Once the regression equation is developed, the statistics associated with the relationship must be examined to see if the CER is a sufficiently strong predictor to be used in the estimate. Most statistics can be easily generated with the regression analysis function of spreadsheet software.

Statistical significance is the most important factor for deciding whether the relationship is valid. An independent variable can be considered statistically significant if there is small probability that its corresponding coefficient is equal to zero, because a coefficient of zero would indicate that the independent variable has no relationship to cost. Thus, it is desirable that the probability that the coefficient is equal to zero be as small as possible. How small is denoted by a predetermined value called the significance level. For example, a significance level of .15 would mean there was a 15 percent probability that a variable was not statistically significant. Statistical significance is determined by both the regression as a whole and each regression variable.

Among important regression measures and statistics are R-squared, the F statistic, and the t statistic.

R-squared

The R-squared (R²) value measures the strength of the association between the independent and dependent (or cost) variables. The R² value ranges between 0 and 1, where 0 indicates that there is no relationship between cost and its independent variable, and 1 means that there is a perfect relationship between them. Thus, the higher R² is the better. An R² of 91 percent in the example in table 11, for example, would mean that the number of workstations (NW) would explain 91 percent of the variation in site activation costs, indicating that it may be a good cost driver.

F Statistic

The F statistic is used to judge whether the CER as a whole is statistically significant by testing to see whether any of the variables’ coefficients are equal to zero. The F statistic is defined as the ratio of the equation’s mean squares of the regression to its mean squared error, also called the residual. The higher the F statistic is, the better the regression, but it is the level of significance that is important.

t Statistic

The t statistic is used to judge whether individual coefficients in the equation are statistically significant. It is defined as the ratio of the coefficient’s estimated value to its standard deviation. As with the F statistic, the higher the t statistic is, the better, but it is the level of significance that is important.

Several questions can be asked regarding the parametric method to check the accuracy of the estimating technique.

Is there a valid statistical relationship, or CER, between historical costs and program, physical, and performance characteristics?
How logical is the relationship between key cost drivers and cost?
Is the CER used to develop the estimate validated and accepted?
How old are the data in the CER database? Are they still relevant for the program being estimated?
Do the independent variables for the program fall within the CER data range?
What is the level of variation in the CER? How well does the CER explain the variation (R2) and how much of the variation does the model not explain?
Do any outliers affect the overall fit?
How significant is the relationship between cost and its independent variables?
How well does the CER predict costs?

The Parametric Method: Further Considerations

The statistics described in the section above are just some of the ways that can be used to validate a CER. Once the measures and statistics have been evaluated, the cost estimator picks the best CER—that is, the one with the least variation and the highest correlation to cost.

The final step in developing the CER is to validate the results to demonstrate that it can predict costs within an acceptable range of accuracy. To do this, analysts use a data set different from the one used to generate the equation and observe whether the results are similar. Again, it is important to use a CER developed from programs whose variables are within the same data range as those used to develop the CER. Deviating from the CER variable input range could invalidate the relationship and skew the results. For the CER to be accurate, the new and historical programs should have similar functions, objectives, and program factors, such as acquisition strategy, or results could be misleading. Analysts should question the source of the data underlying the CER. Some CERs may be based on data that are biased by unusual events like a strike, hurricane, or major technical problems that required a lot of rework. To mitigate this risk, it is essential to understand the data the CER is based on and, if possible, to use other historical data to check the validity of the results.

All equations should be checked for common sense to see if the relationship described by the CER is reasonable. This helps avoid the mistake that the relationship adequately describes one system but does not apply to the one being estimated.

Normalizing the data to make them consistent is imperative to obtain good results. All cost data should be converted to a constant base year. In addition, labor and material costs should be broken out separately because they may require different inflation factors to convert them to constant dollars. Moreover, independent variables should be converted into like units for various physical characteristics such as weight, speed, and length.

Historical cost data may have to be adjusted to reflect similar accounting categories, which might be expressed differently from one company to another.

It is important to fully understand all CER modeling assumptions and to examine the reliability of the dataset, including its sources, to see if they are reasonable. Additionally, CERs should be developed with established and enforced policies and procedures that require staff to have proper experience and training to ensure the model’s continued integrity. The procedures should focus on the model’s background and history, identifying key cost drivers and recommending steps for calibrating and developing the estimate. To stay current, parametric models should be continually updated and calibrated.²⁴

There are several advantages to parametric cost estimating, including:

Versatility: If the data are available, parametric relationships can be derived at any level, whether system or subsystem component. As the design changes, CER inputs can be quickly modified and used to answer what-if questions about design alternatives.
Sensitivity: Simply varying input parameters and recording the resulting changes in cost can produce a sensitivity analysis.
Statistical output: Parametric relationships derived from statistical analysis generally have both objective measures of validity (statistical significance of each estimated coefficient and of the model as a whole) and a calculated standard error that can be used in risk analysis. This information can provide a confidence level for the estimate, based on the CER’s predictive capability.
Objectivity: CERs rely on historical data that provide objective results. This increases the estimate’s defensibility.

Disadvantages to parametric estimating include:

Database requirements: The underlying database must be consistent and reliable. It may be time-consuming to normalize the data or to ensure that the data have been normalized correctly, especially if someone outside the estimator’s team developed the CER. Without understanding how the data were normalized, the analyst has to accept the database on faith—sometimes called the black-box syndrome—in which the analyst simply plugs in numbers and accepts the results. Using a CER in this manner can increase the estimate’s risk.
Currency: CERs must represent the state of the art; that is, they must be updated to capture the most current cost, technical, and program data.
Relevance: Using data outside the CER range may cause errors because the CER loses its predictive ability outside the input range used to develop the CER.
Complexity: Complicated CERs, such as nonlinear CERs, may make it difficult for others to readily understand the relationship between cost and its independent variables.

Parametric Cost Models

Many cost estimating models are based on parametric methods. Depending on the model, the underlying database may contain cost, technical, and programmatic data at the system, component, and subcomponent level. Parametric models typically consist of several interrelated CERs. They may involve extensive use of CERs that relate cost to multiple independent non-cost variables. Databases and computer modeling may be used in these types of parametric cost models.

Access to the underlying data of parametric models may be limited because many models are proprietary, meaning the data are not publicly available. Therefore, when the inputs to the parametric models are qualitative, as often happens, they should be objectively assessed. In addition, parameters should be selected to tailor the model to the specific hardware or software product being estimated.

Parametric models are useful for cross-checking the reasonableness of a cost estimate that is derived by other means. As a primary estimating method, parametric models are most appropriate during the engineering concept phase when requirements are still somewhat unclear and no bill of materials exists. When this is the situation, it is imperative that the parametric model is based on historical cost data and that the model is calibrated to those data. To ensure that the model is a good predictor of costs, analysts should demonstrate that it replicates known data to a reasonable degree of accuracy. In addition, the model should demonstrate that the cost-to-non-cost estimating relationships are logical and that the data used for the parametric model can be verified and traced back to source documentation.

Using parametric cost models has several advantages:

They can be adjusted to best fit the system, subsystem, or component being estimated.
Cost estimates are based on a database of historical data.
They can be calibrated to match a specific development environment.

The disadvantages of parametric cost models include:

Their results depend on the quality of the underlying database.
They require many inputs that may be subjective.
Accurate calibration is required for valid results.

Expert Opinion

Expert opinion, also known as engineering judgment, is commonly applied to fill gaps in a relatively detailed WBS when one or more experts are the only qualified source of information, particularly in matters of specific scientific technology. Expert opinion is generally considered overly subjective, but it can still be useful in the absence of data. It is possible to alleviate subjectivity by probing further into the experts’ opinions to determine if they are based on real data. If so, the analyst should attempt to obtain the data and document the sources.

The cost estimator’s interviewing skills are also important for capturing the experts’ knowledge so that the information can be used properly. However, cost estimators should refrain from asking experts to estimate costs for anything outside the bounds of their expertise, and they should validate experts’ credentials before relying on their opinions.

The advantages of using an expert’s opinion are that:

It can be used when no historical data are available.
It takes minimal time and is easy to implement, once experts are assembled.
An expert may give a different perspective or identify facets not previously considered, leading to a better understanding of the program.
It can help in cross-checking results of CERs that have been extrapolated to use data significantly beyond the data range.
It can be applied in all acquisition phases.

Disadvantages associated with using an expert’s opinion include:

its lack of objectivity,
the risk that one expert will try to dominate a discussion to sway the group or that the group will succumb to the urge to agree, and
it is not very accurate or valid as a primary estimating method.

Because of its subjectivity and lack of supporting documentation, expert opinion should be used sparingly.

Questions to be asked regarding the use of expert opinion as an estimating method include:

Have the experts provided estimates within the area of the expertise?
Is the opinion supported by quantitative historical data? If so, can these be used instead of opinion?
How did the estimate account for the possibility that bias influenced the results?

Other Estimating Methods: Extrapolation from Actual Costs

Extrapolation uses the actual past or current costs of an item to estimate its future costs. There are several variants of extrapolation, including:

averages, the most basic variant, use simple or moving averages to determine the average actual costs of units that have been produced to predict the cost of future units;
learning curves, which account for cost improvement; and
estimates at completion, which use actual cost and schedule data to develop estimates with EVM techniques; EACs can be calculated with various techniques to take current performance into account.

Extrapolation is best suited for estimating follow-on units of the same item when there are actual data from current or past production lots. This method is valid when the product design or manufacturing process has changed little. If major changes have occurred, careful adjustments should be made or another method should be used. When using extrapolation techniques, it is essential to have accurate data at the appropriate level of detail. The cost estimator must ensure that the data have been validated and properly normalized. When such data exist, they form the best basis for cost estimates. Advantages associated with extrapolating from actual costs include their

reliance on historical costs to predict future costs;
credibility and reliability for estimating costs; and
ability to be applied at different levels of data—labor hours, material dollars, and total costs.

The disadvantages associated with extrapolating from actual costs are that:

changes in the accounting of actual costs can affect the results,
obtaining access to actual costs can be difficult,
results will be invalid if the production process or configuration is not stable, and
it should not be used for items outside the actual cost data range.

Questions regarding the use of extrapolation as an estimating method follow.

Were cost reports used for extrapolation validated as accurate?
Was the cost element at least 25 percent complete before using its data to support extrapolation?
Were functional experts consulted to validate the reported percentage as complete?
Were contractors interviewed to ensure the cost data’s validity?
Were recurring and nonrecurring costs separated to avoid double counting?

Other Estimating Methods: Learning Curves

The cost estimating methods discussed in this chapter can identify the cost of a single item. However, a cost estimator may need to determine whether that cost is for the first unit, the average unit, or every unit. Additionally, given the cost for one unit, how should a cost estimator determine the appropriate costs for other units? The answer is in the use of learning curves. Sometimes called progress or improvement curves, learning curve theory is based on the premise that people and organizations learn to do things better and more efficiently when they perform repetitive tasks. A continuous reduction in labor hours from repetitive performance in producing an item often results from more efficient use of resources, employee learning, new equipment and facilities, or improved flow of materials. This improvement can be modeled with a CER that assumes that as the quantity of units produced doubles, the amount of effort declines by a constant percentage.

Workers gain efficiencies in a number of areas as items are repeatedly produced. The most commonly recognized area of improvement is worker learning. Improvement occurs because, as a process is repeated, workers tend to become more physically and mentally adept at it. Supervisors, in addition to realizing these gains, become more efficient in managing their workers as they learn their strengths and weaknesses. Improvements in the work environment also translate into worker and supervisory improvement; studies show that changes in climate, lighting, and general working conditions motivate people to improve.

Cost improvement also results from changes to the production process that optimize the assembly line and the placement of tools and material to help simplify tasks. In the same vein, organizational changes can lead to lower recurring costs, such as instituting a just-in-time inventory or centralizing tasks (heat and chemical treatment processes, tool bins, and the like). Another example of organizational change is a manufacturer agreeing to give a vendor preferred status if it is able to limit defective parts to some percentage. The reduction in defective parts can translate into savings in scrap rates, quality control hours, and recurring manufacturing labor, all of which can result in valuable time savings. In general, it appears that more complex manufacturing tasks tend to improve faster than simpler tasks. The more steps in a process, the more opportunity there is to learn how to do them better and faster. Conversely, more automated tasks achieve less learning. Thus, higher automation leads to less learning, while lower automation levels may yield more learning.

In competitive business environments, market forces require suppliers to improve efficiency to survive. As a result, some suppliers may competitively price their initial product release at a loss, with the expectation that future cost improvements will make up the difference. This strategy can also discourage competitors from entering new markets. For the strategy to work, the anticipated improvements must materialize or the supplier may go out of business because of high losses.

Researchers have observed that learning causes a decrease in labor hours per production unit over time, which informed the formulation of the learning curve. The equation Y = AX^b models the concept of a constant learning curve slope (b) that affects a change in labor hours or cost (Y) given a change in units (X).²⁵ The unit formulation states that as the total quantity doubles, the cost decreases by some fixed percentage.²⁶ Figure 10 illustrates how hours per unit can vary by learning curve.

Figure 10: A Learning Curve

Tip: Click the figure to view a larger version in a new browser tab.

Figure 10 shows how an item’s manufacturing time decreases as its quantity increases. For example, if the learning curve slope is 90 percent and it takes 1,000 hours to produce the first unit, then it will take 900 hours to produce the second unit. Every time the quantity doubles—for example, from 2 to 4, 4 to 8, 8 to 16—the resource requirements will reduce according to the learning curve slope.

Determining the learning curve slope is important and requires analyzing historical data. If several production lots of an item have been produced, the slope can be derived from the trend in the data. Another way to determine the slope is to look at company history for similar efforts and calculate it from those efforts. The slope can also be derived from an analogous program. The analyst can look at slopes for a particular industry—aircraft, electronics, shipbuilding—sometimes reported in organizational studies, research reports, or estimating handbooks. Slopes can be specific to functional areas such as manufacturing, tooling, and engineering, or they may be composite slopes calculated at the system level, such as aircraft, radar, tanks, or missiles.

The first unit cost might be arrived at by analogy, engineering build-up, a cost estimating relationship, fitting the actual data, or another method. In some cases, the first unit cost is not available. Sometimes work measurement standards might provide the hours for the 5th unit, or a cost estimating relationship might predict the 100th unit cost. This is not a problem as long as the cost estimator understands the point on the learning curve that the unit cost is from and what learning curve slope applies. With this information, the cost estimator can easily solve for the first unit cost using the standard learning curve formula Y = AX^b.

Particular care should be taken for early contracts, in which the cost estimator may not yet be familiar enough with program office habits to address the risk accurately (for example, high staff turnover, propensity for scope creep, or excessive schedule delays).

It is reasonable to expect that unit costs decrease not only as more units are produced but also as the production rate increases. This theory accounts for cost reductions that are achieved through economies of scale. Conversely, if the number of quantities to be produced decreases, then unit costs can be expected to increase because certain fixed costs have to be spread over fewer items. The rate at which items can be produced can also be affected by the continuity of production. Production breaks may occur because of program delays (budget or technical), time lapses between initial and follow-on orders, or labor disputes. The effect of production on learning curves is discussed in greater detail in appendix VII.

Because learning can reduce the cost of an item over time, cost estimators should be aware that if multiple units are to be bought from one contractor as part of the program’s acquisition strategy, reduced costs can be anticipated. Thus, knowledge of the acquisition plan is paramount in deciding if learning curve theory can be applied. If so, careful consideration must be given to determining the appropriate learning curve slope for both labor hours and material costs. In addition, learning curves are based on recurring costs, so cost estimators need to separate recurring from nonrecurring costs to avoid skewing the results. Finally, these circumstances should be satisfied before deciding to use learning curves:

much manual labor is required to produce the item;
the production of items is continuous and, if not, then adjustments are made;
the items to be produced require complex processes;
technological change is minimal between production lots;
the contractor’s business process is being continually improved; and
the government program office culture (or environment) is sufficiently known.

Questions regarding the use of learning curves as an estimating method include:

How were first unit costs determined? What historical data were used to determine the learning curve slope?
Were recurring and nonrecurring costs separated when the learning curve was developed?
How were partial units treated?
Were production rate effects considered? How were production break effects determined?

CER calibration compares independent data to model output values. For instance, one method of calibration is adjusting CER factors so that the model output is consistent with actual known costs (independent data).↩︎
b = log (slope) /log (2)↩︎
See appendix VII for a more detailed discussion of the two ways to develop learning curves - unit formulation and cumulative average formulation.↩︎