Data Normalization

The purpose of data normalization is to make a given data set consistent with and comparable to other data used in the estimate. Because data can be gathered from a variety of sources, they are often in different forms. They therefore need to be adjusted before being compared or used as a basis for projecting costs. Cost data are adjusted in a process called normalization, which removes the effects of external influences. The objective of data normalization is to improve data consistency so that comparisons and projections are more valid. Data are normalized in several ways: by cost units, sizing units, key groupings, and technology maturity.

Cost Units

Cost units primarily adjust for inflation so it is important to know the year in which funds were spent. For example, an item that cost $100 in 1990 is more expensive than an item that cost $100 in 2005 when adjusted for the effects of inflation. In addition to inflation, the cost estimator needs to understand what the cost represents. For example, some data may represent only direct labor while other data include overhead and fee. Cost data should also be converted to equivalent units before being used in a data set. That is, costs expressed in thousands, millions, or billions of dollars must be converted to one format—for example, all costs expressed in millions of dollars. Costs may also be adjusted for currency conversions.

Sizing Units

Sizing units normalize data to common units—for example, cost per foot, cost per pound, and dollars per software line of code. When normalizing data for unit size, it is important to define exactly what the unit represents. For example, does a software line of code include carriage returns or comments? Cost estimators should clearly define the sizing metric so that the data can be converted to a common standard before being used in the estimate.

Key Groupings

Key groupings normalize data by similar missions, characteristics, or operating environments by cost type or work content. Products with similar mission applications have similar characteristics and traits, as do products with similar operating environments. For example, space systems exhibit different characteristics from those of submarines, but the space shuttle has characteristics distinct from those of a satellite. Costs should also be grouped by type. For example, costs should be broken out between recurring and nonrecurring or fixed and variable costs.

Using homogeneous groups normalizes for differences between historical and new program WBS elements in order to achieve content consistency. To do this type of normalization, a cost estimator needs to gather cost data that can be formatted to match the desired WBS element definition. This may require adding and deleting certain items to get a like for like comparison. A properly defined WBS dictionary is necessary to avoid inconsistencies.

Technology Maturity

Technology normalization is the process of adjusting cost data for productivity improvements resulting from technological advancements that occur over time. In effect, technology normalization is the recognition that technology continually improves, so a cost estimator must make a subjective attempt to measure the effect of this improvement on historical program costs. For instance, an item developed 10 years ago may have been considered state of the art and the costs would be higher than normal. Today, that item may be available off the shelf and therefore the costs would be considerably less.

Therefore, technology normalization is the ability to forecast changes in cost due to changes in technology by predicting the timing and degree of change of technological parameters associated with the design, production, and use of devices.