OVERVIEW
Source data on income and employment are inherently incomplete. Due to data disclosures and undercoverage, no one dataset provides enough information to create a complete IMPLAN database. IMPLAN’s Labor Income and Employment data, therefore, come from three primary datasets: BLS Census of Employment and Wages (CEW) data, BEA Regional Economic Accounts (REA) data, and County Business Patterns (CBP) data. For some Special Industries (farm, construction, and government), additional data sources are used that provide either more current data or more geographic or sectoral specificity. See Employment and Labor Income Data for more information about the raw data sources and their coverage.
The primary datasets that are used to estimate IMPLAN employment and labor income data all contain non-disclosed elements. Therefore, in order to complete the IMPLAN database, the non-disclosed values must be estimated and controlled for each regional dataset.
Additionally, the Census of Employment and Wages (CEW) dataset, which is one of the most important datasets used in IMPLAN database development, excludes proprietors and certain Industries from their data. As described in Employment and Labor Income Data, the CEW data must also be augmented and adjusted. This article describes these methods.
ESTIMATING NON-DISCLOSED VALUES
COUNTY BUSINESS PATTERN ELEMENTS
Beginning with their 2017 data, the Census Bureau has adopted a new policy under which they no longer provide establishment counts for cases in which the establishment count is less than three, as the number of establishments is now considered sensitive. This new decision to omit from their tables all records with fewer than three establishments was a new policy that was set in place to protect the confidentiality of businesses.
This new practice makes a missing record of 1-2 establishments (an existing industry) indistinguishable from a missing record of zero establishments (a non-existent industry). The omission of records with less than three establishments, in addition to making it impossible to estimate their employment and income values, also makes it impossible for us to obtain high quality estimates for the non-disclosed records with three or more establishments, since we are not able to roll up across NAICS levels without having values for sibling sectors.
Therefore, beginning with the 2019 IMPLAN data set, we no longer create estimates for non-disclosed CBP data; rather, we only use the disclosed data. This will affect our estimates of non-disclosed BLS CEW values in those cases in which we need to turn to CBP (lack of disclosed raw CEW from previous year) and in which the CBP value is not disclosed; in those cases, we will now move right to using a projected CEW value from two years back, if possible, followed by the use of state-level or U.S.-level ratios as the second and third options for obtaining a first estimate for CEW, rather than having CBP as a second option.
CEW DATA
CEW data are current but also come with non-disclosures and, unlike the CBP data, provide only total establishment counts (i.e., they do not provide the number of establishments by employee-size classification) for non-disclosed items. To estimate the non-disclosed elements of the CEW data, a number of methods are used, depending on data availability. The first option is to turn to recent past years' disclosed raw CEW values, applying a growth rate based on state or U.S. data. If no recent past raw data are disclosed for a particular NAICS code in a particular geography, the next option is to turn to the CBP data. If there is a disclosed CBP employment value, we apply the CBP employment per establishment ratio to the CEW establishment value, the latter of which is always disclosed. If there are no disclosed CBP data available for this particular NAICS code and geography, we use ratios from the parent NAICS code or geography applied to the CEW establishment counts, the latter of which are always disclosed.
Values for all lower-level NAICS codes are then controlled to higher-level parent NAICS codes, across all NAICS levels. Adjustments are made only to the elements that are non-disclosed; adjustments are not made to disclosed elements.
BEA REA ESTIMATES
BEA Regional Economic Accounts (REA) data are integral to the IMPLAN data creation process because they cover all Industries and are one of the few sources of Proprietor Employment and Income data, as well as Employee Compensation. U.S. 3-digit REA employment and income data are reported without non-disclosures; however, the state and county level data do have non-disclosed elements. Estimates of non-disclosed state values are made while ensuring that the state values add up to the U.S. values and that the individual state sectors also sum to the more aggregated state sectors.
Disclosing Wage and Salary Income
- The first estimate is the corresponding state level CEW income/employment ratio times the state W&S employment derived above.
- After plugging in the initial estimates, the state values are RASed using the U.S. as controls for the row values and the 2-digit State REA values as the column control.
Disclosing REA Employee Compensation and W&S Income
- State and County 6-digit CEW income data are aggregated to the REA sectoring scheme (the estimation of CEW data is described above).
- To get our first estimate of Employee Compensation (EC), we either project a historical disclosed REA value for that county and industry or apply the state's ratio of REA EC to CEW W&S Income to the county's CEW W&S Income.
- To get our first estimate of W&S Income, we apply the state ratio of W&S Income to EC to the county's EC value.
- These estimates are then adjusted as necessary to ensure that more-detailed sectors sum to their more-aggregate parent sectors.
Disclosing REA Proprietor Income
- If Labor Income is disclosed, we simply subtract W&S Income from Labor Income to get Proprietor Income. Otherwise, if the parent sector's Proprietor Income is non-disclosed, we estimate the child sectors' Proprietor Income by applying the state's ratio of Proprietor Income for that sector to Total EC for all sectors to the county's Total EC value. If the parent sector's Proprietor Income is disclosed and the child sector has the opposite sign as the parent sector, we apply the state childrens' Proprietor Income-to-EC ratios to the county childrens' EC values. If the parent sector's Proprietor Income is disclosed and the child sector has the same sign as the parent sector, we subtract the disclosed childrens' values from the parent's value and then distribute the leftover to the non-disclosed child sectors based on state ratio of Proprietor Income for that child to state's leftover parent to distribute (that is, the state parent less the same child sectors subtracted from the county parent).
- These estimates are then adjusted as necessary to ensure that more-detailed Industries sum to their more-aggregate parent sectors.
Estimating Proprietor Employment
The BEA’s proprietor estimates were the sum of three types of proprietors: sole proprietors, limited partners, and general partners. Proprietor employment is first estimated at the national level, with distinct data sources and methods used for farm and non-farm industries.
National Estimates
The following ten-step process is used to generate estimates of proprietor employment at the national level. This series of steps is largely based on the BEA’s published methodology for creating farm proprietor employment:
- Calculate the number of non-corporate farms for the United States. This is derived as the product of the National Agricultural Statistics Service’s (NASS) number of all farms and the Agricultural Resource Management Survey’s (ARMS) ratio of the number of non-corporate farms to all farms.
- Calculate the number of sole-proprietor farms for the United States. This is derived as the product of the number of non-corporate farms (step 1) and the ARMS ratio of the number of sole-proprietor farms to non-corporate farms.
- Calculate the inflation-adjusted number of sole-proprietor farms for the United States. This is derived as the product of the number of sole-proprietor farms (step 2) and the ratio of the number of sole-proprietor farms over the inflation-adjusted threshold to all sole-proprietor farms. The inflation-adjusted threshold comes from using the change in NASS Agricultural Prices for all farm commodities between the present data year and the year in which the USDA first set a minimum threshold for income needed to be considered a farm (1977).
- Further adjust the number of sole-proprietor farms to remove farms with hired managers, following BEA practices. The new number of sole-proprietor farms is derived as the product of the inflation-adjusted number of sole-proprietor farms (step 3) and the ratio of hired manager operators to total operators from the U.S. Census of Agriculture.
- Calculate the number of farm sole-proprietors. This is derived as the product of the number of sole-proprietor farms adjusted for hired managers and inflation (step 4) and the ratio of the number of operators to sole-proprietor farms from the census.
- Calculate the number of partnership farms in the United States. This is derived as the product of the number of non-corporate farms (step 1) And the ARMS ratio of the number of partnership farms to non-corporate farms.
- Calculate the inflation-adjusted number of partnership farms for the United States. This is derived as the product of the number of partnership farms (step 6) And the ARMS ratio of the number of partnership farms over the inflation-adjusted threshold to all partnership farms.
- Further adjust the number of partnership farms to remove farms with hired managers. This is derived as the product of the adjusted number of partnership farms (step 7) and the ratio of hired manager operators to total operators from the U.S. Census of Agriculture.
- Calculate the number of farm partners. This is derived as the product of the adjusted number of partnership farms (step 8) and the ratio of the number of farm partners to partnership farms from the census of agriculture.
- Calculate total farm proprietors. This is derived as the sum of the number of sole-proprietors (step 5) and the number of farm partners (step 9).
Schedule C (sole proprietorships) match, but the data from K1s only includes limited partnerships (general partnerships are excluded). The BEA had some special data files that allowed them to estimate the number of general partnerships based on the number of limited partnerships by industry, but those data were considered confidential and not available for public use. By projecting a number that includes all three types of proprietors based on the growth in just the first two of those three types, we are circumventing the need for these special tabulations and getting a consistent count of proprietors to use.
The number of non-farm proprietors involves the use of two data tables from the Internal Revenue Service’s (IRS) Statistics of Income (SOI) program: Schedule C data cover sole proprietorships, while K1 data cover limited partnerships. Because the IRS data exclude general partnerships, they are not used directly, but rather are used to calculate growth rates that are applied to previous-year full proprietor counts from IMPLAN. They also serve as minimums for non-farm proprietor employment. The growth rates are constrained to fall within state- and sector-specific historical ranges observed in the BEA REA data. IMPLAN’s fully-disloxed CEW data serve as a back-up source for calculating growth rates if the IRS data are not published in the necessary timeframe.
The BEA proprietor income tables, which still exist, reveal cases where new proprietor employment should exist in a given industry and geography or where previously-existent proprietor employment should no longer exist in a given industry and geography.
As with all IMPLAN data, various geographic and industry hierarchy controls are imposed, and various limits and quality control checks are put into effect.
State and County Estimates
National level estimates are initially distributed to states and counties based on prior-year distributions, with the following adjustments:
- New estimates are limited in size to conform to state-specific growth rate caps based on historical BEA REA Proprietor Employment data.
- BEA REA Proprietor Income data are used to account for any new or lost sectors in each state and county.
- As with other IMPLAN data elements, Proprietor Employment estimates are controlled geographically and according to parent-children industry relationships.
IMPLAN-SECTOR EMPLOYMENT AND INCOME DATA
With a complete disclosed set of 3-digit REA Employee Compensation (EC) data, it is now possible to distribute the data to the IMPLAN Industries using the disclosed CEW data. These values are then controlled to BEA NIPA control totals.
Distributing Employee Compensation to IMPLAN Industries
- The 3-digit adjusted REA EC is distributed to IMPLAN Industries based on the CEW data that has been aggregated to the IMPLAN Industry scheme.
- State estimates are forced to sum to the national value.
- County data are then forced to sum to the corresponding state values.
- A proportion of some Industries’ activity (employment, output, income, etc.) gets reclassified into other Industries. This follows the BEA's Redefinitions practice and is designed to reassign products from producing industries in which they are secondary products to the industries where those products are primary.
Distributing Proprietor Employment and Income to IMPLAN Industries
Proprietor Employment and Income estimates in the more aggregate BEA REA sectoring scheme are distributed to IMPLAN Industries based on Wage and Salary Income or Employee Compensation, depending on the Industry.
Special Considerations for Distribution
For sectors not fully covered by CEW (e.g., farm sectors, rail transportation, and military) and sectors that are not NAICS-based (e.g., construction), other distributors are used (see Data Sources for Select Industries: Farm, Construction, Rail Transportation, and Government).
ACCOUNTING FOR UNDERCOVERAGE IN BLS CEW DATA
The CEW data provide an incredible level of industry detail (6-digit NAICS) geographic detail (county-level), ownership levels (Private, Federal Government, State Government, and Local Government), and always discloses establishment counts.
However, as described in Employment and Labor Income, the CEW data must be augmented and adjusted for the following reasons:
- Non-disclosures. In these cases, the CEW data will contain an establishment count but will not reveal the associated employment or income values.
- Undercoverage. The CEW data only includes workers covered by State unemployment insurance (UI) laws and Federal workers covered by the Unemployment Compensation for Federal Employees (UCFE) program. Excluded from CEW Private employment are proprietors, certain farm and domestic workers exempted from having to report employment data, and railroad workers covered by the railroad unemployment insurance system. Excluded from CEW Federal Government employment are elected officials in the executive or legislative branch, members of the armed forces or the Commissioned Corps of the National Oceanic and Atmospheric Administration, individuals serving on a temporary basis in case of fire, storm, earthquake, or other similar emergency, and individuals employed under a Federal relief program to relieve them from unemployment. Excluded from CEW State and Local Government employment are elected officials, members of a legislative body or members of the judiciary, members of the state National Guard or Air National Guard, and employees serving on a temporary basis in case of fire, storm, snow, earthquake, flood or similar declared emergency.
This section describes the two different approaches that IMPLAN employs to account for exclusions and undercoverage and discusses the Industries that fall under each approach.
APPROACH 1: ADJUST THE CEW DATA USING REA DATA
The first approach to account for undercoverage by CEW is to use the Bureau of Economic Analysis’ Regional Economic Accounts (REA) data to provide a compensating boost to the CEW data. While the BEA REA data provide full coverage of the income of all employees and proprietors in all industries, the BEA REA data are lagged one year relative to the CEW data and do not have nearly the same level of industry detail; thus, the BEA REA data are not suitable as a replacement for the CEW data, but can be used in conjunction with the CEW data to determine the magnitude of undercoverage in each industry and to calculate the appropriate adjustment ratios accordingly. The adjustment ratios are applied to the fully-disclosed CEW data.
This approach is used for the following IMPLAN Industries:
- Support activities for agriculture and forestry
- Commercial fishing
- Junior colleges, colleges, universities, and professional schools
- Grantmaking, giving, and social advocacy organizations
- Private households
- State Government Other Services
- Local Government Other Services
- Federal Government Non-Military
APPROACH 2: USE A DIFFERENT DATA SOURCE
The second approach to account for undercoverage by the CEW is to use a different data source altogether. The IMPLAN Industries that fall into this category are the farm, construction, rail transportation, and government Industries. Read more about the data sources and estimation processes for these Industries in Data Sources for Select Industries: Farm, Construction, Rail Transportation, and Government
ADDITIONAL RESOURCES
RELATED ARTICLES
Employment and Labor Income Data Sources
Data Sources for Select Industries: Farm, Construction, Railroad, and Government
Written July 2, 2024
Updated October 21, 2025