Accounting for Non-Disclosures, Exclusions, and Undercoverage in Employment Data

OVERVIEW

Source data on income and employment are inherently incomplete. Due to data disclosures and undercoverage, no one dataset provides enough information to create a complete IMPLAN database. IMPLAN’s Labor Income and Employment data, therefore, come from three primary datasets: BLS Census of Employment and Wages (CEW) data, BEA Regional Economic Accounts (REA) data, and County Business Patterns (CBP) data. For some Special Industries (farm, construction, and government), additional data sources are used that provide either more current data or more geographic or sectoral specificity. See Employment and Labor Income Data for more information about the raw data sources and their coverage.

The primary datasets that are used to estimate IMPLAN employment and labor income data all contain non-disclosed elements. Therefore, in order to complete the IMPLAN database, the non-disclosed values must be estimated and controlled for each regional dataset. 

Additionally, the Census of Employment and Wages (CEW) dataset, which is one of the most important datasets used in IMPLAN database development, excludes proprietors and certain Industries from their data. As described in Employment and Labor Income Data, the CEW data must also be augmented and adjusted. This article describes these methods.

ESTIMATING NON-DISCLOSED VALUES

COUNTY BUSINESS PATTERN ELEMENTS

Beginning with their 2017 data, the Census Bureau has adopted a new policy under which they no longer provide establishment counts for cases in which the establishment count is less than three, as the number of establishments is now considered sensitive. This new decision to omit from their tables all records with fewer than three establishments was a new policy that was set in place to protect the confidentiality of businesses.

This new practice makes a missing record of 1-2 establishments (an existing industry) indistinguishable from a missing record of zero establishments (a non-existent industry). The omission of records with less than three establishments, in addition to making it impossible to estimate their employment and income values, also makes it impossible for us to obtain high quality estimates for the non-disclosed records with three or more establishments, since we are not able to roll up across NAICS levels without having values for sibling sectors.

Therefore, beginning with the 2019 IMPLAN data set, we no longer create estimates for non-disclosed CBP data; rather, we only use the disclosed data. This will affect our estimates of non-disclosed BLS CEW values in those cases in which we need to turn to CBP (lack of disclosed raw CEW from previous year) and in which the CBP value is not disclosed; in those cases, we will now move right to using a projected CEW value from two years back, if possible, followed by the use of state-level or U.S.-level ratios as the second and third options for obtaining a first estimate for CEW, rather than having CBP as a second option.

CEW DATA

CEW data are current but also come with non-disclosures and, unlike the CBP data, provide only total establishment counts (i.e., they do not provide the number of establishments by employee-size classification) for non-disclosed items. To estimate the non-disclosed elements of the CEW data, a number of methods are used, depending on data availability. The first option is to turn to recent past years' disclosed raw CEW values, applying a growth rate based on state or U.S. data. If no recent past raw data are disclosed for a particular NAICS code in a particular geography, the next option is to turn to the CBP data. If there is a disclosed CBP employment value, we apply the CBP employment per establishment ratio to the CEW establishment value, the latter of which is always disclosed. If there are no disclosed CBP data available for this particular NAICS code and geography, we use ratios from the parent NAICS code or geography applied to the CEW establishment counts, the latter of which are always disclosed.

Values for all lower-level NAICS codes are then controlled to higher-level parent NAICS codes, across all NAICS levels. Adjustments are made only to the elements that are non-disclosed; adjustments are not made to disclosed elements.

REA STATE ADJUSTMENTS

BEA Regional Economic Accounts (REA) data are integral to the IMPLAN data creation process because they cover all Industries and are one of the few sources of Proprietor Employment and Income data, as well as Employee Compensation. U.S. 3-digit REA employment and income data are reported without non-disclosures; however, the state and county level data do have non-disclosed elements. Estimates of non-disclosed state values are made while ensuring that the state values add up to the U.S. values and that the individual state sectors also sum to the more aggregated state sectors.

Disclosing Wage and Salary Employment (SA27 Tables)

  1. State REA values are matched to corresponding CEW employment values. The CEW problem sectors (farm, railroad, and military) are not a problem with REA data as there are few non-disclosures in these sectors at the state level in REA.
  2. After plugging in the initial estimates, state values are RASed using U.S. values as controls for the row values and the 1-digit State REA values as the column control.

Disclosing Wage and Salary Income

  1. The first estimate is the corresponding state level CEW income/employment ratio times the state W&S employment derived above.
  2. After plugging in the initial estimates, the state values are RASed using the U.S. as controls for the row values and the 2-digit State REA values as the column control.

Disclosing Total Employment (SA25 Tables)

The four component BEA Gross State Product data is a source of information at the state level which will tell us whether there is any self-employment income in the state.

  1. If no proprietor employment is reported, then total employment is equal to wage and salary employment, the value of which is derived in the previous step 'Disclosing Wage and Salary Employment'.
  2. In some cases, a single 3-digit non-disclosure remains within a 2-digit group which can be derived through subtracting all disclosed 3-digit data from the 2-digit control value. Conversely, there may be proprietor employment and no corresponding wage and salary employment. For sectors for which both wage and salary employment and proprietor employment are non-disclosed, the first estimate is based on U.S. proprietor employment to wage and salary employment ratios for that sector.
  3. Initial estimates are controlled to known totals at various stages in the process.

REA COUNTY ESTIMATES

Disclosing REA Employee Compensation and W&S Income

  1. State and County 6-digit CEW income data are aggregated to the REA sectoring scheme (the estimation of CEW data is described above).
  2. To get our first estimate of Employee Compensation (EC), we either project a historical disclosed REA value for that county and industry or apply the state's ratio of REA EC to CEW W&S Income to the county's CEW W&S Income.
  3. To get our first estimate of W&S Income, we apply the state ratio of W&S Income to EC to the county's EC value.
  4. These estimates are then adjusted as necessary to ensure that more-detailed sectors sum to their more-aggregate parent sectors.

Disclosing REA Wage & Salary Employment

  1. State and County 6-digit CEW employment data are aggregated to the REA sectoring scheme (the estimation of CEW data is described above).
  2. To get our first estimate of W&S Employment we apply either the county's ratio of CEW Employment to CEW W&S Income to the county's W&S Income estimate or the state's CEW W&S Employment to CEW W&S Income ratio to the county's W&S Income estimate.
  3. These estimates are then adjusted as necessary to ensure that more-detailed Industries sum to their more-aggregate parent sectors.

Disclosing REA Proprietor Employment

  1. If Total Employment is disclosed, we simply subtract W&S Employment from Total Employment to get Proprietor Employment. Otherwise, if the parent sector's Proprietor Employment is non-disclosed, we estimate the child sectors' Proprietor Employment by applying the state's ratio of Proprietor Employment to EC for that sector to the county's EC value. If the parent sector's Proprietor Employment is disclosed, we distribute its value to its children sectors based on state proportions of children to the same parent.
  2. These estimates are then adjusted as necessary to ensure that more-detailed Industries sum to their more-aggregate parent sectors.

Disclosing REA Proprietor Income

  1. If Labor Income is disclosed, we simply subtract W&S Income from Labor Income to get Proprietor Income. Otherwise, if the parent sector's Proprietor Income is non-disclosed, we estimate the child sectors' Proprietor Income by applying the state's ratio of Proprietor Income for that sector to Total EC for all sectors to the county's Total EC value. If the parent sector's Proprietor Income is disclosed and the child sector has the opposite sign as the parent sector, we apply the state childrens' Proprietor Income-to-EC ratios to the county childrens' EC values. If the parent sector's Proprietor Income is disclosed and the child sector has the same sign as the parent sector, we subtract the disclosed childrens' values from the parent's value and then distribute the leftover to the non-disclosed child sectors based on state ratio of Proprietor Income for that child to state's leftover parent to distribute (that is, the state parent less the same child sectors subtracted from the county parent).
  2. These estimates are then adjusted as necessary to ensure that more-detailed Industries sum to their more-aggregate parent sectors.

REA EMPLOYMENT AND INCOME DATA

With a complete disclosed set of 3-digit REA income (national income being adjusted to NIPA) and employment data, it is now possible to distribute data to the 546 IMPLAN Industries using the disclosed CEW data.

Distributing W&S Employment and Employee Compensation to IMPLAN Industries

  1. The 3-digit adjusted REA W&S Employment is distributed to IMPLAN Industries based on the CEW data that has been aggregated to the IMPLAN Industry scheme.
  2. State estimates are forced to sum to the national value.
  3. County data are then forced to sum to the corresponding state values.
  4. A proportion of some Industries’ activity (employment, output, income, etc.) gets reclassified into other Industries. This follows the BEA's “redefinitions” practice and is designed to reassign products from producing industries in which they are secondary products to the industries where those products are primary.

Distributing Proprietor Employment and Income to IMPLAN Industries

BEA REA Proprietor Employment and Income estimates are distributed to IMPLAN Industries based on Wage and Salary Income or Employee Compensation, depending on the Industry.

Special Considerations for Distribution

  1. Construction industries are not defined by 6-digit NAICS in IMPLAN but rather by Census construction type categories. CEW does not fully cover farm industries. Therefore, other distributors are used (see Special Industries) instead of CEW data.
  2. CEW also does not cover Railroad transportation or Federal Military, but there is a one-to-one correspondence between 3-digit REA data and IMPLAN Industries for these, so no distribution is necessary.
  3. There are industries, at the county level, for which 3-digit REA income and employment are available but there are no corresponding 6-digit CEW data (indicating self-employment in the industry for the county but no wage and salary workers). For these cases, the 3-digit REA data are distributed to industries based on the state distribution. If this distribution places less than 0.4 employees to a particular industry, then that piece of the distribution is added to the largest component of the distribution. Finally, a sector with less than 0.25 employees before redefinitions and balancing is zeroed out.

ACCOUNTING FOR UNDERCOVERAGE IN BLS' CEW DATA

The CEW data provide an incredible level of industry detail (6-digit NAICS) geographic detail (county-level), ownership levels (Private, Federal Government, State Government, and Local Government), and always discloses establishment counts.  

However, as described in Employment and Labor Income Data, the CEW data must be augmented and adjusted for the following reasons:

  1. Non-disclosures. In these cases, the CEW data will contain an establishment count but will not reveal the associated employment or income values. 
  2. Undercoverage. The CEW data only includes workers covered by State unemployment insurance (UI) laws and Federal workers covered by the Unemployment Compensation for Federal Employees (UCFE) program. CEW excludes proprietors, certain farm and domestic workers from having to report employment data, railroad workers covered by the railroad unemployment insurance system, elected officials in the executive or legislative branch, members of the armed forces or the Commissioned Corps of the National Oceanic and Atmospheric Administration, individuals serving on a temporary basis in case of fire, storm, earthquake, or other similar emergency, individuals employed under a Federal relief program to relieve them from unemployment, and employees of certain national security agencies, which are excluded for security reasons.

This section describes the two different approaches that IMPLAN employs to account for exclusions and undercoverage and discusses the Industries that fall under each approach. 

APPROACH 1: ADJUST THE CEW DATA USING REA DATA

The first approach to account for undercoverage by CEW is to use the Bureau of Economic Analysis’ Regional Economic Accounts (REA) data to provide a compensating boost to the CEW data. While the BEA REA data provide full coverage of all employees in all industries, the BEA REA data are lagged one year relative to the CEW data and do not have nearly the same level of industry detail; thus, the BEA REA data are not suitable as a replacement for the CEW data, but can be used in conjunction with the CEW data to determine the magnitude of undercoverage in each industry and to calculate the appropriate adjustment ratios accordingly. The adjustment ratios are applied to the fully-disclosed CEW data. 

This approach is used for the following IMPLAN (546 Unaggregated) Industries: 

  • 19 - Support activities for agriculture and forestry
  • 17 - Commercial fishing
  • 481 - Junior colleges, colleges, universities, and professional schools 
  • 522 - Grantmaking, giving, and social advocacy organizations 
  • 525 - Private households 

APPROACH 2: USE A DIFFERENT DATA SOURCE

The second approach to account for undercoverage by the CEW is to use a different data source altogether and not use CEW data for these Industries. The IMPLAN Industries that fall into this category are the Farm, Construction, Rail transportation, and Government Industries. Read more about the data sources and estimation processes for these Industries in Special Industries in IMPLAN.

ADDITIONAL RESOURCES

BLS CEW Data

RELATED ARTICLES

Employment and Labor Income Data Sources

Special Industries in IMPLAN: Farm, Construction, Railroad, and Government

 

Written July 2, 2024