Estimating Non-Disclosed CEW Values

A Brief Introduction to the QCEW Data Series

The Bureau of Labor Statistics’ (BLS) Quarterly Census of Employment and Wages (QCEW) is a quarterly count of employment and wages reported by employers. The QCEW covers more than 95 percent of U.S. jobs available at the county, state, and national level, by NAICS[1] industry, for all levels of NAICS detail. The primary source for the QCEW is administrative data from state unemployment insurance (UI) programs. These data are supplemented by data from two BLS surveys: the Annual Refiling Survey (ARS) and the Multiple Worksite Report (MWR). Before publication, BLS and state workforce agencies review and enhance the QCEW data, correcting errors, imputing for nonresponse, and confirming and annotating unusual movements. The BLS also publishes an annual version of the CEW data, which is the version used by IMPLAN. These data are current to the IMPLAN data year (no lag).

The BLS publishes CEW data for 5 ownership types: Total, Federal Government, State Government, Local Government, and Private (ownership codes 0, 1, 2, 3, and 5, respectively). IMPLAN makes use of all ownership types. The government-owned and privately-owned CEW data are estimated separately by IMPLAN, as will be described in detail later in this article). The table below shows the raw 2020 annual CEW data for Denver County, CO for NAICS code 10 (All industries), for each of the 5 ownership types.

State FIPS County FIPS Ownership Code NAICS Code NAICS Title Year Disclosure Code Establishments Employments Wages

08

031

0

10   

Total, All industries

2020

 

36,207

91,419

 40,287,746,658

08

031

1

10   

Total, All industries

2020

 

110

13,161

 1,212,748,433

08

031

2

10   

Total, All industries

2020

 

49

15,411

1,034,001,938

08

031

3

10   

Total, All industries

2020

 

83

40,282

 2,781,496,532

08

031

5

10   

Total, All industries

2020

 

35,965

422,564

35,259,499,755

 

Exclusions, Undercoverage, Non-Disclosures, & Oddities

Exclusions and Undercovered Industries

Major exclusions from UI coverage (and thus the BLS CEW data) include most agricultural workers on small farms, all members of the Armed Forces, elected officials in most states, most employees of railroads, some domestic workers, most student workers at schools, and employees of certain small nonprofit organizations (primarily religious organizations). 

According to the Bureau of Economic Analysis (BEA), the following sectors are also undercovered in the CEW data: Support activities for agriculture and forestry; Shellfish fishing; Finfish fishing; Religious organizations; Colleges, universities, and professional schools (student employees specifically), and Private households. While non-military and non-elected government employees can choose to be covered by Social Security or their own programs, they still have government-provided unemployment insurance, which is what allows them to show up in CEW.

IMPLAN uses other data sources to adjust for the exclusions and undercoverage in the CEW data when developing the full IMPLAN data set. However, IMPLAN’s CEW data product does not include any such adjustments. IMPLAN’s CEW data contain estimates for all non-disclosed records and has full NAICS roll-up (described in more detail later in this article) but are not otherwise adjusted.

The CEW data cover wage and salary employees only; there are no proprietors in the CEW data sets. IMPLAN obtains proprietor data (in addition to several other data elements) from the BEA’s Regional Economic Accounts tables. Please note that while proprietors themselves are not included in CEW, if those proprietors have employees, those employees are indeed included in the CEW data.

Non-disclosures

In accordance with the BLS confidentiality policy, data reported under a promise of confidentiality are published in a way so as to protect the identifiable information of respondents. As such, the BLS withholds the publication of UI-covered employment and wage data for any industry level when necessary to protect the identity of employers. Totals at the industry level for the states and the nation include the non-disclosed data suppressed within the detailed tables without revealing those data. QCEW confidentiality concepts and practices are largely based on the Statistical Policy Working Paper 22 developed by the Federal Committee on Statistical Methods.

The great news is that even when the BLS does not publish the employment or wages for a given geography and NAICS code, it always reveals the number of establishments in every NAICS code and geography (with a few exceptions, which are discussed in the next sub-section of this article). This not only tells us which industries exist in which geographies, but also helps us develop quality estimates of those non-disclosed employment and wage values. Non-disclosed records are marked with a disclosure code of “N” in the raw CEW data, as shown in the table below, which contains more data for Denver County, CO. As just mentioned, and as can be seen in the table below, even for records for which the employment and wages values are non-disclosed, the number of establishments is still reported. The main value IMPLAN adds to the raw CEW data is to provide estimates for all non-disclosed records. We provide these estimates using a prioritized hierarchy of data and techniques, as described in detail later in this article. 

State FIPS County FIPS Ownership Code NAICS Code NAICS Title Year Disclosure Code Establishments Employments Wages

08

031

5

31-33

Manufacturing

2020

 

879

19,949

1,294,348,649

08

031

5

311  

Food Manufacturing

2020

 

144

4,764

311,428,643

08

031

5

3111 

Animal Food Manufacturing

2020

 

5

401

35,051,790

08

031

5

31111

Animal Food Manufacturing

2020

 

5

401

35,051,790

08

031

5

311111

Dog and Cat Food Manufacturing

2020

N

4

-

-

08

031

5

311119

Other Animal Food Manufacturing

2020

N

1

-

-

Other Oddities

In almost all cases, if an industry exists in a geography, the BLS will report the number of establishments, even if it does not report the number of employees or their wages. However, there are a few cases where the BLS will mark a record as non-disclosed (giving it a disclosure code of “N”), which suggests the existence of the industry in the geography, while also reporting an establishment count of 0, which is an apparent contradiction. This phenomenon is typically an artifact of the annualization of the quarterly data. If an establishment in a given industry and geography existed for only a single quarter of a year, then the annual value for that industry in that geography would be 0.25, which then rounds down to 0.

A disclosure code of “N” implies existence, since any industries that do not exist in a given geography will not have records in the raw CEW data (i.e., they will not appear in the table at all). IMPLAN therefore assigns an establishment count of 0.25 to these records and then proceeds as normal with the disclosing of the associated employment and wages. An example of records with a disclosure code of “N” (which implies existence) and an establishment count of 0 is shown below for Denver County, CO.

State FIPS County FIPS Ownership Code NAICS Code NAICS Title Year Disclosure Code Establishments Employments Wages

08

031

5

21232

Sand, Gravel, Clay, and Ceramic and Refractory Minerals Mining and Quarrying

2020

N

0

0

0

08

031

5

212322

Industrial Sand Mining

2020

N

0

0

0

08

031

5

333999

All Other Miscellaneous General Purpose Machinery Manufacturing

2020

N

0

0

0

A more troublesome practice adopted by the BLS is to place some records into County FIPS 999, which is not a real County FIPS code in any state. Without any information on the true county to which these records belong, these records are not used by IMPLAN and are not included in IMPLAN’s CEW data. An example of these records for Colorado are shown in the table below.

State FIPS County FIPS Ownership Code NAICS Code NAICS Title Year Disclosure Code Establishments Employments Wages

08

999

5

213  

Support Activities for Mining

2020

 

12

45

4,245,724

08

999

5

2131 

Support Activities for Mining

2020

 

12

45

4,245,724

08

999

5

21311

Support Activities for Mining

2020

 

12

45

4,245,724

08

999

5

213111

Drilling Oil and Gas Wells

2020

 

4

20

1,646,707

08

999

5

213112

Support Activities for Oil and Gas Operations

2020

 

8

25

2,599,017

A similar practice adopted by the BLS is to place some records into NAICS code 99 (and its 3-, 4-, 5-, and 6-digit child NAICS codes of 999, 9999, 99999, and 999999), which is not a real NAICS code and does not contain any information as to the real industries to which the records belong. Because the values for these 99 NAICS codes is included in the total values for “All industries” (NAICS code 10), it is necessary for IMPLAN to estimate values for any non-disclosed records for these 99 NAICS codes to ensure full NAICS roll-up without overestimating the non-disclosed values for other, non-99, NAICS codes. Thus, while IMPLAN does not make use of the estimates for the 99 NAICS codes in the development of the full IMPLAN data sets, IMPLAN does provide estimates for the 99 NAICS codes in its CEW data set. An example of these records is shown for Denver County, CO in the table below. In this example, all the values for these 99 NAICS codes are fully disclosed (note the absence of a disclosure code) and thus did not need to be estimated by IMPLAN; however, in many cases, these values are not disclosed and IMPLAN must make estimates for them using the same data and methods used for estimating the non-disclosed values of regular NAICS codes. High quality estimates of these 99 NAICS codes ensures high quality estimates of the other non-disclosed NAICS codes.  

State FIPS County FIPS Ownership Code NAICS Code NAICS Title Year Disclosure Code Establishments Employments Wages

08

031

5

99   

Unclassified

2020

 

19

35

3,028,272

08

031

5

999  

Unclassified

2020

 

19

35

3,028,272

08

031

5

9999 

Unclassified

2020

 

19

35

3,028,272

08

031

5

99999

Unclassified

2020

 

19

35

3,028,272

08

031

5

999999

Unclassified

2020

 

19

35

3,028,272

 

How IMPLAN Estimates Non-Disclosed Values in the BLS CEW Data

A hierarchy of techniques for assigning a first estimate

This section describes in near full detail the procedures used to obtain a first estimate of non-disclosed CEW employment values. Wage estimates are based on the final employment estimates, using wages-per-employment estimates from prior years’ data or from higher geographic levels.

Raw CEW Data from One Year Prior

If the current year’s CEW value for a given NAICS code and geography is non-disclosed, the first place we turn to is the previous year’s raw CEW data. If it is disclosed, we multiply the current establishment count (which is always disclosed) by 0.9 * the previous year’s employment-per-establishment ratio. Multiplying by 0.9 is done to provide a more conservative first estimate, under the assumption that the current average establishment size is slightly below the previous year’s average establishment size. 

Raw Census Bureau County Business Patterns (CBP) Data (Lagged One Year)

If the previous year’s CEW value is also non-disclosed, the second option is to turn to the CBP data. Like the CEW data, the CBP data cover wage and salary employees only. However, the CBP data do not include government-owned establishments, so it is only used for the estimation of the private-ownership CEW data. IMPLAN does not make use of the CBP wages data since the CBP wages values are two years lagged relative to the IMPLAN data year, while the CBP employment values are only one year lagged relative to the IMPLAN data year.

If the CBP value for the same geography and NAICS code is disclosed[2], then we multiply the lagged CBP employment-per-establishment ratio by the current CEW establishment count for that geography and NAICS code (which is always disclosed). 

Raw CEW Data from Two Years Prior

If both the previous year’s raw CEW value and the (lagged) raw CBP value are non-disclosed, we look to the raw CEW data from two years prior. If it is disclosed, we multiply the current establishment count (which is always disclosed) by 0.9 * the previous year’s employment-per-establishment ratio. Multiplying by 0.9 is done to provide a more conservative first estimate, under the assumption that the current average establishment size is slightly below the average establishment size from two years prior. 

Raw CEW Data from a Higher Geographic Level

If all 3 previous options (the previous year’s raw CEW value, the (lagged) raw CBP value, and the raw CEW value from two years prior) are non-disclosed, our final option is to multiply the current establishment count at the county level by 0.9 times the parent geography’s current employment-per-establishment ratio. Multiplying by 0.9 is done to provide a more conservative first estimate, under the assumption that the child geography’s average establishment size is slightly below the parent geography’s average establishment size. For states, the parent geography is clearly always the U.S. For counties, the first-choice parent geography is the state, but if the state’s value is also non-disclosed, then the U.S. becomes the parent geography for the county (the U.S. data are fully disclosed for the private ownership code).

NAICS Roll-Up

NAICS roll-up (that is, ensuring that all 6-digit NAICS codes’ values sum to their 5-digit parent code’s value, that all 5-digit NAICS codes’ values sum to their 4-digit parent code’s value, and so forth up to the total for all industries), is achieved through a series of bottom-up and top-down adjustment routines. Note that in this process, no disclosed CEW value is adjusted in any way; only the estimates for non-disclosed values are adjusted to achieve NAICS roll-up. There are a few special nuances, but the main gist is as follows:

  • Bottom-up: If a parent sector is non-disclosed, set the parent equal to the larger of our initial estimate for the parent or the sum of all the children's values (whether they were disclosed or estimated).
  • Top-down: Subtract any disclosed child sector values from the parent sector’s value (whether the parent sector’s value was disclosed or estimated) to yield a “leftover” value to be distributed to the non-disclosed child sectors. Distribute the leftover value to the non-disclosed child sectors proportionately (i.e., according to each of their proportions of the sum of them all). Stated another way, adjust each non-disclosed child sector’s value proportionately to its share of the sum of all non-disclosed child sector values. A simple example will be shown later in this article.

Geographic Roll-Up

IMPLAN’s CEW data do not roll up geographically at every NAICS level for various reasons, chief among them the BLS’ use of the fake County FIPS 999 and the fake NAICS code 99, as well as the existence of records with a disclosure code of 0 but with establishment counts of 0. These same issues prevent the roll up of the values for each of the four ownership types to sum to the values for ownership code 0 for all NAICS codes.  

However, IMPLAN does employ a number of strategies to ensure some level of geographic and cross-ownership-type concordance where possible. Among these strategies:

  • Creating minimum values for states, for each NAICS code, calculated by summing the disclosed values across all of the counties in that state, for that NAICS code.
  • Constraining the estimate of any non-disclosed value for NAICS code 10 (All industries) to the value for Ownership Code 0 less any other disclosed values for NAICS code 10 for any other ownership types.

Examples

A very simple case

State FIPS County FIPS Ownership Code NAICS Code NAICS Title Year Disclosure Code Establishments Employment

08

031

5

31-33

Manufacturing

2020

 

879

19,949

08

031

5

311  

Food Manufacturing

2020

 

144

4,764

08

031

5

3111 

Animal Food Manufacturing

2020

 

5

401

08

031

5

31111

Animal Food Manufacturing

2020

 

5

401

08

031

5

311111

Dog and Cat Food Manufacturing

2020

N

4

                    -  

08

031

5

311119

Other Animal Food Manufacturing

2020

N

1

                    -  

Consider the very straightforward case below. In this case, we need to generate estimates for NAICS codes 311111 and 311119 for Denver, CO in the 2020 CEW data set, and then adjust those first estimates so that they sum to their parent sector’s value, which in this case, is disclosed, making this a very simple case.

  1. The 2019 raw CEW data for this county, ownership code, and two NAICS codes were also non-disclosed, so we are not able to use our first choice in the hierarchy.
  2. The 2019 CBP data set had data for one of these two NAICS codes for this county, as shown below. Therefore, our first estimate for NAICS code 311111 is 235 / 3 * 4 = 313.33.
State FIPS County FIPS NAICS Code Disclosed Employment Establishments

08

031

311111

True

235

3

  1. The 2018 raw CEW data were non-disclosed for this county and NAICS code 311119; therefore, we were not able to use our third choice in the hierarchy for this NAICS code.
  2. The state-level 2020 CEW data for NAICS code 311119 and ownership code 5 were fully disclosed; therefore, our first estimate for this NAICS code and ownership code 5 for Denver County is 0.9 * 538 / 34 * 1 = 14.24.
State FIPS County FIPS Ownership Code NAICS Code NAICS Title Year Disclosure Code Establishments Employment

08

000

5

311119

Other Animal Food Manufacturing

2020

 

34

538

  1. NAICS roll-up: Since these are the only two child sectors of the parent sector, we don’t need to subtract any disclosed child sector values from the parent sector’s value to get a leftover value to distribute to the two non-disclosed child sectors; rather, we simply need to distribute the parent sector’s entire disclosed value (401) to these two non-disclosed child sectors proportionate to their first estimates. The sum of our first estimates for these two child sectors is 327.57, while we need them to sum to 401; thus, we’ll need to adjust them upward proportionately. First, we’ll calculate each estimate’s proportions of that sum: 33 / 327.57 = 0.9565 and 14.24 / 327.57 = 0.0435.  We’ll then multiply the parent sector’s value by these proportions to obtain the final values for the two non-disclosed child sectors:
    • 9565 * 401 = 383.57
    • 0435 * 401 = 17.43

A Very Complicated Case

The case of manufacturing in Teller County, CO is a much trickier one, with nearly every manufacturing sector in the county being non-disclosed, including the 2-digit super parent sector! In this case, the NAICS roll-up task is much more challenging, with the use of both directions (bottom-up and top-down) being necessary, as well as the need to consider the values of all the other 2-digit NAICS codes, including NAICS code 99. Replicating such a complicated case in Excel can require over 100 rows of data; thus, we will not present the entire replication process for this case here in this article; we include the case here to illustrate just how complicated the process can become.

State FIPS County FIPS Ownership Code NAICS Code NAICS Title Year Disclosure Code Establishments Employment

08

119

5

31-33

Manufacturing

2020

N

17

0

08

119

5

311  

Food Manufacturing

2020

N

1

0

08

119

5

3119 

Other Food Manufacturing

2020

N

1

0

08

119

5

31199

All Other Food Manufacturing

2020

N

1

0

08

119

5

311999

All Other Miscellaneous Food Manufacturing

2020

N

1

0

08

119

5

312  

Beverage and Tobacco Product Manufacturing

2020

N

1

0

08

119

5

3121 

Beverage Manufacturing

2020

N

1

0

08

119

5

31211

Soft Drink and Ice Manufacturing

2020

N

1

0

08

119

5

312112

Bottled Water Manufacturing

2020

N

1

0

08

119

5

315  

Apparel Manufacturing

2020

N

1

0

08

119

5

3159 

Apparel Accessories and Other Apparel Manufacturing

2020

N

1

0

08

119

5

31599

Apparel Accessories and Other Apparel Manufacturing

2020

N

1

0

08

119

5

315990

Apparel Accessories and Other Apparel Manufacturing

2020

N

1

0

08

119

5

321  

Wood Product Manufacturing

2020

N

1

0

08

119

5

3211 

Sawmills and Wood Preservation

2020

N

1

0

08

119

5

32111

Sawmills and Wood Preservation

2020

N

1

0

08

119

5

321113

Sawmills

2020

N

1

0

08

119

5

325  

Chemical Manufacturing

2020

N

1

0

08

119

5

3254 

Pharmaceutical and Medicine Manufacturing

2020

N

1

0

08

119

5

32541

Pharmaceutical and Medicine Manufacturing

2020

N

1

0

08

119

5

325411

Medicinal and Botanical Manufacturing

2020

N

1

0

08

119

5

327  

Nonmetallic Mineral Product Manufacturing

2020

N

2

0

08

119

5

3273 

Cement and Concrete Product Manufacturing

2020

N

2

0

08

119

5

32732

Ready-Mix Concrete Manufacturing

2020

N

2

0

08

119

5

327320

Ready-Mix Concrete Manufacturing

2020

N

2

0

08

119

5

332  

Fabricated Metal Product Manufacturing

2020

 

7

39

08

119

5

3322 

Cutlery and Handtool Manufacturing

2020

N

2

0

08

119

5

33221

Cutlery and Handtool Manufacturing

2020

N

2

0

08

119

5

332215

Metal Kitchen Cookware, Utensil, Cutlery, and Flatware (except Precious) Manufacturing

2020

N

2

0

08

119

5

3324 

Boiler, Tank, and Shipping Container Manufacturing

2020

N

1

0

08

119

5

33242

Metal Tank (Heavy Gauge) Manufacturing

2020

N

1

0

08

119

5

332420

Metal Tank (Heavy Gauge) Manufacturing

2020

N

1

0

08

119

5

3327 

Machine Shops; Turned Product; and Screw, Nut, and Bolt Manufacturing

2020

 

4

27

08

119

5

33271

Machine Shops

2020

 

4

27

08

119

5

332710

Machine Shops

2020

 

4

27

08

119

5

333  

Machinery Manufacturing

2020

N

1

0

08

119

5

3336 

Engine, Turbine, and Power Transmission Equipment Manufacturing

2020

N

1

0

08

119

5

33361

Engine, Turbine, and Power Transmission Equipment Manufacturing

2020

N

1

0

08

119

5

333618

Other Engine Equipment Manufacturing

2020

N

1

0

08

119

5

334  

Computer and Electronic Product Manufacturing

2020

N

2

0

08

119

5

3344 

Semiconductor and Other Electronic Component Manufacturing

2020

N

1

0

08

119

5

33441

Semiconductor and Other Electronic Component Manufacturing

2020

N

1

0

08

119

5

334418

Printed Circuit Assembly (Electronic Assembly) Manufacturing

2020

N

1

0

08

119

5

3346 

Manufacturing and Reproducing Magnetic and Optical Media

2020

N

1

0

08

119

5

33461

Manufacturing and Reproducing Magnetic and Optical Media

2020

N

1

0

08

119

5

334613

Blank Magnetic and Optical Recording Media Manufacturing

2020

N

1

0

08

119

5

337  

Furniture and Related Product Manufacturing

2020

N

2

0

08

119

5

3371 

Household and Institutional Furniture and Kitchen Cabinet Manufacturing

2020

N

2

0

08

119

5

33711

Wood Kitchen Cabinet and Countertop Manufacturing

2020

N

1

0

08

119

5

337110

Wood Kitchen Cabinet and Countertop Manufacturing

2020

N

1

0

08

119

5

33712

Household and Institutional Furniture Manufacturing

2020

N

1

0

08

119

5

337124

Metal Household Furniture Manufacturing

2020

N

1

0

 

Written December 28, 2021

 

[1] https://www.census.gov/NAICS/

[2] Starting with the 2017 CBP data, the Census Bureau changed its confidentiality policies and no longer publishes records (not even establishment counts) for any industry in any geography that has less than 3 establishments. This new practice makes a missing record of 1-2 establishments (an existing industry) indistinguishable from a missing record of zero establishments (a non-existent industry).  The omission of records with less than three establishments, in addition to making it impossible to estimate their employment and income values, also makes it impossible for us to obtain high quality estimates for the non-disclosed records with three or more establishments, since we are not able to roll up across NAICS levels without having values for sibling sectors. Therefore, beginning with the 2017 CBP data set, IMIPLAN no longer estimates values for the non-disclosed CBP data/ IMPLAN now only uses the fully disclosed values from CBP and uses other methods where raw CBP data are not available, as described next.