replicating BLS employment levels with IPUMS CPS

Hi, I am trying to replicate BLS employment levels for specfic industries. I used this table (https://usa.ipums.org/usa/volii/indcr…) to create a dummy variable for the NAICS categories. I am trying to compare 2014 employment numbers in my data to this table provided by BLS: https://www.bls.gov/opub/ee/2015/cps/…. This is my code in stata:

tab NAICS if year == 2014 [iw=wtfinl]

NAICS is my dummy variable I created using IND for the NAICS industry codes. Although I come up with the same percentage distribution for the industries as BLS does, I am coming up with crazy numbers. For example for mining, the frequency I get with my data is 13,833,783, but the BLS employment level for mining in 2014 is 1,088,000. Any idea why this might be happening?

I think the issue may be that you are summing frequencies for multiple sample months in 2014. I believe the report you have provided from the BLS is reporting annual averages. Try computing an annual average; I think your numbers will be much closer.

Thank you that worked! Before you answered, I tried doing what CEPR suggests doing to their CPS ORG data (here: http://ceprdata.org/cps-uniform-data-…) which is generating a new weight by taking wtfnl, dividing it by 12 and rounding to the nearest integer. Would that also work?

Thank you!

Yes, that should also work. There will be a difference in the frequencies between the two methods, but it should be very small.