Apologies and thank you in advance for the basic question, I see several questions related to this in nature but I am lacking confidence if I am understanding the documentation correctly.
I am using PERWT values to create a ratio of male vs female workers in a given metro area for a specific OCC job category. For example, 167 male vs. 22 females that work as ‘construction managers’ in Seattle-Tacoma-Bellevue. This is similar to the approach described in this previous post.
My specific question for this part of the analysis is, other than the warnings provided in the link above such as small sample size warnings, are there any nuances/caveats that I should be aware of when creating ratios of workers by sex in this manner? Does this ratio sufficiently create a snapshot of the job market in terms of the sex of the worker for that job market and job title?
Next, I understand that in order to create standard deviations for the above ratio metric, I need to then use REPWTP as described here, where the statistic I am interested in is either the # of male workers of # of female workers for a given metro+job title combination. My presumption here is that the standard deviation would be the same for male and female workers in the same metro+job combination. I would then be able to use this point estimate and standard deviation in a z-test. Does this methodology align with the recommended use of PERWT and REPWTP?
For context, I am using ACS 2022 data, ‘SEX’, ‘MET2013’, and ‘OCC’. I have since added REPWTP as I realized from reading previous posts that this was not included in extracts beyond PERWT.
Thank you again for the assistance, IPUMS is a wonderful org!
The person weight PERWT in IPUMS USA adjusts estimates so that they are nationally representative, accounting for the differential probability of each individual being included in the ACS sample. The ACS sampling design uses a complex geography-based method to select respondent households, so in order to estimate empirically derived standard errors, you must also use replicate weights. Replicate weights generally increase the standard error of an estimate. Your understanding of how to use PERWT and the replicate weights seems correct.
While IPUMS User Support can provide guidance on IPUMS data and documentation, we cannot provide analytical advice, which seems to be what you are most interested in. As you have noted, small sample sizes can pose an issue and result in large standard errors.
Note that metropolitan areas are not identified explicitly in the public use ACS microdata. The smallest identifiable geographic area in the public use microdata is the public use microdata area (PUMA). IPUMS geographers are able to use PUMA to identify the metro area of residence for some metro areas. However, MET2013 includes match errors, and does not identify all metro areas. You can read more about this in the description of MET2013.
You may be interested in using PWMET13 instead of MET2013, depending on the goal of your analysis. PWMET13 reports the metro area of work for each working respondent. This variable is subject to the same identification limitations as MET2013.
You may also want to look into IPUMS NHGIS, which provides summary data from the U.S. decennial census and American Community Survey, aggregated at a variety of geographic levels, including metropolitan statistical area. TableB24010. Sex by Occupation for the Civilian Employed Population 16 Years and Over would likely be the most relevant table for you. The summary data provided by IPUMS NHGIS are not subject to geography identification limitations like the microdata. This brief video tutorial walks through how to get started with the NHGIS data finder.