Is county data available in IHIS? Is cancer screening in IPUMS-USA?

Hello, I cannot seem to find a variable for county in the IHIS data.

I can see county as a variale in IPUMS-USA but that lacks the cancer screening data that IHIS has.

Is there a way to get county data and cancer screening data?

The combined small area estimates site combines BFRSS and NHIS to provide county level data for recent years but I am looking for a larger time frame. I cannot seem to find raw NHIS data that has county and cancer screening.

Please help. Thank you

One of the major limitations of the IHIS data is the geographic detail available in the public use data. In many years data is limited to the identification of only census regions and, in some years, a few large metropolitan statistical areas. Researchers can access more geographic detail and add it to an IHIS data extract by working with the staff of the NCHS Research Data Center (RDC). If your research proposal is approved, you can access restricted data (including geographic identifiers) through on-site analysis at an NCHS or Census Restricted Data Center, via remote access, or with the paid assistance of NCHS RDC staff.

Stuff like that does not fall into your lap. IPUMS staff are just too polite to tell you that :). Also, you may want to distinguish the different data products a bit better. While IPUMS staff makes great efforts in unifying the interfaces, the underlying data sets are so different in their concepts that there is no real way to combine say CPS and NHIS.

NHIS only has the Census region (Northeast, South, Midwest, West). The data below that level do exist, but, as was indicated by IPUMS staff, are only available in restricted data access environment. More detailed geography isn’t released for statistical confidentiality / disclosure protection of health data.

The county variable that you may have seen in ACS is actually suppressed, as well (https://usa.ipums.org/usa-action/vari…). In the case of ACS, you can recover some of the counties from the available public use geography variables called PUMA, public use microdata area (https://usa.ipums.org/usa/volii/bound…), through the GIS tools like MCDC geocorr, but that is unlikely to move your work on cancer screening. See above on disparate products.

BRFSS data inded have county variables, too, but again these variables are only available in restricted data access environment.

The combined NHIS+BRFSS data project (http://sae.cancer.gov) was a very cool piece of research for its time that required pooling the top guns like Raghunathan and Schenker. These days, I would give a good Bayesian stats post-doc 3-4 months to reproduce everything. I am glad they have updated their SAEs with the newer data – thank you for making me look there again; this is a huge update they have performed between summer 2016 when I last looked there, and now (March 2017). I am sure they have done their best in combining the available data, so my guess is that the better data that could be combined in a meaningful way do not exist. Of course their work was also done behind a wall of an RDC.

Note that the SAE is a sophisticated statistical product. It goes way above and beyond proportions of smokers estimated from survey data, and also invokes statistical modeling with county-level covariates (side panel at https://sae.cancer.gov/nhis-brfss/met…). Again, stuff like that won’t fall into your lap from IPUMS.org website as it only deals with raw data. You need to fiddle with those statistical models quite a bit, and this work requires knowledge of generalized linear mixed models, analysis of complex survey data with weights and cluster samples, outlier diagnostics, and model selection (at least that’s my experience in building models like that). And standard errors are just impossible to get when all of these factors are combined… as there are no real methodology papers that do combine all these aspects (and Rao’s 2nd edition of the SAE bible does not have that, either).