I was trying to understand the relationship between the NHGIS data versus the IPUMS USA and CPS data. Are all of the data variables from USA and CPS also available in the NHGIS datasets?
I was reading through the documentation, and I understand that the metadata for the IPUMS USA and CPS datasets are more detailed, while the NHGIS data draws its codebook from a different source. However, I was not whether all of the data in the USA and CPS datasets are also available through the NHGIS extracts? Could someone please clarify.
Thank you.
I will provide you with some information about the three data collections you are interested in: IPUMS USA, IPUMS CPS, and IPUMS NHGIS. These are three distinct data collections. While some of the source data for IPUMS USA and IPUMS NHGIS are the same, there is no overlap between data offered on IPUMS USA and IPUMS CPS, or between data offered on IPUMS CPS and IPUMS NHGIS.
IPUMS CPS provides microdata from the Current Population Survey from 1962 to 2024. The Current Population Survey is a monthly panel survey of U.S. households. Microdata are person level data, meaning each row or observation is a household or person, and each column is a variable that gives information about that person or household. You can read more general information about IPUMS CPS here.
IPUMS USA provides microdata from the U.S. decennial census from 1790 to 2020 and the American Community Survey from 2001 to 2022. You can read more general information about IPUMS USA here and view a list of samples available from IPUMS USA here. There is no overlap between the CPS and the data provided on IPUMS USA; the CPS is distinct from the ACS and the decennial census. However, IPUMS does make an effort to use common variable names in IPUMS USA and IPUMS CPS.
IPUMS NHGIS provides summary data from the U.S. decennial census, American Community Survey, and other data sources such as Vital Statistics and Census of Agriculture. IPUMS NHGIS does not provide any data from the CPS. The data available range from 1790 to 2022. Summary data are different from microdata. Each row is a specific geographic area, such as a particular county, state, or school district. Each column is a variable that gives information about that particular area, such as its population or the number of households within different income brackets. There are not common variable names between IPUMS USA and IPUMS NHGIS because the data are structured differently (there are no person-level variables). Summary data are available at very fine geographic levels, which is not true of microdata. Privacy considerations prevent person-level data from identifying very small geographic areas.
@Isabel_Pastoor Thank you so much for this detailed response. Yes, this is exactly what I was looking for. thank you so much. So this just verifies what I was thinking, but it helps to have that confirmation.
Thanks again.