NYC city Poverty data and ACS comparision


I dowloaded the NYC Poverty data from:

It uses the same identifier of ACS, so so serialno in NYC Poverty refers to cbserial in ACS. After merging the data (using year+identifier) I have realized there are lots of observations in NYC data yet those cannot be found in ACS data.

As far as I know they only used ACS sources, hence I wonder if is there any chance they used some households in ACS that are NOT released in IPUMS ACS website-but they are maybe in restrictive data or so. Or am I confused about this common identifier (yet for the common families, I checked everything is same)

Ok now that I realized ACS has a different numbering system starting from 2017 for households, when you adjust that, it can be merged with NYC poverty directly

It sounds like you’ve already found a solution to your problem, but in case not, here are my notes on how to do this merge.

I was able to create a 1:1 merge with the 2018 NYCgov Poverty Measure Data and 2018 ACS IPUMS samples at both the household and person-level. In order to merge at the household level, I created a variable equivalent to CBSERIAL in the NYCgov dataset by adding “20180” to the beginning of SERIALNO (while maintaining leading zeros); this should result in a 13-digit numeric variable. Also make sure to subset both datasets to CBPERNUM==1 and SPORDER==1 to keep one record per household. Merging at the person-level required creation of a new person-level serial variable in both datasets by concatenating CBSERIAL and CBPERNUM (for the ACS dataset) and CBSERIAL and SPORDER (for the NYC dataset); the tricky part with these concatenations will be to ensure leading zeros are kept so that the variable results in a 15-digit unique identifier.

If you aren’t able to create the merge with the fix you mentioned or using these tips, please provide more information, including your merge code and the scale of differences in the numbers you are seeing, so that I can help troubleshoot the issue.

1 Like