Hi there - I’m working with 1900 Census data and struggling with the CHBORN variable. All the codebooks indicate that there should be a first category of “N/A” coded as “00”; however, when I download data, I never get this category. The first code is “1”, which according to the codebook should be “no children”.
I’m working with the 100% file, so I’d expect the N/A category not to be zero. I’ve used the IPUMS online table generator (which uses the 1900 5% sample), which does yield an N/A category, so seems to confirm my suspicion that there should be some N/As (coded 00) in the 100% 1900 download.
I am subsetting the 100% sample by SEX (2) AGE (15-49), RACE (1,2) and STATEICP (several states), LINK1900 (1) and LINK1910 (1). I have tried removing all of these filters, but still do not get the N/A (00) data for CHBORN. I’ve also explored whether this discrepancy can be explained by other variables, like MARST, but no luck.
I’ve downloaded the data in different formats to make sure there wasn’t an error with how I was reading the data into RStudio. No difference between .dat and .csv downloads.
I originally noticed this because I’m also working with the 1910 Census, which does include an N/A category (coded “00”), which made me worry something wasn’t matching up between these years.
So, it seems the online table generator is showing that N/As are part of the dataset, but they are dropped in my download. Any thoughts on why this is happening??
My only hunch is from the quality control variable QCHBORN; A very large number of entries of the “no children” (1) category are coded as QCHBORN=4 (hot deck allocation by IPUMS). Could these be the missing N/As?
Any advice would be much appreciated - thank you!