My results from counting lines by year and then summing are very different from this, and I do not know why. All the samples I have checked have 25,404,662 lines.
The only thing that I have been able to think of is that maybe it is a product of sample selection. All my samples include the IPUMS-CPS automatically selected default samples, and also three Suppliment Topics: Education, Fertility and Marriage, and Voter.
Does the Latino oversample increase the number of lines (above the numbers given on the web page you reference)?
Does the use of The IPUMS-CPS default samples automatically added from months other than March increase the number of lines?
Does the addition of Supplement Topics to to the sample increase the number of lines?
If not, do you have any other suggestions as to why my results might be different, or diagnostic tests you would recommend?
If the difference is the result of sample choices such as those above, would I be correct in assuming that the the proper response is to make them go away with some merger procedure? None of these other samples involve the addition of any more households or people, correct? Though I suppose there could be persons in sampled households who are not present in either March. Do you have code for such a merger that already available, or would I write it myself based on Drew, Flood and Warren?