for ACS data, I use BPL>99 as a condition to choose immigrants and use (AGE - YRSUSA1) to get the age at immigration for immigrants. However, i get numbers like -6 and other negative numbers a lot! How could this be? is there more accurate way to do this?
I recommend taking a look at this post as this issue has been brought up in the past. In short, YRSUSA1 in the ACS is derived from the year of immigration (YRIMMIG), which can easily cause a difference of 1 year depending on their age and when they responded to the ACS. The other cases are likely related to aggregation and binning the oldest years together, which are not reflected in the labels for YRSUSA1. In the 2019 ACS for example, everyone who entered the US in 1929 or earlier would have their year of entry set at 1929. You can see how these years of entry are binned by taking a look at US2019A_YOEP and the other USXXXXA_YOEP variables.
thanks for your reply, i counted the values below -2, and it turned out their number is rather small( a few hundred). So the error is acceptable