I’ve been trying to do some summary statistics over income across different years.
However, I realized that there were several issues:
for 1950, the net loss is indicated as “-1” regardless of the actual amount of loss for INCBUSFM, INCOTHER, and INCTOT. For instance, even if the INCWAGE was 550, because the magnitude of the net loss in INCBUSFM was greater than that of INCWAGE, INCTOT was denoted as -1, which raises an issue since we do not know the actual amount of loss.
in some cases, for instance in 1950, INCTOT did not necessarily match the sum of INCWAGE, INCBUSFM, and INCOTHER, which is due to the fact that INCOTHER is the sum of the codes and not the actual amounts of INCWAGE, INCBUSFM, and INCOTHER, which raises an issue when we want to find out how much percentage of INCTOT do each of INCWAGE, INCBUSFM, and INCOTHER accounts for.
because for each year, such as for 1950, there is top code such as $10,000, which means that even if INCWAGE, INCBUSFM, and INCOTHER exceeded $10,000 in practice, they can only be denoted as 10,000, which doesn’t reflect the true data.
These were the limitations that I’ve found while working with the data, and was wondering if you know if there is any alternatives/answers to solving these issues.
thanks for your help