Creating a Lorenz Curve Using IPUMS Data

I am looking into analyzing the income difference within racial groups, but I am unsure how to get the data I need to create a Lorenz curve (and from that, the Gini coefficient). Any help would be appreciated!


Look under the “Person” tab at the top of the page. There, you’ll find sections for income and race.

There are two main ways to make a Lorenz curve, which would be at the household level or the individual level. If you want to make it at the individual level, you should use the variable INCTOT to capture an individual’s total income. The associated weight is PERWT. If you are doing it at the household level, you can use HHINCOME, which contains the sum of all individuals’ incomes in the household. The household weight is HHWT, and when doing household level analysis, you should keep only one row of data per household (for example, keep only those with PERNUM=1).

Quantiles can be a little bit tricky with weighted data. Some commands can use weighted data. For example, see the Stata module -lorenz-. I imagine there are similar commands in other statistical packages.

I would also just note two things:

  1. Sometimes Lorenz curves are constructed based on wealth, sometimes income. The ACS will only be able to measure income.

  2. Income in the ACS is topcoded. This can substantially affect your estimation of the Lorenz curve, especially when a large fraction of total income is earned at the very top of the distribution. So the estimated Gini coefficient will be biased downward.