Us2023a_pincp and problems with numeric variables and labels on stata

emma_canelli · November 20, 2025, 3:10pm

I am using stata for a university projecy, and one of the main variables I intend to use is us2023a_pincp. I downloaded the variable, I turned the strings values into numeric values, but what happened is that stata sees the real income as the label of the observation, while the label is seen as the value of the income level. Example: if I ask stata to sum 2 incomes, the program will sum 1863+6512, while the real incomes are 22000+45000. On the data browse page it shows real incomes as 7-digit values (such as 0002000, or 0011000), and it uses what should be the label as a value for operations. The maximum it shows on a summary table is 21942, which is labeled as BBBBBBB, meaning “N\A”, another proof that labels and variables are inverted. HOw can I solve this? IT says to use ADJINC but I have no idea what that is, and cannot find help for it.

Ivan_Strahof · November 25, 2025, 5:10pm

I was unable to replicate your issue with summing US2023A_PINCP and did not observe any labels and values being inverted. I hope that explaining my process and sharing suggestions below will help you troubleshoot the issue. You may also find it useful to review our Data Training Exercises, which include sample problems and code that is intended to familiarize users with IPUMS data formats. We also offer short online tutorials that walk through the essentials of using IPUMS USA.

To replicate your issue, I first downloaded your extract #5 as well as the corresponding Stata .do file (right-click on STATA under the Command Files column for this extract and select “Save link as…” to save the .do file). It is necessary to download the .do file to load a fixed-width (.dat) data extract into Stata. Note that you may also request your extract to be formatted as a Stata .dta file (rather than a fixed-width file) that can be opened directly into Stata without requiring a .do file. This option is available from the Extract Request page (under Data Format) immediately before you submit your extract for processing.

Your extract #5 contains the 2023 American Community Survey (ACS) individual-level microdata where each observation is a person enumerated in this ACS sample. Each observation includes data for some variables that are automatically added to your extract and others that you have individually selected. This includes the variable US2023A_PINCP, which is the source variable released by the Census Bureau that reports total personal income in the 2023 ACS.

IPUMS USA offers the harmonized variable INCTOT, which integrates all of the sample-specific source variables reporting total personal income into a single variable. If you were to have multiple ACS samples in your extract (e.g., the 2022 and 2023 ACS), adding the variable INCTOT is essentially adding both US2023A_PINCP and US2022A_PINCP; the first is used for INCTOT for 2023 survey participants and the latter for 2022 participants. Additionally, INCTOT applies consistent coding across the integrated source variables. For example, 9999999 is always used to code N/A cases rather than using changing mnemonics such as BBBBBBB (this is noted in the codes tab for INCTOT). While the source variables are helpful for tracking down changes in reporting by the Census Bureau, we recommend that researchers use the harmonized variables for data analyses.

After executing the .do file to load the data into Stata, I destringed the variable US2023A_PINCP with the code:

destring(us2023a_pincp), generate(us2023a_newp) force

I then summed the new variable across household SERIAL by running:

bysort serial: egen hh_income = sum( us2023a_newp)

The incomes appear to have been correctly summed. In the screenshot below, the household with SERIAL 357812 shows that the personal income values 21,000 and 45,000 summed to 66,000.

Stata screenshot with summed household income

ADJINC is an adjustment factor offered by the Census Bureau as an optional method for adjusting income values to be more comparable. In 2023, the adjustment factor was 1.019518. Our user note on standardization of income variables discusses how to apply this factor, as well as the potential benefits and downsides of this method.

Topic		Replies	Views
Empty income variables in Stata from 2000 5% sample USA	2	308	January 31, 2015
Reproducing SPMFTOTVAL CPS	5	51	May 15, 2025
Why are the income vars in my SAS data set empty? How do I get a data set with income for 2000 and 2010? USA	1	439	February 18, 2014
when INCTOT is not the sum of the various INC components CPS	4	690	October 29, 2018
Some variables ( from IPUMS) are coded as both strings and numerical IDs. How can I get STATA to use IDs only? CPS	1	333	June 18, 2015

Us2023a_pincp and problems with numeric variables and labels on stata

Related topics