Matching tha data on the website (Analyzed data online) and the dowloaded one (browse and select data)

Andres_Vasquez · June 4, 2020, 6:04pm

Hello, im new using the website and im working on a migration analysis, so i used the Analyze data section and selected the 1980 sample and dpld variable (detailed birth place) i got a number. But then i use the Browse and select data section to dowload the same information on csv format. I did al the procedure, but for my surprise the values didnt match with the ones on the online Analysis
ill appreciate some help explaining that

Matthew_Bombyk · June 5, 2020, 9:04pm

Did you use weights for your analysis when you downloaded the data? You need to weight your analysis using PERWT in order to replicate the results in the online analysis system. If you did, can you please post the number you found from both the online system and the extract that you downloaded?

Andres_Vasquez · June 6, 2020, 5:50pm

Sure, i multiply the PERWT by PERNUM, then i sum all those results, in order to get the total population in usa to double check that number with the one on the Analyze data section. This way i got 32.361.600. Instead of 226.862.400 which is the number on the Analyze data section and looks more rational for the total population in us

Andres_Vasquez · June 6, 2020, 6:55pm

my question would be if that multiplication is well done? also i just noticed that when i open the file excel tells me that couldnt load all the data cause its over the limit of 1.048.576 rows. Any idea how can i solve this?

Matthew_Bombyk · June 9, 2020, 5:25pm

You shouldn’t multiply PERWT by PERNUM. To check the population total, just sum PERWT. To get a weighted estimate (for example average age) you multiply the weight by the variable of interest, for example: PERWT*AGE. I’ll also note that when calculating averages in Excel, you’ll need to divide the weights by the population total (the sum of the weights). Statistical software packages will do this for you automatically.

I think the reason your population total is so small is because you are only able to load about 1/10 of the dataset, so only about 10% of the people are being counted. There are a number of ways to overcome this limitation, here are a few:

Use a statistical package such as R (free) or Stata instead of Excel.
Use PowerPivot in Excel (this turns Excel into a full-featured database and can load much more data).
Take a random sample of the data when making your extract in IPUMS. More details on this are available here, and I’ve also attached a screenshot showing where you can choose this. For your particular sample, I think a density of .45 would be sufficiently small.

Andres_Vasquez · June 10, 2020, 4:14am

Thanks for your support, i wouldnt have done it with out your help. I got your idea about the PERWT vairable and i used the power pivot tool, finally matched the data.

Topic		Replies	Views
Expanding PERWT to get "True" Individual Observations USA	3	751	March 26, 2021
Using PERWT to get total population USA	4	1110	March 21, 2019
Interracial Marriages in the US Over Time USA	5	1193	October 4, 2019
PERWT in a new data extract sums to very low value USA	3	646	November 14, 2019
Can I apply weighting without statistical software? USA	1	1076	June 25, 2015

Matching tha data on the website (Analyzed data online) and the dowloaded one (browse and select data)

Related topics