Missing observations

Hello, all. My name is Michael Sikivie and I’m new to this forum.

I put together a data extract using iPUMS. All I’ve done is run the do file that reads in the data file that iPUMS gives me, cps_00002.dat.gz, into Stata. There are so many missing observations that I think something must have gone wrong reading in the data. I’m mostly interested in the linking variables that apparently aren’t in the CPS data itself, but apparently even those are mostly missing. cpsidp is 99.98% missing with only 771 observations, and marbasecidp is 99.99% missing. It seems like those variables should be about 0% missing, since they are used to link households across months of the monthly CPS and to the ASEC.

I don’t know if this is relevant, but I altered one line of the do file to make it work. At the end of the “quietly infix” line I replaced “using cps_00002.dat” with “using cps_00002.dat.gz”, because that was the name given when I downloaded it from iPUMS, and before I changed that line Stata would give an error and not read in any data.

Any help is appreciated.

It sounds like you are trying to read a compressed (.gz) file in Stata, please correct me if I’m wrong. In order to read a data file into Stata it will need to be decompressed first (7zip is a good decompression program if you need one). Note that changing the .do file to point to a .dat file instead of a .dat.gz file will not decompress the data; you may be able to read the data into Stata but it will not read the data correctly.