Replicating data



I’m trying to replicate the following data but am having trouble selecting the right variables and samples. Could someone please help me?


I am also having trouble identifying what variables, or data sources for that matter, are used to generate this table. There is information about health insurance coverage in both the CPS (available via IPUMS CPS) and the NHIS (available via IPUMS Health Surveys) data. Both of these data sources could potentially have been used to construct this table. Could you provide more information about where this table comes from? A link to the article would be best, but a citation will work too.


It’s from this paper. Could you help me? Thank you


Thanks for providing a link to the paper. It is always a bit tricky to figure out the specific data sources from published tables such as this, however, I think I have an idea of where this information is coming from. In general, IPUMS CPS health insurance variables are listed here. Specifically, you’ll likely want to use the information in the PRIVOWN, PRIVDEP, PRIVWHO1, and PRIVWHO2 variables. Using the PRIVWHO1, PRIVWHO2, and LINENO variables, researchers can identify the policyholder(s) whose privately-purchased insurance provided coverage for someone coded as “Yes” in PRIVDEP. Finally, the VERIFY variable identifies if an individual actually had health insurance or not during the previous calendar year.

Note that this is all speculative and based on the limited description of the data by the authors of the paper linked above. If you are in need of more detailed information you may have some luck reaching out directly to the authors of the study.