# understanding frequency distribution

I apologize for this basic question but I am a research librarian and not a statistician. I am trying to determine labor force participation rates from the CPS and I created a table. Could someone kindly point out how to interpret the results. I completed a number of video tutorials but the frequency distribution still confuses me. I included a screenshot below. Thank you in advance for your time and assistance. tribution table remains difficult.

hi. here’s another non-statistician’s partial answer.

the rows list mutually exclusive “attributes” of the subjects, i.e., one is either “At Work”, or “Unemployed, experienced worker”, “NILF, unable to work”, or “NILF, other”. each person is one of these, and *only* one of these. so, if you add up all these rows (for a particular year), you have all the subjects.

the bottom row (“COL TOTAL”) is a check, showing that the sum of the previous rows in that column add up to 100%. the right most column (“ROW TOTAL”) shows the sum of the previous columns in that row, and expresses that both as a number and as a percentage of the sum of the other rows in the right most column (which number is shown in the bottom right cell – right most column, bottom row – of the table).

columns: for your frequency table, the first colum (1996) says that, of the population in question (who *are* those people?), none were working, 45.8% were unable to work, while 54.2% were for other reasons “Not In the Labor Force”. those three numbers “exhaust” the population in question in 1996, so they add up to 100%, which is listed in the bottom row (“COL TOTAL”).

rows: if we look at a row, 10, say, “At work”, we see that only in year 2006 was that number greater than zero. in 2006, the number of people at work equaled 60.4%, again of the specific population you have selected; this number corresponds to 7,023.9 people in the population working. in the last column of row 10 (the column labeled “ROW TOTAL”), we see the same 7,023.9 people, and that, *in row 10*, these 7,023.9 people correspond to 14.6% of the 48,136.2 people at the bottom row (“COL TOTAL”).

*that* number, 48,136.2, is the sum of the numbers of the bottom row of the previous five columns. (in 1996, there were 11,668.8 people; in 2000, there were 4,087.5 people; in 2006, there were 11,626.1; in 2010, there were 7,845.6; and in 2016, there were 12,908.1.)

hope that helps.

Thank you so much. Just for clarification, I am working on a project calculating a jobs to population ratio for Hillsborough County (12057). Current literature suggests that the number of people claiming unable to work due to disability has increased along all age cohorts. I queried empstat, age(21-24) and disabwrk(2) Disability limits or prevents work.

The allocation of cases is 19 so I assume that only 19 people actually answered the question. the number below the percentage, for example row 10 has a value of 60. and 7,023.9. Does the 7,023.9 indicate that as the total number of people are at work although they are claiming a disability. Since I only have 19 cases, I am thinking I need to expand teh sample size.

Any ideas, does this make sense?

Many many thanks for your help.

If you run the same table without weighting (i.e. select a weight of “none” versus wtsupp), you will see where the 19 cases come from. The 19 valid cases correspond to 19 responses (across the years selected) to the ASEC survey. The weighting variable, wtsupp, allows you to create population based estimates from the samples. Therefore, for 2006, the percentage of individuals aged 21-34 with a disability that limits or prevents work that are at work is 60%; 7,023 is the weighted number (population value) of individuals that fit the same criteria.

Just a quick note about your age filter: in your original screenshot, you have the criteria as 21-34, however, in your follow-up question, you indicated 21-24 as your desired age range. Age(21-34) will give you 19 unweighted cases, whereas age(21-24) will give you 5 unweighted cases.