Association chapter 5 Notes
Q.1) What is categorical or qualitative data? Give two examples of categorical data.
Categorical or Qualitative Data:
The name categorical variable refers to the variables for which the measure scale consists of a set for categories, in other words the variable(s) which can not be measure quantitatively. Categorical data that are only classified into categories
- A person may be categorized as beautiful or ugly.
- A recovery from an operation might be rated as “completely recovered”, “nearly recovered” ,’only some what recover ‘ and “not at all recovered”
Q.2) What does the chi-squared test actually test in the case of single categorical variable?
It is frequently desired to obtain a sample of nominal or ordinal scale data and to infer whether the population from which it comes conform to a specified theoretical distribution, the following formula of chi-square is used as a measure of how for a sample distribution deviates from a theoretical distribution
i.e X2 =∑Ki=l (Oi – E i) / E i with (k-1) degrees of freedom. Where O, is the observed frequency and E, is the expected frequency in class i if the null hypothesis is true. The X2 test involves the following steps;
- Compute the expected frequencies of each category by multiplying the hypothesized population proportion by the total sample size that is
Expected frequency = population proportion x Sample size.
- Divide the square difference between observed and expected frequencies of each category by their corresponding expected frequencies and then sum the resulting values, the values of X- statistics will be find.
- Compare the computed chi-square to the value in the chi-square table with the appropriate degree of freedom.
- If calculated chi-square statistic is larger then the tabulated value, reject the null hypothesis and accept the alternative hypothesis
Q.3) What is the difference between the nominal scale and ordinal scale? Give at least two examples of the data for each scale.
If the level of Categorical variables does not have a natural ordinary, such variables are called nominal. For example;
· The categories of religious affiliation are Muslim, catholic, Jewish, Protestant, etc.
· The mode of transportation is car, bus, railway, motorcycle, cycle, etc. For nominal variables the order of listing of the categories is irrelevant to the statistical analysis.
Many Categorical variables do have ordered levels such variables are called ordinal.
· The ordinal scale of the size of the car that is, small, compact, mid size, large.
· The attitude to wards the co education are strongly disapproved ,approve
Q.4) What is a two-way classification? Give two examples of bivariate categorical data.
when categorical data is classified for two independent variables,are called two way classifications or bivariate categorical data.
To compare the effectiveness of two different teaching methods ,say A and B .Suppose 200 students were included in a study.100 are randomly selected students are given to teaching method A and remaining 100 to B.After the result the 200 students were rated as “excellent” , “good”,”average” and “poor”.
2.Consider a random sample of 200 students is classified regarding to their grades A,B and C and gender.
Q.6) Can a chi-squared test be done if only percentages are available as the basic data? If yes, explain how.
No,chi charts cannot be done if just percentages are available.
Q.7) In a primary school, it is found that 40% of the students are female; 35% of the female students are from rural area; 60% of the male students are from Urban area. The total number of students in that school are 300. Complete the following 2×2 contingency table
And answer the following.
a.Total number of Male students.
b.Total number of Female students.
c.Total students belonging to rural area.
d.Total students belonging to urban area.
e. Number of female students belonging to Urban area.
Total number of students in the school=300
Total number of male student=300-120
The number of female student from rural area=35/100*120=42
Total number of female student =40% =40/100*300=120
Total number of students belonging to rural area=114
Total number of student belonging to urban area=186
Total number of female students from urban area=120-42=78
Q.8) If we consider terrorist activity against businesses during the first three-quarters of 1985, in Latin America there were 443 bombings. 56 attacks on installations, and 7 assassinations. In Europe during this same time period, there were 101 bombings, 6 attacks on installations, and 10 assassinations. Perform a chi-squared test of independence. Use 0.05 level of significance.
H0: There is no association between the countries and terrorist activities.
H1: There is an association between the countries and terrorist actvities.
level of significance:
α = 0.05
X2 =∑ (Oij – Eij)2 / Eij
|Bombing||Attacks on installations||Assassinations||Total|
since our calculated value of X2=21.252 falls in rejection region.so we have sufficient evidence against the null hypothesis,therefore we accepts H1 i.e there is an association between countries and terrorist activites.
Q.15) Competitors in a beauty contest are ranked by two judges in the following order.
First judge 3 1 4 2 5 9 8 7 6
Second judge 5 3 2 1 4 6 9 7 8
Calculate spearman’s rank correlation coefficient.
Q.16) Gymnast were ranked by two judges in the following table. Calculate spearman’s rank correlation.
First judge 4 5 8 3 1 2 7 6
Second judge 5 4 6 1 2 3 7 8