Section 3.5.
In this section we do contingency tables using the command "table".
This command displays one-way, two-way and multi-way tables.
The cells may contain counts, percents and statistics from a chisquare
test; they may also contain summary statistics of associated variables
(i.e., any variable which is not used as a classification variable).
The cells may also contain data. The classification variables must
contain integer values between -10000 and +10000 or missing values (*).
We work with the worksheet car.mtw.
MTB > Retrieve 'C:\MTBSEW\INDUST~1\MTW\CAR.MTW'.
First, we code the data, then we find the correlations, the contingency
table and the chi-square statistic. We consider the variable turn diameter
and the milles/gallon. These variables are c3 and c5.
MTB > code (27:30.6) 1 (30.7:34.2) 2 (34.3:37.8) 3 (37.9:44.0) 4 c3 c7
MTB > code (12:18) 1 (19:24) 2 (25:35) 3 c5 c8
MTB > corre c3 c5
Correlation of Tur_Diam and MPG_City = -0.541
MTB > corre c7 c8
Correlation of C7 and C8 = -0.451
This is the correlation of the coded variables. The correlations are similar.
MTB > table c7 c8;
SUBC> chis.
ROWS: C7 COLUMNS: C8
1 2 3 ALL
1 2 0 4 6
2 4 12 15 31
3 10 26 6 42
4 15 15 0 30
ALL 31 53 25 109
CHI-SQUARE = 34.990 WITH D.F. = 6
CELL CONTENTS --
COUNT
Then, we can edit this table to get:
Miles/gallon city
Turn diameter
| 12-18 |
19-24 |
25-35 |
Total |
27-30.6
| 2 |
0 |
4 |
6 |
30.6-34.2
| 4 |
12 |
15 |
31 |
34.-37.8
| 10 |
26 |
6 |
42 |
37.9-44.0
| 15 |
15 |
0 |
30 |
Total
| 31 |
53 |
25 |
109 |
From this table, we see that as the turn diameter gets bigger,
the miles/gallon get smaller.
So, there exists dependence between the two variables and the two
variables are negatively correlated.
Now, the chisquare statistic measures the dependence between
the two random variables.
If the chisquare statistic is large, then the dependence between
the two random variables is large.
How large, the chisquare statistic has to be in order that the two
random variables are dependent depends on k and m.
mean-squared contingency=Phi^2=X^2/N=34.99/109=0.321
Tschuprow index=T=phi/ square-root ((k-1)(m-1))=square-root(.321/(3*2))=.231
Cramer index=C=phi square-root /(min(k-1,m-1)))=square-root(.321/2)=.401
We can get the table of the expected frequencies, by first entering
the columns in the table of contingency in corresponding column variables,
and then use the command chisquare:
MTB > set c11
DATA> 2 4 10 15
DATA> end
MTB > set c12
DATA> 0 12 26 15
DATA> end
MTB > set c13
DATA> 4 15 6 0
DATA> end
MTB > chis c11 c12 c13
Expected counts are printed below observed counts
| C11 |
C12 |
C13 |
Total |
| | | | |
| | | | |
1 | 2 |
0 |
4 | 6 |
| 1.71 |
2.92 |
1.38 | |
| | | | |
| | | | |
2 |
4 | 12 |
15 | 31 |
| 8.82 |
15.07 |
7.11 | |
| | | | |
| | | | |
3 | 10 |
26 |
6 | 42 |
| 11.94 |
20.42 |
9.36 | |
| | | | |
| | | | |
4 | 15 |
15 |
0 | 30 |
| 8.53 |
14.59 |
6.88 | |
| | | | |
| | | | |
Total | 31 |
53 |
25 | 109 |
ChiSq = 0.051 + 2.917 + 5.003 +
2.631 + 0.627 + 8.755 +
0.317 + 1.524 + 1.370 +
4.903 + 0.012 + 6.881 = 34.990
df = 6
3 cells with expected counts less than 5.0
We get the same chisquare statistic as before: 34.990
Possible subcommands of the command table are:
MEANS, DATA, TOTPERCENTS, MEDIANS, N,
CHISQUARE, SUMS, NMISS, MISSING, MINIMUMS, PROPORTION,
NOALL, MAXIMUMS, COUNTS, ALL, STDEV, ROWPERCENTS, FREQUENCIES.
