Section 3.5.



In this section we do contingency tables using the command "table". 
This command  displays one-way, two-way and multi-way tables.  
The cells may contain counts, percents and statistics from a chisquare 
test;  they may also contain summary statistics of associated variables 
(i.e., any variable which is not used as a   classification variable).  
The cells may also contain data.  The classification variables must 
contain integer values between -10000 and +10000 or missing values (*).  

We work with the worksheet car.mtw.
MTB > Retrieve  'C:\MTBSEW\INDUST~1\MTW\CAR.MTW'.

First, we code the data, then we find the correlations, the contingency  
table and the chi-square statistic. We consider the variable turn diameter 
and the milles/gallon. These variables are c3 and c5.

MTB > code (27:30.6) 1 (30.7:34.2) 2 (34.3:37.8) 3 (37.9:44.0) 4 c3 c7
MTB > code (12:18) 1 (19:24) 2 (25:35) 3 c5 c8

MTB > corre c3 c5
Correlation of Tur_Diam and MPG_City = -0.541

MTB > corre c7 c8
Correlation of C7 and C8 = -0.451
This is the correlation of the coded variables. The correlations are similar.

MTB > table c7 c8;
SUBC> chis.
 
 ROWS: C7     COLUMNS: C8
 
           1        2        3      ALL
  
  1        2        0        4        6
  2        4       12       15       31
  3       10       26        6       42
  4       15       15        0       30
 ALL      31       53       25      109
 
CHI-SQUARE =    34.990   WITH D.F. =    6
  CELL CONTENTS --
                  COUNT
Then, we can edit this table to get:

                                      Miles/gallon city

Turn diameter	12-18	19-24	25-35	Total
27-30.6	2	0	4	6
30.6-34.2	4	12	15	31
34.-37.8	10	26	6	42
37.9-44.0	15	15	0	30
Total	31	53	25	109


From this table, we see that as the turn diameter gets bigger, 
the miles/gallon get smaller.
So, there exists dependence between the two variables and the two 
variables are negatively correlated.
Now, the chisquare statistic measures the dependence  between 
the two random variables.
If the chisquare statistic is large, then the dependence between 
the two random variables is large.
How large, the chisquare statistic has to be in order that the two 
random variables are dependent depends on k and m.
mean-squared contingency=Phi^2=X^2/N=34.99/109=0.321
Tschuprow index=T=phi/ square-root ((k-1)(m-1))=square-root(.321/(3*2))=.231
Cramer index=C=phi square-root /(min(k-1,m-1)))=square-root(.321/2)=.401

We can get the table of the expected frequencies, by first entering 
the columns in the table of contingency in corresponding column variables, 
and then use the command chisquare:

MTB > set c11
DATA> 2 4 10 15
DATA> end
MTB > set c12 
DATA> 0 12 26 15
DATA> end
MTB > set c13 
DATA> 4 15 6 0
DATA> end
MTB > chis c11 c12 c13

Expected counts are printed below observed counts

C11 C12 C13 Total

1 2 0 4 6

1.71 2.92 1.38

2 4 12 15 31

8.82 15.07 7.11

3 10 26 6 42

11.94 20.42 9.36

4 15 15 0 30

8.53 14.59 6.88

Total 31 53 25 109

ChiSq = 0.051 + 2.917 + 5.003 + 2.631 + 0.627 + 8.755 + 0.317 + 1.524 + 1.370 + 4.903 + 0.012 + 6.881 = 34.990 df = 6 3 cells with expected counts less than 5.0 We get the same chisquare statistic as before: 34.990 Possible subcommands of the command table are: MEANS, DATA, TOTPERCENTS, MEDIANS, N, CHISQUARE, SUMS, NMISS, MISSING, MINIMUMS, PROPORTION, NOALL, MAXIMUMS, COUNTS, ALL, STDEV, ROWPERCENTS, FREQUENCIES.

Comments to: Miguel A. Arcones

	C11	C12	C13	Total


1	2	0	4	6
	1.71	2.92	1.38


2	4	12	15	31
	8.82	15.07	7.11


3	10	26	6	42
	11.94	20.42	9.36


4	15	15	0	30
	8.53	14.59	6.88


Total	31	53	25	109