Section 3.5.

In this section we do contingency tables using the command "table". This command displays one-way, two-way and multi-way tables. The cells may contain counts, percents and statistics from a chisquare test; they may also contain summary statistics of associated variables (i.e., any variable which is not used as a classification variable). The cells may also contain data. The classification variables must contain integer values between -10000 and +10000 or missing values (*). We work with the worksheet car.mtw. MTB > Retrieve 'C:\MTBSEW\INDUST~1\MTW\CAR.MTW'. First, we code the data, then we find the correlations, the contingency table and the chi-square statistic. We consider the variable turn diameter and the milles/gallon. These variables are c3 and c5. MTB > code (27:30.6) 1 (30.7:34.2) 2 (34.3:37.8) 3 (37.9:44.0) 4 c3 c7 MTB > code (12:18) 1 (19:24) 2 (25:35) 3 c5 c8 MTB > corre c3 c5 Correlation of Tur_Diam and MPG_City = -0.541 MTB > corre c7 c8 Correlation of C7 and C8 = -0.451 This is the correlation of the coded variables. The correlations are similar. MTB > table c7 c8; SUBC> chis. ROWS: C7 COLUMNS: C8 1 2 3 ALL 1 2 0 4 6 2 4 12 15 31 3 10 26 6 42 4 15 15 0 30 ALL 31 53 25 109 CHI-SQUARE = 34.990 WITH D.F. = 6 CELL CONTENTS -- COUNT Then, we can edit this table to get: Miles/gallon city

Turn diameter 12-18 19-24 25-35 Total
27-30.6 2 0 4 6
30.6-34.2 4 12 15 31
34.-37.8 10 26 6 42
37.9-44.0 15 15 0 30
Total 31 53 25 109

From this table, we see that as the turn diameter gets bigger, the miles/gallon get smaller. So, there exists dependence between the two variables and the two variables are negatively correlated. Now, the chisquare statistic measures the dependence between the two random variables. If the chisquare statistic is large, then the dependence between the two random variables is large. How large, the chisquare statistic has to be in order that the two random variables are dependent depends on k and m. mean-squared contingency=Phi^2=X^2/N=34.99/109=0.321 Tschuprow index=T=phi/ square-root ((k-1)(m-1))=square-root(.321/(3*2))=.231 Cramer index=C=phi square-root /(min(k-1,m-1)))=square-root(.321/2)=.401 We can get the table of the expected frequencies, by first entering the columns in the table of contingency in corresponding column variables, and then use the command chisquare: MTB > set c11 DATA> 2 4 10 15 DATA> end MTB > set c12 DATA> 0 12 26 15 DATA> end MTB > set c13 DATA> 4 15 6 0 DATA> end MTB > chis c11 c12 c13 Expected counts are printed below observed counts

C11 C12 C13 Total
1 2 0 4 6
1.71 2.92 1.38
2 412 1531
8.82 15.07 7.11
3 10 26 642
11.94 20.42 9.36
415 15 030
8.53 14.59 6.88
Total31 53 25109

ChiSq = 0.051 + 2.917 + 5.003 + 2.631 + 0.627 + 8.755 + 0.317 + 1.524 + 1.370 + 4.903 + 0.012 + 6.881 = 34.990 df = 6 3 cells with expected counts less than 5.0 We get the same chisquare statistic as before: 34.990 Possible subcommands of the command table are: MEANS, DATA, TOTPERCENTS, MEDIANS, N, CHISQUARE, SUMS, NMISS, MISSING, MINIMUMS, PROPORTION, NOALL, MAXIMUMS, COUNTS, ALL, STDEV, ROWPERCENTS, FREQUENCIES.

Comments to: Miguel A. Arcones