These sections are devoted to the study of the bootstrap. The bootstrap is recent statistical method consisting in doing simulation from the sample distribution. The bootstrap was introduced in 1979 by Bradley Efrom, which is a professor at Stanford. The cd rom coming with the book provides several macros wich you can use to do bootstraps. To find confidence we need the quantiles of certain distribution. Bootstrap confidence intervals are found using the distribution of the similations to find the quantiles. For example to find confidence intervals for the mean and variance of the Tur_Diam variable in the car file we do MTB > Retrieve 'C:\ISTAT\CAR.MTW'. Retrieving worksheet from file: C:\ISTAT\CAR.MTW Worksheet was saved on 11/25/1995 MTB > let c1=c3 MTB > erase c2-c5 MTB > Execute 'C:\ISTAT\CHA7\BOOTMEAN.MTB' 500. Executing from file: C:\ISTAT\CHA7\BOOTMEAN.MTB *****bootmean****************** sample 100 c1 c2; replace. let k1=mean(c2) let k2=stan(c2) stack k1 c3 c3 stack k2 c4 c4 end *********************** In this way we obtain 500 bootstrap samples with replacement from Tur_Diam variable. Now, in the columns c3 and c4 we have the bootstrap samples. We can use c3 to find the 95 % bootstrap confidence interval for the mean. For the variance of Tur_Diam, we use c4. We have that the .025 quantile of c3 is X((501)(.025))=X(12.525)=(.475)X(12)+(0.525)X(13) We have that the .975 quantile of c3 is X((501)(.975))=X(488.475)=(.525)X(488)+(0.475)X(489) So, we use the macro ************7-2A.MTB********************** sort c3 c5 let k11=(.475*c5(12))+(0.525*c5(13)) let k12=(.525*c5(488))+(0.475*c5(489)) note note note the confidence interval for mu is print k11 k12 sort c4 c6 let k13=(.475*c6(12))+(0.525*c6(13)) let k14=(.525*c6(488))+(0.475*c6(489)) note note note the confidence interval for sigma^2 is print k13 k14 note note note the classical t interval for this variable is tinter c1 end ********************************** MTB > Execute 'C:\ISTAT\7-2A.MTB' 1. Executing from file: C:\ISTAT\7-2A.MTB the confidence interval for mu is K11 34.8000 K12 36.1389 the confidence interval for sigma^2 is K13 2.94498 K14 3.61898 the classical t interval for this variable is N MEAN STDEV SE MEAN 95.0 PERCENT C.I. Num_Cyl 109 35.514 3.321 0.318 ( 34.883, 36.144) We obtained that the bootstrap confidence interval is very similar to the t interval. We find histograms for this data using stadandard methods To find confidence intervals for the median and the quartiles, we proceed as before, but we use the following macros ****bootmed******************* sample 100 c1 c2; replace. let k3=medi(c2) stack c3 k3 c3 end ******************** ****bootperc**************** sample 100 c1 c2; replace. sort c2 c3 let k1=(.75*c3(25))+(.25*c3(26)) let k2=(c3(50)+c3(51))/2 let k3=(.25*c3(75))+(.75*c3(76)) stack c4 k1 c4 stack c5 k2 c5 stack c6 k3 c6 end *************************** ********bootregr******************* sample 16 c4 c5; replace. let c6=k1+k2*c1+c5 let k3=sum(c1*c1) let k4=sum(c1) let k5=sum(c6) let k6=sum(c1*c6) let k7=(16*k6-k4*k5)/(16*k3-k4*k4) let k8=mean(c6)-k7*mean(c1) stack c7 k8 c7 stack c8 k7 c8 end *************************** We also can do hypothesis testing. For example, in the car data, we can test Ho: mu=35 versus Ha: mu>35: MTB > Retrieve 'C:\ISTAT\CAR.MTW'. Retrieving worksheet from file: C:\ISTAT\CAR.MTW ***********EXAMPLE 2********************* Next, we find 500 bootstrap samples for the turn-diameter data and find the t-statistict for each bootstrap sample. *****BOOTTEST****************** sample 100 c1 c2; replace. let k4=sqrt(100)*(mean(c2)-mean(c1))/stan(c2) stack k4 c3 c3 end *********************** MTB > let c1=c3 MTB > erase c2-c6 MTB > Execute 'C:\ISTAT\CHA7\BOOTTEST.MTB' 500. Executing from file: C:\ISTAT\CHA7\BOOTTEST.MTB Now, we do the test ******************* sort c3 c5 let k11=(.05*c5(475))+(0.95*c5(476)) note note note we reject Ho if t(data)>= than print k11 tinter c1 note note note The classical t test is ttest 35 c1; alte 1. end ******************* MTB > Execute 'C:\ISTAT\7-2B.MTB' 1. Executing from file: C:\ISTAT\7-2B.MTB we reject Ho if t(data)>= than K11 1.73287 N MEAN STDEV SE MEAN 95.0 PERCENT C.I. Num_Cyl 109 35.514 3.321 0.318 ( 34.883, 36.144) The classical t test is TEST OF MU = 35.000 VS MU G.T. 35.000 N MEAN STDEV SE MEAN T P VALUE Num_Cyl 109 35.514 3.321 0.318 1.62 0.055 To find the p-value we do MTB > code (-100:1.62)0 (1.62:100)1 c5 c6 MTB > mean c6 MEAN = 0.066000 The bootstrap p-value of the test is 0.066. The classical p-value of the test is 0.055