Sections 7.2 and 7.3.



These sections are devoted to the study of the bootstrap. 
The bootstrap is recent statistical method consisting in doing simulation 
from the sample distribution. The bootstrap was introduced in 1979 
by Bradley Efrom, which is a professor at Stanford.

The cd rom coming with the book provides several macros wich you 
can use  to do bootstraps.

To find confidence we need the quantiles of certain distribution. 
Bootstrap confidence intervals are found using the distribution of the
similations to find the quantiles.

For example to find confidence intervals for the mean and variance 
of the Tur_Diam variable in the car file we do

MTB > Retrieve  'C:\ISTAT\CAR.MTW'.
Retrieving worksheet from file: C:\ISTAT\CAR.MTW
Worksheet was saved on 11/25/1995
MTB > let c1=c3
MTB > erase c2-c5
MTB > Execute 'C:\ISTAT\CHA7\BOOTMEAN.MTB' 500.
Executing from file: C:\ISTAT\CHA7\BOOTMEAN.MTB

*****bootmean******************
sample 100 c1 c2;
replace.
let k1=mean(c2)
let k2=stan(c2)
stack k1 c3 c3
stack k2 c4 c4
end
***********************
In this way we obtain 500 bootstrap samples with replacement from 
Tur_Diam variable.

Now, in the columns c3 and c4 we have the bootstrap samples.
We can use c3 to find the 95 % bootstrap confidence interval for the mean.
For the variance of Tur_Diam, we use c4.

We have that the .025 quantile of c3 is
X((501)(.025))=X(12.525)=(.475)X(12)+(0.525)X(13)
We have that the .975 quantile of c3 is
X((501)(.975))=X(488.475)=(.525)X(488)+(0.475)X(489)
So, we use the macro
************7-2A.MTB**********************
sort c3 c5
let k11=(.475*c5(12))+(0.525*c5(13))
let k12=(.525*c5(488))+(0.475*c5(489))
note 
note 
note the confidence interval for mu is
print k11 k12
sort c4 c6
let k13=(.475*c6(12))+(0.525*c6(13))
let k14=(.525*c6(488))+(0.475*c6(489))
note 
note 
note the confidence interval for sigma^2 is
print k13 k14
note 
note 
note the classical t interval for this variable is
tinter c1
end
**********************************

MTB > Execute 'C:\ISTAT\7-2A.MTB' 1.
Executing from file: C:\ISTAT\7-2A.MTB
 
 
the confidence interval for mu is

K11      34.8000
K12      36.1389
 
 
the confidence interval for sigma^2 is

K13      2.94498
K14      3.61898
the classical t interval for this variable is

             N      MEAN    STDEV  SE MEAN   95.0 PERCENT C.I.
Num_Cyl    109    35.514    3.321    0.318  (  34.883,  36.144)

We obtained that the bootstrap confidence interval is very similar
to the t interval.

We find histograms for this data using stadandard methods

To find confidence intervals for the median and the quartiles, we 
proceed  as before, but we use the following macros
****bootmed*******************
sample 100 c1 c2;
replace.
let k3=medi(c2)
stack c3 k3 c3
end
********************

****bootperc****************
sample 100 c1 c2;
replace.
sort c2 c3
let k1=(.75*c3(25))+(.25*c3(26))
let k2=(c3(50)+c3(51))/2
let k3=(.25*c3(75))+(.75*c3(76))
stack c4 k1 c4
stack c5 k2 c5
stack c6 k3 c6
end
***************************

********bootregr*******************
sample 16 c4 c5;
replace.
let c6=k1+k2*c1+c5
let k3=sum(c1*c1)
let k4=sum(c1)
let k5=sum(c6)
let k6=sum(c1*c6)
let k7=(16*k6-k4*k5)/(16*k3-k4*k4)
let k8=mean(c6)-k7*mean(c1)
stack c7 k8 c7
stack c8 k7 c8
end
***************************

We also can do hypothesis testing.
For example,  in the car data,  we can test Ho: mu=35 versus Ha: mu>35:

MTB > Retrieve  'C:\ISTAT\CAR.MTW'.
Retrieving worksheet from file: C:\ISTAT\CAR.MTW

***********EXAMPLE 2*********************
Next, we find 500 bootstrap samples for the turn-diameter data and 
find the t-statistict for each bootstrap sample.

*****BOOTTEST******************
sample 100 c1 c2;
replace.
let k4=sqrt(100)*(mean(c2)-mean(c1))/stan(c2)
stack k4 c3 c3
end
***********************

MTB > let c1=c3
MTB > erase c2-c6
MTB > Execute 'C:\ISTAT\CHA7\BOOTTEST.MTB' 500.
Executing from file: C:\ISTAT\CHA7\BOOTTEST.MTB
Now, we do the test

*******************
sort c3 c5
let k11=(.05*c5(475))+(0.95*c5(476))
note 
note 
note we reject Ho if t(data)>= than
print k11 
tinter c1
note 
note 
note The classical t test is
ttest 35 c1;
alte 1. 
end
*******************

MTB > Execute 'C:\ISTAT\7-2B.MTB' 1.
Executing from file: C:\ISTAT\7-2B.MTB

 
we reject Ho if t(data)>= than

K11      1.73287

             N      MEAN    STDEV  SE MEAN   95.0 PERCENT C.I.
Num_Cyl    109    35.514    3.321    0.318  (  34.883,  36.144)

 
The classical t test is

TEST OF MU = 35.000 VS MU G.T. 35.000

             N      MEAN    STDEV   SE MEAN        T    P VALUE
Num_Cyl    109    35.514    3.321     0.318     1.62      0.055

To find the p-value we do
MTB > code (-100:1.62)0 (1.62:100)1 c5 c6
MTB > mean c6
   MEAN    =    0.066000

The bootstrap p-value of the test is 0.066.
The classical p-value of the test is 0.055
Sections 7.2 and 7.3.

Comments to: Miguel A. Arcones