Math 147 - Final Exam

Top

Your score

Perfect score 12 15 18 15 25 25 35 30 175

Page 1 Page 2 Page 3 Page 4 Page 5 Page 6 Page 7 Page 8 Total

Math 147 - Final Exam - Fall 1997

Name: ______________________________________ Section Number: ______________

Show ALL necessary work. Solutions with no supporting work where it is necessary will receive NO credit. NEATNESS COUNTS. All tables required for completion of problems are provided at the end of the test. P-values computed from incomplete tables should be presented as an interval. Carry out all your calculations to at least two digits to the right of the decimal point. When you are finished, hand in the exam to your T.A. Good Luck.

For questions 1 - 20 circle the answer which best completes the sentence or answers the question. (3 pts each)

When performing tests of significance,

P-values are computed assuming the null hypothesis is correct.
small P-values generally support the null hypothesis.
a test statistic measures the difference between the data and what is expected on the null hypothesis.
both (a) and (c) are true.
both (b) and (c) are true.

When working with sample surveys it is important to remember that,

the population parameter is likely to change from sample to sample.
non-respondents may be very different from respondents.
hand-picking samples according to key characteristics to resemble the population was shown to be a good method for reducing sampling bias.
probability methods allow the interviewer to choose the subjects in a survey.
none of the above are true.

When drawing at random from a box with replacement,

the probability histogram for the contents of the box ( when put into standard units) will follow the normal curve if the number of draws is large.
the probability histogram for the average of the draws ( when put into standard units) will follow the normal curve if the number of draws is large.
if the number of draws is quadrupled then the expected value for the average quadruples.
if the number of draws is quadrupled then the expected value for the average doubles.
both (a) and (d) are true.

Which of the following statements is true?

The inter-quartile range is a measurement of the center of a set of data.
In a histogram with a long left tail, more than 50% of the data lies to the left of the average.
The median of a large data set is usually strongly affected by the presence of outliers.
Both (b) and (c) are true.
None of the above statements are true.

A fair coin is tossed 100 times.

the expected percentage of heads is the same as the expected percentage of tails.
the SE for the percentage of heads in the first 25 tosses is the same as the SE for the percentage of tails in the last 25 tosses.
the expected difference between the number of heads in the first 50 tosses and the number of tails in the last 50 tosses is 0.
both (b) and (c) are true.
all of the above statements are true.

Company A claims their batteries last an average of 10.23 hours in most electronic devices. Company

B claims this is false, and decides to test the time to failure for 12 batteries from company A.

What test of significance should they use?

A c ² - test with 11 degrees of freedom.

A two sample z-test for difference in averages.

A z-test using the normal curve.

A t-test using the Student’s curve with 11 degrees of freedom.

None of the above tests should be used.

When studying the relationship between two variables, we should remember

the correlation coefficient has the same units as the dependent variable.
using the regression line far away from the average of the independent variable ( in terms of standard units) is an example of the regression fallacy.
associated with each increase of one SD of the independent variable is an increase of r SDs of the dependent variable.
the regression line for y on x estimates the average value for x corresponding to each value of y.
all of the above are true.

One hundred draws were made from a box with unknown contents. A 95% confidence interval

for the average of the tickets in the box was then computed. It is true that,

the SD of the box was estimated with the SD of the population.
a 68% confidence interval computed from the same data will be shorter ( from endpoint to endpoint) then the 95% confidence interval.
if the number of draws is quadrupled, and a new 95% confidence interval is computed, we expect the length of the new interval to be about one-quarter that of the original 95% confidence interval.
both (b) and (c) are true.
all of the above statements are true.

The event A has a 40% chance of happening. The event B has a 80% chance of happening. If the

chance that both events happen together is 32% then,

the two events are mutually exclusive.
the two events are dependent.
the two events are independent.
both (a) and (b) are true.
no conclusion can be drawn because there isn’t enough information.

A machine part was weighed 36 times. These averaged out to be 780 grams, and their SD was 12

grams. It was then discovered that a 20 gram weight was on the scale during each of the measurements.

We can conclude,

a 95% confidence interval for the average of the 36 weighings of the part is 760 ± 4 grams.

the true weight of the part is 760 grams.

95% of the weighings were in the range 780 ± 4 grams.

a 95% confidence interval for the weight of the part is 760 ± 4 grams.

both (c) and (d) are true.

If we know we should be using Student’s curve for computing a P-value, then

the SD of the box is unknown.
the number of observations is small.
SD⁺should be used as our estimate of the SD of the box.
both (b) and (c) are true.
all of the above statements are true.

In Mendel’s experiment, garden peas of pure green strain (the recessive gene) were bred with strains

of pure yellow (the dominant gene), to produce first generation hybrids. These hybrids were then bred to

produce second generation hybrids. If we examine four second generation hybrids at random,

we expect 3 of them to be yellow.
the probability that exactly one of the four examined hybrids is yellow is 75%.
if three of the four examined plants are green, the chance of the fourth being yellow increases.
both (a) and (b) are true.
both (a) and (c) are true.

Homoscedasticity in a regression setup,

means the regression effect is due to something other than the spread around the SD line.
indicates the data had a correlation coefficient very close to 1 or -1.
means the scatter diagram has a generally elliptical (football) shape.
both (a) and (b) are true.
both (a) and (c).

A group of doctors has decided to test the effectiveness of a particular surgery on lengthening the life-span of people with a type of terminal illness. They have a total of twenty patients with the disorder, and they perform the surgery on the 10 of them they think have the best chances for surviving the surgery. The two groups were then compared. Their study,

is likely to overestimate the value of the surgery in helping people with this illness.
contained a treatment and a control group.
relied on the use of historical controls.
both (a) and (b) are true.
all of the above statements are true.

In randomized controlled experiments,

we expect the treatment group to be similar to the control group.
the investigators decide who will be treated and who will not.
a placebo must be used.
confounding factors are harder to correct for.
none of the above statements are true.

A certain SAT Exam has an average score of 600 and an SD of 100. What score best approximates the

20^th percentile?

When drawing at random with replacement from a box of tickets, increasing the sample size

(number of draws) by a factor of 9

multiplies the SE for the average by a factor of 3.
divides the SE for the average by a factor of 9.
multiplies the SE for the average by a factor of 9.
divides the SE for the average by a factor of 3.
has no effect on the SE for the average.

When performing a test of significance a box model is set up, the null and alternative hypothesis are

formulated and

the P-value gives the probability that the null hypothesis is true.

the Student’s curve is used if the SD of the box is known.

a c ²- test is used if the null hypothesis is that the data came from a particular chance model.

the P-value gives the probability of getting a test statistic very close to the null hypothesis.

both (b) and (c) are true.

In a Gauss model for measurement error, which of the following are true?

The SD of the error box is zero.
The average of many repeated measurements is used as an estimate of the average of the error box.
The SD of many repeated measurements is used as an estimate of the SD of the error box.
Both (b) and (c) are true.
All of the above statements are true.

In a situation modeled by drawing at random from a box, which of the following is true?

The SD for the box says how far an individual draw is from the average of the box.
The SE for sample average says how far the sample average is from the average of the box.
The SE for sample average says how far an individual draw is from the average of the box.
The SD for the box says how far the sample average is from the average of the box.

(e) Both (a) and (b) are true.

21. A fair die is rolled 4 times. Answer the following questions about the outcome of the rolls.

(a) ( 4 points) What is the probability all the rolls show 3 or more spots?

(b) ( 5 points) What is the probability of getting at least one roll with 6 spots in the four rolls?

( 7 points ) What is the probability that the sum of the number of spots on the first roll and the

the fourth roll is 9?

(d) ( 3 points) Consider the following two sequences of rolls: ( I ) 3, 3, 3, 3 ( II ) 1, 3, 4, 6

Which of the following is correct? (circle only one)

(1) Sequence ( I ) is more likely.

(2) Sequence ( II ) is more likely.

(3) Both sequences are equally likely.

(6 points) If the same fair die was rolled a total of seven times what is the probability that exactly

four of the rolls show five spots?( give your answer as a percentage)

22. The Ukraine has long kept records of their wheat harvests for famine prediction. Each year, the

number of sunny days before August has been recorded, as well as the size of the final harvest at the

end of September. The following data has been gathered over the last 103 years.

Sunny days (before August): Ave. = 100 SD = 25

Crop size ( tons of wheat) : Ave. = 1500 SD = 200

The scatter diagram is football shaped and r = 0.55

(a) ( 7 points) What is the equation of the regression line for predicting the yearly harvest from the

number of sunny days?

( 4 points) If this year the number of sunny days before August was 127, what would your

estimate of the size of this year’s wheat crop be?

years with 127 sunny days before August, what percentage are famine years?

(d) ( 6 points) If a particular year had a final harvest of 1530 tons of wheat, what would you

use for your estimate of the number of sunny days before August that year?

(10 points) A box contains 18 red marbles and 6 green marbles. If we make 400 draws from this box with

replacement, what is the approximate probability that we draw exactly 103 green marbles. ( Your

answer must be given as a percentage.)

24. A friend of yours wants to test your new found statistical knowledge, so she prepares a box containing

tickets labeled 1, 3, and 5. The proportions of these tickets, however, are unknown to you. So you

make 400 draws at random from the box with replacement and obtain 90 1’s, 98 3’s and 212 5’s.

( 10 points) Find a 99.7% confidence interval for the percentage of ones in the box.

( 15 points) Your friend claims to have put 10 1’s, 10 3’s and 20 5’s into the box. Test if the

draws you made support her claim. Clearly label your test statistic as well as the p-value for your

test. Also state your conclusion in non-statistical language.

25. ( 15 points) A computer company guarantees that their monitors have an average life of 5000 hours.

A consumer activists group has tested the life of 10 of these monitors. [ such a small number were

tested because of their limited budget] The results showed an average lifetime of 4941 hours with a SD

of 100 hours. Test if the data supports the activists claim that the monitors have an average life of less

then 5000 hours. Be sure to state the null and alternative hypothesis (start with a box -model) , the test

statistic, and the P-value. Also state your conclusion in non-statistical language.

26. (15 points) Doctor’s know that calcium is an important nutrient for human development. However,

the mechanism for its adsorption into the body is still in question To test the effectiveness of a

particular fat in aiding adsorption of calcium, 200 people were randomly divided into two groups of

size 100. Both groups ate exactly the same meal, except the treatment group’s meal had 20 mg.

of the fat added to it. One hour after eating, the control group had an average serum calcium level

of 388.44 ppm with a SD of 5.5 ppm, while the treatment group had an average serum calcium level of

391.00 ppm with a SD of 7.5 ppm. Perform a two sample z-test to test the hypothesis that the average

serum calcium is the same for each group vs. the one-sided alternative that the group with the added fat

has a higher level. Clearly label the z-statistic and the p-value. State your conclusion about the

effectiveness of the fat in aiding in calcium adsorption in non-statistical language.

| Top | Home | AnswerKey |