Honors Introduction
to Statistics
Practice Questions
for Exam 2
1. The Admissions
Office has developed a new 10 minute video to send to prospective students to
extol the virtues of attending
(a) Carefully describe an example of a statistical
experiment that could be applied to this situation. Give explicit instructions on what the 12
students should do and be sure to indicate how randomization is used as part of
your experiment.
(b) What specific question would you ask to measure a
response variable in this experiment?
(c) Would you classify your response variable as categorical or quantitative?
(d) Would you classify the experiment you have described as
a randomized comparative experiment,
a matched pairs design, or something
else? Explain briefly. Why is your type of design the best choice?

2. The age distribution of students at City U is modeled by the distribution shown to
the right.
(a) Approximate the median student age on the graph based on the distribution.
Explain how you made your approximation.
(b) Do you expect the mean student age to be higher or lower than the median?
Explain briefly.
Approximate the mean student age, based on the distribution.
(c) If we took random samples of size 5 from the student population, computed the
average age within the sample, and looked at the distribution of these averages, would
you expect the mean for the new distribution to be larger than, smaller than, or the same
as, the mean you estimated in Part
(b)? Explain briefly.
(d) If we took random samples of size 5 from the student population, computed the average age within the sample, and looked at the distribution of these averages, would you expect the standard deviation for the new distribution to be larger than, smaller than, or the same as, the standard deviation of the original distribution shown above? Explain briefly.
3. Despite the difficulties, it is sometimes possible to build a strong case for causation in the absence of experiments. The evidence that smoking causes lung cancer is about as strong as non-experimental evidence can be. What criteria are necessary to suggest causation when we cannot do an experiment?
4. You are interested in determining the level of student support for student government activities. Create a question that is clearly biased, and one that is (to the extent possible) not biased. Briefly explain how you expect responses to the two questions to differ.
5. A study of education followed a large group of
fifth-grade children to see how many years of school they eventually
completed. Let X be
the highest year of school that a randomly chosen fifth grader completes. (Students who go on to college are included
in the outcome X = 12.) The study found
the following probability distribution for X.
|
Years |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
|
Probability |
0.010 |
0.007 |
0.007 |
0.013 |
0.032 |
0.068 |
0.070 |
0.041 |
0.752 |
(a) Carefully explain how you know this is a
legitimate probability distribution.
(b) What percent of fifth graders
eventually finished 12th grade?
(c) Explain what P(X = 4) = 0.010 means in terms
of children completing school.
(d) Find P(X
6).
(e)- Find the probability that a randomly chosen
5th grader finishes 12th grade, given that the student finished 9th grade.
6. Generate two random numbers between 0 and 1 and take Y to be their sum. The sum Y can take any value between 0 and 2. The density curve looks like a triangle with base from 0 to 2.
(a) Sketch a graph of the
density curve.
(b) What is the height of the triangle?
How do you know?
(c) What is the probability that Y is less than 1? (Shade the area that represents the probability on your density curve, then find that area.)
(d) What is the probability that Y is less than 0.5? (Again, shade the corresponding area on your density curve.)
7. Tetrahedral dice are shaped like pyramids, with 4 triangular faces, each of which is an equilateral triangle (all sides have the same length). Assume each die has sides labeled 1, 2, 3 and 4. When you roll a tetrahedral die, you “roll” the number on the down face.
(a) Give a probability model for rolling two such dice.
(b) What is the probability
the sum of the down-faces is 5?
8. A bottling company
uses a filling machine to fill glass bottles with beer. The bottles are supposed to contain 300
ml. In fact, contents vary according to
a normal distribution with mean
ml and standard
deviation
ml.
a. What is the
probability that an individual bottle contains less than 295 ml?
b. What is the
probability that the mean contents of the bottles in a six-pack is less than
295 ml?
c. What
important result guarantees the difference between the previous two
probabilities?
9. The carapace lengths (in mm) of 15 mature
gopher tortoises randomly selected from the preserve in Abacoa are shown below.
320 295 284 303 315 308 303 305
272 315 291 294 276 318 278
a. Examine these data for shape, center, spread, and
outliers.
b. We are
making three assumptions in our use of inference right now. List those three assumptions and discuss the
degree to which each is or is not met in this situation.
c. Assuming that the standard deviation of carapace lengths of
all mature gopher tortoises in the preserve is s = 16 mm, give
a 95% confidence interval for the mean carapace length of all mature gopher
tortoises in the preserve. Write a
complete sentence interpreting the meaning of your interval. (Your sentence
should say something about tortoises!).
d. Estimate the sample size you would you need to compute a 95% confidence interval with a margin of error less than 3 mm.
10. A social psychologist report: “In our sample, ethnocentrism was significantly higher (P < 0.05) among church attenders than among non-attenders.” Explain what this means in language understandable to someone who knows no statistics. Do not use the word “significance” in your answer.
11.
A random number generator is supposed to produce random numbers that are
uniformly distributed on the interval from 0 to 1. If this is true, the numbers generated come
from a population with mean
and standard deviation
. Unfortunately,
producing a good random number generator is quite difficult, and it is well
known that many such generators are not particularly random. You decide to test Excel’s random number
generator by generating 100 random numbers between 0 and 1. You want to perform a hypothesis test to
decide if Excel generates truly random numbers by looking at the mean from your
sample.
a. State your
hypotheses.
b. Suppose the
mean of the 100 numbers generated by Excel is
. Calculate the value
of the test statistic. Find the p-value
for the test.
c. Is the
result significant at the 5% level? At the 1% level?
d. What can you
conclude (or not conclude) based on your test?
(Your answer should say something about random numbers!)
12.
True or False
________ The probability of an event can be
described as the proportion of times the event occurs in many repeated trials
of a random phenomenon.
________ Two events are independent when they
cannot occur together.
________ If we compute two confidence
intervals, an 80% confidence interval and a 90% confidence interval, based on
the same sample, the 80% confidence interval will be narrower.
________ The most important assumption in using
techniques of inference is that our samples are SRSs.
________ Significance tests can tell us if the
observed effect was likely due to chance.
13.
Your mail-order company advertises that it ships 90% of its orders
within three working days. You select an
SRS of 100 of the 5000 orders received in the past week for an audit. The audit reveals that 86 of these orders
were shipped on time.
a. Explain why we expect the number of
on-time shipments in an SRS of size 100 to obey a binomial distribution. What are the relevant parameters?
b. If the company really ships 90% of its
orders on time, what is the probability that 86 or fewer in an SRS of 100
orders are shipped on time? (Use the
normal approximation to the binomial distribution.)
c. A critic
says, “You claim 90%, but in your sample the on-time percentage is only
86%. So the 90% claim is wrong.” Explain in simple language why your
probability calculation in (a) shows that the result of the sample does not
refute the 90% claim.
14.
Compute the following probabilities, based on a standard deck of 52
cards (no jokers).
a. Draw one
card. What is the probability you draw a
spade?
b. Draw one
card. What is the probability you draw a
jack given that you draw a face card?
c. Draw two
cards. What is the probability that your
second card is a spade, given that the first card you drew was a spade?
d. Draw two
cards. What is the probability that your
second card is a spade, given that the first card you drew was a heart?
e. Draw two
cards. What is the probability you draw
two cards in the same suit?