STA 2023 Practice Exam 1
1. When applying for
financial aid, City U students and their families must report household income
(as computed for tax purposes). Family
incomes, in thousands of dollars, for a group of 34 incoming students are shown
below.
22.1 24.5 25.0 29.3 31.2 39.8 40.0 41.0 44.2 45.6
46.7 48.8
49.1 50.2 50.4 51.3 54.1 57.5 59.5 62.1
64.0 64.0
68.3 68.9 70.1 74.4 75.4 80.0 81.5 86.9
98.8 110.3 129.5 191.2
(a) Make a histogram or stemplot
of these data. If you choose a histogram,
be sure to specify your classes. If you
choose a stemplot, be sure to explain what your stems
and leaves represent.
(b) Describe the overall shape of the distribution. Is it roughly symmetric, skewed to the right,
or skewed to the left? Are there any
outliers?
(c) Would the 5 number summary or the mean and standard
deviation give a better brief summary for this distribution? Explain your choice. Calculate the summary statistics that you
choose.
2. The table below
summarizes the accept/reject decisions which City U has made for a sample of
n=3000 applicants, broken down by the type of high school attended.
|
|
Public |
Private |
Parochial |
|
Accept |
1254 |
336 |
180 |
|
Reject |
1026 |
144 |
60 |
(a) What is the acceptance rate (as a %) among all City U
applicants? ___________________________
(b) What proportion of City U applicants are not from a public high school?
___________________________
(c) Find the conditional distribution of acceptance and
rejection within each of the high school types.
(That is, find the acceptance and rejection rates for students who
attended public high schools. Then do
the same for private high schools and again for parochial schools.) Summarize the results in a table and with a
bar chart.
(d) If there was no relationship between the type of school
and the admissions decision, what would you expect for the count in the cell
describing number accepted from public high schools?
(e) With a sentence or two, summarize any relationship that
you see in these data between the admission decision and the type of high
school.
3. City U has a
special relationship with an inner city high school that encourages students to
apply for admission. Below are the
Verbal SAT scores from a SRS of 10 applicants from that school.
510 430 600 540 420 380 620 520 490 540
(a) Find the sample mean and standard deviation for these
SAT scores.
(b) Find the interquartile range for these data. [Recall that the interquartile range is the
difference between the third and first quartiles.]
(c) Use the 1.5*IQR criterion to decide if the minimum score
of 380 unusually low, given the other values in this distribution. Carefully justify your decision. [Recall that the 1.5*IQR criterion says that
an observation is an outlier if it falls more than 1.5*IQR above the third
quartile or below the first quartile.]
4. Suppose that all
City U applicants are required to submit a high school grade average (on a 100
point scale). Past experience shows that
these averages follow a normal distribution with a mean of 83.0 and a standard
deviation of 6.0 points.
(a) What proportion of City U applicants should have a high school average below 80? Find the appropriate z-score and use a standard normal table.
(b) The admissions office would like to designate students
in the top 10% of the high school grade distribution for a "fast
track" admissions decision. How
high would a student's high school average need to be in order to make it into
this special decision group?
Your work should include the relevant z-score and the
relationship between the z-score and your answer.
5. (16 points) City U
is noted for having a top-ranked water polo team. In order to attract the best quality players,
the school is quite generous in awarding scholarships to students on the team
to help defray the $18,000 tuition bill.
Suppose that the boxplot below reflects the size of the scholarships
awarded to the 15 current water poloists.
All scholarships are in multiples of $1,000.

Determine whether each of the statements below is VALID
(definitely true), INVALID (definitely false), or UNDETERMINED (could be true
or false). Explain your reasoning in
each case.
(a) __________________ At least 4 of the water polo players
are on full scholarships.
(b) __________________ There is at
least one player with a $12,000 scholarship.
(c) __________________ None of the
15 swimmers has a scholarship worth exactly $10,000.
(d) Circle the value below which is the most reasonable
estimate for the sample mean of the water polo scholarships. Briefly explain your reasoning.
$ 9,000 $
13,500 $ 16,000 $ 18,000
6. Trying to
determine the best number of students to accept is a tricky admission's
decision. City U officials must assume
that some students will reject an offer from City U in order to attend another
school. If too few students are
accepted, they may end up with too small an incoming class, but accepting too
many students may jeopardize City U's rating in college guidebooks. Here are several years' data on the number of
students accepted and the number who later enrolled.
|
Year |
Accepted |
Enrolled |
|
1996 |
2440 |
611 |
|
1997 |
2800 |
708 |
|
1998 |
2720 |
637 |
|
1999 |
2360 |
584 |
|
2000 |
2660 |
614 |
|
2001 |
2620 |
625 |
(a) Find the correlation between the number of students
accepted and the number that enrolled.
(b) Which variable should be the explanatory variable, and
which should be the response?
Explain. Find the least squares
regression line which best fits these 6 data points.
(c) Write a sentence that interprets what the value of the
estimated slope of this regression line tells us about accepted and enrolled
students. Be as specific as possible.
(d) If City U accepts 2500 students in 2002, how many would you expect to enroll?
(e) What is the
residual for 1998? Write a sentence
interpreting the value of the residual.
(f) Find the value of r2 for this model and interpret it as a percentage. Be as specific as possible. Your statement should relate to City U admissions.
(g) Sketch a time plot of the accepted data and another of
the enrolled data. these data. Do your time plots reveal any strong trend in
the number of students accepted or enrolled from year to year?

7. The age distribution of students at City U is modeled by the distribution shown to
the right.
(a) Approximate the
median student age on the graph based on the distribution.
(b) Do you expect the mean student age to be higher or lower than the median? Explain
briefly. Approximate the mean student age, based on
the distribution.
9. Explain or define the following terms as they relate to linear regression:
(a) Influential observations
(b) Residual
10. Overweight parents tend to have overweight children. The results of a study of Mexican American girls aged 9 to 12 years are typical. The investigators measured body mass index (BMI), a measure of weight relative to height, for both the girls and their mothers. People with high BMI are typically overweight. The correlation between the BMI of daughters and the BMI of their mothers was r = 0.506. The results of this study are confounded. Explain what the confounding is and what you may or may not conclude from the study.
11. The table below shows numbers of flights on time and delayed for two airlines at five airports in one month.
|
|
|
|
||
|
|
On Time |
Delayed |
On Time |
Delayed |
|
|
497 |
62 |
694 |
117 |
|
|
221 |
12 |
4840 |
415 |
|
|
212 |
20 |
383 |
65 |
|
|
503 |
102 |
320 |
129 |
|
|
1841 |
305 |
201 |
61 |
(a) What proportion of all Alaska Airlines flights were delayed? What proportion of all America West flights were delayed?
(b) Find the percentage of delayed flights for Alaska Airlines at each of the five airports. You may record your percentages in the table, next to the number of delayed flights. Do the same for America West.
(c) What happens? What is the name of the phenomenon you observe? Explain why it occurs in this situation. (What’s the lurking variable?)
12. In Professor Friedman’s economics course, the correlation between the students total scores prior to the final exam and their final exam scores is r = 0.6. The pre-exam totals for all students in the course have mean 280 and standard deviation 30. The final exam scores have mean 75 and standard deviation 8.
(a) Professor Friedman grades on a curve so that he expects to assign A’s to approximately 15% of his students, B’s to approximately 35%, C’s to approximately 40% of his students, and D’s or F’s to the remaining 10%. Assuming the distribution of pre final totals is approximately normal, before the final exam, how many points (find the minimum) would a student need to be earning an A? a B? a C?
(b) Find the least squares regression line of final exam scores on pre-final total scores for this course.
(c) Explain the meaning of the vertical intercept of your LSR line in the context of Professor Friedman’s class. Is your interpretation reasonable? Why or why not?
(d) Julie’s total before the exam was 300. What does LSR predict for her score on the final exam?
(e)
Should we should have great confidence in our ability to predict
Julie’s final exam score accurately?
Explain your answer and justify it statistically.