
1.
If 2% of the fuses produced are defective, the probability that in a randomly selected sample of six there are two defectives is:
a. (6C2)((.02)**2)((.98)**4) b. ((.02)**4)((.98)**2) c. (6C4)((.02)**4)((.98)**2) d. ((.98)**4)((.02)**2) e. none of the above
Answer: a. (6C2)((.02)**2)((.98)**4)
Use the binomial probability formula.
2.
A company engaged in recruiting wishes to develop a questionnaire that spreads out applicants (shows greater variation in scores than does its standard form). A new, longer form is developed and tested by having 16 randomly selected applicants from the current applicant pool fill it out. (Another group of 16 randomly and independently selected applicants fills out the standard form.)
Results obtained were: Variance  new form S**2(1) = 40.2 Variance  standard form S**2(2) = 12.3
Regarding scores as normally distributed  a. Test at the 5% level the claim that the variance of the new form is greater than that of the old form. b. Sketch the relevant F distribution and indicate the rejection region.
Answer: a. F(calculated) = 40.2/12.3 = 3.268 F(critical) = 2.4 at the 95% confidence level with df=15,15.
Since F(calculated) is greater than F(critical), we will reject the null hypothesis that the new variance is less than or equal to the old variance.
b. If available, consult file of graphs and diagrams that could not be computerized for accompanying diagram.
3.
It is desired to see if there is a relationship in tastes for an expensive car and owning a trimaran. A survey of 200 upper upper class potential purchasers of cars and trimarans gave these responses:
want expensive car do not want expensive totals car
want trimaran 100 40 140
don't want 20 40 60 trimaran
totals 120 80 200
Specifically, it is desired to test H(O): the desire for a trimaran is independent of a desire for an expensive car vs. H(1): there is a relationship, at level ALPHA = 10%.
The correct table value (cutoff point) to use for this problem is closest to:
a) 2.71 d) 9.21 b) 4.61 e) 6.25 c) 6.63
Answer: a) 2.71 CHISQUARE(critical, df = 1, ALPHA = .10) = 2.70554
4.
The thickness of the individual cards produced by a certain playing card manufacturer is normally distributed with mean = 0.01 inches and variance = 0.000052. What is the probability that a deck of 52 cards is more than 0.65 inches in thickness?
A. .001 B. .006 C. .023 D. .036 E. .067 F. .087 G. .159 H. .184
Answer: B. .006
MU (deck) = 52 * .01 = .52 Var(deck) = 52 * .000052 = .002704
Z = (.65  .52)/SQRT(.002704) = .13/.052 = 2.5
P(Z > 2.5) = .0062
5.
A particular type of bolt is produced having diameters with mean 0.500 inches and standard deviation 0.005 inches. Nuts are also produced having inside diameters with mean 0.505 inches and standard deviation 0.005 inches. If a nut and a bolt are chosen at random, what is the probability that the bolt will fit inside the nut?
Answer: Mean for the distribution of differences = .005 Standard deviation = SQRT((.005)**2/1 + (.005)**2/1) = .007071
Z = value of interest  mean of distribution (of differences) / standard error of the distribution of differences
Z = 0  .005/.007071 = .71
We want all the area to the right of .71
= .7611 or 76%.
6.
It is known that the lengths of a particular manufactured item are normally distributed with a mean of 6 and a standard deviation of 3. If one item is selected at random, what is the probability that it wil fall between 5.7 and 7.5?
Answer: P(5.7 < Y < 7.5) = P((5.76)/3 < Z < (7.56)/3) = P(.1 < Z < .5) = .0398 + .1915 = .2313
7.
Suppose the length of life of certain kinds of batteries is normally distributed with MU = 36 months, SIGMA = 4 months. The company guar antees the battery to last 30 months. What proportion of the batter ies will they have to make an adjustment on?
Answer: P(X < 30) = P(Z < (30  36)/4) = P(Z < 1.5) = .0668 or 6.68%
8.
It is found that the gyre swivit, manufactured on any given day at Gozornenplatz, Inc.'s Swat City plant, has the following cha racteristics with respect to length:
Normally distributed with MU = 3.5" and SIGMA = .2".
Draw a picture for each of the following, and show your work.
a. What percent of a day's output lies within two standard deviations of the mean?
b. Of a day's output of 2500 gyre swivits, how many will measure less than 3.966"?
c. 95% of a day's output, centered around the average, will measure between _______ and _______.
d. What percent of a day's output lies between 1 and 2.58 standard deviations above the mean?
Answer: If available, consult file of graphs and diagrams that could not be computerized.
a. 95.4%
P(3.3 <= X <= 3.7) = P(2 <= Z <= 2) = 2(P(0 <= Z <= 2)) = 2 * (.4772) = .9544
b. 2475
Z = (X  MU)/SIGMA = (3.966  3.5)/.2 = (.466/.2) = 2.33
P(Z < 2.33) = .99 99% of 2500 = (.99)(2500) = 2475
c. 3.108, 3.892
P(0 <= Z <= ?) = .475 Therefore ? = 1.96
Lower Boundary = MU  (Z)(SIGMA) = 3.5  (1.96)(.2) = 3.108
Higher Boundary = MU + (Z)(SIGMA) = 3.5 + (1.96)(.2) = 3.892
d. 15.4%
P(1 <= Z <= 2.58) = P(0 <= Z <= 2.58)  P(0 <= Z <= 1) = .4951  .3413 = .1538 = 15.4%
9.
The U.S. Department of Commerce has just completed a sample survey of weekly food expenditures. A simple random sample of 100 families was taken. The average weekly food expenditure was $70.00 per week, with a standard deviation of $8.00. You may assume expenditures in the population to be normally distributed.
a. What proportion of the families spent $85.00 or more per week on food? Be sure to diagram your problem solution]
b. Using the information above, find the expenditure value above which 80% of the families lie.
Answer: If available, consult file of graphs and diagrams that could not be computerized.
a) Z = (X  MU)/SIGMA = (85  70)/8 = 1.875
Area beyond this Z value is .0301, so 3.01% of the families spent 85 dollars or more per week.
b) A cumulative Z value such that 80% lies above it or 20% lies below it is .84.
Z = (X  MU)/SIGMA .84 = (X  70)/8 X = 63.28
Therefore, 80% of the families lie above the expenditure value of $63.28/week.
10.
Suppose a floor manager of a large department store is studying buying habits of their customers.
a) If he is willing to assume that monthly income of these customers is distributed normally, what proportion of the income should he expect to fall in the interval determined by MU +/ 1.2(SIGMA)?
b) What proportion of the income should he expect to be greater than MU + SIGMA?
c) Still assuming normality, what is the probability that a customer selected at random will have an income exceeding the population mean by 3*SIGMA?
Answer: a) P(MU <= X <= 1.2SIGMA) = .3849 P(X is in interval MU +/ 1.2SIGMA) = 2(.3849) = .7698
b) P(X > (MU + SIGMA)) = (.5  .3413) = .1587
c) P(X > MU + (3*SIGMA)) = (.5  .4987) = .0013
11.
Suppose a floor manager of a large department store is studying the buying habits of the store's customers.
a) If he is willing to assume that monthly income of these customers is distributed normally and SIGMA = $500, find the proportion of customers exceeding the population mean by $375.
b) Find the proportion of customers within $125 of the population mean.
Answer: a) Z = 375/500 = .75 P(Z > .75) = (0.5  .2734) = .2266
b) Z = 125/500 = .25 P(.25 <= Z <= .25) = 2(.0987) = .1974
12.
A floor manager of a large department store is studying the buying habits of the store's customers. Suppose the manager has someone tell him that monthly income of these customers is distributed nor mally with a population mean of $600 and standard deviation of $500.
a) What proportion of the customers should he expect to have incomes less than $600?
b) What proportion should he expect to have incomes less than $725?
Answer: a) .5
b) Z = (725  600)/500 = .25 P(Z < .25) = .5 + .0987 = .5987
13.
A company manufactures rope. From a large number of tests over a long period of time, they have found a mean breaking strength of 300 lbs. and a standard deviation of 24 lbs. Assume that these values are MU and SIGMA.
It is believed that by a newly developed process, the mean breaking strength can be increased.
(a) Design a decision rule for rejecting the old process with an ALPHA error of 0.01 if it is agreed to test 64 ropes.
(b) Under the decision rule adopted in (a), what is the probability of accepting the old process when in fact the new process has increased the mean breaking strength to 310 lbs.? Assume SIGMA is still 24 lbs. Use a diagram to illustrate what you have done, i.e., draw the reference distributions.
Answer: a. One tail test at ALPHA = .01, therefore Z = 2.33.
Z = (YBARMU)/(SIGMA/SQRT(n)) 2.33 = (YBAR300)/(24/SQRT(64)) YBAR = 307
Decision Rule: If the mean strength of 64 ropes tested is 307 lbs. or more, we reject the hypothesis of no im provement, i.e., we continue that the new process is better.
b. If available, consult file of graphs and diagrams that could not be computerized for reference distributions.
Z = (307310)/(24/SQRT(64)) = 1.00 Area = 0.1587 or 15.87%
P(type II error) = 0.1587
14.
Suppose X is the price that a certain stock will be exactly 6 months from today. Assume that X is normally distributed with a mean of $30 and a standard deviation of $5.
a. Find the probability that X will be at least $30. b. Find the probability that X will be greater than $40. c. Find the probability that X will be between $20 and $35. d. How many standard deviations is $38 from the mean? e. If you paid $29 for the stock today, what is the probability that you will make a profit if you sell the stock exactly 6 months from today?
Answer: a. P(X >= 30) = P(Z >= 0) = 1/2, where Z = (30  30)/5
b. Z = (40  30)/5 = 2; P(X > 40) = .5  .4772 = .0228
c. Z(1) = (20  30)/5 = 2 Z(2) = (35  30)/5 = 1
Prob(20 < X < 35) = .4772 + .3413 = .8185
d. 8/5 = 1.6 SD's
e. Prob(X > 29) = .5 + .0793 = .5793, where Z = (29  30)/5 = .2
15.
Distribution of the I.Q.'s of 4,500 employees of a company is roughly normal with mean 104 and standard deviation 15. Find the number of employees whose I.Q. is:
a. greater than or equal to 110 b. between 95 and 110.
Answer: a. Z = (110  104)/(15) = .4 NO. = (4500)(.5  .1554) = 1550
b. Z = (95  104)/(15) = .6 NO. = (4500)(.1554) + (4500)(.2258) = 1715
16.
A certain kind of automobile battery is known to have a length of life which is normally distributed with a mean of 1200 days and standard deviation 100 days. How long should the guarantee be if the manufacturer wants to replace only 10% of the batteries which are sold?
Answer: Z = 1.28 for 10 percent failure
1.28 = (X  1200)/100
X = 1072 days for guarantee
17.
A floor manager of a large department store is studying the buying habits of the store's customers. Suppose he assumes that the monthly income of these customers is normally distributed with a standard de viation of 500. If he were to draw a random sample of N = 100 custo mers and determine their income:
a) What is the probability that the sample mean of incomes will differ from the population mean by more than $25?
b) What is the probability that the sample mean is larger than the population mean?
c) Could you provide a reasonable answer to (a) and (b) if the population of incomes were not normal? Explain.
Answer: a) SIGMA(XBAR) = SIGMA/SQRT(n) = 500/SQRT(100) = 50
Z = (XBAR  MU)/SIGMA(XBAR) Z = 25/50 = .5 P(Z < .5 or Z > +.5) = 2(.5  .1915) = .6170
b) .5
c) Yes, the central limit theorem assures us that the distribution of means for n = 100 is symmetrical and approximately normal.
18.
Suppose that you work for a brewery as a clerk to receive barley shipments. As part of your job you are to decide whether to keep or return new shipments of barley. The criteria used for making your decision is an estimation of the moisture content of the shipment. If the moisture level is too high (above 17.5%) the shipment has a good possibility of rotting before use and, therefore, a loss of money to the company. You know from past experience that the variance for all barley shipments is 36 and that your staff can process at the most one sample of 9 moisture readings per shipment.
a. Propose a rule for accepting and rejecting grain shipments on the basis of sample means where the null claim is a shipment has a mean moisture content of 17.5% or less (H(0): MU <= 17.5%). Let the probability of Type I error be .10.
b. When will you make incorrect decisions about a grain shipment having MU = 17.4? What will be the probability of such an error?
c. When will you make incorrect decisions about a grain shipment having MU = 19? What will be the probability of such errors?
d. When will you make incorrect decisions about a grain shipment having MU = 21? What will be the probability of such errors?
Answer: SIGMA**2 = 36 Take a sample, n = 9 SIGMA(XBAR) = SIGMA/SQRT(n) = 6/3 = 2
a. H(0): MU <= 17.5 H(1): MU > 17.5
ALPHA = .10 implies Z = 1.28 Z = XBAR  MU/SIGMA(XBAR)
1.28 = XBAR  17.5/2 2.56 = XBAR  17.5
XBAR = 20.06
Reject H(0) when XBAR > 20.06.
b. I am rejecting H(0) when XBAR > 20.06, so when MU is REALLY 17.4, I make incorrect decisions whenever XBAR > 20.06.
Z = 20.06  17.4/2 Z = 1.33 Area beyond Z = 1.33 is .0918.
The probability of an incorrect decision is .0918.
c. I am rejecting H(0) when XBAR > 20.06, so when MU is REALLY 19, I make incorrect decisions whenever XBAR <= 20.06.
Z = 20.06  19/2 Z = .53 Area between mean and Z = .2019.
The probability of an incorrect decision is .5 + .2019 = .7019.
d. I am rejecting H(0) when XBAR > 20.06, so when MU is REALLY 21, I make incorrect decisions whenever XBAR < 20.06.
Z = 20.06  21/2 Z = .47 Area beyond Z = .47 is .3912.
The probability of an incorrect decision is .3912.
19.
If the number of complaints which a laundry receive per day is a random variable having Poisson distribution with LAMBDA = 4, find the probabilities that on a given day the laundry will receive:
a. no complaints, b. exactly 2 complaints.
Answer: a. P(x=0) = (4**0)(e** 4)/0] = .018 b. P(x=2) = (4**2)(e** 4)/2] = .147
20.
As a quality control inspector you have observed that wooden wheels which are bored offcenter occur about three percent of the time. If six of these wheels are to be used on each toy truck produced by Acme Toy Company, the probability that a given truck has no wheels off center would be obtained by using which distribution?
(a) Normal (c) Hypergeometric (b) Poisson (d) Binomial
21.
Suppose that 2% of the fuses produced by a machine are defective. If we take a sample of 6 from the machine's output, the probability that the first four fuses are good and the last two defective is:
a. (6C4)((.98)**4)((.02)**2) b. ((.02)**4)((.98)**2) c. (6C4)((.02)**4)((.98)**2) d. ((.98)**4)((.02)**2) e. none of these
Answer: d. ((.98)**4)((.02)**2)
The combination (6C4) is not necessary because the order GGGGDD is distinct.
22.
An accounting firm processed 1000 balance sheets for its client last year. If 20% of these are known to contain errors, what is the probability of finding at least one error in a sample of 4 balance sheets chosen at random with replacement?
a. .4906 b. .0016 c. .5904 d. .9984 e. none of these
Answer: c. Let x = # balance sheets containing errors P(x>=1) = 1P(x=0); P(x=0) = b(0;4,.20) = (4C0)(.20**0)(.80**4) = .4096 P(x>=1) = 1  .4096 = .5904
23.
We have a manufacturing process which produces good items with probability .9. We select a sample of 15 items. Assume a binomial experiment.
What is the probability that there is at least one good item in the sample?
a) 15C1(.9)**1 b) 15C0(.9)**0(.1)**15 c) 1  (15C0(.9)**0(.1)**15) d) 1  (15C1(.9)**1(.1)**14) e) none of the above
Answer: c) 1  (15C0(.9)**0(.1)**15)
24.
The following triangle test is sometimes used to identify taste experts. In the case of wine tasting, a test subject is presented with three glasses of wine, two of one kind and a third glass of another wine. The test subject is asked to identify the single glass of wine. A test subject who merely guesses has a 1 chance in 3 of identifying the single glass correctly. An expert wine taster should be able to do much better. Let K stand for the num ber of correct identifications made by a test subject in 10 inde pendent triangle tests.
A test subject makes at least 5 correct identifications (k >= 5). The descriptive level associated with this result is:
a. .076 c. .213 b. .137 d. .057
Answer: c. .213
Descriptive level = P(5 or more correct identifications) = P(5) + P(6) + P(7) + P(8) + P(9) + P(10) = .1366+.0569+.0162+.0030+.0003+.0000 = .213
25.
In a dispute over the proportion of defects in a large shipment a buyer claims there are 20% defective while the seller claims only 10%. To settle the dispute it is decided to take a sample of size 100 from the shipment and if there are less than 15 defectives found to rule in favor of the seller. (Note: the shipment is so large that sampling can be considered to be with replacement.)
a. What is the probability of ruling in favor of the seller if he is correct? b. What is the probability of ruling in favor of the seller if the buyer is correct?
Answer: a. If the seller is correct 
Using normal approximation: p = .1, np = 10, SIGMA = SQRT(npq) = 3 Prob(Z < (14.5  10)/3) = .9322
Using binomial with n = 100 and p = .1: P(X < 15) = .9274
b. If the buyer is correct 
Using normal approximation: p = .2, np = 20, SIGMA = 4 Prob(Z < (14.5  20)/4) = .0845
Using binomial with n = 100 and p = .2: P(X < 15) = .0804
26.
Suppose that it is known that one out of ten undergraduate college textbooks is an outstanding financial success. A publisher has selected four new text books for publication. What is the probability that:
a. Exactly one will be an outstanding financial success?
b. at least one?
c. at least two?
Answer: a. P = (4C1)(.1**1)(.9**3) = .2916
b. P = 1  (4C0)(.9**4) = .3439
c. P = 1  .6561  .2916 = .0523
27.
The Liddalol Airline Company runs an airline from New York to Boston. Its planes carry a maximum of 90 passengers. Knowing that not all persons who reserve seats will actually use them, they accept 100 reservations for each flight. The company has determined that 80% of the persons who make reservations actu ally use them. Assuming that 100 reservations are made for a particular flight, find the probability that some passengers will not get seats. What two assumptions do you have to make?
Answer: p = .8 1  p = .2 n = 100
Using normal approximation to binomial:
XBAR = np = 100*.8 = 80 S**2 = npq = 100*.8*.2 = 16 S = 4
P(X > 90) = P(Z > (90  80)/4) = P(Z > 2.5) = .0062
The two assumptions are:
1. That the decisions for individuals to use their reservations are independent
2. That the probability that a person who makes a reservation actually uses it remains constant from person to person.
28.
The Noglow automatic cigarette lighter is claimed to light 80% of the time when the button is pushed. If this is true, and if the lighter is tried 25 times:
a. What is the probability of getting exactly 20 lights?
b. What is the probability of getting fewer than 17 lights?
c. What is the probability of getting no lights on the first 4 trials?
Answer: a. P(X = 20) = (25C20)(.8**20)(.2**5) = .196
b. P(X < 17) = .046 (use table)
c. (.2**4) = .0016
29.
A floor manager of a large department store is studying habits of their customers. One aspect of this research pertains to residence location of customers.
a) If 1/2 of the customers live outside the city, what is the proba bility that 4 customers selected at random will all live inside the city?
b) Continuing to suppose that 1/2 live outside, what is the probability that 3 or fewer in a random sample of size n = 10 will live outside?
c) If 1/2 live outside, the probability is 0.10 that the random sample of size n = 100 will contain ____ or fewer persons living outside.
Answer: a) (1/2)**4 = 1/16
b) Let X = number chosen who live outside P(X<=3) = b(0; 10, .5) + b(1; 10, .5) + b(2; 10, .5) + b(3; 10, .5) = .001 + .0098 + .0439 + .1173 = 0.172
c) P(X<=?) = .10 = b(?; 100, 1/2) Using the fact that for large n and p and q not too close to zero, the binomial distribution can be closely approximated by a normal distribution where Z = (X  np)/SQRT(npq)
Therefore X = (Z*SQRT(npq)) + np = (1.28*SQRT(100*.5*.5) + (100*.5) = (1.28 * 5) + 50 = 6.4 + 50 = 43.6 X == 44
30.
A salesman has found that, on the average, the probability of a sale on a single contact is .3. If the salesman contacts 50 customers, what is the probability that at least 10 will buy? Write an exact expression for the probability and then obtain an approximate numerical value using the normal approximation.
Answer: Exact Expression:
p = .3, q = 1  p = .7, n = 50
p(X >= 10) = SUM(X = 10, 50)((50CX)(.3**X)(.7**[n  X!)) = .9598
Using normal approximation:
XBAR = n*p = 50*.3 = 15 SIGMA = SQRT(npq) = SQRT(50*.3*.7) = 3.24
Z = (10  15)/3.24 = 1.54 P(Z >= 1.54) = .4382 + .5000 = .9382
If you use the correction factor, Z = 1.70 and P(Z >= 1.70) = .9554.
31.
A machine produces bolts in a length (in inches) found to obey a normal probability law with mean MU = 5 and standard deviation SIGMA = 0.1. The specifications for a bolt call for items with a length (in inches) equal to 5 +/ 0.15. A bolt not meeting these specifications is called defective.
a. What is the probability that a bolt produced by this machine will be defective?
b. If a sample of 10 bolts is chosen at random what is the probability that there will be at least two defective bolts?
Answer: X = length of bolt
a. P(defective) = 1  P(4.85 < X < 5.15) = 1P((4.85  5)/.1 < (X  5)/.1 < (5.15  5)/.1) = 1  P(1.5 < Z < 1.5) = 1  .8664 = .1336
b. Y = number of defectives in a sample of size 10
Then Y has a binomial distribution with parameters n = 10, P = .1336.
Hence, P(at least 2 are defective) = P(Y >= 2) = 1  P(Y = 0)  P(Y = 1) = 1  (10C0)*(P**0)*((l  P)**10)  (10C1)*P*((1P)**9) = 1  (.8664**10)  10*(.1336)*(.8664**9) = .394
32.
A factory finds that, on the average, 20% of the bolts produced by a given machine are defective. If 10 bolts are selected at random from the day's production, find the probability that:
a) exactly 2 will be defective. b) 2 or more will be defective.
Answer: a) P(X = 2) = (nCX)(p**X)(q**(n  X)) = (10C2)(.2**2)(.8**(10  2)) = .3020
b) P(X >= 2) = 1  P(X <= 1) = 1  (P(1) + P(0)) = 1  (((10C1)(.2**1)(.8**(10  1))) + ((10C0)(.2**0)(.8**(10  0)))) = 1  (.2684 + .1074) = .6242
33.
Seventy five percent of the Ford autos made in 1976 are falling apart. Determine the probability distribution of the number of Fords in a sample of 4 that are falling apart. Draw a histogram of the distribution. What is the mean and variance of the distribution?
Answer: Let X = the number of Fords falling apart in a sample of four.
probability distribution: (binomial distribution with n=4 and p=.75)
X ^ p(X) ^ 0 ^ 0.0039 = (4C0)(.75**0)(.25**4) 1 ^ 0.0469 = (4C1)(.75**1)(.25**3) 2 ^ 0.2109 = (4C2)(.75**2)(.25**2) 3 ^ 0.4219 = (4C3)(.75**3)(.25**1) 4 ^ 0.3164 = (4C4)(.75**4)(.25**0)
^ P(X) ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 0.6 ^^^^^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 0.5 ^^^^^ ^ ^ ^ ^ ^ ^ ^ ^  ^ 0.4 ^^^^ ^^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 0.3 ^^^^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 0.2 ^^^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 0.1 ^^^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^^^^^^> 0 1 2 3 4 X
mean = np = 4*.75 = 3 variance = npq = 4*.75*.25 = .75
34.
The probability that a particular kind of machine, used in production, breaks down during a one week period is 0.2. If a company has 10 of these machines, what is the probability of having: a. at least two breakdowns during a given week? b. 3,4, or 5 breakdowns during a given week? c. How many breakdowns should the company expect to have over a one month (4 weeks) time period? d. Now suppose that during a particular week six machines break down. Do you have reason to believe that the break down rate may have increased above the 0.20 rate? State reasoning. e. What assumptions are necessary in making the above calcula tions of probability?
Answer: x = # of breakdowns during a given week x has a binomial distribution with n=10, p=.2
a. P(x>=2) = 1  [b(0;10,.2) + b(1;10,.2)! = 1  [.376! = .624 b. P(3 <= x <= 5) = b(3;10,.2) + b(4;10,.2) + b(5;10,.2) = .2013 + .0881 + .0264 = .3158 c. E[x! = np = 10(.2) = 2 for 1 week for 4 weeks = 4(2) = 8 d. The probability of 6 or more breakdowns given p = .2, n = 10 is .0064. Since the probability that this event occurred by chance is so small, there appears to be some indication that the breakdown rate has increased above .2. e. 1) The breakdown of one machine is not related to the condi tion of any of the other machines; in other words, machine breakdowns are independent. 2) The probability of a breakdown is the same for each of the machines.
35.
A large TV retailer in San Francisco claims that 80 percent of all service calls on color television sets are concerned with the small receiving tube. Test this claim against the alternative PI =/= 0.80 at ALPHA = 0.05 if a random sample of 222 calls on color television sets included 167 which were concerned with the small receiving tube.
Answer: Continue the null hypothesis that the proportion is .80.
Two methods of solution: 1. Using the binomial distribution:
H(0): PI = .80 H(A): PI =/= .80
Z = ((167/222)  .80)/SQRT(PI*Q/n) = (.752  .80)/SQRT(.00072) = 1.779
2. Using the normal approximation to the binomial:
Mean = n*PI = 222 * .8 = 177.6 Variance = n*PI*Q = 222*.8*.2 = 35.52
H(0): MU = 177.6 H(A): MU =/= 177.6
Z = (167  177.6)/SQRT(35.52) = 10.6/5.9599 = 1.779
The critical Z value for both of these cases is: Z(ALPHA/2 = .025) = 1.96
Since the two Z calculated values are less than the critical Z value, we continue the null hypothesis that the proportion is equal to .80.
36.
You, as a manufacturer, can use a particular part only if its diameter is between .14 and .20 inches. Two companies, A and B, can supply you with these parts at comparable costs. Supplier A produces parts whose mean is .17 and whose standard deviataion is .015 inches. However, supplier B produces parts whose mean is .16 inches and whose standard deviation is .012. The diameters of the parts from each company are normally distributed. Which company should you buy from and why?
Answer: For Supplier A:
Z = (X  MU)/SIGMA = (.14  .17)/.015 = 2
and Z = (.20  .17)/.015 = 2
Area between Z = 2 and Z = 2 under the normal curve is .9544. There fore, 95.44% of the parts would be within .14 in. and .20 in.
For Supplier B:
Z = (.14  .16)/.012 = 1.67
and Z = (.20  .16)/.012 = 3.33
Area between Z = 3.33 and Z = 1.67 under the normal curve is .9520. Therefore, 95.20% of the parts would be within .14 in. and .20 in.
Conclusion: I would choose Supplier A by a hair.
37.
A lightbulb is selected randomly from a factory's monthly production. The bulb's lifetime (total hours of illumination) is a random variable with exponential density function f(x) = (1/MU)*(e**[x/MU!) if x >= 0 = 0 if x < 0, where the fixed parameter MU is the mean of this distribution (MU > 0).
a) Derive the cumulative distribution function F(x). Show that a random lifetime X exceeds x hours (x > 0) with probability P(X > x) = e**(1/MU) b) Let M denote the smallest value in a random sample of n bulb lifetimes X(1), X(2), ..., X(n). Show that P(M > x) = P(X(1) > nx). HINT: M > x if and only if X(1) > x and X(2) > x and ... and X(n) > x. c) Assume the mean lifetime MU = 700 hours. Use a) and a table of the exponential function to evaluate numerically i) the median lifetime x(.50), ii) P(X <= 70), iii) P(70 < X <= 700).
Answer: a) F(X) = INT(X/0)((1/MU)*(e**[t/MU!)dt) X = e**(t/MU)! 0 = 1/0  [e**X/MU)!
F(X) = [ 0; x < 0 [ 1.0  [e**(x/MU)!; x >= 0
Prob (X>x) = 1.0  F(X) = 1.0  [1.0  [e**(x/MU)!! = e**(x/MU)
b) Prob(M > x) = [Prob(X(1)>x)!*[Prob(X(2)>x)!*...*[Prob(X(n)>x)! = [e**(x/MU)!**n = e**(xn/MU) = [Prob(X(1)>xn)!
c) i) 0.50 = Prob(X <= Median) = F(x) = 1.0  [e**(x/700)! 0.50 = e**(x/700) using a table of the exponential function x/700 == .693 x == 485.1 hours ii) Prob(X<=70) = F(X=70) = 1.0  [e**70/700)! = 1.0  0.90484 = 0.09516 iii) Prob(70 < x <= 700) = F(x=700)  F(x=70) = [1.0[e**(700/700)!![1.0[e**(70/700) = [1.0  .36788!  [0.09516! = 0.53696
38.
A lightbulb is selected randomly from a factory's monthly production. The bulb's lifetime (total hours of illumination) is a random variable with exponential density function f(x) = [(1/MU)*(e**[x/MU!) if x >= 0 [ 0 if x < 0, where the fixed parameter MU is the mean of this distribution (MU>0). a) For an exponential distribution the standard deviation SIGMA = MU. Let XBAR = (1/n)(X(1)+X(2)+...+X(n)) denote the average value in a random sample of n bulb lifetimes. Express E[XBAR! and VAR[XBAR! in terms of MU. If the mean MU = 700 hours and sample size n = 100, then the statistic Z=(XBAR700)/70 has approximately a normal distribution with what mean and variance? b) Describe a test of the null hypothesis H(0): MU <= 700 against the alternative hypothesis H(1): MU > 700, using only the sample mean XBAR. If the desired significance level is ALPHA = .05 and sample size n = 100, then indicate which numerical values of XBAR corre spond to this test rejecting H(0). (Use the table of the standard normal distribution.) c) If mean MU = 700 hours, then P(X > 2100) = .04979. If instead MU > 700, is P(X > 2100) larger or smaller than .04979?
Answer: a) E[XBAR! = E[(1/n)*(X(1)+X(2)+...+X(n))! = (1/n)*[E[X(1)+E[X(2)!+...+E[X(n)!! = (1/n)*[n*E[X!! = E[X! = INT(INFNTY/0)(X*(1/MU)*e**[x/MU!)dx) (Integrating by parts, with u = x dv = (1/MU)(e**[x/MU!)dx du = dx v = e**[x/MU! INFNTY = x*(e**[x/MU!)!  INT(INFNTY/0)(e**[x/MU!dx) 0
INFNTY = MU * e**[x/MU!! 0 = MU
E[x**2! = INT(INFNTY/0)((x**2)*(1/MU)*(e**[x/MU!)dx) by parts with, u = (x**2) dv = (1/MU)(e**[x/MU!)dx du = 2x dx v = e**[x/MU!
INFNTY = (x**2)*(e**[x/MU!)! INT(INFNTY/0)((2x)*(e**[x/MU!)dx) 0 = 2*INT(INFNTY/0)((x*(e**[x/MU)dx) by parts with u = x dv = e**[x/MU!dx du = dx v = mu*(e**[x/MU!) INFNTY = 2*[x*MU*(e**[x/MU!)!  INT(MU*(e**[x/MU!)dx)! 0
INFNTY = 2(MU**2)*(e**[x/MU!)! 0 = 2(MU**2)
VAR[XBAR! = VAR[(1/n)*(X(1)+X(2)+...+X(n))! = [(1/n)**2!*[VAR[X(1)!+VAR[X(2)+...+VAR[X(n)!! = [(1/N)**2!*[n*VAR[X!! = (1/n)*(VAR[X!) = (1/n)*[E[X**2!(E[X!**2)! = (1/n)*[2(MU**2)(MU**2)! = (MU**2)/n
Z = (XBAR700)/70 E[Z! = (E[XBAR!700)/70 = (MU700)/70 = (700700)/70 = 0/70 = 0
VAR[Z! = VAR[(XBAR700)/70! = [(1/70)**2! * VAR(XBAR) = [(1/70)**2! * [(MU**2)/n! = [1/4900! * [(700**2)/100! = 1
b) test statistic: Z = [XBAR700!/[700/SQRT(n)! critical region: Any value of Z(calc) that lies beyond the Z(crit) which is found in the standard normal table with ALPHA per cent of the distribution beyond it.
with n = 100 and ALPHA = .05, Z(crit) = 1.645
Thus in order to reject H(0), [XBAR700!/[700/SQRT(100)! >= 1.645 XBAR >= (1.645*70) + 700 XBAR >= 815.15
c) It can be shown that a random lifetime X exceeds x hours (X>0) with probability P(X > x) = e**(x/MU) Therefore, P(X > 2100) = e**(2100/700) = e**(3) Now if MU > 700, the exponent of e becomes less and looking at a table of the exponential function it is evident that the probability becomes smaller.
39.
In a given business venture a man can make a profit of $1000 or suffer a loss of $500. The probability of a profit is 0.6. What is the expected profit (or loss) in that venture?
Answer: p = .6 Expected profit = (.6*1000)  (.4*500) = 600  200 = 400
40.
The following probability distribution applies to the value of a stock during the coming year:
VALUE P(VALUE) 100 .46 150 .04 200 .20 250 .20 300 .10
Compute the expected value of the stock. What interpretation would you give to this value?
Answer: E(V) = (100 * .46) + (150 * .04) + (200 * .20) + (250 * .20) + (300 * .10) = 46 + 6 + 40 + 50 + 30 = 172
The $172 is the average, or mean, of the distribution of stock values.
41.
Suppose that the probability that a salesman makes a sale to any customer is .4. If each sale is worth $100 in commissions and the events of making a sale to two different customers are independent, what is his expected commission if he sees two customers on a parti cular day?
Answer: If we let X = commissions that day, the probability distribution for X is:
X ^ p(X) _________________ 0 ^ .6 * .6 = .36 $100 ^ (.6 * .4) + (.4 * .6) = .48 $200 ^ (.4 * .4) = .16
E(X) = (0 * .36) + (100 * .48) + (200 * .16) = 48 + 32 = $80
42.
An investor wishes to buy a stock and sell it three months later. After much investigation he has narrowed his possible choices to five different stocks and decides to pick one of these at random. The first stock has one chance in four of losing value. The second stock has one chance in three of losing value. The third stock has two chances out of nine of losing value. The fourth and the fifth stocks both have three chances out of ten of losing value. What is the probability that the investor loses money on his investment?
Answer: P(picking a particular stock) = .2
Given: First Stock: P(losing) = .25 Second Stock: P(losing) = .33 Third Stock: P(losing) = .22 Fourth Stock: P(losing) = .3 Fifth Stock: P(losing) = .3
Probability that the investor loses money on his investment = E[P(losing)!
E[P(losing)! = (.25*.2) + (.33*.2) + (.22*.2) + (.3*.2) + (.3*.2) = .28
43.
Suppose an investor buys a stock for $100 per share with intentions of selling it three months later. At the end of three months he has one chance in four of selling for $80 per share, one chance in four of selling for $100 per share and one chance in two of selling for $140 per share. How much per share can the investor expect to make on this stock when he sells it?
Answer: Let X = amount made by investor when he sells stock.
x ^ p(x) _____________ 20 ^ .25 0 ^ .25 40 ^ .5
E(X) = (20*.25) + (0*.25) + (40*.5) = 5 + 0 + 10 = 5
The investor can expect to make $5 on this stock when he sells it.
44.
Joe Pennyworth has a very rare 1919SVDB penny. He is considering accepting a firm offer of $8 for the penny or putting it up for auction at the local numismatic club. His possible actions are:
a(1): put the penny up for auction a(2): accept $8 for the penny
An estimate of the probability distribution for the sales price at the auction is given to Joe as:
Sales Price Probability   $ 6 .10 7 .20 8 .30 9 .30 10 .10
Joe has determined his utility function as:
Dollar Value Utility   $ 6 1.0 7 2.5 8 4.0 9 4.5 10 5.0
a. If Joe evaluates the problem by considering expected monetary values what is his decision?
b. If Joe evaluates the problem by considering expected utilities, what is his decision?
c. Plot the utility function.
d. Is Joe a risk preferrer?
Answer: a. Expected profit if the penny is put up for auction:
E[a(1)! = [6*.10!+[7*.20!+[8*.30!+[9*.30!+[10*.10! = .60 + 1.40 + 2.40 + 2.70 + 1.00 = $8.10
Expected profit if he accepts the offer:
E[a(2)! = [8.00*1.00! = $8.00
Since E[a(1)! > E[a(2)!, he should put the penny up for auction.
b. Expected utility of putting the penny up for auction:
Exp. Ut. [a(1)! = [1*.10!+[2.5*.20!+[4*.30!+[4.5*.30!+[5*.10! = 3.65
Expected utility of accepting offer:
Exp. Ut. [a(2)! = [4*1.00! = 4
Since Exp. Ut.[a(2)! > Exp. Ut.[a(1)!, he should accept the offer of $8 for the penny.
c. ^ Utility ^ ^ 5 + * ^ * 4 + * ^ 3 + ^ * 2 + (NOTE: Connect *'s ^ with a smooth curve.) 1 + * ^ ++++++++++> 1 2 3 4 5 6 7 8 9 10 Dollar Value
d. No, a risk avoider.
45.
A lot containing 12 parts among which 3 are defective is put on sale "as is" at $10.00 per part with no inspection possible. If a defective part represents a complete loss of the $10.00 to the buyer and the good parts can be resold at $14.50 each, is it worthwhile to buy one of these parts and select it at random?
Answer: Expected return value of part = .75*(14.50) + .25(0) = 10.875
Therefore, you expect to gain approximately $.87 on each part you buy, and it is worthwhile to buy one selected at random.
46.
The Connecticut Daily Numbers game uses a selection procedure very similar to the one described in the following paragraph for the selection of random numbers.
There are 10 identical ping pong balls on which the digits 0, 1, ..., 9 have been written. After mixing the balls thoroughly in a box, one is selected with out looking. The digit written on the ball is record ed, and then the ball is put back in the box. This whole process of mixing, selecting, writing down a digit, and returning the ball to the box is repeated again and again.
The first 100 digits selected by the Connecticut Lottery showed the following distribution:
digit: 0 1 2 3 4 5 6 7 8 9
number of occurrences: 23 7 5 8 12 11 8 9 7 10
Test the hypothesis that all 10 digits are equally likely. The descriptive level, DELTA, for the test satisfies:
a. DELTA < .01 c. .025 < DELTA < .10 b. .01 < DELTA < .025 d. .10 < DELTA
Answer: a. DELTA < .01
H(O): All digits are equally likely to occur. (This defines a Goodness of Fit Test for a uniform distribution.)
Expected frequency for each cell is 10.
CHISQ(calc) = [[(2310)**2!+[(710)**2!+[(510)**2!+[(810)**2!+ [(1210)**2!+[(1110)**2!+[(810)**2!+[(910)**2!+ [(710)**2!+[(1010)**2!!/10 = 22.6
The probability of obtaining CHISQ(calculated) = 22.6 is less than .01. (This DELTA value was obtained from Table of CHISQUARE dis tribution with 9 df.)
47.
Among twentyfive articles, nine are defective, six having only minor defects and three having major defects. Determine the probability that an article selected at random has major defects given that it has defects.
a. 1/3 b. .25 c. .24 d. .08
Answer: a. 1/3
P[MD/D! = (3/25)/(9/25) = 3/9 = 1/3
48.
Among twentyfive articles eight are defective, six having only minor defects and two having major defects. Determine the pro bability that an article selected at random has major defects given that it has defects.
(a) .08 (c) 1/3 (b) .25 (d) .24
Answer: (b) .25
P(MDD) = P(MD and D)/P(D) = (2/25)/(8/25) = .25
49.
The following table shows the composition of employees at Dwinal's Inn.
(FT) ^ (PT) ^ Fulltime ^ Parttime ^ TOTAL  Waiters (W) 20 ^ ^ 30  Bartenders (B) ^ ^  Cooks (C) 10 ^ ^ 15  TOTAL ^ 15 ^ 50
a) Complete the above table.
b) From this table structure (form) another table showing all the marginal and joint probabilities.
c) Find the following conditional probabilities:
i) P(PTW) = ? ii) P(BFT) = ? iii) P(BC) = ? iv) P(CPT) = ?
Answer: a) (FT) ^ (PT) ^ Fulltime ^ Parttime ^ TOTAL  Waiters (W) 20 ^ 10 ^ 30  Bartenders (B) 5 ^ 0 ^ 5  Cooks (C) 10 ^ 5 ^ 15  TOTAL 35 ^ 15 ^ 50
b) Joint Marginal  ^ .4 ^ .2 ^ .6 ^  Joint ^ .1 ^ 0.00 ^ .1 ^  ^ .2 ^ .1 ^ .3 ^  Marginal ^ .7 ^ .3 ^ ^ 
c) i) P(PTW) = P(PT INTRSCT W)/P(W) = .2/.6 = .33
ii) P(BFT) = P(B INTRSCT FT)/P(FT) = .1/.7 = .143
iii) P(BC) = P(B INTRSCT C)/P(C) = 0.0/.3 = 0
iv) P(CPT) = P(C INTRSCT PT)/P(PT) = .1/.3 = .33
50.
A certain kind of job opening can be filled by hiring only either a high school graduate or a college graduate. In the past all hirings for the job have resulted in 80% of the hirings being successful according to the company's evaluation. It is also known that among all failures at the job, 40% have been high school graduates, while among all successes only 30% have been high school graduates. Using the percentages as probabilities, find the probability that if the job is given to a high school graduate it will be successful.
Answer: Using Bayes' Law:
P(success) = .8 P(HSF) = .4 P(HSS) = .3
P(SHS) = [P(S)*P(HSS)!/[P(S)*P(HSS) + P(F)*P(HSF)! = ((.8)(.3))/((.8)(.3) + (.2)(.4)) = .75
or the population breaks down as follows:
Successful Failure  High school graduate ^ 24% ^ 8% ^ 32%  College graduate ^ 56% ^ 12% ^ 68%  80% 20%
Therefore, Prob(successfulhigh school graduate) = 24/32 = .75
51.
Suppose in a 500 mile race there are 12 entries. 4 cars are to be placed in each of three rows to start the race. How many ways can the first row of cars in the race be formed?
Answer: 4]*12C4 = 11880
or
12P4 = 11880
52.
An ice cream store has three sizes of ice cream cones, small, medium, and large. If four cones are randomly selected, one at a time, what is the probability that a small cone will be selected before three large cones are selected?
a. 1/3 b. 2/3 c. 64/81 d. 9/16 e. 61/64
Answer: c. 64/81
1  P(no S selected)  P(LLLS) = ((2/3)**4)  ((1/3)**4) = 1  (16/81)  (1/81) = 64/81
OR:
Sxxx 1/3 27/81 ! Sxx 2/3*1/3 18/81 ! Sx 2/3*2/3*1/3 12/81 ! 64/81 MMMS ((1/3)**4) 1/81 ! MMLS 3*((1/3)**2)((1/3)**2) 3/81 ! MLLS 3*((1/3)**2)((1/3)**2) 3/81 !
53.
A certain assembly consists of two sections, A and B, which are bolted together. In a bin of 100 assemblies, 12 have only section A defective, 10 have only section B defective, and 2 have both section A and section B defective. What is the probability of choosing, without replacement, 2 assemblies from the bin which have neither section A nor section B defective?
a. (76)**2/(100)**2 b. (98)**2/(100)**2 c. 98(97)/[100(99)! d. 76(75)/[100(99)! e. none of these
Answer: d. 76(75)/[100(99)! # of sections without defectives = 100  (12 + 10 + 2) = 100  24 = 76 P(of no defectives) = (76/100)*(75/99)
54.
Given that each of three identical devices operating independently has probability 3/4 of operating successfully, determine the pro bability that exactly two of the three fail.
(a) 3/64 (c) 4/64 (b) 9/64 (d) 27/64
Answer: (b) 9/64
P(2 failures and 1 success) = 3*(1/4)*(1/4)*(3/4) = 3*(3/64) = 9/64
55.
The life in months of service before failure of the color television picture tube in 8 television sets manufactured by Firm A and 8 sets manufactured by Firm B are as follows (arranged according to size):
Firm A: 25, 29, 31, 32, 35, 37, 39, 40 Firm B: 34, 36, 41, 43, 44, 45, 47, 48
Let ETA(A) and ETA(B) denote the median service life of picture tubes produced by the 2 firms. A confidence interval for ETA(B)  ETA(A) is bounded by the dth smallest and the dth largest of all differences of B and Aobservations. For confidence coefficient .99, we take d equal to:
(a) 9 (b) 14 (c) 15 (d) 17
56.
Prices of shares on the stock market are recorded to 1/8th of a dollar. We might then expect to find stocks selling at prices ending in:
0 1/8 1/4 3/8 1/2 5/8 3/4 7/8
with about equal frequency. On a certain day, 120 stocks showed the following frequencies:
26 8 15 9 22 12 19 9.
H(O): P(0) = P(1/8) = P(1/4) = P(3/8) = P(1/2) = P(5/8) = P(3/4) = P(7/8), or the distribution of final eighths is uniform. H(A): the distribution of final eighths is not uniform.
At significance level ALPHA = .10, the hypothesis being tested is rejected provided the test statistic is:
a) greater than 1.28. d) greater than 12.0. b) greater than 14.7. e) smaller than 2.83. c) smaller than 4.17.
Answer: d) greater than 12.0.
CHISQUARE(critical, df=7, ALPHA=.10) = 12.0
57.
Prices of shares on the stock market are recorded to 1/8th of a dollar. We might then expect to find stocks selling at prices ending in:
0 1/8 1/4 3/8 1/2 5/8 3/4 7/8
with about equal frequency. On a certain day, 120 stocks showed the following frequencies:
26 8 15 9 22 12 19 9.
H(O): P(0) = P(1/8) = P(1/4) = P(3/8) = P(1/2) = P(5/8) = P(3/4) = P(7/8), or the distribution of final eighths is uniform. H(A): the distribution of final eighths is not uniform.
In testing the above hypothesis, all expected frequencies equal:
a) 12 b) 60 c) 125 d) 500 e) none of these
Answer: e) none of these
e = 120 * (1/8) = 15
58.
A manufacturer of floor polish conducted a consumerpreference experi ment to determine which of the five different floor polishes was superior. A sample of 100 housewives viewed five patches of flooring which received the five polishes. Each housewife indicated the patch that she considered superior in appearance. The lighting, background, etc., were approximately the same for all five patches. The result of the survey was as follows:
Polish A B C D E TOTAL Frequency 27 17 15 22 19 100
a. State the hypothesis of "no preference" in statistical terminology. b. State the test statistic used. c. Test the hypothesis at ALPHA = .10 and draw the conclusion.
Answer: a.) H(O): P(A) = P(B) = P(C) = P(D) = P(E) = 1/5 H(A): P(A) =/= P(B) =/= P(C) =/= P(D) =/= P(E)
b.) Use CHISQUARE with 4 df
c.) CHISQUARE (calculated) = Sum [((OE)**2)/E!
A B C D E TOTAL O 27 17 15 22 19 100 E = n*p(i) 20 20 20 20 20 100 OE 7 3 5 2 1 0 (OE)**2 49 9 25 4 1 ((OE)**2)/E 2.45 .45 1.25 .20 .05 4.40
CHI SQUARE (calculated) = 4.40 CHI SQUARE (critical, df = 4, ALPHA = .10) = 7.78
Therefore the data supports the null hypothesis at the .10 level and a conclusion that no significant consumer preference for floor polish has been found.
59.
A market research firm was hired to test consumer preference for different packages for some soap. Two hundred randomly selected housewives were given a package of soap wrapped in each of the following colors: red, white, blue, green. After a month in which they could use the soap, they were given a free case of the color package of their choice. There were no markings to differentiate the packages  just color  and the soap itself was the same. Is there a significant difference in the colors they selected?
Color Package No. Housewives Choosing   red 50 white 75 blue 30 green 45
Answer: Contingency table: 0 50 75 30 45 E 50 50 50 50
CHISQ = ((50  50)**2)/50 + ((75  50)**2)/50 + ((30  50)**2)/50 + ((45  50)**2)/50 = 0 + 12.5 + 8 + .5 = 21
df = (K  1) = 3
P(CHISQ(3) >= 21) < .001
Reject H(O) at ALPHA = .10, .05, or .01. Conclude that there is a significant difference in the colors chosen.
60.
Horseracing fans often insist that in a race around a circular track the horses in certain post positions have significant advantages. Post position 1 is nearest to the inside rail and post position 8 is farthest to the outside. Suppose we observed the results of races for one month of racing at a track. (Horses were randomly assigned to post positions.) Results were as follows:
post position 1 2 3 4 5 6 7 8 no. of wins 29 15 18 25 17 10 15 11
Use CHISQUARE to test whether there is any difference in number of wins. State the null hypothesis you are testing.
Answer: H(O): No difference in number of wins per post position (uniform distribution).
Total number of wins = 140. Expected number of wins/position = 140/8 = 17.5
CHISQUARE(calculated) = ((29  17.5)**2)/17.5 + ((15  17.5)**2)/17.5 + ((18  17.5)**2)/17.5 + ((25  17.5)**2)/17.5 + ((17  17.5)**2)/17.5 + ((10  17.5)**2)/17.5 + ((15  17.5)**2)/17.5 + ((11  17.5)**2)/17.5 = 17.143
ALPHA = .05 (or .10 or .01) usually; df = 8  1 = 7
.01 <= P(CHISQ(7)=17.143) <= .05
Therefore, the conclusion is to reject H(O) at ALPHA = .05 or .10, and continue H(O) at ALPHA = .01.
61.
New York State Thruway Commission is examining lane usage on the bridge leading to the Big Apple (Tappan Zee Bridge). It is hypothesized that, during rush hours, traffic in vehicles/hour in the rightmost four lanes is in the ratio:
Lane 1 2 3 4 ^^ LEFT ^ 11 ^ 12 ^ 10 ^ 7 ^ RIGHT ^__________________________^
In essence this means that of the total traffic for the four lanes:
11/40 took lane 1 12/40 took lane 2 10/40 took lane 3 7/40 took lane 4
A sampling of lane traffic for one day is as follows:
^^ ^ 8200 ^ 9000 ^ 7350 ^ 5200 ^ ^___________________________________^
Can we conclude at ALPHA = .05 that traffic lane usage occurred in the hypothesized ratios?
(a) Pick the most appropriate hypothesis test. What is it? (b) State the null and alternative hypotheses. (c) Compute a test statistic. (d) Indicate the critical value or values. (e) Do you continue or reject H(O)? What is your conclusion relative to the question posed above?
Answer: (a) CHISQUARE goodness of fit
(b) H(O): The lane usage occurs in the ratio of 11:12:10:7. H(A): The lane usage is other than 11:12:10:7.
(c) observed ^ 8200 ^ 9000 ^ 7350 ^ 5200 ^ 29750 __________^________^________^________^________^_______ expected ^8181.25 ^ 8925 ^ 7437.5 ^5206.25 ^
CHISQUARE(calc) = .042 + .630 + 1.029 + .008 = 1.709
(d) CHISQUARE(.05,3) = 7.815
(e) Do not reject H(O) since CHISQUARE(calc) < CHISQUARE(crit). Conclude that lane usage occurs in hypothesized ratios.
62.
The following is the number of cars produced in an auto plant.
MON TUE WED THU FRI  20 25 25 20 10
Test the null hypothesis at ALPHA = .01 that production does not depend on the day of the week.
Answer: n(i) 20 25 25 20 10  E[n(i)! 20 20 20 20 20
CHISQUARE(calculated) = 7.5 CHISQUARE(critical, df=4, ALPHA=.01) = 13.3
Since CHISQUARE(calculated) < CHISQUARE(critical), we cannot reject the null hypothesis at ALPHA = .01.
63.
It is desired to see whether there is a relationship in tastes for an expensive car and owning a trimaran. A survey of 200 upperclass potential purchasers of cars and trimarans gave these responses:
Want Expensive Do Not Want Expensive Totals Car Car
Want Trimaran 100 40 140 Don't Want Trimaran 20 40 60
Totals 120 80 200
Specifically, it is desired to test H(O): the desire for a trimaran is independent of a desire for an expensive car.
If H(O) were true, the estimate of the expected number of those who do not want either a trimaran or an expensive car would be:
a) (140*120)/200 d) (140*80)/200 b) (140*60)/200 e) (120*60)/200 c) (60*80)/200
Answer: c) (60*80)/200
Expected Value = (80/200)(60/200)(200) = (80*60)/200
64.
It is desired to see if there is a relationship in tastes for an expensive car and owning a trimaran. A survey of 200 upperupper class potential purchasers of cars and trimarans gave these responses:
want expensive don't want expensive totals car car
want trimaran 100 40 140
don't want 20 40 60 trimaran
totals 120 80 200
Specifically, it is desired to test H(O): the desire for a tri maran is independent of the desire for an expensive car.
The contribution to the Chisquare statistic of the term: desire tri maran  and desire expensive car is:
a) ((100120)**2)/120 d) ((10078)**2)/78 b) ((100140)**2)/140 e) ((10080)**2)/80 c) ((10084)**2)/84
Answer: c) ((10084)**2)/84
Expected Value = (120/200)(140/200)(200) = 84
Contribution = ((10084)**2)/84
65.
It is desired to see if there is a relationship in tastes for an expensive car and owning a trimaran. A survey of 200 upperupper class potential buyers of cars and trimarans gave these results:
want expensive don't want expensive total car car
want trimaran 100 40 140
don't want 20 40 60 trimaran
totals 120 80 200
Specifically, it is desired to test H(O): the desire for a trimaran is independent of the desire for an expensive car vs. H(1): there is a relationship at level ALPHA = .20.
Given that the appropriate normalized statistic is greater than 23 and less than 30, one should ______ H(O): independence since the value ______ is ______ than the correct cutoff point.
a) reject, 30, bigger d) continue, 23, bigger b) reject, 23, bigger e) continue, 23, smaller c) continue, 30, bigger
Answer: b) reject, 23, bigger
CHISQUARE(critical, df = 1, ALPHA = .20) = 1.64 and if CHISQUARE(calculated) is in the interval (23,30) CHISQUARE(calculated) > CHISQuARE(critical), which implies that H(O) should be rejected.
66.
The life in months of service before failure of the color television picture tube in 8 television sets manufactured by Firm B are as follows (arranged according to size):
Firm B: 34, 36, 41, 43, 44, 45, 47, 48
Let ETA(B) denote the median service life of picture tubes produced by the firm. To test the hypothesis ETA(B) = 38.5 against the alternative ETA(B) =/= 38.5, the value of CHISQ(calculated) for the median test equals:
(a) 8 (b) 6 (c) 4 (d) 2
Answer: (d) 2
^ Above 38.5 ^ Below 38.5  observed ^ 2 ^ 6 ^  expected ^ 4 ^ 4 ^ 
CHISQ = [[(2  4)**2! + [(6  4)**2!!/4 = 2
67.
A car rental agency is in the process of deciding the brand of tire to purchase as standard equipment for their fleet. As part of the decision process, they are interested in studying the treadlife of five competing brands. Based on testing, the research department determined that each of 10 tires of each brand will last the following number of miles (in 1000's to the nearest 1000). Compute a CHISQ median test. Test the null hypothesis H(O): no difference among tires, with ALPHA = .05.
Tire Brands  A B C D E
40 45 30 35 28 42 40 32 40 32 45 40 31 42 34 38 44 35 36 28 40 42 28 38 32 41 44 29 34 26 43 41 31 41 29 43 41 30 41 31 37 43 34 35 25 40 41 27 37 31
Answer: MD(overall) = 37
Observed: A B C D E      above MD 9 10 0 5 0 below MD 0 0 10 4 10
Expected:
4.5 5 5 4.5 5 4.5 5 5 4.5 5
CHISQ(calculated) = 41.34 CHISQ(ALPHA=.10, df=4) = 7.779
CHISQ(calculated) > CHISQ(critical), therefore reject H(O) and conclude the samples are from populations with different medians.
68.
Test that there is no relationship between performance in a company's training program and ultimate success in the job. Use ALPHA = 0.01. The following data is obtained from 400 samples of a company.
PERFORMANCE IN TRAINING PROGRAM
A B C  SUCCESS A ^ 63 ^ 49 ^ 9 ^ IN JOB  B ^ 60 ^ 79 ^ 28 ^  C ^ 29 ^ 60 ^ 23 ^ 
Answer: H(O): There is no relationship between performance in the training program and success in the job. H(A): There is a relationship between performance in the training program and success in the job.
A B C  A ^ 63(45.98) ^ 49(56.87) ^ 9(18.15) ^ 121  B ^ 60(63.46) ^ 79(78.49) ^ 28(25.05) ^ 167  C ^ 29(42.56) ^ 60(52.64) ^ 23(16.8) ^ 112  152 188 60 400
CHISQUARE = ((63  45.98)**2)/45.98 + ... + ((23  16.8)**2)/16.8 = 20.18
Critical value = 13.3 df = (3  1)(3  1) = 4
Since CHISQUARE(critical) < CHISQUARE(calculated), reject the null hypothesis and conclude that there is a relationship.
69.
In order to find out how viewing preferences of TV viewers change over the years, networks conduct viewer surveys. In such a survey, viewers of sports events were asked to name their favorite sport. The following table gives responses for the years 1960 and 1970.
1960 1970  
football 150 250 baseball 250 150 basketball 100 100
The null hypothesis tested by an appropriate CHISQUARE test is:
a) 1970 viewers prefer football to baseball. b) 1960 viewers prefer baseball to football. c) There have been no changes in viewing preferences between 1960 and 1970. d) Viewing habits of TV watchers have changed between 1960 and 1970. e) The number of sports viewers has remained the same over the years.
Answer: c) There have been no changes in viewing preferences between 1960 and 1970.
70.
In order to find out how viewing preferences of TV viewers change over the years, networks conduct viewer surveys. In such a survey, viewers of sports events were asked their favorite sport. The fol lowing table gives responses for the years 1960 and 1970.
1960 1970  
football 150 250 baseball 250 150 basketball 100 100
Using the null hypothesis that there have been no changes in viewing preferences between 1960 and 1970, the value of CHISQUARE for the given table is:
a) less than 2. d) between 25 and 45. b) between 2 and 10. e) greater than 45. c) between 10 and 25.
Answer: e) greater than 45.
Expected values:
1960 1970   football 200 200 baseball 200 200 basketball 100 100
CHISQUARE(calc) = SUM([(OE)**2!/E) = ([(150200)**2!/200) + ([(250200)**2!/200) + ([(100100)**2!/100) + ([(250200)**2!/200) + ([(150200)**2!/200) + ([(100100)**2!/100) = (2500/200) + (2500/200) + 0 + (2500/200) + (2500/200) + 0 = 10000/200 = 50
71.
In order to find out how viewing preferences of TV viewers change over the years, networks conduct viewer surveys. In such a survey, viewers of sports events were asked their favorite sport. The fol lowing table gives responses for the years 1960 and 1970.
1960 1970  
football 150 250 baseball 250 150 basketball 100 100
The network was interested in testing the null hypothesis that there have been no changes in viewing preferences between 1960 and 1970. If the correct value of CHISQUARE is sufficiently high to reject the hypothesis being tested, then we can conclude that:
a) viewing habits have not changed over the 10year span. b) basketball is more popular in 1970 than in 1960. c) both football and baseball have become more popular in 1970. d) the appeal of football has increased and that of baseball has decreased between 1960 and 1970. e) the appeal of baseball has increased and that of football has decreased between 1960 and 1970.
Answer: d) the appeal of football has increased and that of baseball has decreased between 1960 and 1970.
Since we reject H(O): that there has been no change in viewing habits, and from the table, we can see that more people preferred football in 1970 than in 1960, that fewer people preferred base ball in 1970 than in 1960, and that there was no change in pre ference with regards to basketball, the above conclusion is appropriate.
72.
Given the following data matrix:
AUTOMOBILES CHEV. CORVETTE MUSTANG II VW RABBIT MONTE CARLO     OWNER'S AGE
less than 40 21 143 36 28 ^ 228
greater than or equal to 40 26 61 35 41 ^ 163      47 204 71 69 ^ 391
Test at ALPHA = .05 if the populations of cars have different distributions of ages of persons owning them. If the null hypothesis is rejected, construct confidence intervals about the proportion differences as a posthoc test procedure.
Answer: a. Expected Frequencies: 27.41 118.96 41.40 40.24 19.59 85.04 29.60 28.76
CHISQ(calculated) = 21.88 CHISQ(ALPHA=.05, df=3) = 7.815
CHISQ(calculated) > CHISQ(critical), therefore reject H(O) and con clude that the distributions are different.
b. P(1) P(2) P(3) P(4)     .4468 .7010 .5070 .4058
P(i)  P(j) +/ SQRT(CHISQ(critical) * (p(i)q(i)/n + p(j)q(j)/n))
Pairs C.I.   1  2 .2542 +/ .2216 * 1  3 .0602 +/ .2619 1  4 .0041 +/ .2615 2  3 .1940 +/ .1885 * 2  4 .2952 +/ .1879 * 3  4 .1012 +/ .1751
Conclude: P(1)  P(2) P(2)  P(3) caused rejection of H(O) P(2)  P(4)
73.
Frequency of repairs are being examined for two populations of cars, foreign and domestic. Given the sample data below, can we conclude that the population distributions are the same at ALPHA = .10?
Frequency of Repairs/Year 0 1  2 3  5 More than 5     Foreign Autos 6 ^ 11 ^ 11 ^ 7 Domestic Autos 100 ^ 50 ^ 22 ^ 17
Answer: Expected frequencies:
0 1  2 3  5 More than 5     Foreign Autos 16.56 ^ 9.53 ^ 5.16 ^ 3.75 Domestic Autos 89.44 ^ 51.47 ^ 27.84 ^ 20.25
CHISQUARE(calculated) = 19.439 CHISQUARE(critical, ALPHA=.10, df=3) = 6.251
Since CHISQUARE(calculated) > CHISQUARE(critical), reject H(0) and conclude that the distributions are not the same.
74.
Below are the results of an insurance survey to relate amount of insurance to income.
Amount of Insurance Income Family (in Thousand $) (in Thousand $)    A 9 10 B 20 14 C 22 15 D 15 14 E 17 14 F 30 25 G 18 12 H 25 16 I 10 12 J 20 15
Find RHO and TAU and test each for significance. (Note: data has ties.)
Answer: a. R(X) R(Y) (R(X)R(Y)**2    1 1 0 2 2.5 .25 3 5 4 4 5 1 5 2.5 6.25 6.5 5 2.25 6.5 7.5 1 8 7.5 .25 9 9 0 10 10 0  15
H(O): X and Y are independent H(A): X and Y are correlated
RHO(crit) = .6364 RHO(calc) = 1  ((6*15)/(10*99)) = .9091
Since RHO(calc) > RHO(crit), reject H(O) and conclude that X and Y are correlated.
b. N(C) N(D) #Neither    9 0 0 7 0 1 4 1 2 4 1 1 5 0 0 4 0 0 2 0 1 2 0 0 1 0 0 0 0 0    38 2 5
U(X) = (1/2) * (2*1) = 1 U(Y) = (1/2) * (2*1 + 3*2 + 2*1) = 5
TAU = (38  2)/SQRT(44)*SQRT(40) = 36/41.952 = .8581
H(O): X and Y are independent H(A): X and Y are correlated
T(calc) = 38  2 = 36 T(crit) = 21
Since T(calc) > T(crit), reject H(O) and conclude that X and Y are correlated.
75.
The observed life, in months of service, before failure for the color television picture tube in 8 television sets manufactured by Firm B are as follows (arranged according to size):
Firm B: 34 36 41 43 44 45 47 48
Let ETA(B) denote the median service life of picture tubes produced by the firm.
The point estimate of ETA(B) equals:
a. 35 b. 43.5 c. 44 d. 33.5
Answer: b. 43.5
n = 8 Therefore, the median equals the average of the two middle values. Median = (43 + 44)/2 = 43.5 or any number between 43 and 44.
76.
The life in months of service before failure of the color television picture tubes in 8 television sets manufactured by Firm A and 8 sets manufactured by Firm B are as follows (arranged according to size):
Firm A: 25 29 31 32 35 37 39 40 Firm B: 34 36 41 43 44 45 47 48
Let ETA(A) and ETA(B) denote the median service life of picture tubes produced by the two firms.
The Sinterval with confidence coefficient .71 for ETA(A) is bounded by:
a. 29 and 39 b. 36 and 47 c. 31 and 37 d. 41 and 45
Answer: c. 31 and 37
GAMMA = .71 n = 8
From the Table of dfactors for Sign Test and Confidence Intervals for the median, d = 3. The confidence interval is bounded by the dsmallest and dlargest sample observations. Thus, the Sinter val about the median is bounded by the third smallest and third largest sample observations, or 31 and 37.
77.
The life in months of service before failure of the color television picture tube in 8 television sets manufactured by Firm A and 8 sets manufactured by Firm B are as follows (arranged according to size):
Firm A: 25 29 31 32 35 37 39 40 Firm B: 34 36 41 43 44 45 47 48
Let ETA(A) and ETA(B) denote the median service life of picture tubes produced by the two firms.
The Winterval with confidence coefficient .98 for ETA(A) is bounded by:
a. 29 and 39 b. 36 and 47 c. 35 and 47.5 d. 27 and 39.5
Answer: d. 27 and 39.5
n = 8 Using a table of critical values for the Winterval with ALPHA=.02, d=2, the table of averages:
^ 25 29 31 32 35 37 39 40  25 ^ 25 [27! 28 29 ^ 29 30 31 ^ 31 32 ^ 35 ^ 37 ^ 37 38 38.5 39 ^ 39 [39.5! 40 ^ 40
Winterval is 27 and 39.5.
78.
Explain which measure of central tendency is most useful when reporting an average income for persons employed by Beech Aircraft.
Answer: The median would be the most useful measure of central tendency when reporting an average income. The distribution of income is positively skewed since there are relatively few people who earn a substantially high income. These extreme values would affect the mean by inflating it. The median, which simply indicates the point where half the observations are above and half are below, would not be affected by such extreme values and in this sense would more accurately convey the "average" income.
79.
A college athlete, equally talented in baseball and football, compares the income potential in the two sports before choosing to specialize in one of them. The data available for annual income from all sources is below:
Mean Median 90th percentile Football players: $25,000 $20,000 $70,000
Baseball players: $23,000 $28,000 $50,000
a. Give a onesentence interpretation of the mean which indicates how it can be used to help him to decide between the two sports. b. Do the same for the median and the same for the 90th percentile. c. Based on the data above, which sport would you suggest he choose? Indicate why.
Answer: a. The mean is the average dollar income (or expected income) and in dicates football is the better choice, though the difference between the two means is not great.
b. The median is the midpoint in terms of rankings and is substantially higher for baseball. The 90th percentile is the dollar value that exceeds 90 percent of the salaries and indicates football is the best choice.
c. The distribution of salaries in football is skewed right, indicating higher potential salary extremes  a riskier but higher  payoff choice. A riskaverse athlete might choose baseball, whose left skewed salaries suggest higher "typical" salaries.
80.
The life in months of service before failure of the color television picture tube in 8 television sets manufactured by Firm A and 8 sets manufactured by Firm B are as follows (arranged according to size):
Firm A: 25 29 31 32 35 37 39 40 Firm B: 34 36 41 43 44 45 47 48
Let ETA(A) and ETA(B) denote the median service life of picture tubes produced by the two firms.
You want to test the hypothesis ETA(A) = 38 against the alternative ETA(A) < 38. The correct sign test statistic and its value is:
a. S(+) = 2 b. S() = 2 c. S(+) = 3 d. S() = 3
Answer: a. S(+) = 2
Since we have H(A): ETA(A) < 38, we expect fewer observa tions to be larger than the median, and the correct test statistic is S(+). Its value is:
S(+) = # observations > 38 = 2.
81.
A student organization surveyed food prices at 4 local food stores:
Stores Item Weight/volume A B C D  Apples per lb .30 .30 .33 .45 Lettuce one head .39 .25 .25 .39 Milk, homogenized 1/2 gal container .84 .76 .81 .76 Eggs: fresh, grade A, 1 doz .89 .83 .69 .93 large Hamburger per lb 1.29 .99 .99 1.09 Frying chicken cut up, per lb .65 .46 .59 .69 Chicken noodle soup 10 3/4 oz can .22 .19 .22 .19 White bread 1 lb loaf .48 .59 .48 .33 Raviolios with meat 15 oz .45 .41 .43 .35 sauce Soda qt bottle .38 .40 .37 .39 Coffee 4 oz 1.39 1.31 1.29 1.23 Peanut butter 28 oz jar 1.19 1.16 1.17 1.09 Laundry soap 3 lb 1 oz .89 .85 .81 .80
You may want to compare prices at stores C and D. An appropriate twosample test can be based on either:
a. the sign test or the median test. b. the Wilcoxon one or two sample test. c. the sign test or Wilcoxon onesample test. d. the median test or Wilcoxon twosample test.
Answer: c. the sign test or Wilcoxon onesample test.
82.
The observed life, in months of service, before failure for the color television picture tube in 8 television sets manufactured by Firm B are as follows (arranged according to size):
Firm B: 34, 36, 41, 43, 44, 45, 47, 48
Let ETA(B) denote the median service life of picture tubes produced by the firm and assume the lifetimes have symmetric distributions. You want to test the hypothesis ETA(B) = 38.5 against the alternative ETA(B) =/= 38.5 using the Wilcoxon signed rank test. From the following list, select the most reasonable test statistic:
(a) W(+) = 2 (b) W(+) = 5 (c) W() = 5 (d) W() = 2
Answer: (c) W() = 5
X(i) D(i) ]D(i)] Rank    
34 4.5 4.5 3.5 36 2.5 2.5 1.5 41 2.5 2.5 1.5 43 4.5 4.5 3.5 44 5.5 5.5 5 45 6.5 6.5 6 47 8.5 8.5 7 48 9.5 9.5 8
W() = SUM(R()) = 3.5 + 1.5 = 5
83.
Ten randomly selected cars of a specific year, make, and model and with similar equipment, are subjected to an EPA gasoline mileage test. The resulting miles/gallon are:
24.6, 30.0, 28.2, 27.4, 26.8, 23.9, 22.2, 26.4, 32.6, 28.8
Using the Wilcoxon Median Test, test the hypothesis that the population median is 30 miles/gallon at the ALPHA = .10 level. Construct a 90% confidence interval for the median.
Answer: Measurement D(i) ]D(i)] Rank     24.6 5.4 5.4 7 30.0 0 0  28.2 1.8 1.8 2 27.4 2.6 2.6 3.5 26.8 3.2 3.2 5 23.9 6.1 6.1 8 22.2 7.8 7.8 9 26.4 3.6 3.6 6 32.6 2.6 2.6 3.5 28.8 1.2 1.2 1
R+ = 3.5 > T = 3.5 R = 41.5
Lower w = 9 Upper w = (9*10)/2  9 = 36
Since (T=3.5) < 9, we reject H(0): median = 30.
For the confidence interval, we need the 11th largest and smallest values, to be obtained from the following table:
^ 32.6 30.0 28.8 28.2 27.4 26.8 26.4 24.6 23.9 22.2  32.6 ^ 32.6 31.3 30.7 30.4 30.3 29.7 29.5 28.6 28.25 27.4 30.0 ^ 30.0 29.4 29.1 28.7 28.4     28.8 ^ [28.8! 28.5 28.1 27.8     28.2 ^      26.05 [25.2! 27.4 ^    26.0 25.65 24.8 26.8 ^   25.7 25.35 24.5 26.4 ^ 26.4 25.5 25.15 24.3 24.6 ^ 24.6 24.25 23.4 23.9 ^ 23.9 23.05 22.2 ^ 22.2
Therefore, 90% C.I.: from 25.2 to 28.8.
84.
The life in months of service before failure of the color television picture tube in 8 television sets manufactured by Firm A and 8 sets manufactured by Firm B are as follows (arranged according to size):
Firm A: 25, 29, 31, 32, 35, 37, 39, 40 Firm B: 34, 36, 41, 43, 44, 45, 47, 48
Against the twosided alternative, the Wilcoxon (Mann Whitney) two sample test has descriptive level:
(a) .050 (b) .010 (c) .007 (d) .004
Answer: (c) .007
U(A) = 0 + 0 + 0 + 0 + 1 + 2 + 2 + 2 = 7 U(B) = 64  7 = 57 P(U(A) <= 7) = .007
85.
The life in months of service before failure of the color television picture tube in 8 television sets manufactured by Firm A and 8 sets manufactured by Firm B are as follows (arranged according to size):
Firm A: 25, 29, 31, 32, 35, 37, 39, 40 Firm B: 34, 36, 41, 43, 44, 45, 47, 48
Suppose the data is ranked as one combined set. The sum of the ranks R(B) for the Bobservations equals:
(a) 36 (b) 43 (c) 57 (d) 93
Answer: (d) 93
Table of Ranks:
Firm A: 1 2 3 4 6 8 9 10 Firm B: 5 7 11 12 13 14 15 16
SUM(R(B)) = 5 + 7 + 11 + 12 + 13 + 14 + 15 + 16 = 93
86.
An expert gave the following subjective ratings of the driving abilities of a group of two subjects. Test the hypothesis that according to the expert's ratings, women are better drivers than men. (Use a nonpara metric test with ALPHA = .05.) (NOTE: higher scores indicate better drivers.)
Expert's Ratings
Male 7, 4, 2, 3, 12, 1, 14, 10, 10 Female 6, 13, 12, 10, 14, 7, 3, 11
Answer: H(O): Female drivers are worse or as good as male drivers H(A): Female drivers are better than male drivers
Using the Mann WhitneyWilcoxon test:
U(M) = 2.5 + 1 + 0 +.5 + 5.5 + 0 + 7.5 + 3.5 + 3.5 = 24
U(critical, onetail, ALPHA=.05,9,8) = 19
Since U(M) > U(critical) continue H(O). Therefore sample evidence was not strong enough to indicate that females are better drivers than males.
Using Wilcoxon version of test:
Ranks: Ratings ^ 1 2 3 4 6 7 10 11 12 13 14  M or F ^ M M M,F M F M,F M,M,F F M,F F M,F Rank ^ 1 2 3.5 5 6 7.5 10 12 13.5 15 16.5
Sum of ranks for females = 3.5 + 6 + 7.5 + 10 + 12 + 13.5 + 15 + 16.5 = 84 = T(F)
Sum of ranks for males = [(17)(18)/2!  84 = 69 = T(M)
Converting to U statistic:
U(M) = T(M)  [.5 * n(M) * (n(M) + 1)! = 69  [.5 * 9 * 10! = 24
Conclusion reached is the same as above.
87.
A large consulting firm hires a west coast university to provide an MBA program for its employees. The basic statistics course is taught at two locations of the firm. After completion of the course, stan dardized tests are given to the participating employees at each loca tion. Assume the distributions of test scores are symmetric for both groups. The results are shown below.
Observation Location A Location B    1 65 60 2 74 72 3 77 66 4 82 75 5 70 78 6 78 65 7 84 
Test the hypothesis at ALPHA = .10 that the two samples came from the same population, or equivalently that the populations have the same median scores.
Answer: H(O): The two samples came from the same population. (median(1) = median(2)) H(A): The two samples came from different populations. (median(1) =/= median(2))
Using WilcoxonMann Whitney test statistic we find:
A Rank B Rank     84 13 78 10.5 82 12 75 8 78 10.5 72 6 77 9 66 4 74 7 65 2.5 70 5 60 1 65 2.5   Ranked Sums: 59.0 32.0
Smaller Rank Sum = 32; (Note that some books give critical values for this sum.) T = 32  (6 * 7)/2 ; (Transforming to the MannWhitney U Statis = 11 tic.)
Critical Values: lower U = 9 upper U = (6 * 7)  9 = 33
Since 9 <= (T=11) < 33, do not reject H(O). We do not have sufficient evidence to claim a difference between the two populations.
88.
New employees of the ABC corporation are given a training program to acquaint them with business procedures and principles. Two groups of ten each are selected randomly from a large set of new employees. The first group is trained using Method A, and the second group is trained using Method B. At the end of the training period, each group is given the same test to determine how much information has been assimilated. The data are:
Method A Method B  
55 50 70 91 70 90 65 62 62 75 81 88 72 84 58 78 67 82 50 80
Use ALPHA = .05 to test that the two training methods result in the same amount of assimilated information.
Answer: X(i) Rank  
50 1.5 50* 1.5* 55* 3 * 58* 4 * 62* 5.5* 62 5.5 65* 7 * 67* 8 * 70* 9.5* 70* 9.5* 72* 11 * 75 12 78 13 80 14 81* 15 * 82 16 84 17 88 18 90 19 91 20
(* indicates Method A)
S = SUM(R(X(i))) = 1.5 + 3 + 4 + 5.5 + 7 + 8 + 9.5 + 9.5 + 11 + 15 = 74
T = S  n*(n  1)/2 = 74  10*11/2 = 19
lower w = 24 upper w = 10*10  24 = 76
T < 24, reject H(0); conclude methods result in different scores.
89.
Eight names are selected at random from the subscriber list of Magazine A, and eight additional names from the list of Magazine B. The ages of the subscribers are determined and listed below (fictitious data):
A: 18, 24, 35, 19, 20, 20, 40, 17 B: 20, 30, 45, 38, 42, 34, 50, 22
a. Using appropriate 5year age groups, prepare a stem & leaf plot OR a histogram for each group.
b. Is there evidence (at the 5% level of significance) that the two magazines appeal to different age groups?
c. Give two possible reasons for choosing the test you chose for part b instead of some other test.
Answer: a. STEM & LEAF PLOT:
Magazines A B 1 ^ 897 ^ 2 ^ 400 ^ 02 2 ^ ^ Age 3 ^ ^ 04 Groups 3 ^ 5 ^ 8 4 ^ 0 ^ 2 4 ^ ^ 5 5 ^ ^ 0
HISTOGRAMS:
^ Magazine A ^ 5 + ^ 4 + Frequency ^ 3 + _________ ^ ^ ^ ^ 2 + ^ ^ ^ ^ ^ ^ ^ 1 + ^ ^ ^  ^ ^ ^ ^ ^ ^ ^ +++++++++++> 5 10 15 20 25 30 35 40 45 50 55 Age
^ Magazine B ^ 5 + ^ 4 + Frequency ^ 3 + ^ 2 +   ^ ^ ^ ^ ^ 1 + ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ +++++++++++> 5 10 15 20 25 30 35 40 45 50 55 Age
b. Using the MannWhitney U Test:
U(A) = 0 + 2 + 4 + 0 + .5 + .5 + 5 + 0 = 12
Using a table of critical values for this test at 5% level of significance, U(critical) = 14
Since U(observed) < U(critical), there is evidence that the two magazines appeal to different age groups.
c. 1) Normality cannot be assumed. 2) The sample sizes are not large enough to avoid necessity for normality.
90.
Suppose you run a warehouse that stocks replacement parts for appliances. You randomly sample orders for parts for replacement burner units for two brands (A and B) of electric stoves. Over a period of 50 weeks you observe the following:
weekly demand number of weeks for burners Brand A Brand B  
0 28 22 1 15 21 2 6 7 3 1 0 over 3 0 0
Use the KolmogorovSmirnov test to determine if these two distributions are the same.
Answer: F(A) F(B) D    .56 .44 .12 .86 .86 0 .98 1.00 .02 1.00 1.00 0 1.00 1.00 0
P(D(50) >= .12) > .2 Do not reject H(O) at ALPHA = .10, .05 or .01.
91.
Suppose we examine a random sample of subjects to see whether there is a preference for certain colors. The colors selected are green, blue, brown, yellow, and black. Theory suggests that people may prefer those colors that are most commonly found in nature, so the colors have been ranked from least commonly found in nature (black) to most commonly found (green). We have recorded the number of people (n = 50) who rated each color as their favorite. Our null hypothesis is one of no special preference. Perform a KolmogorovSmirnov test to see whether the ob served preferences fit our expected uniform distribution.
color expected observed   
black 10 0 yellow 10 5 brown 10 0 blue 10 25 green 10 20
Answer: S(n)(X) F(0)(X) D   
0 .2 .2 .1 .4 .3 .1 .6 .5 .6 .8 .2 1.0 1.0 0
Prob.(D(n=50) >= .5) < .01
Reject H(O), that all colors are uniformly preferred, at ALPHA = .10, .05 and .01.
92.
You wish to compare four methods of displaying apples sold in supermarkets. The question to be answered is:
Does one of these methods (A,B,C, or D) provide greater daily apple sales than another?
In order to evaluate methods, a single display is used in a store for a full day. Displays cannot be changed during the working day but can be changed between days. A store owner agrees to let you set up and meas ure sales on Monday, Tuesday, Wednesday, and Thursday for each of 4 consecutive weeks during October and November (i.e. a total of 16 selling days). Prior to this test period you obtain the following information on apple sales where the same display has been used each day:
Units of Apples Sold Monday Tuesday Wednesday Thursday First Week 100 105 70 100 Second Week 125 120 85 120
Design an experiment to compare A,B,C, and D. Explain your choice of design.
Answer: The information available on apple sales using the same display (uniform treatment) indicates one should not assume uniform results, since both 1. Day of week (Wednesday is noticiably worse) and 2. Week (second week had more sales) pear to affect the response. Therefore, a Latin Square Design is appropriate to restrict randomization so that each display is tested an equal number of times on each day of the week and an equal number of times during each week. In this way, each display is equally exposed to effects of week and day.
The following assignment of treatments (from the program LSQPLN***) is one possible assignment since each display occurs on each day of the week as well as once during each week.
Day of Week
Week 1 2 3 4
1 C D B A 2 D C A B 3 A B D C 4 B A C D
93.
Suppose that you have been instructed to test 2 chemicals that are said to be mosquito repellants. You are to compare these 4 treat ment combinations:
Amount of Chemical A Amount of Chemical B ____________________ ____________________ T(11) 0 0 T(12) 0 + T(21) + 0 T(22) + +
+ indicates the recommended rate of application for each chemical. An experimental unit consists of an arm of a subject. Each subject provides 2 arms that are considered comparable, but subjects may differ in mosquito appeal (i.e., experimental units are homogeneous within incomplete blocks of size 2).
^^^ ^^^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^^^ ^^^ Arm 1 2 1 2
Person 1 Person 2
A) How would you group treatments so that main effect differences between rates for chemical A would be measured precisely, while main effect differences between rates for B are measured less pre cisely? (Indicate clearly which treatments must be applied to the same person).
B) How would you group treatments so that A and B main effects are measured most precisely while AB interaction is estimated less pre cisely?
Answer: A) To obtain high precision for A at the expense of low precision for B, let B be the basis for grouping (B is main plot factor) ==> Group 1: T(11), T(21); Group 2: T(12), T(22). Both group members are assigned to the same person.
B) Let AB be the basis for grouping, i.e., group members will consist of treatments receiving (+) signs for AB interaction, or those re ceiving () signs.
T(11) T(12) T(21) T(22)
A + +   B +  +  AB +   + ==> Group 1: T(11) and T(22) Group 2: T(12) and T(21)
94.
An investigator wished to study the effect of an operator on the performance of a machine. He could arrange to have each of four operators run the machine five times. A response measurement could be recorded each time the machine was used. How many experimental units will he have if
a. He randomly selects an operator, has him run the machine five times, then selects another operator, etc.?
b. He identifies 20 turns for running the machine and randomly assigns operators to turns subject to the requirement that each operator perform five times?
c. He forms five groups of four turns and randomly and independently assigns operators within each group of four?
Answer: a. 4 : an experimental unit is a set of 5 turns or time of running the machine. b. 20: an experimental unit is a turn. c. 20: an experimental unit is a turn even though turns have been arranged in groups.
95.
An investigator suspected that the time required to pour a mold in a foundry was longer after lunch than before lunch. He proposed comparing these two conditions or treatments by measuring times needed to pour a mold.
Which of the following schemes meets the requirement of independence of response among experimental units? Why or why not?
a. The foundry was visited one day and four times were recorded before lunch and four times after lunch.
b. The foundry was visited on four days. On each day one time was recorded before lunch and one after lunch.
Answer: Scheme b comes close to meeting the requirement of independence of response among experimental units. With this scheme other factors that may affect time to pour a mold, either before or after lunch such as: the weather conditions or the handing out of paychecks during lunch etc. would be balanced, and, therefore, the responses would not all be tied together with that common influencing factor.
96.
A company is interested in adopting a new type of machine. Since it is an expensive model they are not willing to adopt it unless they are fairly positive it will decrease the production time per unit. If MU(S) is the mean production time per unit under the stan dard machine and MU(n) is the mean production time per unit under the new machine, the appropriate pair of hypotheses to test is:
(a) H(O): MU(S) = MU(n) vs. H(A): MU(S) < MU(n) (b) H(O): MU(S) >= MU(n) vs. H(A): MU(S) < MU(n) (c) H(O): MU(S) = MU(n) vs. H(A): MU(S) =/= MU(n) (d) H(O): MU(S) = MU(n) vs. H(A): MU(S) > MU(n)
Answer: (d) H(O): MU(S) = MU(n) vs. H(A): MU(S) > MU(n)
The hypothesis we do not wish to reject unduly is MU(S) = MU(n). This we call H(O). The alternative we wish to investigate and not accept unduly is MU(S) > MU(n).
97.
A home owner claims that the current market value of his house is at least $40,000. Sixty real estate agents were asked independently to estimate the house's value. The hypothesis test that followed ended with a decision of "reject H(O)". Which of the following statements accurately states the conclusion?
a) The home owner is right, the house is worth $40,000. b) The home owner is right, the house is worth less than $40,000. c) The home owner is wrong, the house is worth less than $40,000. d) The home owner is wrong, the house is worth more than $40,000. e) The home owner is wrong, he should not sell his home.
Answer: c) The home owner is wrong, the house is worth less than $40,000.
98.
In an experiment to determine the effect of a utilization review (UR) procedure on the length of hospitalization, patients were paired by age and sex and one member of each pair was randomly assigned to a ward that had UR, the other to a regular ward. The results of the 30 pairs are summarized below. The hospital wants to know at the 5% significance level if the length of hospitalization is different for those experien cing a utilization review.
Regular Ward UR Ward Paired Difference Mean Length of Stay 5.64 4.29 1.35 S.D. 2.75 2.41 3.41
a. Set up the appropriate confidence interval to evaluate this experiment.
b. How can the hospital use the confidence interval to make a decision about the effectiveness of the procedure?
Answer: a. C.I. = DBAR +/ [t*S(D)/SQRT(n)! = 1.35 +/ [2.045*3.41/SQRT(30)! = 1.35 +/ 1.27
.08 <= MU(D) <= 2.62
b. Because MU = 0 is not included in this interval, we know the null hypothesis would be rejected in a significance test at the 5% level. Therefore, we can conclude that the procedure is effective.
99.
A shirt manufacturer is considering the purchase of new sewing machines. If MU(1) is the average number of shirts made per hour by his old machines and MU(2) is the corresponding average number of shirts per hour for the new machine, he wants to test the null hypothesis MU(1) = MU(2) against a suitable alternative.
a. What alternative hypothesis should he use if he does not want to buy the new machine unless it is proven superior?
b. What alternative hypothesis should the manufacturer use if he wants to buy the new machine (which has nice features) unless the old machines are actually superior?
Answer: a. MU(2) > MU(1)
b. MU(1) > MU(2)
100.
A manufacturer who produces auto tires wished to compare the wearing qualities of two types of tires, A and B. To make the comparison, a tire of Type A and one of Type B were randomly assigned and mounted on the rear wheels of each of 5 automobiles. The automobiles were then operated for a specific number of miles and the amount of wear was then recorded for each tire.
Auto A B ___________________________
1 10.6 10.2 2 9.8 9.4 3 12.3 11.8 4 9.7 9.1 5 8.8 8.3
Test H(0): MU(A) = MU(B) against H(A): MU(A) =/= MU(B) with ALPHA = .05:
a) using 2sample test. b) using pairedsample test.
What are your conclusions? Explain.
Answer: a) XBARA = 51.2/5 = 10.24 XBARB = 48.8/5 = 9.76
S(A)**2 = (SUMX(A)**2  ((SUMX(A))**2)/n)/(n  1) = (531.22  (51.2**2)/5)/4 = 1.73
Similarly: S(B)**2 = 2.44
(NOTE: The following is the same as pooling since the sample sizes are equal. However, the proper df = 5 + 5  2 = 8.)
S(XBARA  XBARB) = SQRT((S(A)**2)/n(A) + (S(B)**2)/n(B)) = SQRT((1.73/5) + (2.44/5)) = .913
Critical values of: (XBARA  XBARB) = (MU(A)  MU(B)) +/ t(crit)*S(XBARA  XBARB) = 0 +/ (2.31*.913) = +/ 2.109
DM(CALC) = XBARA  XBARB = 10.24  9.76 = .48
Since .48 is neither less than 2.109 nor more than 2.109, we cannot reject H(O).
b) A B D D  DBAR = d d**2
10.6 10.2 .4 .08 .0064 9.8 9.4 .4 .08 .0064 12.3 11.8 .5 .02 .0004 9.7 9.1 .6 .12 .0144 8.8 8.3 .5 .02 .0004   2.4 .0152
DBAR = 2.4/5 = .48
S(D) = SQRT((SUM(d)**2)/n  1) = SQRT(.0152/4) = .0616
S(DBAR) = S(D)/SQRT(n) = .0616/SQRT(5) = .0276
t(calc) = DBAR/S(DBAR) = .48/.0276 = 17.41
t(crit) = 2.776 for n  1 = 4 df
Since 17.41 > 2.776, we reject H(O) and conclude that the means are different.
H(O) was rejected in (b) but not in (a) since the test for related samples is stronger than that for independent samples.
101.
We are interested in the wearing capabilitites of tires. We obtain Goodday and Goodpoor Tires and 9 racing cars (and also the track used for the Indianapolis 500 Race). We put Goodday on the lefthand side of the car (front and rear) and Goodpoor on the righthand side of the car (front and rear). We then allow the cars to complete the 500 miles at a (relatively) safe speed and then measure the wear (in millimeters) per tire.
Car No. Goodday Goodpoor    77 17 16 82 18 19 92 17 12 41 16 13 17 15 14 22 14 12 18 10 10 23 18 15 43 17 13
a. All the advertising literature claims equality between Goodday and Goodpoor. Can you present evidence to disprove this claim? Use a significance level of 5%.
b. Comment on the validity of this experimental setup.
Answer: a. Let d be the difference in wear between tires on the lefthand side compared to tires on the righthand side. We are interested in testing the hypothesis that the mean (dBAR) of such different scores is zero.
H(0): MU(dBAR) = 0 H(1): MU(dBAR) =/= 0
The problem is obviously a paired experiment setup and therefore we perform a ttest on the difference.
Car No. GD GP d(i) d(i)**2      77 17 16 1 1 82 18 19 1 1 92 17 12 5 25 41 16 13 3 9 17 15 14 1 1 22 14 12 2 4 18 10 10 0 0 23 18 15 3 9 43 17 13 4 16   SUM 18 66
dBAR = [SUM(d(i))!/[9! = 18/9 = 2
S(d)**2 = [SUM([d(i)dBAR!**2)!/[n1! = [[SUM(d(i)**2)![n*(dBAR**2)!!/[n1! = [[66![9*4!!/[8! = 3.75
t(calc.) = [dBAR0!/[SQRT([S(d)**2!/n)! = [20!/[SQRT([3.75!/9)! = 3.098
t(critical, .05, twotailed, 8 df) = 2.306
Since t(calculated) > t(critical), reject H(0). Therefore we can claim on the basis of this test that the tires are not equal.
b. The Indianapolis race track has an oval shape with highlybanked curves. Since the cars travel in only one direction, only the inner tires would wear appreciably. There are many other drawbacks to the design, but this one is catastrophic.
102.
A random sample of 625 boxes taken from the output of a box making machine was inspected for flaws. It was found that 500 of the boxes were free from flaws. To three decimals, what is the upper limit of the 0.99 confidence interval estimate of the proportion of acceptable boxes being produced?
a. .8 + 1.96*SQRT(.16/625) b. .8 + 2.576*(.16/625) c. .8 + 1.96*(.16/625) d. .8 + 2.576*SQRT(.16/625)
Answer: d. .8 + 2.576*SQRT(.16/625)
C.I. = p +/ Z(ALPHA/2)*SQRT(pq/n) p = 500/625 = .8, q = 125/625 = .2 C.I. = .8 +/ Z(.005)*SQRT(.8*.2/625)
Upper limit = .8 + 2.576*SQRT(.16/625)
103.
Suppose that I'm interested in a herd of Unicorns where mean weight is unknown, but the population variance is known to be 100. The Director of Unicorns has given me $1200 for a year and directed me to submit monthly reports on the average weight of this herd. It costs $5 per unicorn to weigh a beast and the Director will let me have whatever money is left over after I pay expenses. He won't tolerate a confidence interval greater than +/ 5 and becomes very upset with an interval that doesn't contain the true mean. (He has a special infor mant who reports intervals that don't contain the true population mean.) What shall I do? Why? Will I make any money? Will I survive the year?
Answer: The steps taken would probably depend upon several factors, including:
(1) What action the Director would take if you failed to produce a confidence interval that contains the mean,
(2) What chance of failure you are willing to risk, and
(3) How you expect the mean weight of the herd to change through out the year.
If you feel it is necessary to estimate the mean every month with as much confidence as possible and forego making any money, you would use a sample size = 20 and find a confidence interval with a 97.5% confidence level (see calculations below).
n = (1200/12)/5 = 20
Z = 5/(10/SQRT(20)) = 2.24
P(2.24 < Z < 2.24) = .975
In order to make money, you could either decrease your confidence level, not make a new estimate every month, or ask the director for a raise.
104.
A random sample of 500 accounts receivable is selected from the 4,032 accounts that a firm has, and the sample mean is found to be $242.30. The sample standard deviation is computed to be $3.20. Set up a .99 confidence interval estimate of the population mean. How do you interpret the meaning of this interval?
Answer: Using t with ALPHA = .01 and df = 499,
C.I. = XBAR +/ (t) (S/SQRT(n)) = 242.30 +/ (2.576) (3.20/SQRT(500)) = 241.93 to 242.67.
99% of the time that this procedure is used to calculate an interval, the resulting interval will contain MU. This interval may or may not include MU.
105.
A survey on consumer finances reports that 33 per cent of a sample of 2,600 spending units expected good times during the next 12 months. Assume that a simple random sample was used in the study. Set up a .95 confidence interval estimate of the population proportion of spending units expecting good times.
Answer: p = .33 n = 2600 Z(ALPHA=.025) = 1.96
Stand. error of proportion = SQRT(pq/n) = SQRT((.33*.67)/2600) = .009
C.I. = .33 +/ (1.96 * .009) = from .312 to .348
106.
In a random sample of 200 television viewers in a certain area, 95 had seen a certain controversial program. Construct a 0.99 confi dence interval for the actual percentage of television viewers in that area who saw the program.
Answer: .475 +/ 2.58 SQRT((.475*.525)/200) = .475 +/ .091
107.
Taking a random sample from its very extensive files, a water company finds that the amount owed in 16 delinquent accounts have a mean of $16.35 and a standard deviation of $4.56.
a. Use these values to construct a .98 confidence interval for the average amount owed on all delinquent accounts.
b. If Mr. Blackwater, the company president, claims the delinquent accounts have a population mean of $19.01, how could you quickly respond to him based on part a above (also after explaining that you were using a 2% ALPHA level)?
Answer: a. C.I. = XBAR +/ [t*S(XBAR)!; t(ALPHA=.01, onetail, df=15) = 2.602 = 16.35 +/ [2.602*(4.56/SQRT(16))! = from 13.384 to 19.316
b. According to the confidence interval found in part a, Mr. Blackwater's estimate of $19.01 is a possible estimate for the population mean. It should be pointed out that we are 98% confident that such a confidence interval would contain the population mean.
108.
A floor manager of a large department store is studying the buying habits of the store's customers. Suppose he assumes that monthly income of these customers is normally distributed with a standard deviation of 500. If he draws a random sample of size N = 100 and obtains a sample mean of YBAR = 800,
a) Find a .95 confidence interval for the true population mean. b) Do you think that it would be quite unreasonable for the true population mean to be $600? Explain.
Answer: a) C.I. = YBAR +/ Z*SIGMA(YBAR) = 800 +/ 1.96*(500/SQRT(100)) = 800 +/ (1.96*50) = 800 +/ 98
Therefore, 702 < MU < 898
b) Yes, based on the above confidence interval, we would reject the hypothesis that MU = 700 (at ALPHA = .05).
109.
A cigarette manufacturer tests tobaccos of two different brands of cig arettes for nicotine content and obtains the following results:
Brand A: 4 6 5 2 3 Brand B: 7 8 5 9 6
a. Using ALPHA = .01, would you say that there is a difference in the averages?
b. Set up 99% confidence limits on the difference. Does this answer agree with your answer in part a? Why or why not?
Answer: a. For supplier A:
XBARA = 20/5 = 4
S(A)**2 = (SUM(X**2))((SUM(X))**2)(n)/(n1) = (90(400/5))/4 = 2.5
For supplier B:
XBARB = 35/5 = 7
S(B)**2 = (225(1225/5))/4 = 2.5
S(XBARAXBARB) = SQRT((S(A)**2/n(A)) + (S(B)**2/n(B))) = SQRT((6.25/5) + (6.25/5)) = 1.58
(XBARAXBARB)(crit) = MU +/ t(crit)*S(XBARAXBARB) = 0 +/ (3.36)*1.58 = +/ 5.31
(XBARAXBARB)(calc) = 4  7 = 3
Since 3 is neither less then 5.31 nor more than 5.31, we cannot reject H(0).
b. (XBARA  XBARB)  t(crit)*S(XBARA  XBARB) < MU(A)  MU(B) < (XBARAXBARB) + t(crit)*S(XBARAXBARB)
3  5.31 < MU(A)  MU(B) < 3 + 5.31 8.31 < MU(A)  MU(B) < 2.31
Note that H(O): MU(A)  MU(B) is included in this interval, and thus we are led to the same conclusion, (i.e. continuation of H(O)), as in part (a).
110.
A brewery producing beer has a number of specifications for quality. Among these standards is the requirement that the degree of hop like flavor should be a value of 8.0.
The production of the brewery consists of a large number of batches. It's possible for differences to arise between batches, so we will regard each batch as a different population. We will consider the hoppiness of each batch as a normally distributed variable with mean and variance unknown.
From each batch you can remove 6 samples for hoppiness. For each batch you are to:
a. set confidence limits for the batch (population) mean, MU; b. determine if these limits are consistent with the require ment that hoppiness is a value of 8.0.
1. Outline the procedure to be followed in setting confidence limits where the probability of the interval calculated including MU is:
a. 90% b. 99%
2. Apply the procedure outlined to this set of sample values: 13, 11, 9, 14, 8, 11. Is this sample data consistent with the speci fication of hoppiness = 8.0 when the probability level used is:
a. 90% b. 99%
3. Do these results suggest any weakness in the procedure used? If so, what?
Answer: 1. a. To set 90% confidence limits including MU, we need XBAR, the sample standard deviation, s, the sample size, n, and a t value for ALPHA = .1.
Use the formula: XBAR +/ (t, ALPHA/2)(s/SQRT(n)) with t(ALPHA/2) = 2.015.
b. Use the same procedure as in 1a, but use t(ALPHA/2) = 4.032.
2. XBAR = 11 Standard deviation = 2.28 n = 6
a. 11 +/ (2.015)(2.28/SQRT(6)) = 11 +/ (2.015)(.93) = 11 +/ 1.876 = 9.124 to 12.876
This is inconsistent with the specification of hoppiness = 8.0.
b. 11 +/ (4.032)(2.28/SQRT(6)) = 11 +/ (4.032)(.93) = 11 +/ 3.753 = 7.247 to 14.753
This is consistent with the specification of hoppiness = 8.0.
3. Ths basic weakness seems to be in using a procedure that produces a confidence interval consistent with a wide range of values, which makes it difficult to detect departures from MU = 8. This situation is exaggerated when the 99% level is used. In addition, the sample size used seems small relative to the variability measured.
111.
Willy the Waiter claims that the amount in tips that he receives per customer on any given day is normally distributed, but that the average and variability change from day to day (in response to changes in Willy, the weather, the menu, etc.). So far today, Willy has received the following amounts in tips (in dollars):
1, 2, .5, 1, 1.5, 0.
a. Write a model for Willy's tips for today. Define all terms. b. Set 90% confidence limits for the next tip that he will receive.
Answer: a. Y(J) = MU + EPSILON(J) where:
Y(J) : tips received from customer J MU : mean value for tips for the day EPSILON(J): deviation of customer J's tips from the mean value, assumed to be a value of a normally distributed ran dom variable with a mean of zero and a variance of SIGMA**2.
b. Since the mean and variance of the population have been estimated the variance of the predicted, or future, value involves both the variance of the individuals (S**2) and the variance involved in estimating the mean ((S**2)/n).
S(Predicted value)**2 = S**2 + (S**2)/n = (S**2) + (1 + 1/n)
Finding XBAR = 1 and S = .707, we get:
S(Pred. Value) = SQRT(.5(1 + 1/6)) = SQRT (.583) = .764
Using ALPHA = .10 and df = 5, t = 2.132
C.I. = 1 +/ 2.015(.764) = .54 to 2.54
112.
The following triangle test is sometimes used to identify taste experts. In the case of wine tasting, a test subject is presented with three glasses of wine, two of one kind and a third glass of another wine. The test subject is asked to identify the single glass of wine. A test subject who merely guesses has a 1 chance in 3 of identifying the single glass correctly. An expert wine taster should be able to do much better. Let K stand for the num ber of correct identifications made by a test subject in 10 inde pendent triangle tests.
Assume that a test subject is accorded the title "expert wine taster" if the number K of correct identifications is suffi ciently high to reject the hypothesis P = 1/3 at significance level ALPHA = .02
i) A Type II error has the consequence that:
a. an experienced wine taster is accorded the title. b. a person who is guessing is not accorded the title. c. an experienced wine taster is not accorded the title. d. a person who is guessing is accorded the title.
ii) The power of the test when P = .8 equals:
a. .38 b. .61 c. .88 d. .97
Answer: i) c. an experienced wine taster is not accorded the title.
ii) c. .88
Under H(O): P = 1/3
P(X >= 7) = P(7) + P(8) + P(9) + P(10) = .016 + .003 + 0 + 0 == .02
Therefore, X = 7 is the critical value.
Under H(A): P = .8
Power = 1  BETA; (BETA = P(X < 7)) = P(X >= 7) = .201 + .302 + .268 + .107 = .878
113.
A toaster manufacturer produces two models, A and B. Experience indicates that 3% of the customers buying model A will make a claim on their warranty. In a sample of 400 owners of model B (whose warranties have expired), 16 made a claim on their warranty. The manufacturer wishes to determine if the models differ in the number of claims.
(a) Determine the value of the test statistic. What are your conclusions?
(b) Let PI(B) be the probability that a buyer of model B will make a claim on the warranty. For what values of the test statistic would you reject H(0) when testing H(0): PI(B) = .03 against H(A): PI(B) =/= .03? (Let ALPHA = .10).
Answer: (a) For A: PI(A) = .03 For B: n = 400, p(B) = 16/400 = .04
H(0): PI(A) = PI(B) = .03, or PI(B)  PI(A) = 0 H(A): PI(B) =/= .03
Z(calculated) = (.04  .03)/SQRT(.03*.97/400) = .01/.00853 = 1.1724
Z(critical, twotail, ALPHA = .05) = +/ 1.96
Therefore continue H(0), and assume at the 95% confidence level that the models are the same in the number of claims.
(b) Using ALPHA = .10: Z(critical, twotail) = +/ 1.645
114.
Past experience shows that, if a certain machine is adjusted properly, 5 percent of the items turned out by the machine are defective. Each day the first 25 items produced by the machine are inspected for defects. If three or fewer defects are found, production is continued without interruption. If four or more items are found to be defective, produc tion is interrupted and an engineer is asked to adjust the machine. After adjustments have been made, production is resumed. This proce dure can be viewed as a test of the hypothesis p = .05 against the alternative p > .05, p being the probability that the machine turns out a defective item. In test terminology, the engineer is asked to make adjustments only when the hypothesis is rejected.
Interpret the quality control procedure described above as a test of the indicated hypothesis. A Type I error results in:
a. a justified production stoppage to carry out machine adjustments. b. an unnecessary interruption of production. c. the continued production of an excess of defective items. d. the continued production, without interruption, of items that satisfy the accepted standard.
Answer: b. an unnecessary interruption of production.
115.
The workers in a large plant have complained through their union negotiators that they are being underpaid. Both sides (labormanagement) agree that the mean wage for plant workers in this industry is about $3.75 per hour with a standard deviation of $.84 per hour.
(i) Does the fact that a random sample of 49 workers from this plant gave mean wage $3.54 provide sufficient evidence to indicate the plant is paying an inferior wage? Use ALPHA = .05.
(ii) State what a Type I and a Type II error would be for this problem.
Answer: (i) H(O): MU >= 3.75 H(A): MU < 3.75
n = 49 XBAR = 3.54 MU = 3.75 SIGMA = .84
SIGMA(XBAR) = [.84!/[SQRT(49)! = .12
Z(calc) = [3.54  3.75!/[.12! = 1.75
Z(crit, ALPHA=.05, onetailed) = 1.645
Since Z(calc) < Z(crit), reject H(O). Therefore, sample evidence is strong enough to suggest that workers are being underpaid.
(ii) Type I Error: A type I error will occur when the null hypothesis is rejected on the basis of the sample information and in reality the null hypothesis is true. In this case, the conclusion based on the random sample would be that the workers are being under paid when actually they are not. So, the workers' complaint would be erroneously supported.
Type II Error: A type II error will occur when the null hypothesis is not rejected on the basis of the sample informa tion and in reality the null hypothesis is false. In this case, the conclusion based on the random sample would be that the workers are not being un derpaid when actually they are being underpaid. So the employers' position of just wages would be er roneously supported.
116.
The daily yield of a chemical manufactured in a chemical plant, recorded for n = 49 days, produced a mean and standard deviation equal to XBAR = 870 tons and s = 21 tons, respectively.
Test H(0): MU = 880 against H(A): MU < 880, using ALPHA = .05. Calculate BETA for H(A): MU = 870.
Answer: S(M) = S/SQRT(n) = 21/7 = 3 XBAR(crit) = MU(M) + Z(crit)S(M) = 880 + ((1.65)*3) = 875.05
Since 870 < 875.05, we reject H(0) and conclude that MU < 880.
BETA is the probability of committing a type II error. Using the above decision rule and given H(A), it is the probability that XBAR is greater than XBAR(crit) = 875.05 when MU = 870.
BETA(H(A): MU = 870) = P(XBAR > 875.05); Z = (875.05  870)/3 = P(Z > 1.683) ; = 1.683 = .046
117.
We are interested in finding the linear relation between the number of widgets purchased at one time and the cost per widget. The following data has been obtained:
X: Number of widgets purchased 1 3 6 10 15 Y: Cost per widget(in dollars)55 52 46 32 25
Suppose the regression line is YHAT = 2.5X + 60. We compute the average price per widget if 30 are purchased and observe:
a. YHAT = 15 dollars; obviously, we are mistaken; the prediction YHAT is actually +15 dollars. b. YHAT = 15 dollars, which seems reasonable judging by the data. c. YHAT = 15 dollars, which is obvious nonsense. The regression line must be incorrect. d. YHAT = 15 dollars, which is obvious nonsense. This reminds us that predicting Y outside the range of X values in our data is a very poor practice.
Answer: d. YHAT = 15 dollars, which is obvious nonsense. This reminds us that predicting Y outside the range of X values in our data is a very poor practice.
118.
A management analyst is studying production in an electronic component assembly factory. Workers individually assemble components into final products. Each worker is given 100 sets of components to assemble each day. Employees clock out at the time they finish assembling the 100 sets into final products. The analyst has average hourly production rates for each individual worker. Which mean should be used to calculate the overall average production per labor hour?
a. arithmetic mean b. geometric mean c. harmonic mean
Answer: c. harmonic mean
The harmonic mean is properly used since the numerator in each worker's average production is 100 units and the denominator, hours worked, varies.
119.
A management analyst is studying production in an electronic component assembly factory. Workers individually assemble components into final products. Workers assemble as many units as they can in an eight hour day. The analyst has average hourly production rates for each individual worker. To calculate the factory's overall average hourly production per worker, which mean should be used?
a. arithmetic mean b. geometric mean c. harmonic mean
Answer: a. arithmetic mean
The arithmetic mean of individual average hourly production rates is the same as total production divided by total hours worked, since individual rates are daily production divided by eight for every employee.
120.
Nelly finds that 30 out of 100 randomly selected persons walking in downtown Cincinnati believe that the government should spend more money on health care; a similar survey in the suburbs shows that 20 out of 100 persons believe in more government spending. At the ALPHA = 0.05 level, are these data evidence that the people in downtown Cincinnati believe to a different extent than people in the suburbs that the government should spend more money on health care?
Answer: Z = (P(1)  P(2) 0)/(S) where S is the standard error for the difference of proportions S = SQRT((.3(.7)/100) + (.2(.8)/100)) = .06 Z(calc.) = ((.3  .2)  0)/.06 = 1.67 Z(crit.) = +/ 1.96 for ALPHA = .05
Since 1.67 < 1.96; Continue the H(O).
121.
You are to conduct an opinion poll to determine the opinions of resi dents of a given community about a projected industrial development pro gram. How large a sample should you select to estimate the proportion of adult residents favoring the projected development? Make all assump tions necessary to determine the sample size, and justify these assump tions.
Answer: Necessary assumptions are:
Level of significance = .05, then Z = 1.96 Tolerable error, e = .05 Assume X has a binomial distribution with a population of size N and PI = .50, where X is the number of adult residents favor ing the projected development.
The first two assumptions are arbitrary values that will depend upon the preference of the researcher. The choice of the proportion has been set at maximum variability since no other information on the proportion was available.
n = (Z**2)(PI)(1  PI)/(e**2) = (1.96**2)(.5)(.5)/(.05**2) = 384.16 == 385
122.
a. For each of the samples listed below obtain: 1. a mean 2. a variance, and 3. a standard deviation
Each sample was randomly obtained from the production of the hot dog manufacturer listed.
Company Dog Length(inches)
A 5,5,5,5,5 B 6,5,5,5,4 C 9,9,5,1,1 D 9,5,5,5,1 E 9,5,5,5,5,5,5,5,5,1 F 9,9,9,4,4,3,3,3,3,3
b. Given that the price per hot dog is the same for all manufacturers, whose hot dogs would you buy? Why?
Answer: a. Company Mean Variance St. Dev. (SUM X(i))/n=XBAR S**2=(SUM(XXBAR)**2)/n1 S=SQRT(VAR.)
A 25/5=5 0 0 B 25/5=5 2/4=1/2=.5 SQRT( 1/2 )=.707 C 25/5=5 64/4=16 SQRT( 16 )=4 D 25/5=5 32/4=8 SQRT(8)=2.83 E 50/10=5 32/9=3.55 SQRT( 3.55)=1.89 F 50/10=5 70/9=7.78 SQRT(7.78)=2.79
b. This question may have a variety of answers. The decision would depend on the purpose. If it was important to have as little variability as possible when selling 5 inch hot dogs, company A would be best since it has the least variability. However, if you could profit from selling hot dogs in a variety of lengths, company F might prove best since it shows a lot of variability and produces hot dogs ranging from 9 to 3 inches in length.
123.
Two workers on the same job show the following results over a long period of time.
Worker Worker A B  Mean time of completing the job (minutes) 30 25 Standard deviation (minutes) 6 4
a. Which worker appears to be more consistent in the time he requires to complete the job? Explain.
b. Which worker appears to be faster in completing the job? Explain.
Answer: a. Worker B appears to be more consistent in the time he requires to complete the job, since he has a smaller variance.
b. Worker B appears to be faster in completing the job, since he has a smaller mean. (You could actually test this.)
124.
Suppose the manager of a plant is concerned with the total number of manhours lost due to accidents for the past 12 months. The company statistician has reported the mean number of manhours lost per month but did not keep a record of the total sum. Should the manager order the study repeated to obtain the desired information? Explain your answer clearly.
Answer: Nothe estimate that he would get using the mean number per month would most likely be accurate enough, without having to go to the extra expense of another study. Presumably the mean number of hours lost per month is equal to the total number of hours lost divided by 12, so it's not difficult to calculate the total.
125.
A large health screening program that will have 36 clinics needs to purchase scales for the clinics. A manufacturing firm has available 36 scales on which the same 180 pound man was weighed. The variance in his weight on the 36 scales was .07 (lb**2). The screening program will buy the scales if the variance is not significantly greater than .05 at the 1% significance level.
a. What test statistic would you use to test the null hypothesis that the true variance in weights on the new scale is .05? Set up the computations.
b. What are the null hypothesis, alternative hypothesis, and critical region of such a test?
Answer: a. CHISQUARE = (n1)*(S**2)/(SIGMA**2)
b. H(O): SIGMA**2 = .05 H(A): SIGMA**2 > .05
Critical Region: CHISQUARE(df=35, ALPHA=.01, onetail) = 57.34
126.
Suppose that the variable measured using a random sample is annual income. (Suppose that it and all other items were measured accurately.) Explain what it is that these two models have to say about income.
1. Y(I) = MU + EPSILON(I) 2. Y(I,J) = MU(J) + EPSILON(I,J) Where J = 1 indicates a Democrat J = 2 indicates a Republican
Answer: Model 1 states that annual income can be described by a single popula tion having a single mean and standard deviation. Individual incomes consist of a common mean plus random variation.
Model 2 states that description of annual income may require 2 popula tions, one for Democrats and one for Republicans. It provides for the possibility of different population means for income and may also pro vide for different standard deviations.
NOTE: This is a case where it would probably not be advisable to assume that EPSILONS are normally distributed.
127.
A manufacturing company operates 12 plants that are regarded as about the same in all important respects. This company decides to try a new safety program. The new program is randomly assigned to 6 plants while the old program is continued at the other 6. Number of manhours lost per plant per month, were measured in each plant following completion of the safety programs. Results were: New Program: 46, 41, 16, 11, 58, 61 Old Program: 92, 65, 10, 24, 46, 51 a. Write a model for number of man hours lost. b. Fill out an ANOVA table corresponding to this model. c. Was there a change in accident rate that was detectable at the 10% level?
Answer: a. Y(I,J) = MU(I) + EPSILON(I,J) or MU + TAU(I) + EPSILON(I,J) Where: Y(I,J) is manhours lost in plant J under program I MU(I) is population mean for time lost under program I. or MU is population mean and TAU(I) is used to indicate effect of program I defined as a deviation from MU. and EPSILON(I,J) indicates a random element associated with the Jth plant using program I. These random elements are normally distributed with mean = 0 and variance = SIGMA**2
b. Source df Source df Total 6 Total 6 Mean General Program 1 1 or mean 1 Mean Corrected Program 2 1 total 5 Error 4 Programs 1 Error 4
c. Old Program: New Program: XBAR(1) = 48 XBAR(2) = 38.83 S(1) = 29.18 S(2) = 21.03 n = 6 n = 6 Testing: H(0): XBAR(1) = XBAR(2) H(A): XBAR(1) =/= XBAR(2) We first must test for homogeneity of variance or H(0): SIGMA(1) = SIGMA(2) H(A): SIGMA(1) =/= SIGMA(2) F(calculated) = [S(1)**2!/[S(2)**2! = 851.6/442.17 = 1.926 with 5 EPSILON 5 df F(critical) 5.05 with ALPHA = .05, df = 5 EPSILON 5 Since F(calculated) is less than F(tabled), there is evidence at the 5% level to continue the null hypothesis of homogeneity of variance. The variance should now be pooled: S(p)**2 = ((5) (851.6) + (5) (442.17))/10 = 646.89 and finally find the standard error of the difference between means S(XBAR(1)  XBAR(2)) = SQRT ((S(p)**2)(1/n(1) + 1/n(2))) = 14.68 Now using the twotailed t test with ALPHA = .10, df = 10 we test the null hypothesis about the means t(calculated) = ((4838.83)0)/14.68 = .6245 t(criticals) = 1.812 EPSILON 1.812 Continue the null hypothesis that there was no change in the accident rate. Since t(calculated) is greater than the smaller t(critical) but less than the larger t(critical) at the 90% confidence level.
128.
In the attached Table 1, results for the routine measurement of nickel in a steel standard are reported. This determination was made daily over a long period of time to establish a quality control program.
In Table 2, the data have been plotted as a tally sheet of individual values. Clearly, a grouped tally sheet would be more effective in revealing the pattern of variation in these data.
Perform the following 
(a) Set up a grouped tally sheet and histogram. A cell interval of 0.05% is recommended. List the frequency, cumulative frequency and relative cumulative frequency for each cell.
(b) Calculate the mean and standard deviation (use coding) by both the ungrouped and the grouped procedures. Compare results.
(c) What is the mode  comment  is it meaningful?
(d) What is the median?
(e) Calculate the standard deviation of the mean.
(f) Plot an ogive. Plot the data on normal probability paper. Is it reasonable to assume a normal distribution? If so, estimate the standard deviation and mean and compare wih the calculated values. Estimate the percentage of values outside of the limits 4.88 to 5.21 and compare with the actual percentage.
Table 1. Results of Daily Determination of Nickel in a Nickel Steel Standard
Date % Ni Date % Ni Date % Ni
Mar. 6 4.95 Apr. 17 4.96 May 29 5.03 7 5.02 18 4.79 30 5.08 8 5.17 19 5.06 31 5.20 9 5.08 20 5.03 June 1 5.11 10 4.92 21 4.95 2 4.95 11 4.94 22 5.10 3 4.95
13 5.22 24 5.05 5 5.00 14 4.96 25 5.30 6 4.92 15 5.05 26 5.24 7 5.16 16 5.02 27 5.00 8 5.14 17 5.14 28 5.08 9 5.02 18 5.00 29 5.04 10 5.14
20 5.07 May 1 4.97 12 5.02 21 4.83 2 4.86 13 4.97 22 5.11 3 5.07 14 4.96 23 4.99 4 4.90 15 5.26 24 4.98 5 5.22 16 5.11 25 5.26 6 5.07 17 5.15
27 4.88 8 5.31 19 4.98 28 5.01 9 5.05 20 5.15 29 4.98 10 5.16 21 5.00 30 5.21 11 5.02 22 5.14 31 5.15 12 5.18 23 4.98 Apr. 1 5.00 13 4.90 24 5.03
3 5.00 15 5.20 26 5.01 4 5.10 16 5.08 27 4.97 5 5.03 17 5.19 28 5.12 6 4.97 18 5.16 29 4.98 7 4.89 19 4.88 8 5.12 20 4.99
10 5.27 22 4.92 11 5.09 23 5.17 12 5.13 24 5.01 13 4.93 25 5.02 14 4.93 26 5.06 15 5.04 27 5.03
Table 2. Frequency Table and Tally Sheet for the Data in Table 1
Ni Conc., Tally Frequency Ni Conc., Tally Frequency % (y) Marks (f) % (y) Marks (f)
4.79 X 1 5.05 XXX 3 4.80 5.06 XX 2 4.81 5.07 XXX 3 4.82 5.08 XXXX 4 4.83 X 1 5.09 X 1 4.84 5.10 XX 2 4.85 5.11 XXX 3 4.86 X 1 5.12 XX 2 4.87 5.13 X 1 4.88 XX 2 5.14 XXXX 4 4.89 X 1 5.15 XXX 3 4.90 XX 2 5.16 XXX 3 4.91 5.17 XX 2 4.92 XXX 3 5.18 X 1 4.93 XX 2 5.19 X 1 4.94 X 1 5.20 XX 2 4.95 XXXX 4 5.21 X 1 4.96 XXX 3 5.22 XX 2 4.97 XXXX 4 5.23 4.98 XXXXX 5 5.24 X 1 4.99 XX 2 5.25 5.00 XXXXXX 6 5.26 XX 2 5.01 XXX 3 5.27 X 1 5.02 XXXXXX 6 5.28 5.03 XXXXX 5 5.29 5.04 XX 2 5.30 X 1 5.31 X 1
Answer: a) (If available, consult file of graphs and charts that could not be be computerized.)
Cell Cell Cum Rel Cum Midpoints Boundaries f f f 4.775 4.80 1 1 0.01 4.825 4.85 2 3 0.03 4.875 4.90 8 11 0.11 4.925 4.95 14 25 0.25 4.975 5.00 22 47 0.47 5.025 5.05 15 62 0.62 5.075 5.10 12 74 0.74 5.125 5.15 13 87 0.87 5.175 5.20 7 94 0.94 5.225 5.25 4 98 0.98 5.275 5.30 2 100 1.00 5.325 ___ 100
b) ungrouped YBAR = 504.99/100 = 5.0499 == 5.05
ungrouped S(Y) = SQRT[(2551.3039  2550.1490)/99! = SQRT(0.01166) = 0.108 == 0.11
Grouped and coded by: Y = 0.05d + 5.05
Cell Midpoint d f f*d f(d**2) 4.80 5 1 5 25 4.85 4 2 8 32 4.90 3 8 24 72 4.95 2 14 28 56 5.00 1 22 22 22 5.05 0 15 0 0 5.10 +1 12 +12 12 5.15 +2 13 +26 52 5.20 +3 7 +21 63 5.25 +4 4 +16 64 5.30 +5 2 +10 50 ___ ___ sum(fd) = 2 sum(f*d**2) = 448
dBAR = (sum(fd))/n = 2/100 = .02
YBAR = (0.05)(.02) + 5.05 = 5.049 == 5.05
S(d) = SQRT[((448  2**2)/100) / 99! = SQRT(4.525) = 2.127
S(Y) = (2.127)(0.05) = 0.106 == 0.11
c) 5.00 or 5.02  not meaningful because no single value occurs with sufficient frequency.
d) Median is average of 50th and 51st observations  (5.03 + 5.03)/2 = 5.03
e) S(YBAR) = S(Y)/SQRT(n) = 0.108/SQRT(100) = 0.0108 == 0.011
f) Estimates graphically should compare closely.
(If available, consult file of graphs and charts that could not be computerized.)
Actual percentage outside = 11%. Graphical estimate should be within about 2% of this.
129.
A coffee dispensing machine provides servings that have a population mean of 6 ounces and a population standard deviation of .3 ounces. If the difference is measured between randomly chosen cups (e.g. the 7th minus the 15th, the 22nd minus the 29th, etc.), the distribution of differences will have a mean of ______ and a standard deviation of ______.
Answer: a. MU = 0 b. SIGMA = SQRT(.09/1 + .09/1) = .424
130.
The closing prices of two common stocks were recorded for a period of 15 days. The means and variances were:
Y(1)BAR = 40.33, Y(2)BAR = 42.54, S(1)**2 = 1.54, S(2)**2 = 2.96
Do these data present sufficient evidence to indicate a difference in variability of the two stocks for the populations associated with the two samples? [Assume stock 1 is normally distributed with mean = MU(1) and variance = SIGMA(1)**2 and stock 2 is normally distributed with mean = MU(2) and variance = SIGMA(2)**2; ALPHA = 5% and S(i)**2 = SUM(j=1,n(i))([(Y(ij)Y(i)BAR)**2!/[n(i)1!), i=1,2.!
Answer: H(0): SIGMA(1)**2 = SIGMA(2)**2 H(A): SIGMA(1)**2 =/= SIGMA(2)**2
F(calc) = [larger variance! / [smaller variance! = [2.96! / [1.54! = 1.922
F(crit., df=14,14, ALPHA= .05, onetail) = 2.48
Since our calculated F value is less than our tabled F value, we do not reject (continue) the null hypothesis that the variances for the two populations are the same.
131.
Once upon a time there was a king who proclaimed that a proper kingdom should not have great differences in wealth. One day he instructed his wizard to randomly sample his kingdom so that he could assess the distribution of wealth.
So the wizard did this 
1. He randomly selected 100 people, found their income, and wrote down the mean for the group of 100; 2. He randomly and independently repeated this process over and over again; 3. He truthfully reported to the king, "I have repeatedly taken average wealth in the kingdom of 100 subjects and find that the average wealth is 10 units and variance of those averages is 1 unit. Further, those means are normally distributed."
a. What is mean wealth of individuals in the kingdom? b. What is the variance for individual wealth in the kingdom? c. Why did the wizard report on means based on samples of 100?
Answer: a. 10 units b. SIGMA**2 = (n) (SIGMA(XBAR)**2) = 100 * 1 = 100 c. To conceal the variability that would be obvious if he reported on individuals.
132.
HEADLINE: MPG for Gas Guzzler skyrockets over MPG for Econ Scooter]
Data: in miles per gallon
Gas Econ Guzzler Scooter   1964 4 25 1968 5 30 1972 8 35 1976 16 40
100] G Percent ] Increase ] Over 75] Prior ] Time ] G Period 50] ] ] 25] G ] E ] E E  1968 1972 1976
(To complete the graph connect the three G points with straight lines to relate the performance of Gas Guzzler. Similarly, connect the three E points to show the trend for Econ Scooter.)
Even though the above graph is correct, explain how it has led to the misleading headline.
Answer: The headline is misleading in the sense that it implies that mpg is being compared for the two vehicles. Only upon inspection of the data can one see that Econ Scooters have a substantially higher mpg, while their rate of increasing mpg has not been as great. The graph accurate ly indicates the rate of increase in mpg, but the headline is comparing actual mpg, which is quite different.
133.
Suppose that a report contains this graph:
^ ^ Annual Income ^ (thousands of ^ $ per year) ^ ^ 50 + * ^ * ^ ^ ^ ^ ^ ^ ^ * ^ 25 + ^ ^ ^ ^ * ^ ^ ^ ^ +++> 10 20 30 Years of Experience in Trade
(Note: to complete graph, connect the *'s with a smooth curve.)
a. What does the graph indicate as annual income for someone with no experience in the trade?
b. Describe the relation between income and experience over the inter val from 0 to 20.
c. Describe the relation between income and experience over the inter val 20 to 30.
d. Describe the overall graph.
Answer: a. Around 12,500 dollars per year.
b. There appears to be approximately a straight line relation in which income increases with experience over the interval from 0 to 20. (There seems to be some curvature or flattening for experience near 20.) The change in income in this range is from around 12.5 to around 48, so the rate of increased income is roughly $35,500/20 = $1775 per year.
c. The relation between experience and income for experience between 20 and 30 years also appears to be roughly a straight line, but a flat straight line, indicating that income stays roughly constant at a little less than $50,000 per year.
d. The overall graph indicates income initially around $12,500 (no ex perience), increasing income in the range from 0 to 20 years exper ience, approaching a limit that seems to be a little below $50,000. That limit seems to be reached sometime between 10 and 25 years. (Income seems to remain about constant afterward.)
134.
If a random sample of 18 homes south of Center Street in Provo showed the average selling price to be $15,000 with s**2 = $2400 and a random sample of 18 homes north of Center Street revealed an average selling price of $16000 with s**2 = $4800, can you conclude that there is a statistically significant difference (ALPHA = .05) between the selling price of homes in these areas of Provo?
Answer: H(0): MU(north)  MU(south) = 0 H(A): MU(north)  MU(south) =/= 0
s**2 = ((181)(2400) + (181)(4800)) / (18+182) = 3600 s = 60
Before pooling the sample variances, we will test to see if the population variances are equal: H(0): SIGMA(north)**2 = SIGMA(south)**2 H(A): SIGMA(north)**2 =/= SIGMA(south)**2
F(calc.) = 4800/2400 = 2 F(crit, df=17,17, ALPHA=.05) = 2.29
So do not reject (continue) H(0), and pool s**2's, the following is equivalent to pooling when the sample sizes are equal:
t(calculated) = (1500016000)/((60)*SQRT(1/18+1/18)) = 50 t(crit., ALPHA=.05, df=35, twotailed) = +/ 2.03
conclusion: there is a significant difference
135.
A machine is supposed to produce Zorkel fingers having a thickness of .050 inches. To test if the machine is working properly, a random sample of 16 Zorkel fingers is selected randomly from the day's out put. The mean thickness of the sample is .053 inches with S = .003. We wish to determine if the machine is in proper working order with ALPHA = .01. Use a twotailed test.
Answer: Hypothetical Population: Set of all Zorkel fingers. Sample : The 16 randomly selected fingers.
H(O): MU = .05. The mean Zorkel finger thickness is .05. H(A): MU =/= .05. The mean Zorkel finger thickness is other than .05.
MU(M) = .05 by H(O)
S(M) = S/SQRT(n) = (.003)/4 = .0007497
M(crit) = MU(M) +/ t(crit)*S(M) = .05 +/ (2.95)*(.0007497) = .052 to .048
Since M = .053 is greater than .052, we reject H(O) and conclude that Zorkel finger thickness is other than .05.
OR, using a ttest:
t(calc) = [XBAR  MU! / [S(M)! = [.053  .050! / [.0007497! = 4.0016
t(crit, df=15, ALPHA=.01, twotailed) = 2.947
Since t(calc) > t(crit), we reach the same conclusion as above.
136.
An investigator was interested in studying relations between a number of factors and salary in a university. One of the factors of interest was tenure status. After much agonizing, the investigator decided to use the following variable for persons having faculty appointments.
Variable called T coded as: 1 for nontenure track people such as administrators with faculty appointments
2 for faculty with less than one year in service in tenure track positions
3 for faculty with one to three years on tenure track
.
.
7 for faculty having tenure and more than 20 years service
The investigator then carried out a multiple regression analysis in which one of the variables fitted was T.
If you accept his seven tenure classes as a reasonable grouping scheme, would you use this approach? Why or why not?
Answer: If I regarded the seven classes specified as a good way to form groups based on tenure, I would want to see what would happen if I used six independent variables instead of just one to represent tenure effects. Using T alone might work if the relation between salary and the code values for T were linear. But, the code values for T don't appear to be either well ordered (the first class included administrators who are apt to have higher salaries than the following classes which do seem to provide sort of an increasing order) or equally spaced. (Is the amount of "tenure" the same between classes 1 and 2, and 2 and 3?).
I would not use this approach.
137.
A report on the effect of sex on faculty salaries at a Western University states that all ranks and departments have been surveyed. It states that:
A simple regression with salary as dependent variable and sex as independent variable had a regression coefficient for sex equal $5000.
A multiple regression with the same dependent variable but additional variables for rank (full professor, associate, assistant, instructor), department, tenure status, length of employment, etc. had a regression coefficient for sex equal $100.
(In both cases, the independent variable used for sex was coded so that the regression coefficient for sex represented the salary advantage of males over females.)
Which of these values would you use to represent the effect of sex on salary? Explain your answer.
Answer: I would expect that value $100 would be more informative or trustworthy. When all important factors affecting response have been held constant except for a single independent variable and that variable is related to response by a straight line relation, a simple regression coefficient can provide a good measure of how that variable affects response. But, if response is affected by many variables and they are not constant in the data set being examined, a simple regression coefficient can be very misleading. In this case, the fact that the regression coefficient for sex changed from 5000 to 100 when other variables were included in the fitted regression indicates that much of the apparent influence of sex on salary really was the result of treating variables like rank, department, etc. as constant or unimportant when in fact some of them were important and not constant.
138.
Attached is a table relating current food prices and prices from 3 months ago for a certain supermarket in the area. Perform the following:
a. Plot current price vs. price 3 months ago. b. Propose a model relating current price and price 3 months ago. Define all terms and estimate all parameters.
Produce Price 3 months ago Current  Milk (1 gallon) 1.39 1.39 Cheese (sliced 12 oz.) .89 .93 Eggs (1 doz. large) .89 .81 Bologna (12 oz.) .85 .89 White Tuna Fish (7 oz.) .75 .79 Soup (chicken noodle) .22 .24 Green Beans (1 lb. can) .33 .39 Ground Beef (1 lb.) .95 .98 Corn Flakes (12 oz.) .49 .49 Spaghetti (2 lbs.) .89 .95 Sauce (w/o meat, 16 oz.) .59 .59 Coffee (6 oz.) 1.57 1.57 Bread (1 oz.) .45 .52 Lettuce (1 head) .49 .33 Potatoes (10 lbs.) .49 .69 Fruit Cocktail (18 oz.) .49 .49 Peanut Butter (18 oz.) .87 .99 Yogurt (8 oz.) .37 .37 Rice (2 lbs.) .71 1.09 Cottage Cheese (1 lb.) .65 .67
Total 14.33 15.17
Answer: a. If available, consult file of graphs and diagrams that could not be computerized for graph.
b. The model I propose is: Y = B(1)*X + EPSILON
where: Y is the response, current price; B(1) is the estimated effect of X on Y; X is the independent variable, price 3 months age; EPSILON is a random error term.
The fitted equation is: YHAT = 1.046 * X
I forced this regression through the origin because, with a regres sion not through the origin, (the intercept equalled .052), the intercept was not significant at the 5% level. Also, the usual method of describing inflation would not include adding a constant to some computed number.
The t test for the regression coefficient and F test for the regres sion mean square are significant.
ANOVA Source df SS M.Sq.
Uncorrected total 20 13.8443 Regression 1 13.6213 13.6213 Pooled Error 19 0.2230 .01739
R**2, adjusted for matched X error = .9894
139.
A small mailorder house uses the weight of incoming mail to determine how many of their employees are to be assigned to filling orders on a given day. Assume a linear regression model, given X = weight (lbs) of mail on hand at 7:00 a.m., and Y = no. of 8hour shifts required to fill the orders of that day. The calculated results from some data are given below:
n = 8 SUM(X**2) = 524 SUM(X) = 56 SUM(XY) = 364 SUM(Y) = 40 SUM(Y**2) = 256 YHAT = .52 + .84X
(a) Test the hypothesis that the slope of the regression line is zero at ALPHA = .05.
(b) Find a 90% confidence interval for the number of eight hour shifts required if there are 10 lbs. of mail on hand at 7 a.m. on a par ticular day.
(c) In analyzing the fitted regression model, explain what the values for b(0) and b(1) mean. Is there anything inconsistent about your fitted values from a practical standpoint?
Answer: (a) H(O): BETA = 0 H(A): BETA =/= 0
SSE = [256[[40**2!/8!!  [([364(56*40)/8!**2)/(524[[56**2!/8!)! = [56!  [(84**2)/(132)! = 2.5454
MSE = [2.5454!/[6! = 0.4242
S(b)**2 = 0.4242/132 = 0.0032
S(b) = 0.0567
t(calc) = [0.84  0!/[.0567! = 14.817
t(crit, ALPHA=.05, twotailed, df=6) = +/ 2.447
Since t(calc) < +t(crit), reject H(O). Therefore, based on this sample evidence, conclude that the regression coefficient is dif ferent from zero.
(b) S(YAT) = SQRT([4242! * [(1/8) + ([10**2!/[132!)!) = 0.61
C.I. = YHAT +/ [t * S(YHAT)! = [.52 + (.84*10)! +/ [1.943 * 0.61! = 8.92 +/ 1.189 = from 7.73 to 10.11
(c) b(0) is the estimated value for BETA(0) which is the intercept value on the Yaxis for the regression line.
b(1) is the estimated value for BETA(1) which is the slope of the regression line. This indicates the ratio of the change in the Y variable with respect to the change in the Xvariable for the par ticular line.
The fitted value that might cause some concern from a practical standpoint is b(0), which implies that approximately a four hour shift is needed even when no mail is on hand.
140.
A report for an organization states that a simple regression was used to relate salary to sex. The independent variable for sex was coded so that the regression coefficient for sex represented the salary ad vantage of males over females. The result of fitting over 700 pairs of values was a regression coefficient of 5000 (advantage for males of $5000). "A test of the regression coefficient at the 1% level was significant. The correlation coefficient r was .24."
What action would you take on the basis of this report? Explain.
Answer: Send the report writers back to reexamine their data.
1) The only time that a simple regression would be a good way to estimate the effect of sex on salary would be when all other important factors affecting response have been held constant or nearly constant. That seems unlikely in most organizations. (There should also be a straight line relation between salary and sex. That should be a good bet unless there are more than two sexes in the organization.)
2) For this data set, sex has only accounted for around 6% (.24**2) of the variation in salary. It would take someone bolder than I am to take action on a description that leaves 94% of the varia tion in salary unexplained. (The test of significance says that the evidence at hand is consistent with the claim that the regres sion coefficient is not zero. It offers guarantees neither that the model fitted is reasonable nor that a worthwhile amount of var iation in response has been accounted for.)
141.
The following data illustrates the relationship between income and education for a sample of nine U.S. workers.
Education (X(j) in years) Income (Y(j) in thousands of $) 0 5 6 6 8 7 10 9 12 8 12 10 12 12 14 11 16 12
a. Obtain a scattergram for the data.
b. Perform a regression analysis using the model: Y(j) = a + b*X(j) + e(j)
c. Draw the regression line on the scattergram.
d. What income, in thousands of dollars, would you predict for a single U.S. worker with 10 years of education?
e. Find the correlation coefficient for the data.
f. What proportion of the variance in income is "explained" by the regression equation?
g. Based on this sample, how much extra income would an additional year of education be worth to a person with less than 16 years of education?
Answer: a. Y ^ ^ Connect points A & B to form the ^ graph of the regression line. 15 + ^ ^ B ^ * * ^ * 10 + * ^ * ^ * ^ * ^ * 5 * A ^ ^ ^ ++++++++++++++++++> X 5 10 15
b. YHAT = 4.1063 + .47826(X)
c. Refer to scatter diagram above.
d. Prediction for a worker with 10 years of education: YHAT = 4.1063 + .47826(10) YHAT = 8.889
Therefore, I predict an income of $8,889 when education = 10 years.
e. r = .892
f. r**2 = .796, so 79.6% of the variation in income has been explained by the model.
g. An additional year of education is worth $478, since b(1), the regression coefficient, indicates the change in Y for a unit change in X.
142.
An experiment was conducted in a supermarket to observe the relation between the amount of display space allocated to Petrushka brand coffee and its weekly sales. The data for the five time periods are below.
Space Allocated (sq. yds.)  X: 1 2 3 4 5 Weekly Sales (cases)  Y: 2 4 5 6 8
After gathering the data, Clark Kent, the SUPERmarket MANager, discovered that the slope of the least squares line is 1.40 and the intercept is +0.80.
a) Plot the least squares line and the data points. Comment on the fit.
b) What would you predict the weekly sales would be if the manager allocated 4.5 sq. yd.?
c) How much of an increase in sales can he expect for every extra sq. yd. of display space?
d) What does LEAST SQUARES refer to?
e) Why are there two equations for the confidence intervals for a future Y value, and when would you use each one?
Answer: a) Y ^ ^ 9 + ^ 8 + * ^ 7 + ^ 6 + * Weekly ^ Sales 5 + B (cases) ^ 4 + * (NOTE: To plot an appro ^ ximation to the regression 3 + line, connect points A & B. ^ Also note that point B is a 2 + * data point.) ^ 1 A ^ +++++++> X 1 2 3 4 5 Space Allocated (sq. yds.)
The least squares line appears to fit the data very well.
b) Sales = (1.4*4.5) + .8 = 7.1
c) 1.4 sq. yd.
d) Least squares refers to minimizing the sum of squares of the dis tance between the regression line and the data points.
e) One equation is used to arrive at confidence intervals for predict ing the mean response, while the other is used for predicting a particular response.
143.
A study was conducted in which typing speed (number of words per minute) was measured each day after the beginning of a period of practice typing. Part of the results of fitting a series of polynomial models appear below.
df for Error S**2 R**2 Linear 9 28.40 .85 Quadratic 8 4.51 .98 Cubic 7 .91 .99
On the basis of this information, which model would you choose? Why?
Answer: SSE(Quad.) = 4.51*8 = 36.08 SSE(Cubic) = 0.91*7 = 6.37 Diff. SS. = 29.71 with 1 df F(CALC) = Mean Sq. Diff./Error Mean Sq. (Containing Model) = 29.71/.91 = 32.65 with 1 and 7 df. F(CRITICAL = ALPHA = .05, df = 1,7) = 5.59
Since F(CALC) is greater than F(CRITICAL), we reject the null hypothesis that the coefficient of the cubic term equals zero. I would, therefore, choose the cubic model.
144.
What would you guess the value of the correlation coefficient to be for the pair of variables: "number of manhours worked" and "number of units of work completed"?
a) Approximately 0.9 b) Approximately 0.4 c) Approximately 0.0 d) Approximately 0.4 e) Approximately 0.9
Answer: a) Approximtely 0.9
145.
The results of an imaginary investigation of the effect on sales of different methods of displaying peaches included: ANOVA Source of Variation df SS M.S.
Total 25 Mean 1 Corrected Total 24 Day of the week 4 400 100 Fruit Market 4 1000 250 Display 4 200 50 Error 9 225 25
Using the information contained in this table perform appropriate tests to decide if background variation accounts for the effect on sales of: a. Display b. Day of Week c. Fruit Market
Answer: a. F(calculated) = 50/25 = 2 F(critical, df = 4, 9, ALPHA = .05) = 3.63 Therefore, retain the null hypothesis that the effect of display equals zero.
b. F(calculated) = 100/25 = 4 F(critical) = 3.63 Therefore, reject the null hypothesis that the effect of the day of the week equals zero.
c. F(calculated) = 250/25 = 4 F(critical) = 3.63 Therefore, reject the null hypothesis that the effect of fruit market equals zero. From these F tests we can see that background variation accounts for the effect on sales of display only.
146.
To test the hypothesis that shelf placement influences sales, a marketing researcher has collected data on sales in a random sample of 15 comparable supermarkets with 3 different shelving policies for an identical brand of soup. The data is weekly sales figures (in tens of cans). Perform the appropriate test at the 5% level. If you reject, which shelving policies are different? (Note: 1/SQRT(.4) = 1.6.)
bottom shelf middle shelf top shelf sales sales sales    10 25 10 5 20 10 10 25 20 10 30 20 15 50 40
Sums 50 150 100 Sums of squared scores 550 5050 2600 8200
Answer: OVERALL MEAN(SALES) 20 STANDARD DEVIATION 10 COEFFICIENT OF VARIATION 50
ANALYSIS OF VARIANCE:  SOURCE OF VARIATION DF SS MEAN SQUARE F(CALC.) UNCORRECTED TOTAL 15 8200.0000 CORRECT'N FOR MEAN 1 6000.0000 CORRECTED TOTAL 14 2200.0000 SHELF 2 1000.0000 500.00000 5.00 EXPERIMENTAL ERROR 12 1200.0000 100.00000
MEANS FOR SHELF TREATMENT MEAN(SALES) MIDDLE 30 TOP 20 BOTTOM 10
PROBABILITY LEVEL FOR COMPARING MEANS = .05 VALUE FOR STUDENT'S t (DF=12,ALPHA=.05,TWOTAILED) = 2.179
LSD FOR ABOVE MEANS IS 13.7812 at PROB.LEVEL .05 (Note: LSD means Least Significant Difference.)  F(critical, df=2,12, ALPHA=.05) = 3.88
Therefore, reject the null hypothesis that shelving policy does not influence sales.
Based on the LSD given above, it appears that there is a significant difference between the middle shelf and the bottom shelf.
147.
Suppose that you wish to test 4 brands of tires for length of usefulness and that you have available 4 cardriver combinations. Thus you have available 16 experimental units if you consider each tire position on a car as an experimental unit: i.e.
Frontright Frontleft Rearright Rearleft ___________ __________ __________ _________ Car 1 Unit 1 Unit 2 Unit 3 Unit 4 Car 2 Unit 5 Unit 6 Unit 7 Unit 8 Car 3 Unit 9 Unit 10 Unit 11 Unit 12 Car 4 Unit 13 Unit 14 Unit 15 Unit 16
A. Assign Brands (A,B,C,D) randomly to experimental units (i.e., Use a procedure appropriate to a completely random, CR, design). Show how you used the random numbers table. Do you see any dangers in using a CR design for this kind of experiment?
B. Assign brands to experimental units subject to the restric tion that each brand must be tested once in each tire posi tion (i.e., Use a procedure appropriate to a randomized com plete block, RCB, design where tire position is used in form ing blocks). Show how you used the random numbers table. Do you see any dangers in using an RCB design for this kind of experiment?
C. Suggest a way of testing tires in this situation that might overcome the dangers of using either a CR or RCB design.
Answer: A. Generating randomly 16 numbers using a computer or using random numbers table, will assign brands randomly to ex perimental units such that each brand appears 4 times in the experiment, i.e.
2, 4, 8, 14, 13, 12, 7, 1, 16, 6, 15, 3, 9, 5, 10, 11
e.g. Brand A to unit 2 " B " " 4 " C " " 8 " D " " 14 " A " " 13 " B " " 12 " C " " 7 " D " " 1 etc.
(1) There is a possibility that one brand might appear on only one tire position (e.g. A to 1,5, 9, and 13)
(2) Or one brand(s) might appear on one car (e.g. B to 1,2,3, and 4).
B. Problem (1) will be solved since in RCB designs each brand will be applied once to each tire position. e.g. In Front right (say as block I) randomly assign brands to units 1,5,9 and 13 (e.g A to 5, B to 13, C to 1 and D to 9). This scheme will eliminate the position to position vari ation. But one would expect with this scheme a car to car variation (rowwise).
C. A Latin square design will resolve (2) as opposed to RCB and resolve (1) and (2) as opposed to CR. In this scheme each brand will appear once and only once in each position and each car. One way is:
_________________________ ^ A ^ B ^ C ^ D ^ ^_______________________^ ^ B ^ C ^ D ^ A ^ ^_______________________^ ^ C ^ D ^ A ^ B ^ ^_______________________^ ^ D ^ A ^ B ^ C ^ ^_______________________^
148.
The manager of a department store wished to compare the influence of background music on the volume of sales in the shoe department. He wished to test:
T1. Waltzes, T2. Marches, T3. Acid Rock, T4. Polkas
He decided to use the same treatment for a sales period where each week provided four periods.
P1. Friday  10 a.m. to 3 p.m. P2. Friday  3 p.m. to 8 p.m. P3. Saturday  10 a.m. to 3 p.m. P4. Saturday  3 p.m. to 8 p.m.
He also decided to use one month (4 weeks) for testing. When asked what, if any, differences would he find if the same background music were used during all test periods he answered:
a. Sales would be greatest during P2 and P4. P3 would be better than P1.
b. Sales would be best during the first week of the month, next best during the 3rd week, and about equally poor during the 2nd and 4th week.
A. Define and illustrate experimental unit in terms of this problem. B. Would you elect to conduct this inquiry as indicated? C. Suppose that you have no alternative but to conduct an experiment under the conditions decreed by the manager. Which of the common designs discussed in the course would you use? D. Write the model for the design you have chosen. Define all terms carefully. (Be sure that your definitions of terms is relevant to this particular problem.)
Answer: A. An experimental unit is one of the four time periods during a certain week, such as: Saturday from 10 a.m. to 3 p.m. during the first week. B. Problems: 1. The treatment set doesn't include a treatment with no music. Yet that would seem to be a reasonable stan dard for comparison. This treatment set only allows comparisons among conditions involving background music. 2. If a randomized block or latin square design is to be used it requires the assumption that there is no interaction between treatments and the blocking factor. It seems questionable that the difference in sales be tween, say, acid rock and waltzes would be the same for 10 a.m. to 3 p.m. on Friday and 3 p.m. to 8 p.m. on Saturday. C. I would use the Latin Square design with time of day and week of month as my blocking factors. D. Y(I,J,K) = MU + TAU(I) + RHO(J) + KAPPA(K) + EPSILON(I,J,K) with I = 1, 2, 3, 4 J = 1, 2, 3, 4 K = 1, 2, 3, 4
where Y(I,J,K) is the response MU is the overall mean TAU(I) are the treatment (type of music) effects RHO(J) are the effects of the time period KAPPA(K) are the effects of the weeks of the month EPSILON is the random error
149.
A test was conducted to compare the relative effectiveness of three waterproofing compounds, (A,B,C). A strip of cloth was subdivided into nine pieces   
Left Center Right _____ _____ _____ _____ _____ _____ _____ _____ _____
_____ _____ _____ _____ _____ _____ _____ _____ _____
Each piece was considered to be an experimental unit, but it was suspected that the pieces differed systematically from left to right in capacity to become waterproofed. Accordingly, the random assignments of compounds to experimental units was res tricted so that:
I. Each compound was tested once in each set of three pieces (sets are left, center, and right); and II. Each compound was tested once in each of the positions within a set of three (once furthest left in a section, once in the cen ter of a section, and once on the right of a section).
a. Write a model appropriate to such a trial. b. Analyze and interpret the following results for such a randomization scheme:
Left Center Right _____ _____ _____ _____ _____ _____ _____ _____ _____ B, 12 A, 15 C, 16 A, 11 C, 17 B, 10 C, 10 B, 12 A, 14 _____ _____ _____ _____ _____ _____ _____ _____ _____
(consider higher numbers as better)
Answer: a. This is an LSQ design where the model is:
Y(I,J,K) = MU + TAU(I) + RHO(J) + KAPPA(K) + EPSILON(I,J,K) Y is response, degree of waterproofing MU is an overall mean for waterproofing TAU(I) are the treatment effects RHO(J) are the column effects, or piece position on cloth KAPPA(K) are the row effects, or the position within the piece EPSILON is the random error, assumed to be normally distributed with mean = 0 and variance = SIGMA**2
Estimates of parameters
SIGMA**2 = 5.333
MU 13 RHO(1) 2 KAPPA(1) 1.333 TAU(1) .333 RHO(2) 1.667 KAPPA(2)  .333 TAU(2)  1.667 RHO(3) .333 KAPPA(3) 1 TAU(3) 1.333
Treatment means were: C = 14.333 A = 13.333 B = 11.333
b. None of the differences among treatment means appear to be signi ficant; they are all less than the LSD of 18.7148 (ALPHA = .01).
The F test for treatments (alternative test with higher Type II error rate):
H(0): TAU(1) = TAU(2) = TAU(3) = 0 F(calculated) = 1.3125 F(table, ALPHA = .01, df = 2,2) = 99,
also does not allow one to reject H(0). In conclusion, it appears that none of the compounds are significantly different from any other at ALPHA = .01.
150.
A test has been conducted in which four tire brands have been tested using 12 experimental units where an experimental unit consisted of one tire position on one car. The random assignment of brands to experi mental units was restricted so that each brand was tested once on each car. Results (in amount of wear) were:
Front Right Front Left Rear Right Rear Left
Car 1 D, 7.17 A, 7.62 B, 8.14 C, 7.76 Car 2 B, 8.15 A, 8.00 D, 7.57 C, 7.73 Car 3 C, 7.74 B, 7.87 A, 7.93 D, 7.80
a. Write a model appropriate to this trial and estimate all parameters. b. Do any of the assumptions for this design make you uneasy? Explain. c. Analyze and interpret these results.
Answer: a. The model is Y(I,J) = MU + TAU(I) + RHO(J) + EPSILON(I,J) where Y is the response, tread wear TAU(I) are the treatment effects, effects of tire brand RHO(J) are the block effects, effects of car EPSILON is the random error term with mean = 0 and variance = SIGMA**2 MU is the overall mean
Estimates of parameters:
MU(HAT) = 7.79 TAU(A,HAT) = .0599 = .06 TAU(B,HAT) = .2633 TAU(C,HAT) = .04667 = .047 TAU(D,HAT) = .27667 = .277 RHO(1,HAT) = .1175 RHO(2,HAT) = .0725 RHO(3,HAT) = .045
SIGMA**2 = .0419 with 6 df.
b. Using a randomized block (RCB) design makes me uneasy since I would expect wheel position on car to also affect tread wear. Therefore, I would also block on wheel position as well as car and use a Latin Square design.
c. Treatments means are: B = 8.053, A = 7.85, C = 7.743, D = 7.513 Only one difference is significant at the .05 level. Tires B and D are different since their difference is greater than the LSD. (B  D) +/ LSD .54 +/ .409 Interval is from .131 to .949 Since the interval does not include zero, we reject the null hypo thesis that the true difference is zero.
The F test for treatments fails. This is the case where the LSD indicates a significant difference while the F test of treatments doesn't. These procedures usually are different and usually have different properties regarding Type I and Type II error rates. Here, the LSD is more exposed to Type I errors and the F test is more exposed to Type II errors.
151.
Write out the sources of variation and the degrees of freedom for the following industrial experiment. Mention also the name of the design.
Three machines were used to produce parts made from four kinds of metal. Each machine made one part from each type of metal. The order with which the metals were assigned to the machines was established through a randomization procedure.
Answer: Source of Variation df  
Total 12 Mean 1 Metals 3 Machines 2 Residual 6 (Metal x Machine)
This is a randomized block experiment with metals playing the role of blocks.
152.
The Crapi Cable Company #35 cable has a mean breaking strength of 1800 pounds with a standard deviation of 100 pounds. A new material is used which, it is claimed, increases the breaking strength. To test this claim a random sample of 50 cables, manufactured with the new material, is tested. It is found that the sample has a mean breaking strength of 1850 pounds. Test this claim using ALPHA = .01.
Answer: Hypothetical population: All Crapi #35 cables made with the new material. Sample: The 50 cables randomly selected.
H(O): MU = 1800. The mean breaking strength of the new cable is 1800 lb. H(A): MU > 1800. The mean breaking strength of the new cable is more than 1800 lb.
MU(XBAR) = 1800 by H(O)
SIGMA(XBAR) = SIGMA/SQRT(n) = 100/SQRT(50) = 14.142
XBAR(crit) = MU(XBAR) + Z(crit)*SIGMA(XBAR) = 1800 + (2.33)*(14.142) = 1832.951
Since the sample mean breaking strength is 1850, which is greater than 1832.51, we must reject H(O) and conclude that the mean breaking strength of the new cable is significantly more than 1800 lb.
153.
In the past a chemical fertilizer plant has produced an average of 1100 pounds of fertilizer per day. The record for the past year based on 256 operating days shows the following:
XBAR = 1060 lbs/day S = 320 lbs/day
where XBAR and S have the usual meaning. It is desired to test whether or not the average daily production has dropped significantly over the past year. Suppose that in this kind of operation, the traditionally acceptable level of significance has been .05. But the plant manager, in his report to his bosses, uses level of significance .01. Analyze the data at both levels after setting up appropriate hypotheses, and comment.
Answer: H(O): MU = 1100 H(A): MU < 1100
Since n = 256, use Z to approximate t.
S(XBAR) = 320/SQRT(256) = 320/16 = 20
Z(calculated) = (1060  1100)/20 = 40/20 = 2
Z(critical, ALPHA=.05, onetailed) = 1.645
Z(critical, ALPHA=.01, onetailed) = 2.33
Therefore, H(0) is rejected at ALPHA=.05 but continued at ALPHA=.01. It appears that the manager is trying to pull a fast one on his bosses by using ALPHA=.01 and saying production has not dropped. However, if the traditional level of significance is used, ALPHA=.05, there is evidence that indicates a drop in production.
154.
The Pfft Light Bulb Company claims that the mean life of its 2 watt bulbs is 1300 hours. Suspecting that the claim is too high, Nalph Rader gathered a random sample of 64 bulbs and tested each. He found the average life to be 1295 hours with s = 20 hours. Test the com pany's claim using ALPHA = .01.
Answer: Hypothetical population: All Pfft 2 watt bulbs. Sample: The 64 randomly selected bulbs.
H(O): MU = 1300. The mean life of 2 watt bulbs is 1300 hours. H(A): MU < 1300. The mean life of 2 watt bulbs is less than 1300 hours.
MU(XBAR) = 1300 by H(O)
S(XBAR) = S/SQRT(n) = 20/8 = 2.5
XBAR(crit) = MU(XBAR) + Z(crit)*S(XBAR) = 1300  2.33*2.5 = 1294.18
Since 1295 is not less than 1294.18, we cannot reject H(O). There is not enough evidence to conclude that the mean life of the 2 watt bulbs is significantly less than 1300 hours.
155.
The Rickety Railroad Company claims that .5 of the trains on its Foggy Bottom branch run on time. An Interstate Commerce Commission investi gator doubts the claim but is uncertain about whether the true frac tion is less than or greater than the claim. A random sample of 64 trains was checked and he found that 21 were on time. Test the com pany's claim using ALPHA = .05.
Answer: Hypothetical population: All trains on the Foggy Bottom branch. Sample: The 64 randomly selected trains.
H(O): PI = .5 50% of the Foggy Bottom branch trains run on time. H(A): PI =/= .5 Other than 50% of the Foggy Bottom branch trains run on time.
We can use the normal approximation since:
n*PI = 64 * .5 = 32 > 5; and n*(1  PI) = 64 * (1  .5) = 32 > 5.
MU(p) = .5 by H(O)
SIGMA(p) = SQRT(PI*(1  PI)/n) = SQRT(.5*(1  .5)/64) = .0625
p(crit) = MU(p) +/ Z(crit)*SIGMA(p) = .5 +/ 1.96*.0625 = .6225, .3775
In this case, p = 21/64 = .328, which is less than .3775, so we reject H(O). Other than 50% of the Foggy Bottom branch trains run on time.
OR:
Z(calc) = (p  PI)/SIGMA(p) = ((21/64)  .5)/(.0625) = 2.75
Since 2.75 < 1.96 (Zcrit), we reject H(O) and reach the same conclusion as above.
156.
Define the term "stratified sample" and explain why it would be useful in the following situation. A company is composed of many small plants located through out the United States. A Vice President of the company wants to determine the opinions of the employees on the vacation policy.
Answer: Answer  A stratified sample is one which has been obtained by a procedure in which the frame is divided into non overlapping categories (strata). Sampling units are then selected at random from each stratum thus assuring that all strata are represented in the sample. For the given problem, I would suggest a type of stratified sampling procedure. Specifically, I would recommend that each plant be considered a stratum and a random sample obtained within each stratum to insure that all plants are represented in the sample. I would further suggest that the sample from each stratum represent the proportional size of that stratum. For example, if plant A employs 25% of the total company's employees, then the sample from plant A should represent 25% of the total sample obtained.
157.
A company wants to estimate with a degree of confidence of 0.95 and with an absolute error not greather than $4.00 the true mean dollar size of orders for a particular item. How large a sample should the company take from its very extensive records to meet this requirement, if SIGMA is assumed to equal $20.00?
Answer: (2 * SIGMA)/SQRT(n) = d (2 * 20)/SQRT(n) = 4
n = 100
Note: 1.96 is a more accurate estimate of the critical value, but complicates the computation.
158.
A manufacturer wishes to determine the average weight of a certain type of product in order to design the proper package. What size sample is required so that the risk of exceeding an error of .20 pounds is .010? (Note: Errors can be positive or negative.) Assume SIGMA is 1.10 pounds.
Answer: Using Z = (XBAR  MU)/(SIGMA/SQRT(n))
we get n = ((Z) (SIGMA)/(XBAR  MU))**2 n = ((2.576)(1.1)/.2)**2 n = 200.73
Therefore, a sample size of 201 is required.
159.
An aircraft parts manufacturer wishes to determine the average shearing strength of a certain type of weld in order to submit a bid for a con tract to produce these parts. What size sample is required so that the risk of exceeding an error of 20 pounds or more is .005? Assume that SIGMA is 100 pounds.
Answer: Using a significance level = .005, Z = 2.576
n = (Z**2)(SIGMA**2)/(e**2) = (2.576**2)(100**2)/(20**2) = 165.9 == 166
160.
In a random sample of flashlight batteries, the average useful life was 22 hours and the sample standard deviation was 5 hours. How large should the sample size be if you want the mean of your sample to be within 1 hour of MU 99 times out of 100 in repeated sampling?
Answer: If significance level = .01, then Z = 2.576 SIGMA(HAT) = 5 hours Tolerable error = 1 hour
n = (Z**2)(SIGMA**2)/(error**2) = (2.576**2)(5**2)/(1**2) = 165.89 = 166
161.
A floor manager of a large department store is studying the buying habits of his customers. Suppose he has good reason to believe that an estimate of $600 for the population mean for the amount spent in his store each year is wrong. He makes preparation to draw a sample but lacks the funds to draw N=100 as he had planned. How large a sample need he draw in order to estimate the population mean within $100 of the true value with probability of 0.95? (Assume SIGMA = $500.)
Answer: n = [Z(ALPHA/2) * SIGMA/e!**2; e = tolerable error = [(1.96) * 500/100!**2 = 96
162.
The variance of average family income in New York State is known to be about the same as it is in Delaware. The mean family income is to be estimated by a sample survey in each state. It is desired to have the sampling error equal in both states. If the recommended sample size for Delaware is 2500 families:
a) What size sample would you take in New York State? b) What statistical formula supports your answer to part (a)?
Answer: a) 2500
b) Variance of mean = SIGMA**2/n
Quantity estimated in each state is a mean. SIGMA is the same for each state, so equal sampling error can be achieved by using equal sample size.
163.
Should a sample survey be considered before a complete count (census)? If so, why? Be brief.
Answer: Yes, most of the time a sample survey should be considered before a census because it has the following advantages:
1) Greater accuracy advantage  it is a curious fact that the results from a carefully planned and well executed sample survey are expected to be more accurate than those from a complete census.
2) Cost advantage  if data are secured from a small fraction of the population, expenditures are smaller than if a com plete count is attempted. Low cost permits expansion of the statistical program and expanded usefulness.
3) Time advantage  in many research problems, time is a criti cal factor. A sample of the data can be collected, coded, tabulated and analyzed more quickly than a complete count.
4) Destructive nature of the test  in order to make observations in some problems, particularly those dealing with manufactured products, the elementary units being observed must be destroyed or weakened. To test all units in the population would result in damage to all units.
164.
What are the major sources of uncertainty (error) in sample survey data? Describe them and give an example of each.
Answer: All data, whether obtained by a census or sample, are subject to various types of uncertainties. There are three types of uncertainties:
1. Structural Limitations  are defects that are built into the survey procedures. The following are some examples:
a. Failure to obtain observations which would be useful; b. Unclear or biased wording of the questionnaire; c. Poor selection of the measuring or testing instrument; d. Too large a gap between the frame and population; e. Poor choice of survey data; f. Incorrect usage of statistical formulas for calculating estimates.
The best control and method for avoiding structural limitation is achieved by detailed planning and design, through testing, review of the literature, and prior studies.
2. Operational Blemishes and Blunders  originate in the execution of the work. The following are some examples:
a. Failure to ask some of the questions; b. Asking questions not on the questionnaire; c. Mistakes in reading the measuring instrument; d. Nonresponse or refusal; e. Keypunch errors.
The detection, control and measurement of operational blemishes and blunders can be achieved through a repetition or audit of the sample, thereby enabling evaluation of their impact on the estimates.
3. Random Variation  is measured by the standard error of estimate. The first source of random variation is simply the variability (spread, dispersion) among the sampling units in the frame. The second source of random variation is from inherent, uncorrelated or nonpersistent, accidental variations of the cancelling nature that arise from inherent variability, perhaps on an hourly basis, of the investigators, supervisors, editors, coders, keypunchers, and other workers.
