1.

`Explain briefly how you would decide which of the following two eventsis the more unusual:a.  A 90 degree day in Vermont.b.  A 100 degree day in Florida. `
`Answer: One would examine previous weather records and note the relative fre-quency  of  90+  degree  days  in  Vermont,  and  100+ degree days inFlorida. Thus, relative frequencies are one method of estimating  theprobability  of  each  event.   (Note:  The smallest frequency is themore unusual event.) `

2.

`The Getrich Tire Company is having a tire sale on tires salvaged froma train wreck.  Of the 15 tires offered in the sale five tires havesuffered internal damage and the remaining ten are damage free.  Youare planning on purchasing two of these tires.  In finding the proba-bility that the two tires selected at random from the 15 will be damagefree, the probability distribution to use is:(a)  Normal                   (c)  Hypergeometric(b)  Poisson                  (d)  Binomial `
`Answer: (c)  Hypergeometric     Since two tires are selected without replacement,     it is Hypergeometric. `

3.

`As a seamstress you have observed that flaws in a certain type ofmaterial occur on the average of 0.2 per yard.  The distributionto find the probability of no more than one flaw occurring in adress requiring four yards of this material would be:(a)  Normal                   (c)  Hypergeometric(b)  Poisson                  (d)  Binomial `
`Answer: (b)  Poisson;  with LAMBDA = .2     Any random phenomenon for which a count of some sort is of interest     is a candidate for modeling by assuming a Poisson distribution. `

4.

`Suppose that a neutron passing through plutonium is equally likely torelease  1, 2, or  3  other  neutrons,  and suppose that these secondgeneration neutrons are in turn each likely to release  1,  2,  or  3third  generation  neutrons.  What is the probability distribution ofthe number of third generation neutrons? What is the mean of thisdistribution? `
`Answer: To find the probability distribution use a tree diagram and count.     N=n    ^  1    2      3      4      5      6     7     8     9     ------------------------------------------------------------------     P(N=n) ^ 1/9  4/27  16/81  12/81  12/81  10/81  2/27  1/27  1/81Mean = E(N) = (1)(1/9) + (2)(4/27) + ... + (9)(1/81) = 4 `

5.

`The lengths of Frank's 24 inch franks are normally distributed withmean of 2 feet and variance of 0.03 feet.  If you purchase 3 of Frank'sfranks for your family, what is the probability that you will have atotal length of hot dogs in excess of 6.60 feet?a.  .008    b.  .014    c.  .019    d.  .023    e.  .029 `
`Answer: d.  .023Variance = 3*(.03) = .09Z = (6.60 - 6)/SQRT(.09) = .60/.3 = 2Area beyond Z = .0228 == .023 `

6.

`If the life of wild pheasants follows a normal distribution with amean of 9 months and a variance of 9, what percent of the populationwill be less than 11 months of age?    (Note that MU = 9 and SIGMA(X)**2 = 9.)(a)  34.13                 (c)  74.86(b)  84.13                 (d)  62.93 `
`Answer: (c)  74.86              Z = (11 - 9)/3 = .67     P(Z < .67) = (.2486) + (.5000)                = .7486                = 74.86% `

7.

`The distribution of lifetimes for a certain type of lightbulb is normally distributed with a mean of 1000 hours and astandard deviation of 100 hours.  Find the 33rd percentile ofthe distribution of lifetimes.a.  560b.  330c.  1044d.  1440e.  none of these `
`Answer: e.  none of these        P(z=?) = .33             z = -.44          -.44 = (x-1000)/(100)           x = -44 + 1000               = 956 `

8.

`In testing a new rifle, the new rifle and a standard rifle are fired alarge and equal number of times under similar conditions.  The new riflescored 53 hits while the old rifle scored 47 hits.  For a comparisonof the two rifles, the total number of hits by the two rifles may be re-garded as equivalent to 100 flips of a coin and a hit by the new rifleas a head.  Consider the null hypothesis that the two rifles are equallygood (prob. of head = 1/2) against the alternative that the new rifleis better.  Answer the following three questions without using thecontinuity correction.A.  The Z-value is    1)  .5    2)  .6    3)  -.6    4)  4.3    5)  none of the aboveB.  The significance level is    1)  0.011    2)  0.274    3)  0.726    4)  less than 0.001    5)  none of the aboveC.  The null hypothesis should    1)  be rejected at 5% but not at 1% level    2)  be rejected at 1% but not at 5% level    3)  be rejected at either 5% or 1% level    4)  not be rejected at either 1% or 5% level    5)  not be continued or rejected without further information `
`Answer: A.  (2)  .6    XBAR = np = 100*.5 = 50    SIGMA = SQRT(npq) = SQRT(100 * .5 * .5)          = 5    Z = (53 - 50)/5 = .6B.  (2)  0.274C.  (4) not to be rejected at either 1% or 5% level. `

9.

`Molybdenum  rods produced on a production line are supposed to average2.2 inches in length.  It is desired to check whether the process is incontrol.  Let X = length of such a rod.  Assume X is approximately nor-mally distributed with mean = MU and variance = SIGMA**2, where the meanand the variance are unknown.Suppose a sample of n = 400 rods  is  taken  and yields a sample averagelength of XBAR = 2 inches, and SUM((X - XBAR)**2) = 399.To test H(0):  MU = 2.2 vs. H(1):  MU =/= 2.2 at level ALPHA = 8%, onewould use a _____ confidence interval for MU and hence a table valueof _____.a)  92%, 1.67b)  92%, 1.41c)  92%, 1.75d)  96%, 2.06e)  96%, 1.75 `
`Answer: c)  92%, 1.75 `

10.

`A rod from a production line has length X where X is normallydistributed with mean = 2 and variance = 1/2.Draw two rods X(1) and X(2) and place them end to end.  The sumof their lengths is X(1) + X(2).P(X(1) + X(2) < 3.6) = P(XBAR < 1.8) since (X(1) + X(2))/2 =XBAR, for sample size n = 2.  Hence P(X(1) + X(2) < 3.6) isexpressible in Z terms asa)  P(Z < -SQRT(2)/5)b)  P(Z < -(2/5))c)  P(Z < -(4/5))d)  P(Z < -(1/5))e)  P(Z < -(2*SQRT(2)/5)) `
`Answer: b)  P(Z < -(2/5))    SIGMA(XBAR) = SQRT((SIGMA**2)/n)                = SQRT((1/2)/2)                = 1/2    Z = (XBAR - MU)/(SIGMA(XBAR))      = (1.8 - 2)/(1/2)      = -(1/5)/(1/2)      = -(2/5) `

11.

`Rods produced by G&R Company are normally distributed with a mean of 66cm. and a standard deviation of 2 cm.  Rods are too long  to be useableif they are longer than 68.5 cm.  What percentage of these rods are toolong?a)  0.1056      b)  0.1151      c)  0.3849      d)  0.3944e)  None of the above are correct. `
`Answer: a)  0.1056    Z = [X - MU!/[SIGMA!      = [68.5 - 66!/[2!      = 1.25    Prob.(Z>1.25) = .1056 `

12.

`A particular type of bolt is produced having diameters with mean 0.500inches and standard deviation 0.005 inches.  Nuts are also producedhaving inside diameters with mean 0.505 inches and standard deviation0.005 inches.  If a nut and a bolt are chosen at random, what is theprobability that the bolt will fit inside the nut? `
`Answer: Mean for the distribution of differences = .005Standard deviation = SQRT((.005)**2/1 + (.005)**2/1) = .007071Z = value of interest - mean of distribution (of differences) /    standard error of the distribution of differencesZ = 0 - .005/.007071 = -.71We want all the area to the right of -.71= .7611 or 76%. `

13.

`It is known that the lengths of a particular manufactured item arenormally distributed with a mean of 6 and a standard deviation  of3.  If one item is selected at random, what is the probability thatit wil fall between 5.7 and 7.5? `
`Answer: P(5.7 < Y < 7.5) = P((5.7-6)/3 < Z < (7.5-6)/3)                 = P(-.1 < Z < .5)                 = .0398 + .1915                 = .2313 `

14.

`A company manufactures cylinders that have a mean 2 inches indiameter.  The standard deviation of the diameters of the cylindersis .10 inches.  The diameters of a sample of 4 cylinders aremeasured every hour.  The sample mean is used to decidewhether or not the manufacturing process is operating satisfactorily.The following decision rule is applied:  If the mean diameterfor the sample of 4 cylinders is equal to 2.15 inches or more,or equal to 1.85 inches or less, stop the process.  If themean diameter is more than 1.85 inches and less then 2.15inches, leave the process alone.a.  What is the probability of stopping the process if the    process average MU, remains at 2.00 inches?b.  What is the probability of stopping the process if the    process mean were to shift to MU = 2.10 inches?c.  What is the probability of leaving the process alone if    the process mean were to shift to MU = 2.15 inches?    To MU = 2.30 inches? `
`Answer: a.  Z = (1.85 - 2.00)/(.10/SQRT(4)) or (2.15 - 2.00)/(.10/SQRT(4))        = -.15/.05                    or = +.15/.05       = -3                = +3P(Z<-3 or Z>+3) = .0013 + .0013                = .0026b.  Z = (1.85 - 2.10)/.05    or    (2.15 - 2.10)/.05      = -2.5/.05                           .05/.05      = -5                                  1P(Z<-5 or Z>1) = .00000 + .1587               = .1587c.  Using MU = 2.15:Z = (1.85 - 2.15)/.05    or    (2.15 - 2.15)/.05  = -.30/.05                   0/.05  = -6                         0P(-6Using MU = 2.30:Z = (1.85 - 2.30)/.05    or    (2.15 - 2.30)/.05  = -.45/.05                   -.15/.05  = -9                         -3P(-9 `

15.

`A company manufactures rope.  From a large number of tests over a longperiod of time, they have found a mean breaking strength of 300 lbs.and a standard deviation of 24 lbs.  Assume that these values areMU and SIGMA.It is believed that by a newly developed process, the mean breakingstrength can be increased.(a)  Design a decision rule for rejecting the old process with an     ALPHA error of 0.01 if it is agreed to test 64 ropes.(b)  Under the decision rule adopted in (a), what is the probability     of accepting the old process when in fact the new process has     increased the mean breaking strength to 310 lbs.?  Assume SIGMA     is still 24 lbs.  Use a diagram to illustrate what you have done,     i.e., draw the reference distributions. `
`Answer: a.  One tail test at ALPHA = .01, therefore Z = 2.33.    Z = (YBAR-MU)/(SIGMA/SQRT(n))    2.33 = (YBAR-300)/(24/SQRT(64))    YBAR = 307    Decision Rule:  If the mean strength of 64 ropes tested is 307                    lbs. or more, we reject the hypothesis of no im-                    provement, i.e., we continue that the new process                    is better.b.  If available, consult file of graphs and diagrams that could not    be computerized for reference distributions.    Z = (307-310)/(24/SQRT(64)) = 1.00    Area = 0.1587 or 15.87%    P(type II error) = 0.1587 `

16.

`A certain kind of automobile battery is known to  have  a  length  oflife  which  is  normally  distributed  with  a mean of 1200 days andstandard deviation 100 days.  How long should the guarantee be if themanufacturer wants to replace only 10%  of  the  batteries which  aresold? `
`Answer: Z = -1.28 for 10 percent failure-1.28 = (X - 1200)/100X = 1072 days for guarantee `

17.

`It  is  known  from  past experience that when a certain type of farmmachine is used, the length of time it will  run  before  needing  anoverhaul  is approximately normally distributed with MU=455 hours andSIGMA=50 hours.   When running the output is 100 bushels per hour.a.  What is the probability that such a machine will process at least    40,000 bushels before needing an overhaul?b.  If a large number of such machines are put into service about 25%    will be running after X hours.  Calculate X.c.  If 25 machines are put into service, what is the probability that    their AVERAGE life will be at least 445 hours? `
`Answer: a.  Z = (40,000-455*100)/(50*100) = -1.1    prob. = .5 + .3643 = .8643b.  prob. = .25      Z = .68    X = 455*100 + 50*100*.68 = 48,900c.  S = 50/SQRT(25) = 10    Z = (445-455)/10 = -1    prob. = .5 + .3413 = .84 `

18.

`Suppose the hour life lengths, X(1) and X(2), of two brands ofelectronic tubes, say T(1) and T(2), are:          MU(1) = 100            MU(2) = 102    SIGMA(1)**2 =  36      SIGMA(2)**2 =   9a.  Find the value of K such that P(X(1) > K) = .93319.b.  If a tube is needed for a 106 hour time period, which brand    should be selected?  Why?c.  If one tube is selected at random from brand T(1), find the    probability that its life will exceed 100 hours.d.  Find P(X(2)-X(1) > 0). `
`Answer: a.  P(Z > Y) = .9334 or P(Z < Y) = .0666    Therefore, Y = -1.5    Using the formula Z = (X - MU)/SIGMA:        X = (Z*SIGMA) + MU        K = (-1.5)(6) + 100          = -9 + 100          = 91b.  Z = (106 - 100)/6        Z = (106 - 102)/3      = 1                      = 1.3    P(Z > 1) = .1587         P(Z > 1.3) = .0918    T(1) should be selected because 15.87% of T(1) tubes last for    106 hours or more, but only 9.18% of T(2) tubes last that long.c.  Z = (100 - 100)/6      = 0/6      = 0    P(Z > 0) = .5000d.  Since the variances are known, the standard error of differences    between elements equals:    SIGMA(X(2) - X(1)) = SQRT[(SIGMA(2)**2) + (SIGMA(1)**2)!                       = SQRT(36 + 9)                       = 6.71       MU(X(2) - X(1)) = 2    Therefore, Z = (0 - 2)/6.71 = -.3    P(Z > -.3) = .1179 + .5000 = .6179 or 61.79%. `

19.

`Suppose that you work for a brewery as a clerk to receive barleyshipments.  As part of your job you are to decide whether to keepor return new shipments of barley.  The criteria used for making yourdecision is an estimation of the moisture content of the shipment.If the moisture level is too high (above 17.5%) the shipment has agood possibility of rotting before use and, therefore, a loss ofmoney to the company.  You know from past experience that the variancefor all barley shipments is 36 and that your staff can process at themost one sample of 9 moisture readings per shipment.a.  Propose a rule for accepting and rejecting grain shipments on the    basis of sample means where the null claim is a shipment has a    mean moisture content of 17.5% or less (H(0):  MU <= 17.5%).    Let the probability of Type I error be .10.b.  When will you make incorrect decisions about a grain shipment    having MU = 17.4?  What will be the probability of such an    error?c.  When will you make incorrect decisions about a grain shipment    having MU = 19?  What will be the probability of such errors?d.  When will you make incorrect decisions about a grain shipment    having MU = 21?  What will be the probability of such errors? `
`Answer: SIGMA**2 = 36Take a sample, n = 9SIGMA(XBAR) = SIGMA/SQRT(n) = 6/3 = 2a.  H(0):  MU <= 17.5    H(1):  MU >  17.5    ALPHA = .10 implies Z = 1.28    Z = XBAR - MU/SIGMA(XBAR)    1.28 = XBAR - 17.5/2    2.56 = XBAR - 17.5    XBAR = 20.06    Reject H(0) when XBAR > 20.06.b.  I am rejecting H(0) when XBAR > 20.06, so when MU is REALLY 17.4,    I make incorrect decisions whenever XBAR > 20.06.    Z = 20.06 - 17.4/2    Z = 1.33    Area beyond Z = 1.33 is .0918.    The probability of an incorrect decision is .0918.c.  I am rejecting H(0) when XBAR > 20.06, so when MU is REALLY 19,    I make incorrect decisions whenever XBAR <= 20.06.    Z = 20.06 - 19/2    Z = .53    Area between mean and Z = .2019.    The probability of an incorrect decision is .5 + .2019 = .7019.d.  I am rejecting H(0) when XBAR > 20.06, so when MU is REALLY 21,    I make incorrect decisions whenever XBAR < 20.06.    Z = 20.06 - 21/2    Z = -.47    Area beyond Z = -.47 is .3912.    The probability of an incorrect decision is .3912. `

20.

`A man purchases 100 boxes of nails, each box containing 1000 nails.If, on the average, one out of every 500 nails is rusty, how many ofthe 100 boxes would you expect to contain less than 2 rusty nails?a.  18      b.  27      c.  41      d.  49      e.  68 `
`Answer: c.  41    f(X) = (LAMBDA**X)*(exp(-LAMBDA))/X],  where LAMBDA = np = 2    100*(f(0) + f(1)) = 100 * (exp(-2) + 2exp(-2))                      = 13.5 + 27.0                     == 41 `

21.

`Failures of electron tubes in airborne applications have been found tofollow closely the Poisson Distribution.  A receiver with sixteen tubessuffers a tube-failure on the average of once every 50 hours of operat-ing time.  Find the probability of more than one failure on an eighthour mission. `
`Answer: Using the Poisson distribution where:    P(Y) = ((e** - LAMBDA)(LAMBDA**Y))/(Y])  LAMBDA = 8/50 = .16P(Y > 1) = 1 - P(0) - P(1)         = 1 - ((e**(-.16))*(.16**0))/0]           - ((e**(-.16))*(.16**1))/1]         = .1152 `

22.

`Suppose the weather forecaster is either  right  or  wrong  with  hisdaily forecast and that the probability he is wrong on any day is .4.Assume his performance is to be evaluated on 18 randomly selecteddays such that his performance is independent from day to day. Let  Abe the event that he is wrong on less than 5 of the days.     a.  Find the exact value of P(A).     b.  Find the approximate value of P(A), based on the Poisson         approximation.     c.  Is the approximation in (b.) valid? Why or why not?     d.  Find the approximate value of P(A), based on the Central Limit         Theorem. (Hint: SIGMA**2 = np(1 - p))     e.  Is the approximation in (d.) valid? Why or why not? `
`Answer: a.  P = P(4 wrong) + P(3 wrong) + P(2 wrong) + P(1 wrong) + P(0 wrong)      = (18C4)(.4**4)(.6**14) + ... + (18C0)(.4**0)(.6**18)      = .061 + .025 + .007 + .001 + .000      = .094b.  LAMBDA = np = (18)(.4) = 7.2    P = (7.2**4)(e**-7.2)/4] + (7.2**3)(e**-7.2)/3] +        (7.2**2)(e**-7.2)/2] + (7.2)(e**-7.2) + (e**-7.2)      = .084 + .046 + .019 + .005 + .001      = .115c.  No, because n is too small and P is too large.d.  p = .4    SIGMA = SQRT(npq) = 2.08    MU = np = 7.2    4.5 in standard units is -1.30 = (4.5 - 7.2)/2.08.    P(Z < -1.30) = .5 - .4032 = .0968e.  Yes, because np and nq are greater than 5, (or p is not close to  0    or 1 and n is at least moderate size). `

23.

`The probability of a snow storm on any given day during January is equalto P.a)  What is the probability of at least one snow storm during January    (the month has 31 days)?  Set this up in general since you have    no values for P.b)  If p = 1/10, what is the probability of exactly three storms during    the period beginning with January 10 and ending with January 21?    Set this up but do not evaluate.c)  Use the normal approximation to evaluate the above probability in    part b. `
`Answer: a)  P(at least 1 storm) = 1-P(no storm) = 1-Q**31        where P = 1-Qb)  P(3) = (12C3)*(.1**3)*(.9**9) = .085c)  p=.1    mean=np=1.2  SIGMA=SQRT(npq)=1.04    P(3) == P(2.5    standard score = (2.5-1.2)/1.04 = 1.25    standard score = (3.5-1.2)/1.04 = 2.21    prob. = .4864 - .3944 = .092 `

24.

`Seventy  five  percent  of  the  Ford  autos made in 1976 are fallingapart. Determine the probability distribution of the number of  Fordsin  a  sample  of  4 that are falling apart.  Draw a histogram of thedistribution. What is the mean and variance of the distribution? `
`Answer: Let X = the number of Fords falling apart in a sample of four.probability distribution: (binomial distribution with n=4 and p=.75)        X   ^   p(X)     -------^----------        0   ^  0.0039     = (4C0)(.75**0)(.25**4)        1   ^  0.0469     = (4C1)(.75**1)(.25**3)        2   ^  0.2109     = (4C2)(.75**2)(.25**2)        3   ^  0.4219     = (4C3)(.75**3)(.25**1)        4   ^  0.3164     = (4C4)(.75**4)(.25**0)               ^          P(X) ^          ^          ^          ^          ^               ^          ^          ^          ^          ^               ^          ^          ^          ^          ^               ^          ^          ^          ^          ^               ^          ^          ^          ^          ^           0.6 ^----------^----------^----------^----------^               ^          ^          ^          ^          ^               ^          ^          ^          ^          ^           0.5 ^----------^----------^----------^----------^               ^          ^          ^          ^          ^               ^          ^          ^     ----------      ^           0.4 ^----------^----------^----^          ^-----^               ^          ^          ^    ^          ^     ^               ^          ^          ^    ^          ^----------           0.3 ^----------^----------^----^          ^          ^               ^          ^          ^    ^          ^          ^               ^          ^     ----------^          ^          ^           0.2 ^----------^----^          ^          ^          ^               ^          ^    ^          ^          ^          ^               ^          ^    ^          ^          ^          ^           0.1 ^----------^----^          ^          ^          ^               ^          ^    ^          ^          ^          ^               ^     ----------^          ^          ^          ^               ^----^----------^----------^----------^----------^----->               0          1          2          3          4      Xmean = np = 4*.75 = 3variance = npq = 4*.75*.25 = .75 `

25.

`The following results were obtained from life tests on miniaturebearings.  Each datum represents the hours to failure in aparticular turbine.                       Time to Failure             ---------------------------------------------------Turbine No.   1    2    3    4    5    6    7    8    9    10----------    -    -    -    -    -    -    -    -    -    --  Bearing    110  116  670  530  260  190  116  254  150   99   Runs      600 1130  525  242  336  414  300  213  769  140             350   90       194  112   78  558            308             280  123       108  330  930  320             41              96                 690       925             92             122                                          260a)  Plot a frequency histogram with cell interval of 50 hours.    Does the distribution look normal?b)  Plot a conventional % relative cumulative frequency curve on    normal probability paper.  What do you conclude?c)  Plot a similar curve on log-normal probability paper.  What    do you conclude?d)  Calculate the proper estimate of central tendency and dispersion. `
`Answer: LIFE TEST DATATime to Failure    Frequency    % Relative Cumulative Frequency---------------    ---------    -------------------------------       41              1                    2.5       78              1                    5.0       90              1                    7.5       92              1                   10.0       96              1                   12.5       99              1                   15.0      108              1                   17.5      110              1                   20.0      112              1                   22.5      116              2                   27.5      122              1                   30.0      123              1                   32.5      140              1                   35.0      150              1                   37.5      190              1                   40.0      194              1                   42.5      213              1                   45.0      242              1                   47.5      254              1                   50.0      260              2                   55.0      280              1                   57.5      300              1                   60.0      308              1                   62.5      320              1                   65.0      330              1                   67.5      336              1                   70.0      350              1                   72.5      414              1                   75.0      525              1                   77.5      530              1                   80.0      558              1                   82.5      600              1                   85.0      670              1                   87.5      690              1                   90.0      769              1                   92.0      925              1                   95.0      930              1                   97.5     1130              1                  100.0a)  No, the distribution does not look normal.  (If available, consult    file of graphs and diagrams that could not be computerized.)b)  This plot does not form a straight line, and thus appears to be non-    normal.  (If available, consult file of  graphs  and diagrams  that    could not be computerized.)c)  This plot does form a straight line, and thus the log of the data    appears to form a normal distribution.  (If available, consult file    of graphs and diagrams that could not be computerized.)d)  Arithmetic mean is 329.275 hours.    Standard deviation is 269.96.    Arithmetic mean of logs is 2.382.   Antilog  is  241 hours.   As    expected, the geometric mean is less than the arithmetic mean. The    geomtric standard deviation is 0.352 as is the log, and the anti-    log of this number is 2.25 hours. `

26.

`The strengths of elevator cables are to be measured.  Let X = strengthof a cable, and assume X is normal with mean MU and variance SIGMA**2,both unknown.  A sample of 89 cables is taken, with results XBAR = 31and S**2 = 89.A 93% confidence interval for MU uses a table value closest to:(a)  1.60   (b)  2.11   (c)  1.32   (d)  1.12   (e)  1.81 `
`Answer: (e)  1.81     Use Z value because sample size is large, although t distribution     would ordinarily be used when SIGMA**2 is unknown. `

27.

`Rods from a production line have a length X which is distributednormally with a mean of 2 and a variance of 1/2.  Draw two rodsX(1), X(2) and place them end to end.  The sum of their lengthsis X(1) + X(2).P[(X(1) + X(2)) < 3.6! = P(XBAR < 1.8) has a value closest to:(a)  .1554            (d)  .3446(b)  .2157            (e)  .7843(c)  .2843 `
`Answer: (d)  .3446     Z = (X - MU)/SQRT(Variance/n)     Z = (1.8 - 2)/SQRT(.5/2) = -4     Area beyond Z of .4 = .3446     Therefore, the probability that the sum of the two rods will     be < 3.6 is .3446. `

28.

`You, as a manufacturer,  can  use  a  particular  part  only  if  itsdiameter  is between .14 and .20 inches.  Two companies, A and B, cansupply you with these parts at comparable costs.  Supplier A producesparts whose mean is .17 and whose standard deviataion is .015 inches.However, supplier B produces parts whose mean is .16 inches and whosestandard  deviation  is  .012.   The diameters of the parts from eachcompany are normally distributed.  Which company should you buy fromand why? `
`Answer: For Supplier A:     Z = (X - MU)/SIGMA       = (.14 - .17)/.015       = -2and Z = (.20 - .17)/.015       = 2Area between Z = 2 and Z = -2 under the normal curve is .9544.  There-fore, 95.44% of the parts would be within .14 in. and .20 in.For Supplier B:     Z = (.14 - .16)/.012       = -1.67 and Z = (.20 - .16)/.012       = 3.33Area between Z = 3.33 and Z = -1.67 under the normal curve is .9520.Therefore, 95.20% of the parts would be within .14 in. and .20 in.Conclusion:  I would choose Supplier A by a hair. `

29.

`A lightbulb is selected randomly from a factory's monthly production.The bulb's lifetime (total hours of illumination) is a random variablewith exponential density function       f(x) = (1/MU)*(e**[-x/MU!)   if x >= 0            = 0                     if x < 0,where the fixed parameter MU is the mean of this distribution (MU > 0).a)  Derive the cumulative distribution function F(x).    Show that a random lifetime X exceeds x hours (x > 0) with    probability                  P(X > x) = e**(-1/MU)b)  Let M denote the smallest value in a random sample of n bulb    lifetimes  X(1), X(2), ..., X(n).    Show that P(M > x) = P(X(1) > nx).    HINT:  M > x if and only if X(1) > x and X(2) > x and ...           and X(n) > x.c)  Assume the mean lifetime MU = 700 hours.    Use a) and a table of the exponential function to evaluate    numerically    i)  the median lifetime x(.50),    ii)  P(X <= 70),    iii)  P(70 < X <= 700). `
`Answer: a)  F(X) = INT(X/0)((1/MU)*(e**[-t/MU!)dt)                       X         = -e**(-t/MU)!                       0         = 1/0 - [e**-X/MU)!    F(X) = [ 0;  x < 0           [ 1.0 - [e**(-x/MU)!; x >= 0    Prob (X>x) = 1.0 - F(X)               = 1.0 - [1.0 - [e**(-x/MU)!!               = e**(-x/MU)b)  Prob(M > x) = [Prob(X(1)>x)!*[Prob(X(2)>x)!*...*[Prob(X(n)>x)!                = [e**(-x/MU)!**n                = e**(-xn/MU)                = [Prob(X(1)>xn)!c)  i)  0.50 = Prob(X <= Median)             = F(x)             = 1.0 - [e**(-x/700)!        0.50 = e**(-x/700)        using a table of the exponential function        x/700 == .693        x == 485.1 hours    ii)  Prob(X<=70) = F(X=70)                     = 1.0 - [e**-70/700)!                     = 1.0 - 0.90484                     = 0.09516    iii)  Prob(70 < x <= 700) = F(x=700) - F(x=70)                              = [1.0-[e**(-700/700)!!-[1.0-[e**(-70/700)                              = [1.0 - .36788! - [0.09516!                              = 0.53696 `

30.

`Suppose that the duration of a storm on a tropical island is expo-nentially distributed with mean value of THETA = 5 minutes.  What isthe probability that a storm on the island will last at least twominutes more, given that it has already lasted for 5 minutes? `
`Answer: The distribution for the duration of a storm f(X) is:f(X) = (1/5) * (e**(-X/5))     X > 0     = 0                       elsewhereP(rain will last at least 2 minutes morelasted for 5 min.)    = P(X >= 7X >= 5)    = (INT(INFNTY/7)((1/5)(e**(-X/5))))/      (INT(INFNTY/5)((1/5)(e**(-X/5))))    = (e**(-7/5))/(e**-1)    = (e**(-2/5))    = .67032 `

31.

`A lightbulb is selected randomly from a factory's monthly production.The bulb's lifetime (total hours of illumination) is a random variablewith exponential density function      f(x) = [(1/MU)*(e**[-x/MU!)  if x >= 0             [ 0                   if x < 0,where the fixed parameter MU is the mean of this distribution (MU>0).a)  For an exponential distribution the standard deviation SIGMA = MU.    Let XBAR = (1/n)(X(1)+X(2)+...+X(n)) denote the average value in    a random sample of n bulb lifetimes.  Express E[XBAR! and VAR[XBAR!    in terms of MU.  If the mean MU = 700 hours and sample size n = 100,    then the statistic Z=(XBAR-700)/70 has approximately a normal    distribution with what mean and variance?b)  Describe a test of the null hypothesis H(0):  MU <= 700 against the    alternative hypothesis H(1):  MU > 700, using only the sample mean    XBAR.  If the desired significance level is ALPHA = .05 and sample    size n = 100, then indicate which numerical values of XBAR corre-    spond to this test rejecting H(0).    (Use the table of the standard normal distribution.)c)  If mean MU = 700 hours, then P(X > 2100) = .04979.  If instead    MU > 700, is P(X > 2100) larger or smaller than .04979? `
`Answer: a)  E[XBAR! = E[(1/n)*(X(1)+X(2)+...+X(n))!            = (1/n)*[E[X(1)+E[X(2)!+...+E[X(n)!!            = (1/n)*[n*E[X!!            = E[X!            = INT(INFNTY/0)(X*(1/MU)*e**[-x/MU!)dx)              (Integrating by parts, with                u = x       dv = (1/MU)(e**[-x/MU!)dx               du = dx       v = -e**[-x/MU!                             INFNTY            = -x*(e**[-x/MU!)!  - INT(INFNTY/0)(-e**[-x/MU!dx)                             0                   INFNTY            = -MU * e**[-x/MU!!                              0            = MU    E[x**2! = INT(INFNTY/0)((x**2)*(1/MU)*(e**[-x/MU!)dx)              by parts with,                u = (x**2)   dv = (1/MU)(e**[-x/MU!)dx               du = 2x dx     v = -e**[-x/MU!                                   INFNTY            = (x**2)*(-e**[-x/MU!)! -INT(INFNTY/0)((2x)*(-e**[-x/MU!)dx)                                   0            = -2*INT(INFNTY/0)((x*(-e**[-x/MU)dx)              by parts with                u = x      dv = -e**[-x/MU!dx               du = dx      v = mu*(e**[-x/MU!)                                   INFNTY            = -2*[x*MU*(e**[-x/MU!)!  - INT(MU*(e**[-x/MU!)dx)!                                   0                                    INFNTY            = -2(MU**2)*(e**[-x/MU!)!                                    0            = 2(MU**2)    VAR[XBAR! = VAR[(1/n)*(X(1)+X(2)+...+X(n))!              = [(1/n)**2!*[VAR[X(1)!+VAR[X(2)+...+VAR[X(n)!!              = [(1/N)**2!*[n*VAR[X!!              = (1/n)*(VAR[X!)              = (1/n)*[E[X**2!-(E[X!**2)!              = (1/n)*[2(MU**2)-(MU**2)!              = (MU**2)/n       Z = (XBAR-700)/70    E[Z! = (E[XBAR!-700)/70         = (MU-700)/70         = (700-700)/70         = 0/70         = 0    VAR[Z! = VAR[(XBAR-700)/70!           = [(1/70)**2! * VAR(XBAR)           = [(1/70)**2! * [(MU**2)/n!           = [1/4900! * [(700**2)/100!           = 1b)  test statistic:  Z = [XBAR-700!/[700/SQRT(n)!    critical region:  Any value of Z(calc) that lies beyond the Z(crit)           which is found in the standard normal table with ALPHA per           cent of the distribution beyond it.    with n = 100 and ALPHA = .05, Z(crit) = 1.645    Thus in order to reject H(0),    [XBAR-700!/[700/SQRT(100)! >= 1.645    XBAR >= (1.645*70) + 700    XBAR >= 815.15c)  It can be shown that a random lifetime X exceeds x hours (X>0)    with probability    P(X > x) = e**(-x/MU)    Therefore,    P(X > 2100) = e**(-2100/700)                = e**(-3)    Now if MU > 700, the exponent of e becomes less and looking at a    table of the exponential function it is evident that the probability    becomes smaller. `

32.

`A lot containing 12 parts among which 3 are defective is put on  sale"as  is"  at  \$10.00  per  part  with  no  inspection possible.  If adefective part represents a complete loss of the \$10.00 to the  buyerand  the good parts can be resold at \$14.50 each, is it worthwhile tobuy one of these parts and select it at random? `
`Answer: Expected return value of part = .75*(14.50) + .25(0) = 10.875Therefore, you expect to gain approximately \$.87 on each part you buy,and it is worthwhile to buy one selected at random. `

33.

`Usually when we make use of a random numbers table we wishto arrange things so that each each event has an equal probabilityof occurring.  If we were interested in locating 5 corntrials in a region having 48 corn farms and we wanted eachfarm to have an equal likelihood of being selected (in contrastto the common practice of locating trials on the farms of thegrowers most friendly to the local extension agent), describea method using the random numbers table that could be usedto make the selection.  Indicate the five farms selectedusing your method. `
`Answer: To use a random numbers table one must do the following:1.  Make up a rule for converting digits from the table intosample identification numbers.  The rule used ordinarily shouldmake selections of each population item equally likely.  Itshould also indicate if the same element can be counted morethan once.2.  Find a starting point in the table in a manner that willnot always lead to the same starting point or a small set ofstarting points.3.  Translate the digits that follow the starting point intosample identification numbers.In this case we will use sampling without replacement meaningthat a population element can only appear once in a sample.It is also assumed that the I.D. numbers 1 to 48, have beenassigned to the farms.a.  The rule for converting digits is:  beginning at the startingpoint and going left to right take a pair of digits and usethose if they are in the range 1 to 48 otherwise discard.Continue this process until you get five.b.  To arrive at the starting point, haphazardly put yourfinger on a group of digits, use the first two digits (that fitthe table) to get a row number and the next two to get a column.Using this process I get row 44 and column 04 as my startingpoint.  Starting from there I get the following pairs:76, 54, 91, 40, 69, 90, 67, 24, 56, 83, 50, 82, 94, 81, 13,98, 42, 87, 88, 02Therefore, the sample would contain the following farms:40, 24, 13, 42, 2 `

34.

`Electron tubes made by two factories, A and B, are installed at randomin single tube units.  Thirty percent of the tubes are from factory B.The probability that a factory B tube will fail in the first week ofoperation is .1, and the probability that a factory A tube will fail is.3.  If a particular unit fails in the first 100 hours of continuousoperation, what is the probability that it had a tube from factory Ainstalled?  From factory B? `
`Answer: a.  P(Afailure) = [P(A)*P(failureA)!/                   [P(A)*P(failureA) + P(B)*P(failureB)!                 = (.7*.3)/[(.7*.3) + (.3*.1)!                 = .875 = 7/8b.  P(Bfailure) = 1 - P(Afailure)                 = 1 - .875 = .125 = 1/8 `

35.

`Suppose  that two of the six spark plugs on a six-cylinder automobileengine require replacement. If the  mechanic  removes  two  plugs  atrandom, what is the probability that he will select the two defectiveplugs? At least one of the two defective plugs? `
`Answer: a.  prob = 1/(6C2) = 1/15b.  prob = 1( - (4C2))/(6C2) = 9/15 = 3/5 `

36.

`A certain assembly consists of two sections, A and B, which are boltedtogether.  In a bin of 100 assemblies, 12 have only section A defective,10 have only section B defective, and 2 have both section A and sectionB defective.  What is the probability of choosing, without replacement,2 assemblies from the bin which have neither section A nor section Bdefective?a.  (76)**2/(100)**2b.  (98)**2/(100)**2c.  98(97)/[100(99)!d.  76(75)/[100(99)!e.  none of these `
`Answer: d.  76(75)/[100(99)!    # of sections without defectives = 100 - (12 + 10 + 2)                                    = 100 - 24 = 76    P(of no defectives) = (76/100)*(75/99) `

37.

`Suppose that the probability is 0.1 that the weather (being either sun-shine or rain) does not change from one day to the  next.   The sun  isshining today.  What is the probability that it will rain the day aftertomorrow? `
`Answer: S(today) - R(tomorrow) - R(day after)P(SRR) = (.9)(.1)       = .09S(today) - S(tomorrow) - R(day after)P(SSR) = (.1)(.9)       = .09P(rain day after tomorrow) = .09 + .09 = .18 `

38.

`-------------     -----------------     ---------------------     ^   ^   ^   ^     ^   ^   ^   ^   ^     ^   ^   ^   ^   ^   ^     -------------     -----------------     ---------------------     ^   ^   ^   ^     ^   ^   ^   ^   ^     ^   ^   ^   ^   ^   ^ -------------     -----------------     ---------------------     ^   ^   ^   ^     ^   ^   ^   ^   ^     ^   ^   ^   ^   ^   ^     -------------     -----------------     ---------------------           I                   II                      IIIThree fields have been divided into plots as in the above figure.Define an edge plot as one that is on the outside of the field (thisincludes corner plots).A farmer selects at random one plot from each field, with no relationbetween choices from different fields. What is the probability heends up with an edge plot from I and II but not from III? Giveanswer as simplified fraction. `
`Answer: P(edge 1) = 8/9P(edge 2) = 10/12 = 5/6P(not edge 3) = 3/15 = 1/5P(edge 1, edge 2, not edge 3) = (8/9)(5/6)(1/5) = 4/27 `

39.

`Suppose that the probability is 0.7 that the weather (sunshine or rain)is different for any given day than it was on the preceding day.  If itis a sunshine day today, what is the probability that it will be rain-ing the day after tomorrow? `
`Answer: PROB = P(sun, sun, rain) + P(sun, rain, rain)     = (.3)(.7) + (.7)(.3)     = .42 `

40.

`The following table gives data which have been rounded from anactual federal report on the subject.  DISTRIBUTION OF ADMINISTRATORS FOR NURSING AND PERSONAL CARE HOMES,      BY LENGTH OF TOTAL WORK EXPERIENCE AND SIZE OF THE HOME,    UNITED STATES, EXCLUDING ALASKA AND HAWAII, JUNE-AUGUST 1969.-----------------------------------------------------------------------       Length of     ^ Under ^       ^       ^       ^       ^       ^Size    total work   ^   1   ^  1-4  ^  5-9  ^ 10-19 ^  20+  ^ Total ^of home  experience  ^ year  ^ years ^ years ^ years ^ years ^       ^_____________________^_______^_______^_______^_______^_______^_______^                      ^       ^       ^       ^       ^       ^       ^Under 25 beds         ^  200  ^  600  ^  850  ^ 1100  ^  550  ^ 3300  ^                      ^       ^       ^       ^       ^       ^       ^25-49 beds            ^  200  ^  750  ^  500  ^  600  ^  350  ^ 2400  ^                      ^       ^       ^       ^       ^       ^       ^50-99 beds            ^  250  ^  700  ^  550  ^  450  ^  250  ^ 2200  ^                      ^       ^       ^       ^       ^       ^       ^100-299 beds          ^  100  ^  300  ^  250  ^  200  ^  150  ^ 1000  ^                      ^       ^       ^       ^       ^       ^       ^300 beds and over     ^    0  ^   20  ^   30  ^   30  ^   20  ^  100  ^______________________^_______^_______^_______^_______^_______^_______^                      ^       ^       ^       ^       ^       ^       ^Total                 ^  750  ^ 2370  ^ 2180  ^ 2380  ^ 1320  ^ 9000  ^______________________^_______^_______^_______^_______^_______^_______^If administrator's experience were independent of size of home, find:A. the probability that an administrator chosen at random is adminis-   tering a home with 25 beds or more, given that he/she has at least   10 years experience.B. the probability that (for an administrator-home pair selected at   random) the home will have 99 or fewer beds and the administrator   will have a work experience of 1 to 9 years inclusive.C. the probability of an administrator with experience of 1 to 9 years   with a nursing home of 300 beds or over.D. the number of administrators you would expect to have from 1 to 9   years experience and work in a home with 99 or fewer beds.E. the number of administrators with 1 to 9 years experience in a home   of 300 or over beds. `
`Answer: A. (note: If events A and B are independent,then                 P(AB) = P(A).)  P(25 beds+  10 yrs.+) = P(25 beds+)                         = (2400+2200+1000+100)/9000                         = 57/90                         = .633B. P(99 or fewer beds INTERSECTION 1 - 9 yrs)                         = [(2370 + 2180)/9000! * [(3300 +                           2400 + 2200)/9000!                         = [4550/9000! * [7900/9000!                         = 0.4438C. P(300+ beds INTERSECTION 1-9 yrs)                         = [(2370 + 2180)/9000! * [100/9000!                         = .0056D.  Using the probability found in part B.    Expected number = 9000*.4438 = 3994.2E.  Using the probability found in part C.    Expected number = 9000*.0056 = 50.4 `

41.

`A population of 160 communities is arranged according to deathrate and air pollution level as follows:                   AIR POLLUTION LEVEL                   Low   Medium   High                   -------------------              Low    2      6      8  ^  16    DEATH          -------------------           Medium   14     42     56  ^ 112     RATE          -------------------             High    4     12     16  ^  32                   -------------------                    20     60     80  ^ 160How many communities would you expect to have a low death rate and ahigh air pollution level if death rate and air pollution level areindependent (i.e. are not associated)?a.  (16*80)/(160**2)    b.  (20*32)/(160**2)    c.  8/160d.  4                   e.  none of these `
`Answer: e.  none of these    P(high pollution) = 80/160 = 1/2  = .5    P(low death rate) = 16/160 = 1/10 = .1    P(both) = .05    Number of communities with low death rate and high pollution =    (.05)*(160) = 8 `

42.

`A population of 160 communities is arranged according to deathrate and air pollution level as follows:                   AIR POLLUTION LEVEL                   Low   Medium   High                   -------------------              Low    2      6      8  ^  16    DEATH          -------------------           Medium   14     42     56  ^ 112     RATE          -------------------             High    4     12     16  ^  32                   -------------------                    20     60     80  ^ 160This is the entire population not a sample.  In view of this --Which of the following statements is correct about the two events"low death rate" and "high air pollution level":a.  they are independentb.  they are mutually exclusivec.  they are exhaustived.  they are oppositee.  none of these `
`Answer: a.  they are independent    P(high pollution)*P(low death) = .5 * .1 = .05    8/160 = .05 `

43.

`A random sample of 160 communities is distributed according to deathrate and air pollution level as follows:                   AIR POLLUTION LEVEL                   Low   Medium   High                   -------------------              Low    2      6      8  ^  16    DEATH          -------------------           Medium   14     42     56  ^ 112     RATE          -------------------             High    4     12     16  ^  32                   -------------------                    20     60     80  ^ 160Which of the following statements is correct?a.  There is no evidence that air pollution level and death rate are    related.b.  Death rate and air pollution level are dependent variables.c.  The CHISQUARE index for the table is very large.d.  The probability that a randomly selected community will have a high    death rate will vary as the air pollution level of the community    varies.e.  None of these. `
`Answer: a.  There is no evidence that air pollution level and death rate are    related.    CHISQUARE(calculated) = 0, so continue the notion in H(O) of    independence. `

44.

`A special steel alloy has an average tensile strength of 25,800 psi.The numerical value of the variance is 1,500,000.  The units assoc-iated with this variance would be:(a)  (psi)**2                  (c)  SQRT(psi)(b)  psi                       (d)  unknown `
`Answer: (a)  (psi)**2 `

45.

`The  life in months of service before failure of the color televisionpicture tube in 8 television sets manufactured by Firm A and  8  setsmanufactured by Firm B are as follows (arranged according to size):     Firm A:  25,  29,  31,  32,  35,  37,  39,  40     Firm B:  34,  36,  41,  43,  44,  45,  47,  48Let ETA(A) and ETA(B) denote the median service life of picture tubesproduced by the 2 firms. A confidence interval for ETA(B) - ETA(A) isbounded by the dth smallest and the dth largest of all differences ofB-  and  A-observations.  For confidence coefficient .99, we take dequal to:     (a)  9     (b)  14     (c)  15      (d)  17 `
`Answer: (a)  9 `

46.

`A Physicist comes to you (as Associate Professor of  Statistics)  foryour  help.  He has a special electronic circuit consisting of Anodesand Diodes linked together in a special way.  This is an experimentalpiece  of equipment and for the 4 components (i.e. Anodes and Diodes)he  knows  the  following  probability  distribution  from   previousexperiments.  ____________________________________________________________ ^  Anodes or Diodes failing per circuit ^ 0 ^ 1 ^ 2 ^ 3 ^  4 ^ ^---------------------------------------^---^---^---^---^----^ ^  Probability that many fail           ^0.1^0.2^0.3^0.3^ 0.1^  ------------------------------------------------------------He creates a special circuit consisting of the same Anodes and Diodesand wonders whether the distribution has remained the same.  He  uses500  of  the circuits and counts the number of Diodes and Anodes thatfail.  He finds the following:        ___________________________________________________       ^  Anodes or Diodes failing    ^ 0 ^ 1 ^ 2 ^ 3 ^ 4  ^       ^------------------------------^---^---^---^---^----^       ^  Frequency                   ^ 50^105^145^155^ 45 ^        ---------------------------------------------------Could  the  Physicist  conclude  the  new  circuit   had   the   samecharacteristics  as  the  previous circuits?  Be careful to state theLevel of Significance used. `
`Answer: Here we need to see how well the data we have fits the theoretical(past) distribution.  This is a Chi square goodness of fit problem.    H(O):  The data fits the past distribution.    H(A):  The data does not fit the past distribution.CHISQ = SUM([(O-E)**2!/[E!)      where  O = observed value             E = expected value = probability*total             df= k-m-1,             k = no. of categories,             m = no. of estimated parametersTable:  No.  failing ^ O ^ E ^ (O-E) ^ [(O-E)**2! ^ [(O-E)**2!/[E!----------^---^---^-------^------------^------------------    0     ^ 50^ 50^   0   ^      0     ^       0.000    1     ^105^100^   5   ^     25     ^       0.250    2     ^145^150^  -5   ^     25     ^       0.167    3     ^155^150^   5   ^     25     ^       0.167    4     ^ 45^ 50^  -5   ^     25     ^       0.500----------^---^---^-------^------------^-------------------  TOTAL   ^500^500^   0   ^    100     ^       1.084  CHISQ (calc.) = 1.084  From tables:     CHISQ(crit., df=4, ALPHA=.05) = 9.49     CHISQ(crit., df=4, ALPHA=.10) = 7.78  Since CHISQ(calc.) < CHISQ(crit.), we shall continue (not reject)  H(O) with ALPHA = 10%.  It seems most likely that the characteris-  tics of the new circuit are the same as the previous circuits. `

47.

`The works known to be written by a famous author have been thoroughlyanalyzed  as to sentence length.  A newly found manuscript is claimedto have been written by the same author.  The data  below  are  takenfrom  a  sample  of  2000  sentences  in  this  new  manuscript.  UseCHISQUARE to decide whether the new manuscript is by the same author.                         proportion of sentences                         _______________________no. words insentence             known author       new manuscript____________         _________________________________  3 or less             .010                 .007  4-5                   .030                 .024  6-8                   .041                 .031  9-12                  .102                 .034 13-16                  .263                 .250 17-20                  .279                 .203 21-24                  .118                 .198 25-27                  .105                 .156 28-29                  .042                 .081 30 or more             .010                 .016 `
`Answer: Number of sentences     -------------------0  14  48  62  68  500  406  396  312  162  32E  20  60  82 204  526  558  236  210   84  20CHISQ = (14-20)**2/20 + (48-60)**2/60 + (62-82)**2/82 +        (68-204)**2/204 + (500-526)**2/526 + (406-558)**2/558        + (396-236)**2/236 + (162-84)**2/84 + (312-210)**2/210        + (32-20)**2/20      = 380.081 d.f. = (k-1) = 9P(CHISQ(9) >= 380.081) < .001Reject H(0) that new manuscript by same author at ALPHA = .10,.05 or .01. `

48.

`On the basis of the data presented below, do we have reason to believethe geneticist who says offspring with characteristics A, B, C, and  Dshould occur with relative frequency 1:2:4:8 in the experiment?    UseALPHA = .005.    Characteristic    A    B    C    D    Number           28   60  208  304 `
`Answer: There are a total of 600 offspring.  If the geneticist is correct, then(1/(1+2+4+8))(600)=(1/15)(600) = 40 offspring are expected to have Cha-racteristic A.  We can compute other expected values as follows.Characteristic            A      B      C      DExpected Number          40     80    160    320Observed Number          28     60    208    304Difference (O(i)-E(i))  -12    -20     48    -16Now these differences look very large, which implies that thegeneticist is probably wrong.  A statistical test can be madeby computing:    W = SUM(i=1,4)(([O(i)-E(i)!**2)/E(i)),which is the CHISQUARE test with 3 degrees of freedom.    W = (-12**2)/40 + (-20**2)/80 + (48**2)/160 + (-16**2)/320      = 23.8CHISQUARE(critical, ALPHA=.005, df=3) = 12.8381Since 23.8 > 12.8381, reject the geneticist's claim. `

49.

`A random sample of 160 communities is distributed according to deathrate and air pollution level as follows:                   AIR POLLUTION LEVEL                   Low   Medium   High                   -------------------              Low    2      6      8  ^  16    DEATH          -------------------           Medium   14     42     56  ^ 112     RATE          -------------------             High    4     12     16  ^  32                   -------------------                    20     60     80  ^ 160What would you estimate from the above table to be the probability thata randomly sampled community will have a low death rate and a high airpollution level?a.  (16*80)/160      b.  (20*32)/(160**2)      c.  8/160d.  4/160            e.  none of these `
`Answer: c.  8/160 `

50.

`The following table gives data which have been rounded from anactual federal report on the subject.  DISTRIBUTION OF ADMINISTRATORS FOR NURSING AND PERSONAL CARE HOMES,      BY LENGTH OF TOTAL WORK EXPERIENCE AND SIZE OF THE HOME,    UNITED STATES, EXCLUDING ALASKA AND HAWAII, JUNE-AUGUST 1969.-----------------------------------------------------------------------          Length of   ^ Under ^       ^       ^       ^       ^       ^Size      total work  ^   1   ^  1-4  ^  5-9  ^ 10-19 ^  20+  ^ Total ^of home   experience  ^ year  ^ years ^ years ^ years ^ years ^       ^__________ ___________^_______^_______^_______^_______^_______^_______^                      ^       ^       ^       ^       ^       ^       ^Under 25 beds         ^  200  ^  600  ^  850  ^ 1100  ^  550  ^ 3300  ^                      ^       ^       ^       ^       ^       ^       ^25-49 beds            ^  200  ^  750  ^  500  ^  600  ^  350  ^ 2400  ^                      ^       ^       ^       ^       ^       ^       ^50-99 beds            ^  250  ^  700  ^  550  ^  450  ^  250  ^ 2200  ^                      ^       ^       ^       ^       ^       ^       ^100-299 beds          ^  100  ^  300  ^  250  ^  200  ^  150  ^ 1000  ^                      ^       ^       ^       ^       ^       ^       ^300 beds and over     ^    0  ^   20  ^   30  ^   30  ^   20  ^  100  ^______________________^_______^_______^_______^_______^_______^_______^                      ^       ^       ^       ^       ^       ^       ^Total                 ^  750  ^ 2370  ^ 2180  ^ 2380  ^ 1320  ^ 9000  ^______________________^_______^_______^_______^_______^_______^_______^Decide whether work experience is associated with size of home in whichthe administrator works.  Explain your decision. `
`Answer: Using the CHI SQUARE statistic to test the hypotheses:        H(0): Work experience and size of home are independent.        H(A): Work experience and size of home are dependent.CHI SQUARE = SUM[((0-E)**2)/E!           = 344.73CHI SQUARE (d.f. = 16, ALPHA = .05) = 26.29Therefore we have very strong evidence to reject the null hypothesisat the .05 ALPHA level. `

51.

`Brand of Tire    A     B     C     D     E   ---   ---   ---   ---   ---   151   157   135   147   146   143   158   146   174   171   159   150   142   179   167   152   142   129   163   145   156   140   139   148   147                     165   166The data in the above table give stopping distance for five brands oftires.  You want to test the hypothesis that brands D and E do notdiffer with respect to stopping ability.  This hypothesis can be testedusinga.  either the sign test or the Wilcoxon signed rank test.b.  the sign test, but not the Wilcoxon signed rank test.c.  either the median test or the Wilcoxon two-sample test.d.  the median test, but not the Wilcoxon two-sample test.e.  the test of homogeneity. `
`Answer: c.  either the median test or the Wilcoxon two-sample test. `

52.

`The life in months of service before failure of the color televisionpicture tube in 8 television sets manufactured by Firm B are as follows(arranged according to size):     Firm B:  34,  36,  41,  43,  44,  45,  47,  48Let ETA(B) denote the median service life of picture tubes produced bythe firm.  To test the hypothesis ETA(B) = 38.5 against the alternativeETA(B) =/= 38.5, the value of CHISQ(calculated) for the median testequals:     (a)  8     (b)  6     (c)  4      (d)  2 `
`Answer: (d)  2              ^ Above 38.5 ^ Below 38.5     -----------------------------------     observed ^      2     ^     6     ^     -----------------------------------     expected ^      4     ^     4     ^     -----------------------------------CHISQ = [[(2 - 4)**2! + [(6 - 4)**2!!/4 = 2 `

53.

`Five cars are entered in a race:     starting order:    1    2    3    4    5    finishing order:    2    1    4    3    5The Kendall rank correlation coefficient between starting order andfinishing order equalsa.  -.4     b.  -.2     c.  .6     d.  .2     e.  .4 `
`Answer: c.  .6                       N(C)                    N(D)    X    Y    (# concordant pairs)    (# discordant pairs)    1    2              3                       1    2    1              3                       0    3    4              1                       1    4    3              1                       0    5    5              0                       0                       --                      --                        8                       2    # of pairs in data = n = 5    T = [N(C) - N(D)!/[n(n-1)/2!      = [8 - 2! / [5(4)/2!      = .6 `

54.

`The observed life, in months of service, before failure for the colortelevision picture tube in 8 television sets manufactured by Firm B areas follows (arranged according to size):Firm B:    34   36   41   43   44   45   47   48Let ETA(B) denote the median service life of picture tubes produced bythe firm.The point estimate of ETA(B) equals:a.  35        b.  43.5        c.  44        d.  33.5 `
`Answer: b.  43.5    n = 8    Therefore, the median equals the average of the two middle values.    Median = (43 + 44)/2 = 43.5    or any number between 43 and 44. `

55.

`The life in months of service before failure of the color televisionpicture tubes in 8 television sets manufactured by Firm A and 8 setsmanufactured by Firm B are as follows (arranged according to size):Firm A:    25   29   31   32   35   37   39   40Firm B:    34   36   41   43   44   45   47   48Let ETA(A) and ETA(B) denote the median service life of picture tubesproduced by the two firms.The S-interval with confidence coefficient .71 for ETA(A) is boundedby:a.  29 and 39      b.  36 and 47      c.  31 and 37      d.  41 and 45 `
`Answer: c.  31 and 37    GAMMA = .71        n = 8    From the Table of d-factors for Sign Test and Confidence Intervals    for the median, d = 3.  The confidence interval is bounded by the    d-smallest and d-largest sample observations.  Thus, the S-inter-    val about the median is bounded by the third smallest and third    largest sample observations, or 31 and 37. `

56.

`The life in months of service before failure of the color televisionpicture tube in 8 television sets manufactured by Firm A and 8 setsmanufactured by Firm B are as follows (arranged according to size):Firm A:   25   29   31   32   35   37   39   40Firm B:   34   36   41   43   44   45   47   48Let ETA(A) and ETA(B) denote the median service life of picture tubesproduced by the two firms.The W-interval with confidence coefficient .98 for ETA(A) is boundedby:a.  29 and 39    b.  36 and 47    c.  35 and 47.5    d.  27 and 39.5 `
`Answer: d.  27 and 39.5    n = 8    Using a table of critical values for the W-interval with ALPHA=.02,    d=2, the table of averages:       ^  25   29   31   32   35   37   39   40    --------------------------------------------    25 ^  25  [27!  28    29 ^       29   30    31 ^            31    32 ^    35 ^    37 ^                           37   38   38.5    39 ^                                39  [39.5!    40 ^                                     40    W-interval is 27 and 39.5. `

57.

`The coded values for a measure of brightness in paper (lightreflectivity), prepared by two different processes, are asfollows for samples of size 9 drawn randomly from each of thetwo processes:    A      B   ___    ___   6.1    9.1   9.2    8.2   8.7    8.6   8.9    6.9   7.6    7.5   7.1    7.9   9.5    8.3   8.3    7.8   9.0    8.9Do the data present sufficient evidence (ALPHA = .10) toindicate a difference in the populations of brightnessmeasurements for the two processes?a.  Use the sign test.b.  Use the Mann-Whitney rank test. `
`Answer: H(O):  Brightness has the same distribution under both       processesH(A):  Brightness has different distributions under       the two processesa.  Sign test:    The signs associated with the differences are:    -,+,+,+,+,-,+,+,+.    The smaller number of like signs is 2.  With 9 pairs, 1 or    fewer signs are required for significance at the .10 level,    therefore we continue the null hypothesis of no difference.b.  Mann-Whitney rank test:       A                B    6.1   1          9.1  16    9.2  17          8.2   9    8.7  13          8.6  12    8.9  14          6.9   2    7.6   5          7.5   4    7.1   3          7.9   7    9.5  18          8.3  10.5    8.3  10.5        7.8   6    9.0  15          8.1   8         ----             ----         96.5             74.5    With n(1) = 9 and n(2) = 9, a value of 63 or less for    the smaller sum of ranks should lead to rejection at    the .05 level.  Therefore we will continue H(O) at both    the .05 and .10 levels. `

58.

`Model           ^    G       F       C    ---------------------------------------    Compacts        ^  20.3    25.6    24.0    Intermediate 6s ^  21.2    24.7    23.1    Intermediate 8s ^  18.2    19.3    20.6    Full-size 8s    ^  18.6    19.3    19.8    Sport Cars      ^  18.5    20.7    21.4The data in the above table give gasoline mileage for various types ofcars produced by three different manufacturers.  You want to comparecars produced by manufacturers G and F.  The hypothesis that gasolinemileage does not differ for the two manufacturers can be tested usinga.  either the sign test or the Wilcoxon signed rank test.b.  the sign test, but not the Wilcoxon signed rank test.c.  either the median test or the Wilcoxon two-sample test.d.  the median test, but not the Wilcoxon two-sample test.e.  the test of homogeneity. `
`Answer: a.  either the sign test or the Wilcoxon signed rank test. `

59.

`The life in months of service before failure of the color televisionpicture tube in 8 television sets manufactured by Firm A and 8 setsmanufactured by Firm B are as follows (arranged according to size):Firm A:    25   29   31   32   35   37   39   40Firm B:    34   36   41   43   44   45   47   48Let ETA(A) and ETA(B) denote the median service life of picture tubesproduced by the two firms.You want to test the hypothesis ETA(A) = 38 against the alternativeETA(A) < 38.  The correct sign test statistic and its value is:a.  S(+) = 2    b.  S(-) = 2    c.  S(+) = 3    d.  S(-) = 3 `
`Answer: a.  S(+) = 2    Since we have H(A):  ETA(A) < 38, we expect fewer observa-    tions to be larger than the median, and the correct test    statistic is S(+).  Its value is:        S(+) = # observations > 38 = 2. `

60.

`An experiment was designed to compare the durability of  two  highwaypaints  named  type  A and type B.  An "A" strip and a "B" strip werepainted across a highway at each of 30 locations.  At the end of  thetest period the following results were observed:  at 6 locations typeA showed the least wear, at 15 locations  type  B  showed  the  leastwear,  and  at 9 locations both had the same amount of wear.  Use thesign test  at  the  5% level  to  test  that  both  paints have equaldurability. `
`Answer: H(O):  P(A < B) = P(B < A) = .5(NOTE:  All tied cases are dropped from the analysis for the sign        test.)n = 21X = the number of fewer signs = 6Using appropriate table:  P(X <= 3) = .001Therefore, reject the null hypothesis. `

61.

`The observed life, in months of service, before failure for the colortelevision picture tube in 8 television sets manufactured by Firm B areas follows (arranged according to size):     Firm B:  34,  36,  41,  43,  44,  45,  47,  48Let ETA(B) denote the median service life of picture tubes produced bythe firm and assume the lifetimes have symmetric distributions.  Youwant to test the hypothesis ETA(B) = 38.5 against the alternativeETA(B) =/= 38.5  using the Wilcoxon signed rank test.  From thefollowing list, select the most reasonable test statistic:     (a)  W(+) = 2     (b)  W(+) = 5     (c)  W(-) = 5     (d)  W(-) = 2 `
`Answer: (c)  W(-) = 5      X(i)     D(i)     ]D(i)]     Rank     -----    -----     ------     ----      34      -4.5       4.5       3.5      36      -2.5       2.5       1.5      41       2.5       2.5       1.5      43       4.5       4.5       3.5      44       5.5       5.5       5      45       6.5       6.5       6      47       8.5       8.5       7      48       9.5       9.5       8W(-) = SUM(R(-)) = 3.5 + 1.5 = 5 `

62.

`The state highway department is collecting data to determine whethera  highway's  repair priorities should be raised, lowered, or shouldremain the same.  The decision will be made in the following manner.If the population median of traffic flow = 100  cars  per  day,  thepriority will remain the same.  If the population median of  trafficflow > 100 cars per day, raise the priority.  If the population  me-dian of traffic flow < 100 cars per day, decrease the priority. Datafor nine randomly selected days is as follows:     Traffic Flow:  88, 91, 89, 101, 93, 86, 95, 98, 92Can we conclude at ALPHA = .05 that the median number of cars per dayis 100?(a)  Pick the most appropriate nonparametric procedure.(b)  State null and alternative hypotheses.(c)  Compute a test statistic.(d)  Indicate your critical values.(e)  Do you or do you not reject H(0)?  What is your conclusion? What     happens to the road in question? `
`Answer: (a)  Use Wilcoxon test.(b)  H(0): Md =   100     H(1): Md =/= 100(c)  X     D(i) = X - 100   ABS(D(i))    Rank     -     --------------   ---------    ----     86        -14              14        9     88        -12              12        8     89        -11              11        7     91         -9               9        6     92         -8               8        5     93         -7               7        4     95         -5               5        3     98         -2               2        2    101         +1               1        1     T = 1      R+ = 1   (Sum of positive ranks)                R- = 44  (Sum of negative ranks)(d)  lower W = 6     upper W = 9(10)/2 - 6 = 39(e)  (T=1) < 6, therefore reject H(O).  Conclude that the median is     less than 100 cars per day, and decrease the priority. `

63.

`Ten randomly selected cars of a specific year, make, and model andwith similar equipment, are subjected to an EPA gasoline mileagetest.  The resulting miles/gallon are:    24.6, 30.0, 28.2, 27.4, 26.8,    23.9, 22.2, 26.4, 32.6, 28.8Using the Wilcoxon Median Test, test the hypothesis that the populationmedian is 30 miles/gallon at the ALPHA = .10 level.  Construct a 90%confidence interval for the median. `
`Answer: Measurement     D(i)     ]D(i)]     Rank-----------     ----     ------     -----   24.6         -5.4       5.4        7   30.0          0         0          -   28.2         -1.8       1.8        2   27.4         -2.6       2.6        3.5   26.8         -3.2       3.2        5   23.9         -6.1       6.1        8   22.2         -7.8       7.8        9   26.4         -3.6       3.6        6   32.6          2.6       2.6        3.5   28.8         -1.2       1.2        1R+ =  3.5          ---> T = 3.5R- = 41.5Lower w = 9Upper w = (9*10)/2 - 9 = 36Since (T=3.5) < 9, we reject H(0):  median = 30.For the confidence interval, we need the 11th largest and smallestvalues, to be obtained from the following table:      ^ 32.6  30.0  28.8  28.2  27.4  26.8  26.4  24.6  23.9   22.2-------------------------------------------------------------------- 32.6 ^ 32.6  31.3  30.7  30.4  30.3  29.7  29.5  28.6  28.25  27.4 30.0 ^       30.0  29.4  29.1  28.7  28.4   --    --    --     -- 28.8 ^            [28.8! 28.5  28.1  27.8   --    --    --     -- 28.2 ^                    --    --    --    --    --   26.05 [25.2! 27.4 ^                          --    --    --   26.0  25.65  24.8 26.8 ^                                --    --   25.7  25.35  24.5 26.4 ^                                     26.4  25.5  25.15  24.3 24.6 ^                                           24.6  24.25  23.4 23.9 ^                                                 23.9   23.05 22.2 ^                                                        22.2Therefore, 90% C.I.:  from 25.2 to 28.8. `

64.

`On eight occasions of cloud seeding, the following amounts of rainfallwere observed: .74, .54, 1.25, .27, .76, 1.01, .49, .70.  On six controloccasions (when no cloud seeding took place), the following amounts ofrainfall were measured: .25, .36, .42, .16, .59, .66.We test (using the Wilcoxon - Mann - Whitney test) the hypothesis thatcloud seeding does not increase amount of rainfall against the alter-native that it does.  The descriptive level for the given data equalsa.  .00     b.  .01     c.  .02     d.  .04     e.  .05 `
`Answer: c.  .02     U(S) = number of times seeded observations are larger than            control observations     U(S) = 6 + 4 + 6 + 2 + 6 + 6 + 4 + 6          = 40    U(NS) = (8*6) - 40          = 48 - 40          = 8     from table:     P(U(NS) < 9) = .021 `

65.

`The  life in months of service before failure of the color televisionpicture tube in 8 television sets manufactured by Firm A and  8  setsmanufactured by Firm B are as follows (arranged according to size):     Firm A:  25,  29,  31,  32,  35,  37,  39,  40     Firm B:  34,  36,  41,  43,  44,  45,  47,  48Against the two-sided alternative, the Wilcoxon (Mann Whitney) two-sample test has descriptive level:     (a)  .050     (b)  .010     (c)  .007     (d)  .004 `
`Answer: (c)  .007U(A) = 0 + 0 + 0 + 0 + 1 + 2 + 2 + 2     = 7U(B) = 64 - 7     = 57P(U(A) <= 7) = .007 `

66.

`The  life in months of service before failure of the color televisionpicture tube in 8 television sets manufactured by Firm A and  8  setsmanufactured by Firm B are as follows (arranged according to size):     Firm A:  25,  29,  31,  32,  35,  37,  39,  40     Firm B:  34,  36,  41,  43,  44,  45,  47,  48Suppose the data is ranked as one combined set.  The sum of the ranksR(B) for the B-observations equals:     (a) 36     (b)  43     (c)  57     (d)  93 `
`Answer: (d)  93          Table of Ranks:Firm A:   1     2      3     4     6     8     9     10Firm B:   5     7     11    12    13    14    15     16SUM(R(B)) = 5 + 7 + 11 + 12 + 13 + 14 + 15 + 16          = 93 `

67.

`Question           type      1    2    3    4   An investigator is interested                                        in teachers' use of variousTeacher                                 types of questions in teaching             A       9    1    9    2   mathematics.  He identifies 4             B       4    6    7    0   types of questions which de-             C       8    2    5    1   mand responses of different             D       6    9    2    3   levels of complexity. He re-             E       7    5    6    2   cords the number of questions             F       7    3    4    1   of each type asked by each             G       8    5    2    5   teacher in a random sample of             H       8    9    7    1   10 teachers.  The frequencies             I       6    5    8    4   are reported for teacher and             J       7    2    5    1   question type.The most appropriate nonparametric test for these data would be:A.  Mann-Whitney TestB.  Friedman TestC.  Wilcoxon TestD.  CHI-SQUARE Test of Homogeneity `
`Answer: B.  Friedman Test    We are interested in comparing the average ranks for the four    question types.  Friedman will do this directly. `

68.

`Using the list of designs below, indicate which type of design ismost descriptive of the following study:    a.  one shot case study    b.  factorial design    c.  time series design    d.  nonequivalent control group    e.  co-relational study    f.  one group pretest posttest    g.  equivalent time series    h.  patched up design    i.  posttest only control group design    j.  criterion group    k.  pretest-posttest control    l.  separate sample pretest-posttestA company making electric drills has kept accurate monthly records ofthe number of faulty drills sold as indicated by the number of themthat have been returned to the factory.  Because this number hasbeen increasing, the company has instituted a training program to im-prove the skills of its inspectors in the hopes that fewer faultydrills will be distributed.  The company plans to assess the effectsof this training by continuing to examine monthly records of how manydrills are returned to the factory for repairs after the training pro-gram has been completed. `
`Answer: c.  time series    This study consists first of repeated observation, then conduct of    training program, and then continued observation. `

69.

`Regarding the testing of ammunition using a 16-inch gun subject to alinear trend for wear, W.J. Youden makes these comments on a testingsequenceAAAA/BBBB/..../EEEE where A, ..., E are brands of shells"Nothing good comes from this work.  The averages are worthless.  Eachaverage depends on its position in the firing order.  The estimate ofthe experimental error based on repeat rounds fired in succession ob-viously has no applicability for judging differences betweenammunitions not fired in immediate succession."a.  Why are the averages worthless?b.  What is wrong with the estimate of experimental error? `
`Answer: a.  The averages are worthless  because  differences between  ammunition    means are mixed up with differences in firing order where firing or-    der is an important source of variation.  If we let, say, RHO(1) re-    present the effect of testing during the first four firings,  RHO(2)    represent the effect of testing during the second four firings, etc.    and let TAU(A) represent the effect  of  Brand  A ammunition, TAU(B)    represent the effect of Brand B ammunition,  etc.,  then the differ-    ence YBAR(A) - YBAR(B) = (TAU(A) + RHO(1)) - (TAU(B) + RHO(2)).  The    proper difference is distorted by an amount (RHO(1) - RHO(2)).b.  Repeat rounds fired in succession will tend to agree with each other    much more than rounds  scattered  randomly  through  the firing  se-    quence.  They will underestimate the variance that should be used in    comparing brands of ammunition. `

70.

`An Experiment is to be conducted in a greenhouse to compare threetreatments.  The 9 experimental units to be used will be arranged asbelow:                             UnitsGreenhouse              X       X       XHeat                    X       X       XSource(Radiator)        X       X       XSince the plants used in the experiment are sensitive to heat, it isexpected that the closer an experimental unit is to the heat source, thepoorer the response that will be obtained.  Otherwise the experimentalunits are considered to be about the same.a.  How would you assign treatments to experimental units?  Explain.b.  What, if any, experimental design is defined by this method of    assignment? `
`Answer: a.  I would assign treatments such that each treatment would occur in    each of the distances from the radiator because we suspect that the    distrance from the heat will affect the response.b.  This is the randomized complete block design, with distance from the    heat source used as the blocking factor. `

71.

`Suppose that you have been appointed energy czar of New Hampshireand have been instructed to provide guidance to consumers on therelation of speed to miles per gallon for various makes of cars.Disregarding cost considerations, which of the following statictest schemes would you prefer? Why?a.) 5 tests all at 25 mphb.) 2 tests at 25 mph, 3 tests at 55 mphc.) 1 test at 25, 1 at 35, 2 at 45, and 1 at 55 mphd.) 1 test each at 25, 35, 45, 55, and 65 mph `
`Answer: d.) Because I would be able to say something about speeds of 25 and65 without extrapolating, and I would be able to say more aboutthe whole range because it is well covered using this scheme. `

72.

`Suppose that you are a member of a garden club that has 30 members.  Theclub has fallen into controversy as to whether or not planting garlicnext to Baby's Breath is an effective way to reduce insect attacks onBaby's Breath.One club member proposes that the best gardener in the club plantvarious parts of his garden either with Baby's Breath alone or withBaby's Breath next to garlic.  The proposal is that random selection beused so that half of 20 areas are planted one way and half are plantedthe other way.Another club member disagrees.  He suggests that each club member plant2 areas - one with Baby's Breath alone and one with Baby's Breath plusgarlic (again randomly assigning treatment to area).  Which scheme doyou favor?  Why? `
`Answer: I favor the second scheme because the scope of inference will be muchbroader.  After the study is finished, the results would apply to manytypes of soil conditions, environmental conditions and gardener'sabilities. `

73.

`Suppose that you are an entomologist and wish to test 4 compounds thatare said to attract a certain insect.  You have 16 insect traps in asquare (4X4) arrangement in a field that contains a growing crop.  Youthink, but are not sure, that either wind direction or distance frominsect source may affect number trapped.  Your traps are like this:Wind ------------->X     X     X     XX     X     X     XX     X     X     XX     X     X     XInsect Source /]               ]a.  How many df will be associated with experimental error if the    design used for this situation is     i) completely random?    ii) randomized blocks (with one block consisting of the traps clos-                           est to the insect source, ..., another block                           consisting of the traps furthest from the                           source)?   iii) latin square?b.  What model terms have to be important if it's to be worthwhile to    have used the latin square?c.  Which model terms (or influences) have to be unimportant if a CR    design is to be a good choice? `
`Answer: a.    i) t(r-1) = 4(4-1) = 4(3) = 12     ii) (r-1)(t-1) = (4-1)(4-1) = (3)(3) = 9    iii) (t-1)(t-2) = (4-1)(4-2) = (3)(2) = 6    where t = number of treatments          r = number of blocks or repetitionsb.  Both wind and distance from insect source have to be important in    order for the latin square design to be worthwhile, therefore, both    RHO(j) and KAPPA(k) for all i and k must be important.c.  Both wind and distance from insect source, or RHO(j) and KAPPA(k)    for all j and k, must be unimportant for the completely random    design to be appropriate. `

74.

`A computer user wishes to compare two programs in terms of amount ofcomputer time used.  Both programs perform the same analysis and usethe same data set.  The computer user randomly selects 10 time periodsduring the time when the computer ordinarily is used.  For each ofthese periods, both programs are run.  Each time a random methodis used to decide which program should be run first.  What experimentaldesign has been used? `
`Answer: A randomized complete block (RCB) design has been used to comparethe two programs.  The implied blocking factor is time period, sinceeach program must run within each time period.  Each treatment(program) occurs randomly within each time period. `

75.

`A computer user is charged for the amount of computer time that he uses.He has at his disposal two programs that perform the same analysis.  Hesets up a standard data set and wishes to compare programs in terms ofcomputer time used.  He randomly chooses 20 time periods from thosetimes when he ordinarily would use the computer.  Programs are randomlyassigned to these 20 periods subject only to the requirement that eachprogram be run 10 times.  What experimental design has been used? `
`Answer: The completely random design has been used since the treatments(programs) have been randomly assigned to the experimental units withoutany restriction on randomization except that each treatment be assigned10 times. `

76.

`An experiment has been conducted in which two computer programs werecompared in terms of computer time needed to perform the same analysisof the same data.Results obtained included:                ANOVASource of Variation     df      SS      M.S.Total                   20      -       -Mean                     1      -       -Corrected Total         19      -       -Treatments               1      -       -Error                   18      -       -MeansProgram 1       20Program 2       15LSD(at ALPHA .05) = t * S(dBAR) = 2a.  Which computer program would you use in the future?  Why?b.  What design is suggested by this report? `
`Answer: a.  I would use program 2 because the difference in mean times between    the two programs is 5 which is larger that the LSD of 2, which    indicates that the observed difference was unlikely (at the .05    level) to have occurred by chance alone.  Thus, the sample data    indicates a significant difference between the two programs.b.  A completely random design is indicated by the ANOVA table because    no degrees of freedom have been subtracted from the total for any    blocking factors.  The total loss in degrees of freedom is    attributable to the mean and one treatment factor. `

77.

`An imaginary experiment was conducted to compare length of life ofbatteries sold be 4 manufacturers.Results obtained included:                ANOVASource of Variation             df      SS      M.S.Total                           16      -       -Mean                             1      -       -Corrected Total                 15      -       -Type of flashlight               3      -       -Month of testing                 3      -       -Brands                           3      -       -Error                            6      -       -MeansBrand c         4.0Brand b         3.9Brand a         3.5Brand d         3.3LSD(.05) = .2a.  What brand(s) would you suggest buying if all prices were the same?    Why?b.  What design was used?c.  What does the ANOVA table tell you about how the data was obtained? `
`Answer: a.  I would suggest buying brands c or b because although they are notsignificantly different from each other, both of them are significantlydifferent from the other two brands, based on the LSD at a .05significance level.b.  A latin square design is implied by the ANOVA table.  Two blockingfactors are indicated; type of flashlight and month of testing, eachhaving four levels.c.  The ANOVA table indicated an LSQ design with the two restrictions onrandomization of treatments to experimental units.  Each brand ofbattery was required to occur once within each month and once with eachtype of flashlight. `

78.

`Suppose that you are in charge of product testing for a chemicalcompany.  You are persuaded that your company had a new product that isa promising way of relieving insomnia.a.  Suppose that the resources available only permit testing one other    treatment in addition to the new compound.  What will that    treatment be?  Why?b.  Suppose that available resources permit testing two treatments    besides the new product.  What will be your choice of treatments?    Why?c.  What will be your choice if five additional treatments can be tested    (in addition to the new product)?  Explain you choice. `
`Answer: a.  The other treatment would be a control so that there will be some    real basis for comparison to see if the product does any good at allb.  Then I would choose a control and a product which is believed to be    effective at relieving insomnia.c.  If I could test 5 additional treatments, they would be a control and    four treatments for insomnia.  If four were much more widely used or    more interesting that others, they would be the four tested.  If all    insomnia preparations were about equally interesting, then I would    choose four randomly. `

79.

`Sominex commercials repeatedly present endorsements of the form:  I takeSominex and sleep something fierce.  Some skeptics would suggest thatmany, if not all, of the Sominex endorsers could take an inert pill andsleep something fierce.  Why would an inert pill be a better "control"for testing the effectiveness of a sleeping potion that no pill at all?(e.g. test subjects might report not being sleepy, then they mightrandomly receive either Sominex or no pill at all.) `
`Answer: The problem associated with not taking any pill at all to compare withtaking a Sominex pill is that such a design has not controlled for thepossibility of an effect on response of taking "any" pill.  That is,some people may take a pill to sleep, and just the act of taking a pill,which they think works will in fact have an effect on their ability tosleep.  By comparing Sominex to an inert pill, or control, one cancontrol for this possible effect and look at just the effect of Sominex.Since both groups are going through the process of "taking pills", thiseffect when comparing the two has been controlled and the differencesbetween the two groups will be due to the ingredients of the pill alone,rather than including effects of "taking a pill". `

80.

`Suppose that you have identified three pocket calculator models that youregard as suitable for your work and comparable in price.  Suppose thatyou will make your decision on which one to buy on the basis of the timethat it takes for you to perform a particular set of calculations.  Youfeel that you are equally familiar with all 3 models, but suspect thatif you repeat the same set of calculations over and over again, you willbecome slower and slower.  Suppose that it is reasonable to repeat thecalculations six times on each machine.  Which design will you use? Why?  i)  Completely Random ii)  Two 3X3 Latin Squaresiii)  Randomized Block `
`Answer: Choose two 3X3 Latin Squares because the position in the testing processappears to have a significant effect on the response, and, therefore,this effect should be balanced out. `

81.

`The time required for a computer user to finish running a program toperform a particular analysis depends on many things such as-Language used by the program,Number of other jobs being processed on the system,Amount of data being summarized, etc.Suppose that 3 programs were available to perform the same analysiswhere the chief difference among programs was the language used forwriting them.  Suppose that these languages were:1.  FORTRAN2.  BASIC3.  AssemblerSuppose further that 3 terminals were available and all 3 versions couldbe run at the same time.  (A different person at each terminal).  Whichof the following designs would you use?  Why?  i) Randomized Block ii) Latin Squareiii) Completely Random `
`Answer: I would use a Randomized Block design so that each program would run atthe same time.  This assumes that there will be no important effect ofperson or terminal on the amount of time needed to run a program. `

82.

`An investigator wished to study the effect of an operator on theperformance of a machine.  He could arrange to have each of fouroperators run the machine five times.  A response measurement could berecorded each time the machine was used.  How many experimental unitswill he have if-a.  He randomly selects an operator, has him run the machine five times,    then selects another operator, etc.?b.  He identifies 20 turns for running the machine and randomly assigns    operators to turns subject to the requirement that each operator    perform five times?c.  He forms five groups of four turns and randomly and independently    assigns operators within each group of four? `
`Answer: a.  4 : an experimental unit is a set of 5 turns or time of running the    machine.b.  20: an experimental unit is a turn.c.  20: an experimental unit is a turn even though turns have been    arranged in groups. `

83.

`An investigator plans to conduct an experiment to evaluate threedifferent methods for measuring a chemical.  An experimentalunit will involve the activities of a technician during 12 timeperiods over six days.  When asked what pattern of variationhe would expect if the technician used the same method ineach of those time periods, the investigator responded that allmeasurements would be about the same, that he expected no re-gular pattern of variation.Which of the following experimental designs do you recommend?  Why?  i)  Latin Square ii)  Randomized Blockiii)  Completely Random Design `
`Answer: I recommend iii, Completely Random Design because there are no expectedinfluencing factors for which we should balance the effects. `

84.

`Suppose that we wished to compare two drivers in terms of milestravelled per gallon of gas used.  Suppose that an experimentalunit consists of independently driving over a singleprescribed course involving about the same traffic and bothcity and open road driving within a prescribed time range.  Inorder to broaden the scope of this comparison, it is desiredto use:a     Cadillac      Ford      Volkswagen      Datsun Z      Jeep      Mazdaa.  How would you set up this trial to balance the effects of    kind of car on driver performance?  What design would you use?b.  How would you balance both the effects of kind of car    and day of travel? `
`Answer: a.  I would use a randomized block design and use kind of car as    a blocking factor, having each driver drive in each of    the cars.b.  If length of time required to complete a run were short    enough so that two comparable runs could be finished on the    same day, I would define a block as 2 runs on the same day    involving the same kind of car.  Drivers would be randomly    assigned to runs each day.  If it was not reasonable to    regard 2 runs on the same day as comparable, then a scheme    involving a 2 X 2 latin square for each brand of car    might be useable. `

85.

`You now have at your disposal one 16-inch gun.  You are to develop atesting scheme to test shell velocity for five brands of ammunition.Because of the amount of explosive required for each round, eachsupplier will provide just four rounds.While the gun to be used for testing has a new barrel, you know fromprevious experience that gun barrel wear is pronounced.  In fact,you are willing to assume that the only consequential uncontrolledenvironmental factor is gun wear and that order of firing has alinear effect on shell velocity.  (Velocity of shell declines fromfiring 1 to firing 20.)a.  Define experimental unit.b.  Propose a testing scheme and assign brands to experimental units.c.  Will your testing scheme avoid distortions due to firing order    if the influence of firing order is I = 50 - 2T (I:  influence,    T:  time of firing, T = 1, 2, ..., 20)? `
`Answer: a.  An experimental unit is one firing of the gun, and all the    preparation that goes with it.b.  I propose that the randomized complete block design be used.    I would assign the treatments to the experimental units within    four blocks of five sequential units.  The following is an example    of an assignment of treatments to experimental units.    Block               1       2       3       4    Treatment        1               2        6      15      18        2               5        9      11      19        3               1       10      13      20        4               3        7      14      17        5               4        8      12      16c.  This scheme will not entirely avoid distortions due to firing    order.  The randomized complete block goes a long way in balancing    these distortions.    If you had five rounds of each brand of ammunition you could use a    Latin Square design and eliminate the distortions altogether.  (It's    also possible to use 4 columns of a 5 X 5 Latin Square, but it's not    expected that this option would be proposed in an introductory    class.)    Influence values for the above randomization    (e.g. for firing #1.  I = 50 - 2 = 48              firing #2.  I = 50 - 4 = 46)    Treatment                                Total Influence        1          46    38    20    14          118        2          40    32    28    12          112        3          48    30    24    10          112        4          44    36    22    16          118        5          42    34    26    18          120    Treatment 5 had the largest advantage due to firing order,    but RCB randomization largely balanced firing order effects. `

86.

`An investigator wishes to explore the effectivess of four coagulants(named A, B, C, and D) in removing suspended material from water.He proposes to try these coagulants on six different water sampleswhere each sample is large enough to provide an aliquot for testingeach coagulant.  The proposed procedure consists of        * Preparing enough of each coagulant for Sample 1 where the          order of preparing coagulants is random        * Treating aliquots of Sample 1 in the same order and putting          samples on the stirrer in that order        * Repeating the same process for each of the other five water          samples (using a different randomization for each          sample).a.  How would you conduct a uniformity trial for the above situation?b.  What would it tell you? `
`Answer: a.  You would conduct a uniformity trial by running the whole experi-    ment but using only one of the coagulants throughout.b.  It would show you any patterns of variation in response among the    experimental units independent of the treatments.  It would also    give you an estimate of the background variation. `

87.

`The average number of hours an electric circuit lasts beforefailing is 100 hours.  An engineer claims that he can developa circuit that increases the average life of the circuit.  Itis desired to test H(0): MU = 100 against the appropriate al-ternative hypothesis.  The alternative hypothesis is best rep-resented as:(a)  H(A): MU =/= 100        (c)  H(A): MU < 100(b)  H(A): MU <= 100         (d)  H(A): MU > 100 `
`Answer: (d)  H(A): MU > 100 `

88.

`We want to compare two machines for production line speed of beer bottlemanufacturing.  At the end of n(1) = 9 days, the number of bottles  pro-duced by machine 1 yield XBAR(1) = 19, S(1)**2 = 4.  For  machine 2,  wehave n(2) = 6 days, XBAR(2) = 17,  and S(2)**2 = 9.  Assume independenceof samples, normality, and equality of unknown variances.In testing H(0):  MU(1) = MU(2) vs. H(1):  MU(1) =/= MU(2), the observedvalue of our statistic is:(A) (2)/(SQRT(77/13))(SQRT((1/9) + (1/6)))(B) (2)/SQRT((4/9) + (9/6))(C) (2)/(SQRT(90/13))(SQRT((1/9) + (1/6)))(D) (2)/SQRT((16/9) + (81/6))(E) (2)/(SQRT(36/13))(SQRT((1/9) + (1/6))) `
`Answer: (A) (2)/(SQRT(77/13))(SQRT((1/9) + (1/6)))First we must find the pooled variance:   S(P)**2 = ((8)(4) + (5)(9))/(9 + 6 - 2)           = 77/13Now standard error of the difference between means:   S(XBAR(1) - XBAR(2)) = (SQRT(77/13))(SQRT((1/9) + (1/6)))The observed t-value is:   t(calculated) = ((19 - 17) - 0)/S(XBAR(1) - XBAR(2))                 = 2/S(XBAR(1) - XBAR(2)) `

89.

`We want to compare two machines for production line speed of beer bottlemanufacturing.   At  the  end  of  each of n(1) = 9 days,  the number ofbottles produced by machine 1 yield XBAR(1) = 19, S(1)**2 = 4.   For ma-chine 2, we have n(2) = 6 days, XBAR(2) = 17, and S(2)**2 = 9.   Assumeindependence of samples, normality, and equality of unknown variances.Suppose we test H(0):  MU(1) = MU(2)  vs.  H(1):  MU(1) =/= MU(2)  atALPHA=.05, and the value of the test statistic is 2.20.  Then we should:(A) Do not reject (continue) H(0)(B) Reject H(0)(C) Do not reject (continue) H(1)(D) Both (B) and (C)(E) Both (A) and (C) `
`Answer: (D) Both (B) and (C)     Given that:   t(calculated) = 2.20     and find  :   t(critical, ALPHA=.05, twotailed, df=13) = +/- 2.160.     Since t(calculated) falls in the area of rejection, we would reject     H(0) and would not reject (continue) H(1). `

90.

`Molybdenum rods are produced by a production line setup.   It is desir-able to check whether the process is in control.  Let X = length of sucha rod.  Assume X is approximately  normally  distributed  with mean = MUand variance = SIGMA**2, where the mean and variance are unknown.Take n = 400 sample rods, with sample average length XBAR = 2 inches,and SUM((X - XBAR)**2) = 399.In testing H(0):  MU = 2.2 vs. H(1):  MU =/= 2.2 at level ALPHA = 8%,one should _____ the H(0) since the value _____ lies _____ theconfidence interval.a)  continue, 2.2, withinb)  reject, 2, outside ofc)  reject, 2.2, outside ofd)  continue, 2, withine)  either b or c `
`Answer: c)  reject, 2.2, outside of    S**2 = 399/399 = 1    S(XBAR) = SQRT(S**2/n) = .05    If you center the confidence interval on the sample mean the    confidence interval = 2 +/- (1.75)(.05)                        = from 1.9125 to 2.0875    which does not contain the hypothesized value, 2.2. `

91.

`Molybdenum rods are produced by a production line setup.  It is desiredto check whether the process is in control.  Let X = length of such arod.  Assume X is approximately normally distributed with mean = MUand variance = SIGMA**2, where the mean and variance are unknown.Take n = 400 sample rods, with sample average length XBAR = 2 inchesand SUM((X - XBAR)**2) = 399.If one were testing H(0):  MU = 1 vs. H(1):  MU =/= 1 at levelALPHA = _____, one should _____ the H(0) since the value 1 lies_____ the confidence interval.a)  16%, not reject (continue), withinb)   8%, not reject (continue), withinc)   4%, not reject (continue), withind)   4%, not reject (continue), to the left ofe)   4%, reject, to the left of `
`Answer: e)  4%, reject, to the left of    S**2 = 399/399 = 1;  S(XBAR) = SQRT(S**2/n) = .05;    C.I. = 2 +/- Z(ALPHA/2) * .05;    Z(16%/2) = 1.41, Z(8%/2) = 1.75, Z(4%/2) = 2.05    C.I.(ALPHA=16%) = 2 +/- (1.41)(.05)                    = from 1.93 to 2.07    C.I.(ALPHA= 8%) = 2 +/- (1.75)(.05)                    = from 1.91 to 2.09    C.I.(ALPHA= 4%) = 2 +/- (2.05)(.05)                    = from 1.90 to 2.10    1 is not included in any of the confidence intervals, so H(0) should    be rejected in all cases. `

92.

`It is known that long, thin titanium rods  lengthen  with  increasingtemperature.   A  sample of n=20 identical titanium rods is selected.Each is subjected to a particular uniform temperature for a specifiedtime.    Let  Y  denote  the  change  in  length.  The  readings  are(X(1),Y(1)),...,(X(20),Y(20)),  with  data  XBAR=2  (in  hundreds  ofdegrees   F),   YBAR=3   (in   milli-inches),  SUM(X-XBAR)**2  =  10,SUM(Y-YBAR)**2 = 40, and SUM(X-XBAR)(Y- YBAR) = 16.In testing H(0):  RHO = 0 vs H(1):  RHO =/= 0 (RHO = population valuefor Pearson correlation coefficient) at level ALPHA = 5%, one should ___H(0) since the statistic r/SQRT((1-r**2)/(n-2)) = ____ is ____than the correct table value of ____.(a)  reject, 5.7, greater, 2.086(b)  reject, 5.7, greater, 1.734(c)  reject, 5.7, greater, 2.093(d)  reject, 5.7, greater, 2.101(e)  continue, 1.6, less, 2.086 `
`Answer: (d)  reject, 5.7, greater, 2.101 `

93.

`In order to compare two brands of  tires,  Nader's  Raiders  selectedfive  tires  of each brand, measuring the mileage for which each tiregave  adequate  service.   The  result  of  the  test  (expressed  inthousands of miles) were:     BRAND A            BRAND B     -------            -------      28.2                25.5      24.9                25.4   23.0                25.3      21.8                25.0      28.1                24.8     -----               -----     126.0               126.0Assuming both brands sell for the same price, which brand of tire wouldyou say is the better buy?  (Do not compute standard deviations.) `
`Answer: Under the no computation requirement, it  appears  Brand B is the betterbuy, since the mileage figures are more consistent and the rank ordering(highest to lowest) AABBBBABAA seems to favor B. `

94.

`Two types of paint are to be tested.  Paint I is somewhatcheaper than paint II.  The test consists of giving scores tothe paints, after they have been exposed to certain weatherconditions for a period of 6 months.  Five samples of eachtype of paint are scored as follows:Paint I    ^  26  16  20  25  23________________________________Paint II   ^  20  28  32  25  25We should like to adopt paint I, the cheaper one, unless wehave definite reason to believe that paint II is better.Test the hypothesis that MU(2) <= MU(1) at level of signifi-cance ALPHA = 10.A.  State your test statistic and critical region.B.  Perform your calculations.C.  State your conclusions. `
`Answer: Assumptions:  (a) Both populations are normal and independent.              (b) Populations have the common variance SIGMA**2.A.  Test statistic:  t = [XBAR(1) - XBAR(2)!/[S(XBAR(1) - XBAR(2))!                     t(critical, ALPHA=.1, df=8) = -1.397B.  t = [XBAR(1) - XBAR(2)!/[S(XBAR(1) - XBAR(2))!        where: n(1) = n(2) = 5        S(XBAR(1) - XBAR(2)) = SQRT((S(1)**2/n(1)) + (S(2)**2/n(2)))            XBAR(2) = 26              XBAR(1) = 22            S(2)**2 = 19.5            S(1)**2 = 16.5        If t(calc) <= t(crit), we reject H(0).            t(crit) = -1.4            t(calc) = (22-26)/2.68                    = -1.491C.  Since t(calculated) < t(critical), we reject H(0):  MU(2) <=    MU(1) and adopt paint II. `

95.

`Five parallel determinations of zinc in an organic substance have beenobtained.  The results arranged in order are:  16.84%, 16.86%, 16.91%,16.93%, and 17.08%.  The initial reaction is  a  desire to discard thehighest value, which seems to be an outlier.a)  Briefly describe the considerations which should arise    in the mind of the experimenter in deciding how to treat    the data.b)  Perform a statistical test to determine if the datum should    be rejected.  What is the inherent weakness of such tests? `
`Answer: a)  1.  Are there any physical reasons on which to base a rejection?        (i.e., dirty glassware, spill, etc.)    2.  Are data normally distributed?    3.  What are requirements of the analyses?b)  Using the first 4 measurements to calculate XBAR and S(X):    XBAR = 16.885    S(X) = 0.0420    t(calc) = (17.08 - 16.885)/(.042*SQRT(1+(1/4)))            = 4.153    t(crit, ALPHA=.05, df=3, two-tailed) = 3.182    Since t(calc) > t(crit), the datum does appear suspect and should    be considered for deletion.    NOTE:  In this case,    VAR(X(5)-XBAR) = VAR(X(5) - [[X(1)+X(2)+X(3)+X(4)!/4!)                   = SIGMA**2 + [1/16![4*(SIGMA**2)! - 2COV(X(5),XBAR)                   = (SIGMA**2)(1 + (1/4))    (COV(X(5),XBAR) = 0, since X(5) and XBAR are independent.) `

96.

`A report on the effect of a nuclear power plant on number of fish perunit area in nearby waters states that the hypothesis under testwas:  H(0):  MU(1) - MU(2) >= 20 whereMU(1) is mean number of fish during the year before the plant was      constructed;  andMU(2) is mean number of fish during the year after the plant began      operation.a.  Is this a one tailed or a two tailed test?b.  Will a sample difference of 19 ever result in rejection of    this claim?  (i.e. will a reduction of 19 ever lead to    rejection of this claim.) `
`Answer: a.  one tailed testb.  yes `

97.

`a.  The information sheet of an insecticide company carried this    statement:        "Differences in control between our material and the current        standard material were not statistically significant at the        99% confidence level."    How do you interpret this statement?  What, if any, additional    information would you like to have in order to make a choice of    which material to use?b.  Suppose that the statement read that differences were significant    at the 99% confidence level.  How would you interpret this state-    ment?  What, if any, additional information would you like to have    in order to make a choice?    (In answering this question, it may be helpful to picture an    experiment as a way of sampling a population of differences in    response to these 2 chemicals.  Commonly, it's hypothesized that    such a population of differences has a mean of zero, i.e.,  H(0):    MU = 0.  Such a hypothesis is readily tested by using sample dif-    ferences to set confidence limits.) `
`Answer: a.  As the statement stands, it simply means that the new material is    neither better, nor worse, at controlling insects than the current    standard material.  However, no information is available on the res-    pective means and the corresponding measures of variation; what size    sample was used to make the comparison; was a one or two tailed test    made on the difference or a confidence interval created around the    observed difference; what was the standard error of the difference?b.  My first reaction to the statement of significant differences at the    .99 level is what was the observed difference, i.e., which product    controls insects better]  However, it still lacks the necessary in-    formation to support such a statement.  Once again, knowledge of the    size of the observed difference, the corresponding t-value or confi-    dence interval, and measures of variation are needed to judge the    merit of the concluding statement.  Statistically significant dif-    ferences can be observed simply due to large sample sizes or dif-    ferences that are just barely significant may indicate the need    for further experimentation or replication. `

98.

`In a study of learning ability, six boys and six girls were chosen atrandom  from  a kindergarten class.  Scores were obtained as measure-ments of their ability to learn nonsense  syllables.   Students  werepaired  on  the  basis of IQ (that is, the boy with the lowest IQ waspaired with the girl  with  the  lowest  IQ,  etc.).   The  data  arepresented below.                     SCORESPair Number      Boys      Girls-----------      ----      -----    1             10        13    2             14         8    3             16        10    4             13        13    5             13        12    6             15        13a.  Estimate the difference between the population mean scores for boys    and that for girls.b.  Find a 99% confidence interval for the difference in population    means, doing your calculations very roughly, so as to find which    of the following is closest to the answer.  Circle the corres-    ponding number.    1.  -12 to 16    2.   -4 to  8    3.   -1 to  5    4.    0 to  4    (If you don't like any of these, show your work for partial credit.)c.  The primary purpose of the pairing was in hopes of reducing    (circle one):    1.  skewness in the data    2.  the standard error of the difference in sample means    3.  the degrees of freedom in the appropriate t-test    4.  heterogeneity of the variances    5.  correlation between boys and girls `
`Answer: a.  Estimated difference between the population mean scores for boys    and girls = XBAR(boys) - XBAR(girls)              = 13.5 - 11.5              = 2b.  2.  -4 to 8    Since it is a paired test, we will compute S(DBAR) in order to    compute the 99% confidence interval for the difference in popu-    lation means.    DBAR = XBAR(boys) - XBAR(girls) = 2    d(i) = (D(i) - DBAR), where D(i) is the difference between the           ith pair.       S(D) = SQRT[SUM(d(i)**2)/(n-1)!            = SQRT(62/5)            = 3.52    S(DBAR) = S(D)/SQRT(n)            = 3.52/SQRT(6)            = 1.44    99% confidence interval:            = DBAR +/- [t(ALPHA=.01, twotailed, df=5)*S(DBAR)!            = 2 +/- [(4.032)*(1.44)!            = 2 +/- 5.8            = from -3.8 to 7.8c.  2.  the standard error of the difference in sample means `

99.

`A chemist studies two treatments applied to a chemical which he mustprepare in small quantity because of cost and variability of its pro-perties.  The first time he runs the experiment he applies treatmentA to half of the first batch and treatment B to the other half of thefirst batch.  He experiments with six batches and on the basis of sixpairs of observations he declares the means of the two treatment pop-ulations to be just barely significant.  The next time he runs a sim-ilar experiment involving the same chemical and two treatments.  Heintends to run a two group (unpaired) experiment.(1) State some advantages (things in favor of) a two group experiment    for the chemist.(2) State some reasons why the chemist might again prefer a paired de-    sign. `
`Answer: (1) A two group experiment allows for more degrees of freedom for mak-    ing a test.  The group design has twice as many degrees of freedom    and in addition it allows for the possibility of getting three or    more experimental units out of one test chemical preparation batch.    With the two group experiment there is no restriction with regard    to equal sample sizes.(2) With a paired design,  unwanted variability can be handled.  That    is, the variance of difference SIGMA(D)**2 will in general be less    than the variance of observations.  There is no need to assume nor-    mality of each population and only differences  need  to be assumed    independent with homogeneous variance when a t-test is employed. `

100.

`We  are  interested in the wearing capabilitites of tires.  We obtainGood-day and Good-poor Tires and 9 racing cars (and  also  the  trackused  for  the  Indianapolis  500  Race).   We  put  Good-day  on theleft-hand side of the car (front  and  rear)  and  Good-poor  on  theright-hand  side  of the car (front and rear). We then allow the carsto complete the 500 miles at  a  (relatively)  safe  speed  and  thenmeasure the wear (in millimeters) per tire.      Car No.        Good-day        Good-poor      -------        --------        ---------        77              17              16        82              18              19        92              17              12        41              16              13        17              15              14        22              14              12        18              10              10        23              18              15        43              17              13a.  All the advertising literature claims equality between Good-day and    Good-poor.  Can you present evidence to disprove this claim?  Use a    significance level of 5%.b.  Comment on the validity of this experimental set-up. `
`Answer: a.  Let d be the difference in wear between tires on the left-hand side    compared to tires on the right-hand side.  We are interested in    testing the hypothesis that the mean (dBAR) of such different scores    is zero.         H(0):  MU(dBAR)  =  0         H(1):  MU(dBAR) =/= 0    The problem is obviously a paired experiment set-up and therefore we    perform a t-test on the difference.        Car No.         GD         GP        d(i)        d(i)**2        -------         --         --        ----        -------          77            17         16          1            1          82            18         19         -1            1          92            17         12          5           25          41            16         13          3            9          17            15         14          1            1          22            14         12          2            4          18            10         10          0            0          23            18         15          3            9          43            17         13          4           16                                             ----        -------  SUM                                  18           66           dBAR = [SUM(d(i))!/[9!                = 18/9                = 2        S(d)**2 = [SUM([d(i)-dBAR!**2)!/[n-1!                = [[SUM(d(i)**2)!-[n*(dBAR**2)!!/[n-1!                = [[66!-[9*4!!/[8!                = 3.75       t(calc.) = [dBAR-0!/[SQRT([S(d)**2!/n)!                = [2-0!/[SQRT([3.75!/9)!                = 3.098       t(critical, .05, two-tailed, 8 df) = 2.306    Since t(calculated) > t(critical), reject H(0).   Therefore we can    claim on the basis of this test that the tires are not equal.b.  The Indianapolis race track has an oval shape with highly-banked    curves.  Since the cars travel in only one direction, only the    inner tires would wear appreciably.  There are many other drawbacks    to the design, but this one is catastrophic. `

101.

`A computer programmer was concerned about the length of time thatwould be required to print exam questions and answers at aterminal (a printing device like a typewriter).  The programmerknew:     -that the terminal was capable of printing 30 characters      per second;     -the number of lines (maximum length: 80 characters) of text      that made up each question and answer;     -the identity of the questions to be printed at any particular      time; and     -that question selection for a single exam was a random process.Accordingly, the programmer estimated the time required  to  print  aset of questions and answers by multiplying the total number of linesto be printed by 80 and then dividing the result by 30 to  arrive  atan  estimated  print  time  in  seconds.   Later  the  programmer wasinformed that his estimates for print  time  were  always  too  high.What  do  you  recommend  that  the  programmer  do  to  improve  hisestimation scheme? `
`Answer: A  variety  of  suggestions  should  be  offered  in response to thisquestion.  Some should focus on the distribution  of  characters  perline  and  the  use  of  something other than the maximum length as abasis  for  estimation.   (This  assumes  that  the  printing  devicerecognizes  short  lines  and  does  not  always  attempt to print 80characters.) My preference  would  be  to  form  a  series  of  examscovering  the  anticipated range of useage and using random selectionas much as possible.  I would print these exams on the  terminal  andhope that a regression involving:     Y:  Print time; and     X:  Total number of lineswould provide an adequate approximation of estimating print time.  Ifnot,  I  would  then  be inclined to examine use of such variables asquestion length, answer length, computer load, etc. `

102.

`A test was performed to determine intensity settings for a certain typeof filter.   In 20 separate runs  of the test, the results for filter 1are as follows:        96, 83, 97, 93, 99, 95, 97, 91, 100, 92,        88, 89, 85, 94, 90, 92, 91, 78,  77, 93.Use this data to answer the following.  (You may assume a normal distri-bution.)a.  Write a model to describe an individual measurement from the popula-    tion in terms of an overall mean and a random element.  Define  all    terms completely.  (You may assume the  intensity  settings  to  be    determined by a large number of factors that operate  independently.    Assume that each factor makes a small contribution to intensity set-    ting and that contributions are additive.)b.  Obtain sample estimates for the parameters and set 90% confidence    limits for MU.c.  How would you modify your answer to part b if you knew that SIGMA**2    was 36?d.  Verify that the sample variance, S**2, is equal to:    S**2 = SUM(i=1,n)(e(i)**2)    where e(i) is a deviation from the           -------------------    mean and P is the number of para-                 (n - P)          meters estimated  other  than  the                                  population variance. `
`Answer: a.  Population form of model:  Y(i) = MU + EPSILON(i),  i = 1, 2, ...    where       Y(i):  intensity setting at which oprator i can first                       detect an image using filter 1.                  MU:  the population average intensity setting for an                       indefinitely large population of operators.          EPSILON(i):  a random element representing the difference in                       intensity setting between that required by a par-                       ticular operator (Operator i) and the population                       mean.  The usual assumption is that the EPSILON                       (i)'s are normally and independently distributed                       with a mean of 0 and a variance of SIGMA**2.    Sample form of model:  Y(i) = MU(HAT) + e(i),  i = 1, 2, ...    where       Y(i):  same as above.       MUHAT or YBAR:  an estimate of MU above.                e(i):  = Y(i) - MU(HAT)  !  A deviation, the difference                       = Y(i) - Y(i)HAT  !  between an  observed  inten-                       = Y(i) - YBAR     !  sity setting, Y(i), and what                                            we would  estimate  for the                                            ith   intensity   setting,                                            Y(i)HAT.  Here, Y(i)HAT =                                            MU(HAT) = YBAR.b.  PARAMETER    ESTIMATE    ---------    --------    MU           MU(HAT) = YBAR = (SUM(i=1,20)(Y(i)))/n                         = 1820/20                         = 91    SIGMA**2     S**2 = (SUM(i=1,20)((Y(i) - YBAR)**2))/(n - 1)                      = ((SUM(i=1,20)(Y(i)**2))-(SUM(Y(i))**2)/20)/(n-1)                      = [(96**2 + ... + 93**2) - ((1820)**2/20)! / 19                      = 39.789 with 19 df    CONFIDENCE LIMITS (Variance Estimated):        General form for limits:  parameter estimate +/- (t)*(estimated                                  standard error of parameter estimate)        In this case, limits for MU = MU(HAT) +/- t(ALPHA=.05, df=19) *                                      (S(MU(HAT)))                                    = YBAR +/- (1.729) * (S(YBAR))                                    = 91 +/- (1.729) * (1.410)                                    = from 88.56 to 93.44        i.e., 90% of the time that we draw a random sample of 20 opera-        tors and calculate an interval in this way, we  will get an in-        terval that contains the true mean, MU.  (This assumes that the        model used is appropriate.)c.  CONFIDENCE LIMITS (Variance Known):        General form for limits:  parameter estimate +/- (Z)* (known                                  standard error of parameter estimate)        In this case, limits for MU = MU(HAT) +/- (Z) * (SIGMA(MU(HAT)))                                    = YBAR +/- (1.64) * (SIGMA(YBAR))                                    = 91 +/- (1.64) * (1.34)                                    = from 88.8 to 93.2d.  i     Y(i)     Y(i)HAT=YBAR     e(i)    -     ----     ------------     ----    1      96          91            +5    2      83          91            -8    3      97          91            +6    4      93          91            +2    5      99          91            +8    6      95          91            +4    7      97          91            +6    8      91          91             0    9     100          91            +9   10      92          91            +1   11      88          91            -3   12      89          91            -2   13      85          91            -6   14      94          91            +3   15      90          91            -1   16      92          91            +1   17      91          91             0   18      78          91           -13   19      77          91           -14   20      93          91            +2             SUM(i=1,20)(e(i)**2) = (5)**2 + ... + (2)**2                                  = 756   [SUM(i=1,20)(e(i)**2)!/(n - P) = 756/19                                  = 39.789 with 19 df   since Y(i)HAT = YBAR. `

103.

`Molybedenum rods are produced by a production line setup.  It is desiredto check whether the process is in control.  Let X = length of such arod.  Assume X is approximately normally distributed with mean = MU andvariance = SIGMA**2, where MU and SIGMA**2 are unknown.Take N = 400 sample rods, with sample average length XBAR = 2 inches,and SUM((X-XBAR)**2) = 399.The correct confidence interval for MU at ALPHA = 8% is closest to:a.  2.2 +/- (1.75/20)b.  2 +/- (1.41)SQRT(399/400)c.  2.2 +/- (2.06/20)d.  2 +/- (1.75/20)e.  2 +/- (1.67)SQRT(399/400) `
`Answer: d.  2 +/- (1.75/20)    S**2 = 399/(400-1) = 1 S(XBAR) = SQRT(S**2/n)         = SQRT(1/400)         = 1/20    C.I. = XBAR +/- Z(ALPHA=.08/2) * S(XBAR)         = 2 +/- (1.75) * (1/20) `

104.

`John has done an experiment on gallons of water per second that flow ina sewer main in the city.  He makes 16 measurements  of  this  flow andfinds that their average is 100 and their variance is 9.   Find  a  98%confidence interval for the mean flow. `
`Answer: C.I. = XBAR +/- [t(df=15,ALPHA=.01)*S/SQRT(n)!C.I. = 100 +/- [2.602*(3/SQRT(16))!C.I. = 100 +/- 1.95     = from 98.05 to 101.95 `

105.

`The  calculated nitrogen content  of  pure  benzanilide is 7.10%.  Fiverepeat  analyses  of "representative"  samples  yielded values of 7.11%,7.08%, 7.06%, 7.06%, and 7.04%.  Using an ALPHA level of size 5%, can weconclude that the experimental  mean  differs from  the expected  value?Assume that the measured values are approximately normally distributed. `
`Answer: H(O): MU =   7.10H(A): MU =/= 7.10      YBAR = 7.07      S(Y) = 0.0265         t = (YBAR - MU)/S(YBAR) = (7.07 - 7.10)/(0.0265/SQRT(5))           = 2.53         t(critical, ALPHA=.05, df=4) = +/- 2.776Since the calculated value of t is not in the critical region, continueH(O) that the nitrogen content has a true value of  7.10%,  i.e.,   the0.03% difference is ascribable to random error.orYBAR +/- t*(S(Y)/SQRT(n))YBAR +/- 2.776*(0.0265)/(SQRT(5))P(7.037 <= MU <= 7.103) = 0.95Continue H(O) that the nitrogen content has a true value of 7.10% at 95%level since 7.10 lies within the 95% confidence interval. `

106.

`Wire cable is being manufactured by two processes.  We need todetermine if the processes are having different effects on themean breaking strength of the cable.  Randomly selected samplesfrom each process were submitted to the lab for testing as thoughthey were regular samples.  Coded values of the load requiredto break the cables (tension) are given below:Process No. 1:  9, 4, 10, 7, 9, 10Process No. 2:  14, 9, 13, 12, 13, 8, 10Determine if there is any difference in the mean breakingstrength for the two processes at the 95% probability level. `
`Answer: HO: MU(1) - MU(2) = 0   n(1) = 6                   n(2) = 7  MU(1) = 8.167              MU(2) = 11.285S(1)**2 = 4.47             S(2)**2 = 4.49F = 4.49/4.47 = 1.004F(ALPHA = .05, df = 5,6) = 4.39Since F(calculated) < F(critical), we can assume at the 95% probabilitylevel that there is no difference in the standard deviations.  It is,therefore, acceptable to pool the variances.S(P)**2 = [(n(1) - 1)(S(1)**2) + (n(2) - 1)(S(2)**2)!/(n(1) + n(2) - 2)        = (5*4.47 + 6*4.49)/11        = 4.48S(1BAR - 2BAR) = SQRT(4.48/6 + 4.48/7)               = 1.177             t = [(8.167 - 11.285) - 0!/1.177               = -2.66       t(ALPHA = .05, df = 11) = 2.201Since t(calculated) > t(critical), reject H(0).  Therefore, the twomethods are different.Using a confidence interval:X(2)BAR - X(1)BAR +/- 2.201 * 1.77            3.118 +/- 2.591C.I. = P(.527 <= MU(2) - MU(1) <= 5.71) = .95 `

107.

`Past production units of a certain jet engine model showed the meanmilitary thrust to be 7600 pounds.  The first ten production unitsmanufactured after a model change yielded military thrusts of 7620,7680, 7570, 7700, 7650, 7720, 7600, 7540, 7670, and 7630.  Is theresufficient evidence (use ALPHA = 0.05) that the model changeresulted in a higher average military thrust? `
`Answer: Using ALPHA = .05 and a one-tailed t-test we test:    H(0):  MU <= 7600    H(A):  MU >  7600Finding: YBAR = 7638S(Y) = SQRT((583,420,000 - ((76,380)**2/10))/9) = 57.3t = (YBAR - MU)/(S(Y)/SQRT(N))t = (7638 - 7600)/(57.3/SQRT(10))t = 2.097t(critical) = 1.83Since t(calculated) is larger than t(critical) for a one-sided test atALPHA = .05, reject the null hypothesis.  At the 95% confidence level,the sample evidence indicates a detectable increase. `

108.

`Past experience shows that, if a certain machine is adjusted properly, 5percent of the items turned out by the machine are defective.  Each daythe first 25 items produced by the machine are inspected for defects.If three or fewer defects are found, production is continued withoutinterruption.  If four or more items are found to be defective, produc-tion is interrupted and an engineer is asked to adjust the machine.After adjustments have been made, production is resumed.  This proce-dure can be viewed as a test of the hypothesis p = .05 against thealternative p > .05, p being the probability that the machine turnsout a defective item.  In test terminology, the engineer is asked tomake adjustments only when the hypothesis is rejected.Interpret the quality control procedure described above as a test ofthe indicated hypothesis.  A Type I error results in:a.  a justified production stoppage to carry out machine adjustments.b.  an unnecessary interruption of production.c.  the continued production of an excess of defective items.d.  the continued production, without interruption, of items that    satisfy the accepted standard. `
`Answer: b.  an unnecessary interruption of production. `

109.

`The daily yield of a chemical manufactured in a chemical plant,recorded for n = 49 days, produced a mean and standard deviationequal to XBAR = 870 tons and s = 21 tons, respectively.Test H(0):  MU = 880 against H(A):  MU < 880, using ALPHA = .05.Calculate BETA for H(A):  MU = 870. `
`Answer: S(M) = S/SQRT(n) = 21/7 = 3XBAR(crit) = MU(M) + Z(crit)S(M)           = 880 + ((-1.65)*3)           = 875.05Since 870 < 875.05, we reject H(0) and conclude that MU < 880.BETA is the probability of committing a type II  error.  Using theabove decision rule and given H(A), it is the probability that XBARis greater than XBAR(crit) = 875.05 when MU = 870.BETA(H(A): MU = 870) = P(XBAR > 875.05);    Z = (875.05 - 870)/3                     = P(Z > 1.683)    ;      = 1.683                     = .046 `

110.

`An economist is interested in the possible influence of "Miracle Wheat"on the average yield of wheat in a district.  To do so he fits a linearregression of average yield per year against year after introduction of"Miracle Wheat" for a ten year period.  The fitted trend line is        YHAT(j) = 80 + 1.5*X(j)                (Y(j): Average yield in j year after introduction)                (X(j): j year after introduction).a.  What is the estimated average yield for the fourth year after    introduction?b.  Do you want to use this trend line to estimate yield for, say, 20    years after introduction?  Why?  What would your estimate be? `
`Answer: a.  80 + 1.5*4 = 86b.  No. I would not want to extrapolate that far.  If I did, my estimate    would be 110, but some other factors probably come into play with    20 years. `

111.

`A  management  analyst  is  studying  production  in  an   electroniccomponent assembly factory.  Workers individually assemble componentsinto final products.  Each worker is given 100 sets of components  toassemble  each  day.    Employees  clock  out at the time they finishassembling the 100 sets into final products.  The analyst has averagehourly  production  rates  for  each  individual worker.  Which meanshould be used to calculate the overall average production per laborhour?a.  arithmetic meanb.  geometric meanc.  harmonic mean `
`Answer: c.  harmonic mean    The harmonic mean is properly used since the numerator in each    worker's average production is 100 units and the denominator,    hours worked, varies. `

112.

`A   management  analyst  is  studying  production  in  an  electroniccomponent assembly factory.  Workers individually assemble componentsinto  final  products.  Workers assemble as many units as they can inan eight hour day.  The analyst has average hourly  production  ratesfor  each  individual  worker.   To  calculate  the factory's overallaverage hourly production per worker, which mean should be used?a.  arithmetic meanb.  geometric meanc.  harmonic mean `
`Answer: a.  arithmetic mean    The arithmetic mean of individual average hourly production rates    is the same  as  total production divided by total hours worked,    since individual rates are daily production divided by eight for    every employee. `

113.

`A computer programmer reports that the average time required to run aparticular program is 11.67 minutes, and that the variance is 8.55 with5 df.  In the Appendix of his report, he lists the following values fortime to run the program:                          12, 17, 9, 13, 11, 8.a.  What model for time to run the program was implicit in what he re-    ported?b.  What does this report (or the model) say about factors that might    affect run times? `
`Answer: a.  Y(I) = MU + EPSILON(I)        Where:        Y(I):           time to run job I        MU:             population mean run time        EPSILON(I):     random error term associated with                        job I, usually assumed to be normally                        distributed with a mean of zero and a                        variance of SIGMA**2.b.  The model that he uses in his report says that all important    factors that affect runtime have been held constant. `

114.

`Suppose that you have at your disposal the information below for eachof 30 drivers.  Propose a model (including a very brief indication ofsymbols used to represent independent variables) to explain how milesper gallon vary from driver to driver on the basis of the factorsmeasured.Information:        1.  miles driven per day        2.  weight of car        3.  number of cylinders in car        4.  average speed        5.  miles per gallon        6.  number of passengers `
`Answer: Y(j) = b(0) + b(1)*X(1) + b(2)*X(2) + b(3)*X(3) + b(4)*X(4) + b(5)*X(6)       + e(j)where the dependent variable is variable 5 - miles per gallon and theindependent variables are        X(1) - miles driven per day        X(2) - weight of car        X(3) - number of cylinders in car        X(4) - average speed        X(6) - number of passengers `

115.

`A hospital is considering use of a new device for measuring patienttemperatures.  For each of 50 patients it is proposed that there be onetime when the patient's temperature is taken with both a standardthermometer and the new device.  The order of using devices is to berandomly determined for each patient.  The model proposed to describethe temperatures recorded is:Y(i,j) = MU + TAU(i) + RHO(j) + EPSILON(i,j)i=1 or 2, j=1, 2, ..., 50.a.  What is Y(i,j)?b.  What is TAU(i)?c.  What is RHO(j)?d.  What design is proposed? `
`Answer: a.  Y(i,j) is the response measured for treatment i at level j of the    blocking factor (Person j).b.  TAU(i) is the effect of treatment i, treatment 1: standard thermo-    meter, treatment 2: new thermometer.c.  RHO(j) is the effect of block j, (Person j).d.  The Randomized Complete Block design has been proposed. `

116.

`In the attached Table 1, results for the routine measurement ofnickel in a steel standard are reported.  This determination was madedaily over a long period of time to establish a quality controlprogram.In Table 2, the data have been plotted as a tally sheet ofindividual values.  Clearly, a grouped tally sheet would be moreeffective in revealing the pattern of variation in these data.Perform the following --(a)  Set up a grouped tally sheet and histogram.  A cell interval of     0.05% is recommended.  List the frequency, cumulative frequency     and relative cumulative frequency for each cell.(b)  Calculate the mean and standard deviation (use coding) by both     the ungrouped and the grouped procedures.  Compare results.(c)  What is the mode -- comment -- is it meaningful?(d)  What is the median?(e)  Calculate the standard deviation of the mean.(f)  Plot an ogive.  Plot the data on normal probability paper.  Is it     reasonable to assume a normal distribution?  If so, estimate the     standard deviation and mean and compare wih the calculated values.     Estimate the percentage of values outside of the limits 4.88 to     5.21 and compare with the actual percentage.Table 1.  Results of Daily Determination of Nickel in a Nickel                           Steel StandardDate     % Ni        Date     % Ni        Date     % NiMar. 6   4.95        Apr. 17  4.96        May  29  5.03     7   5.02             18  4.79             30  5.08     8   5.17             19  5.06             31  5.20     9   5.08             20  5.03        June 1   5.11     10  4.92             21  4.95             2   4.95     11  4.94             22  5.10             3   4.95     13  5.22             24  5.05             5   5.00     14  4.96             25  5.30             6   4.92     15  5.05             26  5.24             7   5.16     16  5.02             27  5.00             8   5.14     17  5.14             28  5.08             9   5.02     18  5.00             29  5.04             10  5.14     20  5.07        May  1   4.97             12  5.02     21  4.83             2   4.86             13  4.97     22  5.11             3   5.07             14  4.96     23  4.99             4   4.90             15  5.26     24  4.98             5   5.22             16  5.11     25  5.26             6   5.07             17  5.15     27  4.88             8   5.31             19  4.98     28  5.01             9   5.05             20  5.15     29  4.98             10  5.16             21  5.00     30  5.21             11  5.02             22  5.14     31  5.15             12  5.18             23  4.98Apr. 1   5.00             13  4.90             24  5.03     3   5.00             15  5.20             26  5.01     4   5.10             16  5.08             27  4.97     5   5.03             17  5.19             28  5.12     6   4.97             18  5.16             29  4.98     7   4.89             19  4.88     8   5.12             20  4.99     10  5.27             22  4.92     11  5.09             23  5.17     12  5.13             24  5.01     13  4.93             25  5.02     14  4.93             26  5.06     15  5.04             27  5.03Table 2.  Frequency Table and Tally Sheet for the Data                       in Table 1Ni Conc.,    Tally    Frequency      Ni Conc.,    Tally    Frequency  % (y)      Marks       (f)           % (y)      Marks       (f)  4.79       X            1            5.05       XXX          3  4.80                                 5.06       XX           2  4.81                                 5.07       XXX          3  4.82                                 5.08       XXXX         4  4.83       X            1            5.09       X            1  4.84                                 5.10       XX           2  4.85                                 5.11       XXX          3  4.86       X            1            5.12       XX           2  4.87                                 5.13       X            1  4.88       XX           2            5.14       XXXX         4  4.89       X            1            5.15       XXX          3  4.90       XX           2            5.16       XXX          3  4.91                                 5.17       XX           2  4.92       XXX          3            5.18       X            1  4.93       XX           2            5.19       X            1  4.94       X            1            5.20       XX           2  4.95       XXXX         4            5.21       X            1  4.96       XXX          3            5.22       XX           2  4.97       XXXX         4            5.23  4.98       XXXXX        5            5.24       X            1  4.99       XX           2            5.25  5.00       XXXXXX       6            5.26       XX           2  5.01       XXX          3            5.27       X            1  5.02       XXXXXX       6            5.28  5.03       XXXXX        5            5.29  5.04       XX           2            5.30       X            1                                       5.31       X            1 `
`Answer: a)  (If available, consult file of graphs and charts that could not be     be computerized.)  Cell          Cell            Cum    Rel CumMidpoints    Boundaries    f     f        f                4.775  4.80                     1     1       0.01                4.825  4.85                     2     3       0.03                4.875  4.90                     8    11       0.11                4.925  4.95                    14    25       0.25                4.975  5.00                    22    47       0.47                5.025  5.05                    15    62       0.62                5.075  5.10                    12    74       0.74                5.125  5.15                    13    87       0.87                5.175  5.20                     7    94       0.94                5.225  5.25                     4    98       0.98                5.275  5.30                     2   100       1.00                5.325                         ___                         100b)  ungrouped YBAR = 504.99/100 = 5.0499 == 5.05    ungrouped S(Y) = SQRT[(2551.3039 - 2550.1490)/99!                   = SQRT(0.01166)                   = 0.108 == 0.11    Grouped and coded by:  Y = 0.05d + 5.05      Cell    Midpoint      d      f      f*d      f(d**2)      4.80       -5      1      -5         25      4.85       -4      2      -8         32      4.90       -3      8      -24        72      4.95       -2     14      -28        56      5.00       -1     22      -22        22      5.05        0     15       0          0      5.10       +1     12      +12        12      5.15       +2     13      +26        52      5.20       +3      7      +21        63      5.25       +4      4      +16        64      5.30       +5      2      +10        50                                ___       ___                       sum(fd) = -2    sum(f*d**2) =                                          448    dBAR = (sum(fd))/n = -2/100 = -.02    YBAR = (0.05)(-.02) + 5.05 = 5.049 == 5.05    S(d) = SQRT[((448 - 2**2)/100) / 99! = SQRT(4.525) = 2.127    S(Y) = (2.127)(0.05) = 0.106 == 0.11c)  5.00 or 5.02 - not meaningful because no single value occurs    with sufficient frequency.d)  Median is average of 50th and 51st observations -          (5.03 + 5.03)/2 = 5.03e)  S(YBAR) = S(Y)/SQRT(n) = 0.108/SQRT(100) = 0.0108 == 0.011f)  Estimates graphically should compare closely.    (If available, consult file of graphs and charts that could not be     computerized.)    Actual percentage outside = 11%.    Graphical estimate should be within about 2% of this. `

117.

`A coffee dispensing machine provides servings that have a populationmean of 6 ounces and a population standard deviation of .3 ounces.If the difference is measured between randomly chosen cups (e.g.the 7th minus the 15th, the 22nd minus the 29th, etc.), thedistribution of differences will have a mean of ______ and astandard deviation of ______. `
`Answer: a.  MU = 0b.  SIGMA = SQRT(.09/1 + .09/1) = .424 `

118.

`The  administrator of a loan program for small farmers (five foot andunder)  institutes  a  new  objective  scale  by  which   his   fieldinvestigators   are  asked  to  rate  small  farms  on  their  profitpotential.  He suspects that two of his  investigators  are  applyingthe standard quite differently, which offends his sense of order.  Tocheck on them, he asks both of them to rate 12 randomly chosen farms.The results:FARM #        1    2    3    4    5    6    7    8    9   10   11   12A RATING     90   80   75   80   60   80   55   40   80   65   70   60B RATING     65   50   50   65   40   50   55   45   55   55   45   45a)  Use an appropriate statistical test to see whether this is strong    enough evidence to reject the  hypothesis  that  they rate in the    same way.b)  Make a scatter diagram of the same data.  Fit a straight line  to    the set of points by eye. Estimate the equation of this line using    your graph.c)  How could the information in (a) and (b) TAKEN TOGETHER be useful    to the administrator? `
`Answer: a)  The appropriate test in this case appears to be the paired (re-    lated samples) t-test.    H(O):  MU(Y) - MU(X)  =  0    H(A):  MU(Y) - MU(X) =/= 0    Calculations:      Y  ^    X  ^  D = Y - X  ^  d = (D - DBAR)  ^   d**2    --------------------------------------------------------     90  ^   65  ^      25     ^       7.08       ^    50.17     80  ^   50  ^      30     ^      12.08       ^   146.01     75  ^   50  ^      25     ^       7.08       ^    50.17     80  ^   65  ^      15     ^     - 2.92       ^     8.51     60  ^   40  ^      20     ^       2.08       ^     4.34     80  ^   50  ^      30     ^      12.08       ^   146.01     55  ^   55  ^       0     ^     -17.92       ^   321.01     40  ^   45  ^     - 5     ^     -22.92       ^   525.17     80  ^   55  ^      25     ^       7.08       ^    50.17     65  ^   55  ^      10     ^     - 7.92       ^    62.67     70  ^   45  ^      25     ^       7.08       ^    50.17     60  ^   45  ^      15     ^     - 2.92       ^     8.51    --------------------------------------------------------    835  ^  620  ^     215     ^       0.00       ^  1422.91    YBAR = 69.58    XBAR = 51.67    DBAR = 17.92    S(D) = SQRT(1422.91/11) = 11.37    S(DBAR) = S(D)/SQRT(12) =  3.28    t(calc) = 17.92/3.28            = 5.457    t(crit, ALPHA=.05, df=11, two-tailed) = +/- 2.201    Since t(calc) > +t(crit), reject H(O).  Thus, the evidence    is strong enough that the hypothesis that they rate in the    same way can be rejected.b)            ^              ^            Y ^              ^           90 +                             *    B              ^           85 +              ^           80 +              2         *              ^           75 +              *                    NOTE:  for a line pos-              ^                                   sibly fit by  eye draw           70 +         *                         a line through  points  A-RATING    ^                                   A and  B.   Also  note           65 +         A         *               that 2 indicates there              ^                                   are  two  data  points           60 +    *    *                         located at  this posi-              ^                                   tion.           55 +                   *              ^           50 +              ^           45 +              ^           40 +         *              ^              -----+----+----+----+----+----+----+----+----+----> X                  40   45   50   55   60   65   70   75   80                                 B-RATING    The equation associated with the above line fitted by eye is:        YHAT = 20 + (1.0*X)    The estimated equation found by the least squares  method is:        YHAT = 14.69 + (1.063*X)c)  The information in part (a) does imply that the investigators do    apply the standards quite differently.  However, using the infor-    mation in part (b), the administrator can  estimate  one of  the    ratings given the other. `

119.

`We want to compare two machines for production  line  speed  of  beerbottle  manufacturing.   At  the  end  of  each of n(1) = 9 days, thenumber of  bottles produced by machine(1) yield XBAR(1) = 19, S(1)**2= 4.  For machine(2), we have n(2) = 6 days and XBAR(2) = 17, S(2)**2= 9.  Assume independence  of  samples, normality,  and  equality  ofunknown variances.To test H(0):  MU(1) = MU(2) vs. H(1):  MU(1) =/= MU(2) at ALPHA=.05,we need a table value equal to:a.  1.771b.  1.960c.  2.160d.  2.131e.  2.145 `
`Answer: c.  2.160t(df=n(1)+n(2)-2=13, ALPHA=.05, Two-tailed) = 2.160 `

120.

`Four replicate analyses of each of two ends of a special metal rod weremade.  All eight analyses were made in random order.  Results for copperanalyses on end A were:  4.02, 4.04, 4.08, and  4.05.   On end B,  theywere:  4.08, 4.06, 4.12, and 4.10.  At the 95% probability level, can wereject the hypothesis  of  no  difference in copper content  for the twoends?  At the 99% level? `
`Answer: Mean for end A = (4.02 + 4.04 + 4.08 + 4.05)/4 = 4.0475Variance for end A = .000625    Mean for end B = (4.08 + 4.06 + 4.12 + 4.1)/4 = 4.09Variance for end B = .000667S(P)**2 = [(n(1) - 1)(S(1)**2) + (n(2) - 1)(S(2)**2)!/(n(1) + n(2) - 2)        = (3*.000625 + 3*.000667)/6 = .000646S(ABAR - BBAR) = SQRT(.000646/4 + .000646/4) = .0180HO:  MU(A) - MU(B) = 0t = (-.0425 - 0)/.0180 = -2.36t(ALPHA=.05, df=6) = 2.447      Do not reject the null hypothesis.t(ALPHA=.01, df=6) = 3.707      Do not reject the null hypothesis. `

121.

`A biologist is working with dangerous chemical residues found in wildskunks.   Of particular interest is the possible relationship betweenthe  %  Mercury  accumulation  in  the  liver  and  the  %   Teluriumaccumulation  in  the  lungs.   He purchases seven "chemically clean"skunks, and subjects them to a diet containing Telurium and  Mercury.The  amounts  absorbed by an animal will, of course, vary from animalto animal.  The results were:      % Mercury (X)                 % Telurium (Y)      -------------                 --------------           3                               3           5                               4           2                               2           4                               5           6                               2           1                               0           7                               1Possible useful summaries:           SUM(X) =  28.00000                    SUM(Y) = 17.00000        SUM(X**2) = 140.00000                 SUM(Y**2) = 29.00000 SUM([X-XBAR!**2) =  28.00000          SUM([Y-YBAR!**2) = 17.71429         SUM(X*Y) =  72.00000    SUM([X-XBAR!*[Y-YBAR!) =  4.00000At the 5% level of significance, do you think there is a linear rela-tionship between % Mercury accumulation and the % Telurium accumulation? `
`Answer: Using Pearson's Product Moment correlation coefficient as an indicatorof the strength of a linear relaionship:    r = [SUM([X-XBAR!*[Y-YBAR!)!/            [SQRT([SUM([Y-YBAR!**2)!*[SUM([X-XBAR!**2)!)! = [4.000!/[SQRT([17.71429!*[28.00000!)!      = 0.1796To see if this is suspiciously large we may refer to special tables oruse the approximate t-test.    H(O):  Correlaion is zero.    H(A):  Correlation is other than zero.i.e.    H(O):  RHO  =  0    H(A):  RHO =/= 0    t(calc.) = [r!/[SQRT([1-(r**2)!/[n-2!)!             = [0.1796!/[SQRT([1-(0.1796**2)!/[5!)!             = 0.4082  with 5 df    t(crit., df=5, ALPHA=.05, two-tailed) = +/- 2.571Since -t(crit.) < t(calc.) < +t(crit.) continue (do not reject)H(O).  It seems there is no relationship between Mercury andTelurium accumulations.However, a plot of the data reveals:    * = DATA           ^           ^           ^           ^         5 ^                       *           ^           ^           ^           ^         4 ^                             *           ^           ^           ^           ^ % of    3 ^                 * Telerium  ^           ^           ^           ^         2 ^           *                       *           ^           ^           ^           ^         1 ^                                         *           ^           ^           ^           ^         0 ^     *                 1     2     3     4     5     6     7     8                             % of                             MercuryIt is obvious that Mercury and Telurium are related by a curve havinga maximum Y when X=4.  Why didn't the t-test reveal this?  It is  be-cause the test only looks at linear correlation.  These two variablesare correlated but not in a linear manner. `

122.

`We want to know which of two types of filters  should  be used  over  anoscilloscope to help the operator pick out the image on the cathode  raytube.  A test was designed in which the strength of  a signal  could  bevaried from zero up to the point where the operator  first  detects  theimage.  At this point, the intensity setting  was read.   The lower  thereading when the image was first detected,  the  better  the  filter is.Because people vary in their ability to detect the image, twenty opera-tors were selected and each one made one reading for each filter.   Fromthe results which are tabulated below, test   the  null hypothesis of nodetectable difference in the filters.  If they do  differ at some  ALPHAlevel of less than .10, tell which is best.Operator  F1  F2     Operator  F1  F2     Operator  F1  F2   1      96  92        8      91  90        15     90  89   2      83  84        9     100  93        16     92  90   3      97  92       10      92  90        17     91  90   4      93  90       11      88  88        18     78  80   5      99  93       12      89  89        19     77  80   6      95  91       13      85  86        20     93  90   7      97  92       14      94  91 `
`Answer: DBAR = SUM(D)/N = 40/20 =2 F1        F2      D=F1-F2       D-DBAR       (D-DBAR)**2 96        92         4             2                4 83        84        -1            -3                9 97        92         5             3                9 93        90         3             1                1 99        93         6             4               16 95        91         4             2                4 97        92         5             3                9 91        90         1            -1                1100        93         7             5               25 92        90         2             0                0 88        88         0            -2                4 89        89         0            -2                4 85        86        -1            -3                9 94        91         3             1                1 90        89         1            -1                1 92        90         2             0                0 91        90         1            -1                1 78        80        -2            -4               16 77        80        -3            -5               25 93        90         3             1                1                     --                            ---                     40                            140S(D) = SQRT[SUM((D-DBAR)**2)/(N-1)! = SQRT(140/19) = 2.7145Using a paired t test:        H(O):  MU(F1 - F2) = 0        H(A):  MU(F1 - F2) =/= 0        t(calculated) = (DBAR - MU(F1-F2))/(S(D)/SQRT(N))                      = (2 - 0)/(2.7145/SQRT(20))                      = 3.30        t(ALPHA = .01, df=19) = 2.86                and        t(ALPHA = .001, df=19) = 3.88At ALPHA = .01, t(calculated) > t(critical), so you would reject H(O).At ALPHA = .001, t(calculated) < t(critical), so you would continueH(O).Since F1BAR = 91 and F2BAR = 89, F2 should be considered the best. `

123.

`Two methods were used in a study of the latent heat of fusion of ice.Both method A (an electrical method) and method B (a method of mixtures)were conducted with the specimens cooled to -0.72 degrees C.  The datarepresent the change in total heat from -0.72 degrees C to water at0 degrees C, in calories per gram of mass.   METHOD A       METHOD B      79.98          80.02      80.04          79.94      80.02          79.98      80.04          79.97      80.03          79.97      80.03          80.03      80.04          79.95      79.97          79.97      80.05      80.03      80.02      80.00      80.02Is there any difference in the 2 methods at the 5% probability level? `
`Answer: H(0):  MU(A)  =  MU(B)H(A):  MU(A) =/= MU(B)YBAR(A) = 80.02            YBAR(B) = 79.98S(A)**2 = 0.000574         S(B)**2 = 0.000984      n = 13                     n = 8F = 984/574 = 1.71         F(ALPHA=.05, df=7,12) = 2.92Since F(calculated) < F(critical), we can assume at the .95 level thatthere is no difference in the standard deviations.  Therefore, it isacceptable to pool.S(P) = SQRT[((.12*.000574) + (7*.000984))/(12+7)! = 0.0269t = [((80.02-79.98)-0)/(0.0269*SQRT(1/13+1/8))!  = .04/(0.0269*.45) = 3.30t(ALPHA=.05, df=19, two-tailed) = 2.093Since t(calculated) exceeds t(critical), reject H(0), i.e. the 2methods are different. `

124.

`Propose and justify your proposal for a relation, if any, ofthe following variables on steam use. Steam Use  Production  Wind  Days Worked  Days Below 32o  Temperature 1.  10.98     .61      7.4       20             22           35.3 2.  11.13     .64      8.0       20             25           29.7 3.  12.51     .78      7.4       23             17           30.8 4.   8.40     .49      7.5       20             22           58.8 5.   9.27     .84      5.5       21              0           61.4 6.   8.73     .74      8.9       22              0           71.3 7.   6.36     .42      4.1       11              0           74.4 8.   8.50     .87      4.1       23              0           76.7 9.   7.82     .75      4.1       21              0           70.710.   9.14     .76      4.5       20              0           57.5Values are on a monthly basis for a manufacturing firm.  Windand temperature entries are monthly means.  Days worked and DaysBelow 32 degrees are number of days in a month. `
`Answer: The model proposed here is:Y(Steam use) = B(0) + B(1) * X2(prod) + B(2) * X6(temp) + EThis description based on temperature (X6) and production(X2) accounts for around 96% of the variation in steam use.  Thefitted equation is:YHAT = 11.31 + 4.56*X2 - .09*X(6).The tests of individual b values and F test of regression meansquare are significant.  The residual patterns seem tobe acceptable.Since Days below 32 degrees and Days worked are highly correlatedwith Temperature and Production (respectively) we suspectthat they will not add much to the regression.Model with Temperature and Production                    df          SS          M.sq.Regression           2         27.90        13.95Error                7          1.21          .17R**2 = .958Model with all five variables:                    df          SS          M.sq.Regression           5          28.42       5.68Error                4            .69        .17R**2 = .976 `

125.

`                      Average JuneAugust Yield       Minimum Temperature       June Rainfall     Y                   X(1)                    X(2)    13.1                 50.4                    3.1    14.1                 51.0                    5.0    15.7                 49.1                    6.7    14.3                 51.2                    5.2    15.2                 48.1                    6.9    16.7                 48.0                    7.8    13.8                 51.0                    5.6    12.4                 49.6                    4.0    11.5                 53.1                    3.7    15.3                 48.2                    6.5    14.4                 52.2                    4.8    13.3                 50.5                    4.3    12.5                 54.2                    1.9    12.7                 50.1                    5.6    16.5                 49.9                    6.8The above data was studied with the aid of a computer.  It is datatypical of actual corn yield information recorded in Oklahoma.  Thecorrelations were as follows:    yield vs. temperature   :  -.657    yield vs. rainfall      :   .846    temperature vs. rainfall:  -.796Now, although the correlation between yield and temperature is strongand negative, the least squares equation given by the computer print-out was:   Y = 7.79 + .0379X(1) + .847X(2).Is the sign of the temperature variable X(1) consistent with the nega-tive correlation coefficient?  Explain. `
`Answer: The printout is correct.  The correlation need not have the same signas the coefficient of the variable in the least squares fit.  In thepresence of X(2) the effect of X(1) need not be the same as the effectof X(2) alone.  This makes sense in this experiment.  Corn needs mois-ture, and rain is usually accompanied by cool weather, but the best com-bination for corn yield is warm and wet weather. `

126.

`Listed below is some fictitious data concerning1.  Amount of oil needed to fill the tank for the heating plant of a    building.2.  Days elapsed since last oil delivery.3.  Average outside temperature (Fahrenheit) during the time since last    deliveryOil   Days since last delivery     Average Temperature (F) 60             16                              10.1 41             17                              22 50             16                              15.4 29             13                              18 81             19                               7.3 74             26                              28 57             25                              34a.  Write and fit 2 models    1. One relating oil consumption to average temperature alone.    2. One relating oil consumption to both temp. and days since last       delivery.b.  Compare the estimated regression coefficients for temperature from    these two models.  Why do you think there is this difference (of    lack of difference)?c.  Compare the estimated intercepts for the two models.  Does either    of these values seem reasonable?  Why? `
`Answer: a.  1.  Y = b(0) + b(1)*X(3) + e        Y = 61.216 - .27084*X(3)    2.  Y = b(0) + b(2)*X(2) + b(3)*X(3) + e        Y = .42296 + 5.0193*X(2) -2.0289*X(3)b.  Regression coefficients for temp.    Model 1: -.27084    Model 2: -2.0289    The coefficients for average temperature are different because the    variable "Days since last delivery" provide important information    for exploring variation in oil use.  When average temperature is    fitted without using information about time since last delivery all    times are treated as if the same and the coefficient for temperature    is calculated as if times were the same.  When variation in time    since last delivery is taken into account, the coefficient for    temperature no longer is calculated as if there were no variation    in time since last delivery.c.  Intercepts     Model 1: 61.2      Model 2: .42296    R-Square       Model 1:  .02      Model 2:  .99    I would go with Model 2 because Model 1 does not account for much    of the variation, and the intercept for Model 2 does make more    sense.  If it has been zero days since the last delivery, they    should not need much oil, even if it has been very cold (0 degrees)    that day. `

127.

`It is known that long-thin titanium curtain rods lengthen with in-creasing temperature.  A sample of n = 20 identical titanium rodsare selected.  Each is subjected to a particular uniform tempera-ture X for a specified time.  Let Y denote the change in length.The readings are (X(1), Y(1)), ..., (X(20), Y(20)), with data XBAR= 2 (in hundreds of degrees fahrenheit), YBAR = 3 (in milli-inches),SUM((X - XBAR)**2) = 10, SUM((Y - YBAR)**2) = 40, and SUM[(X - XBAR)(Y - YBAR)! = 16.The least squares regression line of the form Y = a + bX has valuesa = __________and b = __________respectively.(a)  4/5, 7/5            (d)  7/4, -1/2(b)  3/4, 3/2            (e)  1/5, 13/5(c)  8/5, -1/5 `
`Answer: (c)  8/5, -1/5     a = YBAR - b(XBAR)     b = SUM(i = 1, n)[(X(i) - XBAR)(Y(i) - YBAR)!/         SUM(i = 1, n)[(X(i) - XBAR)**2!     b = 16/10 = 8/5     a = 3 - 8/5*2 = -1/5 `

128.

`It is known that long, thin titanium curtain rods lengthen with in-creasing temperature.  A sample of n = 20 identical titanium rods isselected.  Each is subjected to a particular uniform temperature X fora specified time.  Let Y denote the change in length.  The readings are(X(1), Y(1)), ..., (X(20), Y(20)), with data XBAR = 2 (in hundreds ofdegrees F), YBAR = 3 (in Milli-inches), SUM((X - XBAR)**2) = 10, SUM((Y - YBAR)**2) = 40, and SUM[(X - XBAR)(Y - YBAR)! = 16.When the temperature is set at 400 degrees (i.e., X = 4), then thepredicted value of the lengthening of the rod is closest to ______milli-inches.(a)  24/5            (d)  3/5(b)  17/5            (e)  31/5(c)  5 `
`Answer: (e)  31/5    b = SUM(i = 1, n)[(X(i) - XBAR)(Y(i) - YBAR)!/        SUM(i = 1, n)[(X(i) - XBAR)**2!    b = 16/10 = 8/5    a = YBAR - b(XBAR)    a = 3 - 8/5*2 = -1/5    Hence, YHAT = a + bX                = -1/5 + 8/5X    Now X = 4    Hence, YHAT = -1/5 + 8/5*4 = 31/5 `

129.

`The following data were obtained in a study of road width and the numberof accidents occurring per hundred million vehicle miles.    Width    Number of Accidents     73             42     50             83     62             58     30             93     25             90The Department of Transportation wishes to use width to predict numberof accidents.  Determine an equation which will enable them to do this.Can the department significantly improve its prediction of number ofaccidents by using the data on width, over what the prediction wouldbe not using the width data?  (HINT:  do a hypothesis test.)  (Get theappropriate formulas set up, data inserted, then approximate.) `
`Answer: X = Width        Y = Number of Accidents   SUM(X) =   240                   SUM(Y) =   366SUM(X)**2 = 13198                SUM(Y)**2 = 28766                 SUM(X*Y) = 15852b = (15852 - (240*366)/5)/(13198 - (240**2)/5)  = (-1716)/1678  = -1.02a = 366/5 - ((-1.02)*240/5)  = 122.16Regression equation:  YHAT = 122.16 - 1.02*XSource        df      SS      MSQ      F------------------------------------------Regression     1   1754.86  1754.86  23.93Deviations     3    219.94    73.31Total          4   1974.8Critical Region:  F > 10.1Reject H(O) and conclude that width data can significantlyimprove the prediction of accidents. `

130.

`Find the regression line of the stopping distance Y on the speed Xof cars based on the following data:    X = speed (mph)                20  30   40   50    Y = stopping distance (ft)     50  90  150  210 `
`Answer: Y = -64 + (5.4*X) `

131.

`The following data are the result of a thermodynamic experiment:     X surface area       -2    -1    3    5     Y heat loss           0     6    9   10a.  Find the least squares line to fit these data, and make a sketch    of the points and the line.b.  Estimate the loss of heat for a surface area of 4.0.c.  Would you feel safe in using this line to estimate heat loss for    a surface area of 10?  Explain. `
`Answer: a.  Computations:       X   ^   Y   ^   X**2   ^   Y**2   ^   X*Y    -------^-------^----------^----------^----------      -2   ^   0   ^     4    ^      0   ^     0      -1   ^   6   ^     1    ^     36   ^   - 6       3   ^   9   ^     9    ^     81   ^    27       5   ^  10   ^    25    ^    100   ^    50    -------^-------^----------^----------^----------       5   ^  25   ^    39    ^    217   ^    71    XBAR = 5/4  = 1.25    YBAR = 25/4 = 6.25    bHAT = [(4*71)-(5*25)!/[(4*39)-(5**2)!         = 1.214    aHAT = 6.25 - (1.214*1.25)         = 4.733                      ^                    Y ^                      ^                   10 +-----^-----^-----^-----^-----*   (Note:  draw a                      ^     ^     ^     ^     ^     ^    line  through                      ^     ^     ^     ^     ^     ^    points A  and                    9 +-----^-----^-----*-----^-----^    B   for    an                      ^     ^     ^     ^     ^     ^    approximation                      ^     ^     ^     ^     ^     ^    to the  least                    8 +-----^-----^-----^-----^-----^    squares line.)                      ^     ^     ^     ^     ^     ^                      ^     ^     ^     ^     ^     ^        Heat Loss   7 +-----^-----^-----^-----^-----^                      ^     ^     ^     ^     ^     ^                      ^     ^     ^     ^     ^     ^                *   6 +-----B-----^-----^-----^-----^                      ^     ^     ^     ^     ^     ^                      ^     ^     ^     ^     ^     ^                    5 +-----^-----^-----^-----^-----^                      ^     ^     ^     ^     ^     ^                      ^     ^     ^     ^     ^     ^                    4 +-----^-----^-----^-----^-----^                      ^     ^     ^     ^     ^     ^                      ^     ^     ^     ^     ^     ^                    3 +-----^-----^-----^-----^-----^                      ^     ^     ^     ^     ^     ^                      ^     ^     ^     ^     ^     ^                    2 +-----^-----^-----^-----^-----^                      ^     ^     ^     ^     ^     ^                      ^     ^     ^     ^     ^     ^    A               1 +-----^-----^-----^-----^-----^                      ^     ^     ^     ^     ^     ^                      ^     ^     ^     ^     ^     ^  <-+-----*-----+-----+-----+-----+-----+-----+-----+------> X   -3    -2    -1     0     1     2     3     4     5                            Surface Areab.  YHAT = 4.733 + (X*1.214)         = 4.733 + (4.0*1.214)         = 9.589c.  No.  Since the observed values for surface area range from -2 to    5, I would feel very apprehensive about extrapolating to a surface    area of 10. `

132.

`An engineer is interested in the flow rate of a river (volume/min.)at a downstream location, D.  He has a poor set of records for thislocation, but an extensive set of records for an upstream locationU.  He would like to find a way to estimate flow rates at D corres-ponding to various flow rates at U.a)  If we use a regression model for a straight line, what are the    usual symbols and assumptions that apply to measurements taken    at U and D?b)  If there were no streams of any consequence entering the river    between U and D, how would our choice of model be influenced    and what parameters are to be estimated for a straight line re-    lation?c)  If a number of major streams entered the river between U and D,    how would our choice of model be influenced and what parameters    are to be estimated for a straight line relation?d)  Suppose that there are several streams entering the river between    U and D.   Estimate the slope of  a  straight line linking these    measurements.          Flow Rate at U        Flow Rate at D          --------------        --------------                 1                     3                 2                     7                 3                     9                 4                     9                 5                    12 `
`Answer: a)  The measurements at U are values for the independent variable    usually represented by  X  and are assumed to be measured with    negligible error.  The measurements at D are values for the de-    pendent variable usually represented by  Y  such that each Y(i)    is assumed to be normally and independently distributed with mean    = ALPHA + BETA*X(i) and variance = SIGMA**2.b)  We might force  the straight line  through the origin  (X = 0,    Y = 0).  Parameters to be estimated would be BETA and SIGMA**2.c)  We should then be reluctant to claim that there were any values    of X where we did not have to estimate the corresponding value    of Y.  We would use a model for a straight line through (XBAR,    YBAR) and estimate ALPHA, BETA and SIGMA**2.d)  For a straight line through (X, Y),    b = [SUM(X(j)*Y(j))!/[SUM(X(j)**2)!    SUM(X(j)*Y(j)) = 20    SUM(X(j)**2)   = 10    BETA(HAT)  = b = 2 `

133.

`A few months ago, Road & Track  magazine  compared  the  performance  ofabout 25 sports cars with respect to attainable  top speed and fuel eco-nomy.  Regressions were run to investigate how both  top speed and  fueleconomy were affected by the horsepower capability of  the engine.   Thefindings are summarized below.Where:  M(i) = miles per gallon of the ith car.        S(i) = top speed in miles per hour of the ith car.        H(i) = horsepower rating of the ith car, measured as the  actual               number of horses.  All cars tested had between 50 and 300               horsepower engines.Equation (1):  MHAT(i) = 30 - .05*H(i)      r**2 = .55Equation (2):  SHAT(i) = 60 + .20*H(i)      r**2 = .72(1)  Interpret the regression  coefficients  (slope  and  intercept)  of     Equation (1) precisely.(2)  What do the results suggest about the relative usefulness of horse-     power rating in predicting fuel economy on one  hand  and top speed     on the other hand?(3)  Why would you hesitate to use the regression results for predicting     the performance characteristics of cars which have less than  a  50     horsepower engine?(4)  Based on the results of equations (1) and (2), (as  well  as on in-     tuition), one could say that there exists a "trade-off" between top     speed and fuel economy;  i.e., in order to generate an  improvement     in one, you must sacrifice some of the other.   Compute  the magni-     tude of the trade-off;  the number of miles per hours we would pre-     dict would be sacrificed for each additional  mile  per  gallon  of     fuel economy. `
`Answer: (1)  Slope = -.05 indicating m.p.g. decreases with an inrease in horse-                  power.     Intercept =  30 indicating all cars got less than 30 m.p.g. and, in                  fact, since minimum horsepower was  50,  all  cars got                  less than 27.5 m.p.g.(2)  Horsepower is more  useful  in  predicting  speed,  (i.e.  r**2  is     larger).(3)  No such cars were in the sample, therefore one cannot safely extra-     polate unless one has  reason  to believe  (through  other  similar     experiments) that the relationship follows a similar linear pattern     below 50 h.p.(4)  20 horsepower per 1 mile change in economy, and 20 horsepwer would     increase speed by 4 m.p.h.  Therefore,  an increase of 1 m.p.g. in     fuel economy should be accompanied by a decrease of 4 m.p.h. in the     speed of the car. `

134.

`Once upon a time an investigator was concerned about fuel consumptionand speed of travel of automobiles. He measured miles per gallon,mpg, for static tests at 25 miles per hour (mph) and 55 miles perhour for many brands of cars.His report stated that a regression line had been fitted for eachbrand and that a straight line describes the relation betweenmpg and mph perfectly since r**2 = 100 for every brand.Do you subscribe to the claim that a straight line perfectlydescribes the relation of mpg to speeds between 25 and 55 mph?Why? `
`Answer: No, I don't, because he has fit regression lines with only twoobservations, so naturally his r**2 equals 100. You need at leastthree observations to test for(or to observe) departures froma straight line. Two points determine one and only one line. `

135.

`Sometimes a fitted regression equation will do a good job of explaininghow response varies with the independent variables measured and failmiserably to agree with theory or previous observations.  For example,an equation relating yield for a crop to rate of Nitrogen (N) applica-tion might fit well and indicate that increased N reduces yield.  Howcan there be such an inconsistency? `
`Answer: A regression equation that fits observed responses well is concernedwith summarizing the data set at hand.  If the message conveyed by thatequation doesn't fit with theory or previous experience, it may well bethat the data set at hand involves variable settings different fromthose envisioned by theory or encountered in previous experience.  Suchconflicts should not be dismissed quickly.  They probably indicate thatresponse is being observed under "new" conditions that warrant detailedcomparison with those that have been considered previously (These "new"conditions may be due to variation in factors other than those ordina-rily thought to be important.  e.g., If we obtained data relating thevolume of an ideal gas to pressure, we would obtain consistent resultsas long as all other factors were kept constant.  But, the results thatapplied under constant conditions would fail if we allowed, say, tempe-rature to vary.  These results would not fit previous experiences ortheory.  But, if temperature had been measured, these results could bereconciled with previous experience by further analysis). `

136.

`It is known that long-thin titanium curtain rods lengthen with in-creasing temperature.  A sample of n = 20 identical titanium rodsare selected.  Each is subjected to a particular uniform tempera-ture X for a specified time.  Let Y denote the change in lengthcorresponding to X.The readings are (X(1), Y(1)), ...(X(20), Y(20)), with data XBAR =2 (in hundreds of degrees fahrenheit), YBAR = 3 (in milli-inches),SUM(X - XBAR)**2 = 10, SUM(Y - YBAR)**2 = 40, SUM[(X - XBAR)(Y -YBAR)! = 16.The sample correlation coefficient r =(a)  4/5            (d)  1/4(b)  3/4            (e)  1/25(c)  1/5 `
`Answer: (a)  4/5     Y = SUM((X - XBAR)(Y - YBAR))/         SQRT(SUM((X - XBAR)**2)SUM((Y - YBAR)**2))       = 16/SQRT(40*10) = 16/20 = 4/5 `

137.

`Briefly discuss your evaluation of the following statement.    "Since the linear correlation coefficient (r) between IQ and     earning potential is near 0 there is no relationship between     the two." `
`Answer: The statement should probably be rewritten as:    "Since the linear correlation coefficient (r) between IQ and     earning potential is near 0 there is no linear relationship     between the two."This would better emphasize the fact that no linear relationship mayexist, however there remains the possibility that higher order rela-tionships may exist. `

138.

`Thirty patients in a leprosorium were randomly selected to be treatedfor several months with one of the following:A - an antibioticB - a different antibioticC - an inert drug used as a controlAt the end of the test period, laboratory tests were conducted toprovide a measure of abundance of leprosy bacilli in each patient.Scores obtained were:        Patient  1    2    3    4    5    6    7    8    9    10Drug A           6    0    2    8   11    4   13    1    8     0Drug B           0    2    3    1   18    4   14    9    1     9Drug C          13   10   18    5   23   12    5   16    1    20a.  Analyze and interpret the results of this trial (use ALPHA = .05).b.  Write a model appropriate to this trial.  Define all terms    and estimate all parameters.c.  To be consistent with the model you have specified, what order    should have been followed in collecting samples from patients    and in carrying out lab tests? `
`Answer: a.  Using the program CARROT*** you get the following results:    Means for Antibiotics    Treatment        Mean (leprosy bacilli)    Placebo          12.3    Drug A            5.3    Drug B            6.1    LSD (at ALPHA = .05) = 5.57113    LSD tests on means:        H(0):  XBAR(1) - XBAR(2) = 0        H(A):  XBAR(1) - XBAR(2) =/= 0    (XBAR(1) - XBAR(2)) +/- LSD    Placebo and drug B    6.2 +/- 5.57    Interval is from +.63 to +11.77    The interval does not contain zero, therefore, we reject H(0).    Since both antibiotic treatments are significantly different from    the placebo, one would conclude that they have a significant effect    on reducing leprosy bacilli.  However, antibiotics A and B are not    significantly different from each other.  The differences in their    means could be attributed to chance variation alone.b.  Model:  Y(I,J) = MU + TAU(I) + EPSILON(I,J)    Response = overall mean + treatment effect + error(assumed to have                                                      a mean of zero                                                      and variance =                                                      SIGMA**2)    Estimates:  MU                      7.9                TAU(1) drug A          -2.6                TAU(2) drug B          -1.8                TAU(3) Placebo          4.4                SIGMA**2               36.9c.  Suppose that the 30 patients had been assigned identification    numbers 1, 2, ..., 30.  Presumably, treatments were randomly    assigned to patients on the basis of these numbers.  To be con-    sistent with this set up, samples should have been collected    and analyses run according to the identification numbers    (i.e. sample collected first and analysis run first from patient    1 and so on).  this amounts to randomly assigning order of sam-    ple collection and order of lab analysis as well as patients to    treatments.  (It's preferable to collecting samples and running    analyses for all on Drug A, then all on Drug B, then all on    placebo.) `

139.

`An imaginary investigation was conducted to compare three differentlocations at which body temperature can be obtained.  Over a twomonth period, 20 occasions were randomly selected from a large setof times and beds that could reasonably be used in a ward.  On thoseoccasions, the temperature of the patient in the bed was taken atall three locations where the order of measurement was randomizedindependently with each patient.  (This imaginary experimentmust have been conducted by extraordinarily persuasive people.)Imaginary results were:Location:          1            2            3Patient   1        98.8         98.9         98.6          2        98.2         98.2         98.0          3        99.0         99.1         99.1          4       100.6        100.6        100.5          5        99.8         99.9         99.6          6       101.3        101.5        101.2          7       100.4        100.4        100.2          8       101.6        101.7        101.6          9        98.5         98.5         98.4         10        98.1         98.2         97.9         11        98.8         99.0         98.7         12       100.9        100.9        100.7         13        98.4         98.6         98.3         14        99.0         99.1         98.8         15       101.4        101.4        101.3         16        99.8         99.9         99.6         17        99.2         99.3         99.2         18        98.7         98.9         98.6         19        99.2         99.3         99.2         20       101.5        101.6        101.5a.  Summarize the results of this investigation (use ALPHA = .05).b.  What was the estimated variance (S**2)?  What was the standard    error of the difference between location means (S(dBAR))?c.  Set 95% confidence limits for the temperature difference between    location 1 and location 2. `
`Answer: a.  The analysis needed for this trial is for an RCB design with    patients corresponding to blocks.  It is found that the three    treatments (locations) all yield responses (temperatures)    significantly different from one another.    The results were:    ANOVA    Source                df            SS            M.Sq.    Total                 60        595929.1280    Mean                   1        595847.2000    Corrected total       59            81.9219    Location               2             0.4063        0.20313    Person                19            81.3906        4.28372    Experimental Error    38             0.1250        0.00329    Treatment means    Location 2            99.75    Location 1            99.66    Location 3            99.55    LSD (t*S(dBAR)) for these means at the .05 confidence level=.0366547    Location 2 - Location 1 = .09    Location 2 - Location 3 = .20    Location 1 - Location 3 = .11    All of the differences between treatment means are larger than    the least significant difference, therefore, I conclude that    all the treatments yield responses significantly different from    each other.    An F test of the blocking factor:    F(calc) = 4.28372/0.00329 = 1302    F(crit. df = 19.38; ALPHA = .05) == 1.85    indicates that the blocking factor was worthwhile.b.  The variance (S**2) is estimated by the mean square for experi-    mental error, which is 0.00329    The standard error of a difference between two means (S(dBAR)) is    computed using the formula:    SQRT((2*S**2)/r)    Therefore, S(dBAR) == SQRT((2*.00329)/20)                        = .01813836c.  Confidence limits for the difference between Location 2 and    Location 1:    (99.75 - 99.66) +/- t * S(dBAR)    .09 +/- 2.021 * 0.018138    95% confidence interval is from .053 to .127. `

140.

`Treatments A, B, C and D are to be applied to a field pictured below.3 blocks are to be used.  The soil is known to improve as we go fromeast to west.      --------------------------------------------------      ^                                                ^      ^                                                ^East  ^                                                ^  West      ^                                                ^      ^                                                ^      --------------------------------------------------      -->Soil gets better as we move in this direction-->1)  Show in the field layout a typical randomized block design.    (Indicate the 3 blocks.)2)  Write out a fixed effects model appropriate for the data    resulting from such a design and experiment.3)  Write the sources of variation and the degrees of freedom    for an ANOVA table associated with the experiment.4)  Write the expected mean square for treatments in the ANOVA    table and explain how this EMS can be used to justify ANOVA    F tests for the null hypothesis H(O):  All treatment means    are equal. `
`Answer: 1)          Block 1    Block 2    Block 3          ---------------------------------          ^    D     ^    B     ^    A    ^          ^    A     ^    C     ^    C    ^    East  ^    B     ^    D     ^    B    ^  West          ^    C     ^    A     ^    C    ^          ---------------------------------2)  Let Y(ij) be the observation for the jth treatment in block i.    Suppose Y(ij) = MU + BETA(i) + TAU(j) + EPSILON(ij), where MU,    BETA(i), and TAU(j) are fixed, BETA(i) is the effect of the ith    block, TAU(j) is the effect of the jth treatment, and EPSILON(ij)    is the random error associated with observation Y(ij).    Suppose also that SUM(i=1,3)(BETA(i)=0), SUM(i=1,4)(TAU(j)=0),    and EPSILON(ij) is distributed normally and independently with    mean zero and homogeneous variance, SIGMA**2.3)  ANOVA table    Source of      df    SS    MS    EMS    Variability    Total          12    Mean            1    Blocks          2   SSB    Treatments      3   SST   MST    SIGMA**2+(3/3)SUM(j=1,4)(TAU(j)**2)    Exp. Error      6   SSE   MSE    SIGMA**24)  If H(O) is true then TAU(j) = 0 for all j and SUM(j=1,4)(TAU(j)**2)=    0.  This means that MST and MSE are both estimating SIGMA**2.  Hence    an  F  ratio near 1 is compatible with H(0), and a large F ratio im-    plies rejection of H(0). `

141.

`Suppose that you are to test the effects of these fertilizer rates onstrength of cotton fiber.  Rates (Potassium source in pounds/Acre)        R1.      36        R2.      54        R3.      72        R4.     108        R5.     144Field plots to be used for this test are arranged as below:----    ----    ----    ----    ----  1       2       3       4       5----    ----    ----    ----    ----  6       7       8       9      10----    ----    ----    ----    ---- 11      12      13      14      15(The numbers identify experimental units)Randomly assign treatments within groups of 5 plots so that a randomizedcomplete block design will be appropriate. `
`Answer: The program RCBPLN*** can be used to generate a random assignment oftreatments to experimental units so that each treatment occurs oncewithin each block.  The following is an example:            1       2       3       4       5Block 1  R3     R5      R2      R4      R1            6       7       8       9      10      2  R1     R4      R5      R3      R2           11      12      13      14      15      3  R4     R2      R3      R5      R115 Experimental units numbered 1-15Any procedure which produces a random assignment of treatments withineach block is acceptable.  A random numbers table could be used where amethod of determining I.D. numbers corresponding to the 5 treatmentswas performed for each of the 3 blocks. `

142.

`As a researcher working for the milk industry you have been asked totest four feed rations (named A, B, C, and D) and their effect on milkyield for cows.  A total of 16 cows are available for testing purposes.These cows are not all alike, but do form four similar (homogeneous)groupings of four cows each.a.  Based on the information presented above, which of the more common    experimental designs would you choose for this investigation?    Explain.b.  Write a model appropriate to the chosen design and define all terms.c.  What constitutes an experimental unit?  Explain.d.  What are the treatments and how will you make comparisons among    them?e.  Produce a random assignment of treatments to experimental units for    the specified design. `
`Answer: a.  I would use a randomized complete block design because the cows form    four similar groups of four cows, and I think that responses may    vary among these groups.b.  Y(I,J) = MU + TAU(I) + RHO(J) + EPSILON(I,J)    Where Y is the response, (milk yield)          MU is the overall mean          TAU(I) are the treatment (ration) effects          RHO(J) are the block (group of cows) effect          EPSILON is the random element term with mean = 0 and                  variance = SIGMA**2c.  An experimental unit involves a cow being fed a particular ration    over the test period.d.  The treatments are the four feed rations, A, B, C, and D.  I would    feed these cows their assigned rations for the same length of time.    During this time period, I will record milk yield for each of the    cows at the end of the testing period.  I will compute means for the    treatments and a measure of the variance, and will then compute an    LSD with which I could compare the means.e.  I have obtained below one possible assignment of treatments to    experimental units.  Cows 1 through 4 are similar, 5 through 8 are    similar, etc.        Blocks:      1      2      3      4    Treatments:         1           2      6      9     13         2           3      5     12     15         3           4      7     11     16         4           1      8     10     14 `

143.

`An  experiment  involving  two calculating machines [the Curta (hand-operated) and the SR-51 (electronically-operated)!, with  the  formerdesignated  as  treatment  C  and  the  latter  as  treatment  S, wasconducted on 10 sets of 15 two-digit numbers.   The  yield  data  areseconds  required  to  square  and  sum the 15 numbers.  Since it wassuspected that the time reqired to square and sum a  set  of  numberswas  shorter for the second operation on the same set of numbers thanit was on the first, this source of variation was taken into  accountin designing the experiment. Each of the two treatments appeared fivetimes in the first order of performing the calculation and five timesin  the  second  order,  and  both treatments appeared on each set ofnumbers;  except  for  these  restrictions  the  allocation  of   thetreatments  was  random.   The  randomized  plan  for  executing  theexperiment and the data obtained are given below  (treatments  are  Cand S):------------------------------------------------------------------------^     ^             Set of numbers squared and summed            ^Order^^     ^----------------------------------------------------------^     ^^Order^  1 ^  2  ^  3  ^  4  ^  5  ^  6 ^  7  ^  8  ^  9  ^  10  ^Total^------------------------------------------------------------------------^     ^    ^     ^     ^   Time in seconds    ^     ^     ^      ^     ^^  1  ^  C ^  S  ^  C  ^  S  ^  S  ^  C ^  C  ^  S  ^  C  ^   S  ^     ^^     ^ 255^ 115 ^ 280 ^ 107 ^ 105 ^ 240^ 195 ^ 110 ^ 202 ^  85  ^ 1694^^     ^    ^     ^     ^     ^     ^    ^     ^     ^     ^      ^     ^^  2  ^  S ^  C  ^  S  ^  C  ^  C  ^  S ^  S  ^  C  ^  S  ^   C  ^     ^^     ^ 113^ 200 ^ 117 ^ 238 ^ 210 ^ 104^  90 ^ 200 ^ 105 ^ 180  ^ 1557^------------------------------------------------------------------------^     ^    ^     ^     ^     ^     ^    ^     ^     ^     ^      ^     ^^Total^ 368^ 315 ^ 397 ^ 345 ^ 315 ^ 344^ 285 ^ 310 ^ 307 ^ 265  ^ 3251^^     ^    ^     ^     ^     ^     ^    ^     ^     ^     ^      ^     ^^Mean ^ 184^157.5^198.5^172.5^157.5^ 172^142.5^ 152 ^153.5^132.5 ^  -- ^------------------------------------------------------------------------Overall mean:  YBAR(...) = 162.55Order means:   YBAR(1..) = 169.4 and YBAR(2..) = 155.7Treatment means:  YBAR(.C.) = 220.0 and YBAR(.S.) = 105.1Sum of squares of estimated random errors = SUM(eHAT(hij)**2)                                          = 2219.50a.  The experimental unit is:b.  The experimental design is:c.  The number of degrees of freedom for error is:d.  Show how to obtain eHAT(1C1) in terms of the numbers above.e.  Show how to compute S(e)**2 = estimated variance of a single    observation in terms of the numbers above.f.  For the above data, the estimated variance of a treatment mean    equals:g.  For the above data, the estimated variance of a difference between    two treatment means is:h.  The 95% confidence interval or interval estimate for the difference    between the two treatment mean is from:i.  How do the computations in the preceding statement change in com-    puting the 80% confidence interval?j.  The sources of variation in the above experiment are:k.  What effects are orthogonal to each other in the above design?l.  The coefficient of variation for a single observation is: `
`Answer: a.  The experimental unit is one set of 15 numbers.b.  The experimental design is a simple change-over (cross-over).c.  The number of degrees of freedom for error is (2-1)(10-2) = 8.d.  eHAT(1C1) = 255 - 169.4 - 184.0 - 220.0 + 2(162.55)              = 6.7e.  S(e)**2 = 2219.50/8            = 277.44f.  The estimated variance of a treatment mean = 2219.50/(8*10)                                               = 2219.50/80                                               = 27.74g.  The estimated variance of a difference between two treatment    means = (2*2219.50)/(8*10)          = 2219.50/40          = 55.49h.  The 95% confidence interfal for the difference between the two    treatment means from:        220.0 - 105.1 - [2.31*SQRT(2219.5/40)! to        220.0 - 105.1 + [2.31*SQRT(2219.5/40)!; or        from 97.69 to 132.11i.  The computations in the preceding statement change in that    2.31 is changed to 1.40.j.  The sources of variation are:  overall mean, bias, order effects,    set of numbers effects, calculator effects, and residual (random    error).k.  Effects that are orthogonal to each other include:        mean and bias are not, but mean and bias, order, set, and        calculator effects are orthogonal to each other.l.  The coefficient of variation = [SQRT(2219.5/8)!/[162.55!                                 = .1025 `

144.

`A test was conducted to compare the relative effectiveness of threewaterproofing compounds, (A,B,C).  A strip of cloth was subdividedinto nine pieces - - -        Left                  Center                 Right_____  _____  _____    _____  _____  _____    _____  _____  __________  _____  _____    _____  _____  _____    _____  _____  _____Each piece was considered to be an experimental unit, but it wassuspected that the pieces differed systematically from left toright in capacity to become waterproofed.  Accordingly, therandom assignments of compounds to experimental units was res-tricted so that: I.  Each compound was tested once in each set of three pieces (sets     are left, center, and right);  andII.  Each compound was tested once in each of the positions within a     set of three (once furthest left in a section, once in the cen-     ter of a section, and once on the right of a section).a.  Write a model appropriate to such a trial.b.  Analyze and interpret the following results for such a randomization     scheme:       Left                   Center                 Right_____  _____  _____    _____  _____  _____    _____  _____  _____B, 12  A, 15  C, 16    A, 11  C, 17  B, 10    C, 10  B, 12  A, 14_____  _____  _____    _____  _____  _____    _____  _____  _____(consider higher numbers as better) `
`Answer: a.  This is an LSQ design where the model is:    Y(I,J,K) = MU + TAU(I) + RHO(J) + KAPPA(K) + EPSILON(I,J,K)    Y is response, degree of waterproofing    MU is an overall mean for waterproofing    TAU(I) are the treatment effects    RHO(J) are the column effects, or piece position on cloth    KAPPA(K) are the row effects, or the position within the piece    EPSILON is the random error, assumed to be normally distributed             with mean = 0 and variance = SIGMA**2    Estimates of parameters        SIGMA**2 = 5.333        MU      13              RHO(1)  -2              KAPPA(1)  1.333        TAU(1)    .333          RHO(2)   1.667          KAPPA(2) - .333        TAU(2) - 1.667          RHO(3)    .333          KAPPA(3) -1        TAU(3)   1.333    Treatment means were:        C = 14.333        A = 13.333        B = 11.333b.  None of the differences among treatment means appear to be signi-    ficant;  they are all less than the LSD of 18.7148 (ALPHA = .01).    The F test for treatments (alternative test with higher Type II    error rate):        H(0):  TAU(1) = TAU(2) = TAU(3) = 0        F(calculated) = 1.3125        F(table, ALPHA = .01, df = 2,2) = 99,    also does not allow one to reject H(0).  In conclusion, it appears    that none of the compounds are significantly different from any    other at ALPHA = .01. `

145.

`A test has been conducted in which four tire brands have been testedusing 12 experimental units where an experimental unit consisted of onetire position on one car.  The random assignment of brands to experi-mental units was restricted so that each brand was tested once on eachcar.  Results (in amount of wear) were:         Front Right      Front Left       Rear Right       Rear LeftCar 1    D, 7.17          A, 7.62          B, 8.14          C, 7.76Car 2    B, 8.15          A, 8.00          D, 7.57          C, 7.73Car 3    C, 7.74          B, 7.87          A, 7.93          D, 7.80a.  Write a model appropriate to this trial and estimate all parameters.b.  Do any of the assumptions for this design make you uneasy?  Explain.c.  Analyze and interpret these results. `
`Answer: a.  The model is Y(I,J) = MU + TAU(I) + RHO(J) + EPSILON(I,J)    where Y is the response, tread wear          TAU(I) are the treatment effects, effects of tire brand          RHO(J) are the block effects, effects of car          EPSILON is the random error term with mean = 0 and                  variance = SIGMA**2          MU is the overall mean    Estimates of parameters:    MU(HAT)    = 7.79    TAU(A,HAT) = .0599 = .06    TAU(B,HAT) = .2633    TAU(C,HAT) = -.04667 = -.047    TAU(D,HAT) = -.27667 = -.277    RHO(1,HAT) = -.1175    RHO(2,HAT) = .0725    RHO(3,HAT) = .045    SIGMA**2 = .0419 with 6 df.b.  Using a randomized block (RCB) design makes me uneasy since I would    expect wheel position on car to also affect tread wear.  Therefore,    I would also block on wheel position as well as car and use a Latin    Square design.c.  Treatments means are:  B = 8.053,  A = 7.85,  C = 7.743,  D = 7.513    Only one difference is significant at the .05 level.  Tires B and    D are different since their difference is greater than the LSD.    (B - D) +/- LSD    .54 +/- .409    Interval is from .131 to .949    Since the interval does not include zero, we reject the null hypo-    thesis that the true difference is zero.    The F test for treatments fails.  This is the case where the LSD    indicates a significant difference while the F test of treatments    doesn't.  These procedures usually are different and usually have    different properties regarding Type I and Type II error rates.    Here, the LSD is more exposed to Type I errors and the F test is    more exposed to Type II errors. `

146.

`Write out the sources of variation and the degrees of freedom for thefollowing industrial experiment.  Mention also the name of the design.Three  machines  were  used  to produce parts made from four kinds ofmetal. Each machine made one part from each type of metal.  The orderwith  which  the metals were assigned to the machines was establishedthrough a randomization procedure. `
`Answer: Source of Variation         df-------------------         --Total                       12Mean                         1Metals                       3Machines                     2Residual                     6 (Metal x Machine)This is a randomized block experiment with metals playing the role ofblocks. `

147.

`An investigator has at his disposal a garden in which there are 16spaces for planting marigolds.  The investigator is persuaded that aplant will respond equally well (produce the same number of flowers) inany one of these spaces.  He wishes to compare 4 new marigold varieties.The design that matches his notion of the experimental material is:a.  Latin Squareb.  Randomized Blockc.  Completely Random `
`Answer: c., since his vision for uniform conditions is of equal response for allexperimental units. `

148.

`The model proposed to describe the responses measured in an experimentis:Y(i,j) = MU + TAU(i) + EPSILON(i,j)  i=1, 2, 3, or 4Where Y(i,j) is the number of flowers produced by a marigold plant jbelonging to the variety i.a.  What is TAU(i)?b.  What design corresponds to the model? `
`Answer: a.  The treatment effect of variety i, which in this case represents the    number of blossoms more or less than the overall mean produced by    a plant belonging to variety i.b.  Completely Random Design, since the model doesn't include any terms    for blocking factors. `

149.

`Five laboratories were invited to participate in an experiment totest the chemical content of four materials known to vary over therange of interest.  Each laboratory was given two samples of eachmaterial to analyze.  The results were:                                Laboratories           -------------------------------------------------------MaterialSpecimens       I          II        III         IV         V           -------------------------------------------------------    1          8,11       10, 8      7,10        9,12     10,13    2         14,19       11,15     13,11       10,13     17,19    3         20,16       21,18     21,20       22,25     24,22    4         19,13       11,12     17,15       19,17      9,11Perform the appropriate calculations to determine if there is anysystematic difference between laboratories. `
`Answer: ANOVA:Source of Variation        df        SS        MS        F-------------------        --        --        --        -Total                      40     9696.00Correction for mean         1     8761.00Laboratories                4       36.65     9.16     1.97Materials                   3      628.20Error                      20       93.00     4.65Interaction                12      176.55    14.71     3.16 *F(critical, ALPHA=.05, df=12,20) = 2.28In this case, the interaction is significant.  This probably masksthe systematic variation in laboratories, which turns out not tobe significant.To get a clearer picture, separate means were calculated for differ-ent laboratories and material interactions.ML (Material * Laboratory)        Mean Response--------------------------        -------------          11                           9.5          12                           9.0          13                           8.5          14                          10.5          15                          11.5          21                          16.5          22                          13.0          23                          12.0          24                          11.5          25                          18.0          31                          18.0          32                          19.5          33                          20.5          34                          23.5          35                          23.0          41                          16.0          42                          11.5          43                          16.0          44                          18.0          45                          10.0Mean  responses  were  plotted against materials for each laboratory.The graph indicated that the  lab  effects  depend  on  the  specificmaterial  being  considered.   No  lab is consistently different fromother labs for all materials.  For example, Lab 5 gives highest  meanresponse  for  material  one  and  two, but lowest for material four.Similarly, Lab 4 gives highest mean response for material four only.If available, consult file of graphs and diagrams that could not becomputerized for appropriate graph. `

150.

`A completely randomized design was used for an  experiment  on  lightintensity  in  foot candle power units for three types of lights (M =mercury vapor, L = low pressure sodium vapor, and H =  high  pressuresodium  vapor),  in  one  large parking lot.  Suppose the results ob-tained were:        ^       Treatment and Responses (Y(ij))      ^        ^                                            ^        ^       M              L             H       ^        ^--------------^--------------^--------------^        ^  Y(M1) = 12  ^  Y(L1) = 15  ^  Y(H1) = 20  ^        ^  Y(M2) = 10  ^  Y(L2) = 14  ^  Y(H2) = 12  ^        ^  Y(M3) = 11  ^  Y(L3) = 13  ^  Y(H3) =  8  ^        ^  Y(M4) =  9  ^  Y(L4) = 11  ^  Y(H4) =  7  ^        ^  Y(M5) =  8  ^              ^  Y(H5) = 23  ^--------^--------------^--------------^--------------^--------------        ^              ^              ^              ^  Overall meanTotals  ^  Y(M.) = 50  ^  Y(L.) = 53  ^  Y(H.) = 70  ^  = 173/14        ^              ^              ^              ^Means   ^YBAR(M.)=10.0 ^YBAR(L.)=53/4 ^YBAR(H.)=14.0 ^  = YBAR(..)--------------------------------------------------------------------The estimated treatment effects in terms of the above data arecomputed as:a.  tHAT(M) = _______________.b.  tHAT(L) = _______________.c.  tHAT(H) = _______________.The estimated random error effects in terms of the above data maybe computed as:d.  eHAT(M2) = _______________.e.  eHAT(L3) = _______________. `
`Answer: a.  10 - 173/14b.  53/4 - 173/14c.  14 - 173/14d.  10 - 10e.  13 - 53/4 `

151.

`In the past a chemical fertilizer plant has produced  an  average  of1100 pounds of fertilizer per day. The record for the past year basedon 256 operating days shows the following:      XBAR = 1060 lbs/day         S =  320 lbs/daywhere  XBAR  and  S  have  the  usual  meaning. It is desired to testwhether or not the average daily production has dropped significantlyover  the  past  year.  Suppose  that  in this kind of operation, thetraditionally acceptable level of significance has been .05. But  theplant manager, in his report to his bosses, uses level of significance.01. Analyze the data at both levels after setting up appropriatehypotheses, and comment. `
`Answer: H(O):  MU = 1100H(A):  MU < 1100Since n   = 256, use Z to approximate t.S(XBAR) = 320/SQRT(256)        = 320/16        = 20Z(calculated) = (1060 - 1100)/20              = -40/20              = -2Z(critical, ALPHA=.05, one-tailed) = 1.645Z(critical, ALPHA=.01, one-tailed) = 2.33Therefore,  H(0)  is rejected at ALPHA=.05 but continued at ALPHA=.01.It appears that the manager is trying to pull  a  fast  one  on  hisbosses  by  using  ALPHA=.01  and saying production has not dropped.However, if the traditional level of significance is used,  ALPHA=.05,there is evidence that indicates a drop in production. `

152.

`A test of the breaking strengths of two different types of cables wasconducted  using  samples of n(1) = n(2) = 100 pieces of each type ofcable.          CABLE I          CABLE II     -------------------------------------     YBAR(1) = 1925      YBAR(2) = 1905        S(1) = 40           S(2) = 50Do  the  data  provide  sufficient  evidence to indicate a differencebetween the mean breaking strengths of the two cables?  Use  ALPHA  =.10.  Assume SIGMA(1)**2 = SIGMA(2)**2. The tabular value is 1.65. `
`Answer: Z = (1925 - 1905)/[SQRT(1600/100 + 2500/100)! = 3.1Therefore the data indicates a difference. `

153.

`A standard method for determining the amount of active ingredient inpropellants is known to have a standard deviation SIGMA = .8.  Twonew propellants, assumed to be homogeneous, were tested five times.The results of these tests are:X(1):  63.2, 63.6, 62.7, 64.4, 63.1X(2):  62.2, 64.8, 62.2, 60.2, 61.1Test at the .05 level of significance whether there is a differencein the amount of active ingredient in the two propellants. `
`Answer: H(0):  MU(1) - MU(2) = 0H(1):  MU(1) - MU(2) =/= 0XBAR(1) = 63.4XBAR(2) = 62.1SIGMA(XBAR(1)-XBAR(2) = SQRT([SIGMA(1)**2!/[n(1)!+[SIGMA(2)**2!/[n(2)!                      = SQRT(.64/5 + .64/5)                      = .506Using a Z-test:Z = ((XBAR(1) - XBAR(2)) - 0)/SIGMA(XBAR(1) - XBAR(2))  = 1.3/.506  = 2.57The critical two-tailed value for Z with ALPHA = .05 is 1.96.Therefore reject H(0) at 5% significance level since 2.57 > 1.96. `

154.

`Suppose that you have been assigned to estimate the height of agroup of corn plants arranged in 4 rows with 50 plants in each row.You may take measurements of 10 plants.a.  Outline a method for obtaining a random sample in such a situation.b.  What advantages or disadvantages are in such a procedure? `
`Answer: a.  Assign numbers to plants (1 - 200).  Draw a random sample of size 10    using a random numbers table.  Simplest procedure is to use    sampling with replacement.b.  Advantage is that common formulas for mean and variance apply,    but it's a nuisance to have to number plants and use random    selection. `

155.

`In a random sample of flashlight batteries, the average  useful  lifewas 22 hours and the sample standard deviation was 2 hours.  How largeshould the sample size be if you want the mean of your sample  to  bewithin 1 hour of MU 90 times out of 100 in repeated sampling?     a.  25     b.  11     c.  90     d.  35     e.  both c & d. Since the calculated n is too small for thentral limit theorem to apply, choose n >= 30. `
`Answer: b.  11    n = [[2**2![1.645**2!!/[1**2!      = 10.8241     == 11 `