PROTEUS

#### BIOSTATISTICS:     PRACTICE PROBLEM solutions

 These are the solutions to the set of questions [1-56] for ENH440 as well as additional questions 56-78 which appear on THIS page below the answers.  Just keep scrolling.)  For most questions the simple numeric answers have been given; you will need to supply the interpretation for each situation.   Some of the more complex questions will also link to a more detailed solution.         EXACT 'P' RESULTS:  In some cases I have given the P value you should have extracted using the tables. In other cases an "exact" P value from a computerized calculation has been given to illustrate the interpretation of such a value. For example, if P is shown as = 0.0347 this still agrees with P < 0.05   but not with   P < 0.01.   Likewise, P = 0.0017 is clearly < 0.005 but not < 0.001.   We WILL be calculating ONE example of an 'exact P' as part of the "Fisher's Exact Test".

 1 OR: 1.94,  1.39,  1.19,  3.26 (CHI-SQ=4.06, P=0.04),  0.63  detailed solution 2 t = 1.2905,  248 df,   P > 0.05 3 only #4 4/5 F(treat) = 2.385,   F(age) = 5.474    F(Tr x Age) =  0.987 6 OR: 5.00, P=0.215, (FET) 7 y' = 150.252 - 1.423(x),  t= 11.917 8 CL: +0.2514, -0.2194  t=0.145,   328.8 9 River rafting ...  (F.Exact.T)   P= 0.0035      Detailed solution 10 y'= 0.72 + 0.27(x), y=2.88 cm, t=20.0 11 t=2.429  P<0.05,  76,  127,  178 μgPb/L 12 13 46 df,  P < 0.05 14 7.512  P<0.01,   4.814   P<0.05,     1.951   P>0.05 15 t = 3.015,  8 df,  P < 0.02    detailed solution 16 t = 3.953,  38 df,  P < 0.001 17 t = 1.272,  21df 18 t=2.494, 18 df 19 t=3.993, 5df 20 chi-sq = 22.1 21 t (age) = 1.62, 58 df, P>0.05;     t (weight )= 0.74, 58 df, P>0.05;      t (bp) = 6.04, 58 df, P<0.001 22 t = 0.5606,  11 df,   P = 0.59 23 t = 0.7845,  4 df,  P = 0.527 (this is paired) 24 t = 0.6978,  11 df,  P > 0.05 25 t = 1.736,   16 df,  P >0.05
26  Ho: "No difference between Nur and Eng students in terms of mean GPA"   P < 0.05
27   P > 0.05
28   chi-sq    3.525    P = 0.06    OR = 2.85
29   chi-sq    6.764    1 df,   P < 0.01   RR = 3.3 (protective)
30    t = 4.202, 18 df,  P < 0.001
31       t = 2.583,   22 df,   P < 0.02
32   (individual calculation)
33   ms(species) = 73.44,   F=4.54 ,  p>0.05        MS CHEM= 13.083,  F = 0.808   p>0.05
34
35  chi-sq = 7.18, 2df, P<0.05    (Actually P=0.0276)
36  F = 0.693,  5.403,   6.500
37  This is a paired t-test but requires a log transform because the data are exponential     t = 1.109, P=0.3306
38   MPchi-sq:0.93, P:0.335;   HPchi-sq:7.10, P:0.0077(prot);    ESFET: P:0.278
39 (1) RH r=0.75, y=0.654+0.0892, t=2.971, P<0.05;   (2)  VEL: r=0.74, y=-0.510+0.0884(x); t=2.879, P<0.05   (3) CORR betw RH&Vel: r=0.22
40  please change 6.0 to 60, and 7.0 to 70]   Now this is correct:   Pb = 233.4 - 31.73(pH)  = 233.4 - 31.73(pH)
41  chi-sq: 4.43,   P+0.035,  OR 3.18,  CL: 0.94 to11.15
42    t = 3.315,  7 df,   P = 0.0105
43  F=10.939,  2,12 df, P=0.002301     detailed solution
44  (chi-sq goodness of fit)(chi-sq goodness of fit)   (You don't need this for the exam)
45  F(freq):8.44, P<0.01  F(size): 11.25, P<0.01   F(Freq x Size): 4.85, P<0.05  see detailed solution here
46  b= 0.1429,    CL: 0.1074 to 0.1783,   t = 34.65,  4df,    P <0.001
47  d= 2.688,  t= 9.622,   7df,  P <0.001

48

#### (a) 0.13, (b) 0.81, (c) 0.61, (d) 0.18, (e) >.05, <.002,  >.05,  >.05  (f) best return in 2 yrs   detailed solution

49   inverse,   r=0.58,  10 df,  P<0.05
50 Fisher's Exact Test      P = 0.0204
 51 đ = 7.4,   t = 2.608,   P = 0.025 52 F= 6.17;  P<0.01 53 Completed in class.   t(unpaired) = 1.077   t(paired) = 8.624  (The correct method is paired)  detailed solution 54 (a) X² 0.47, P=0.49;    (b) X² 17.30, (prot)  P=0.000032,    (c) X² 9.49, P=0.00207 55 (a) X² 0.08, P=0.78;    (b) X² 0.82, P=0.365,    (c) (error:please omit) 56 X² 6.94,  P=0.0084 (or "P<0.01");    OR: 2.37
Page 27

### [B]  F/ratios   =  15.79,   5.85,   63.16     detailed solution

ANSWERS TO EXTRA QUESTION SET #57- 81  (Scroll down for these questions)

 57 OR: 1.0, 1.17, 0.63, 2.00, 5.44  X?11.1, P=0.0009   (Scroll down for question) 58 (c) at  250M, Y= 480 ppm,  at  500M, Y= 182 ppm,  (d)  t=9.88     (Scroll down for question)  please note the changes here 59 (a) >0.05    (b) no   (d) 26                         (Scroll down for question) 60 t = 9.7,  98 df,   ss  P<0.001                                       (Scroll down for question) 61 RR=4.02   ChiSq: 18.9,  1 df    P=0.000014  (P<0.001)   Exposed personnel >4x as likely to develop lung ca.  reject Ho.      (Scroll down for question) 62 Only expected values are important here.  2 of 9 is 22%, so >20% cells have E values 5 or less.  ChiSq not valid 63 lead(ppm) = 724 - 0.899 M    6df,   t = 9.90,   P <0.001     (inverse rel.  stat sig.)        (Scroll down for question) 64 removed. (essentially the same as #61) 65 HC% = 47.19 - 0.3695 (ppm LEAD).  t = 13.5,   11df,    P <.001,   r= - 0.97,  r2 = 0.94    (Scroll down for question) 70A SOLUTION: We have here TWO variables. The independent variable is the 'treatment' and is a categorical variable with two levels (a hand barrier cream either antiseptic or not).  The dependent variable is presence or absence of E. coli, clearly a categorical variable, with two levels.  So the data will be displayed as a 2x2 table.  Analysis would be by ChiSquare. (Of course odds ratio would be useful for explaining the strength and direction of the association). 70B SOLUTION: The independent variable is the same but the dependent variable is not continuous.  This is a candidate for either 1-way ANOVA or unpaired t-test, either of which would be applicable. 70C SOLUTION: This introduces a second independent variable, also categorical, with two levels.  If the dependent variable remains as in 70B (continuous), then this is a 2-way ANOVA, and with 240 workers, there will obviously be a number in each group, allowing the "factorial design with interaction". 70D SOLUTION:   Now this takes on a complicated arrangement.   If the arrangement in 70B is used, we have a pre-post test (paired t-test) with all 240 people tested before and after using the antiseptic hand cream, thus 240 pairs of data, (239df).     But if 70D is used in this way we have TWO independent variables, and the solution is beyond the scope of this course, but might include Raw foods pre-post and Cooked foods pre-post.  A t-test in either case would be used. 71 SOLUTION:  The data would be appropriately analysed by means of the paired t-test.  Note that the 'pairing' is taking place on the water samples, each of which is being assessed using BOTH tests.  Thus the 15 water samples with known lead content produces 30 separate results, or 15 pairs.   (14 df) 72 SOLUTION: t-test for unpaired data.  (18 df) 73 SOLUTION: Chi-square analysis in 2x2 contingency table.  True relative risk is appropriate here because you DO have the true incidence data. The two exposure groups were all healthy at the start of the study 30 years ago, and have been followed.  Therefore you have the incidence data.  RR = Ie/Io or Incidence rate for exposed group over the Incidence rate for non-exposed group. 74 SOLUTION: Two Variables: Ind.var is categorical with 3 levels.  Dep.var is continuous.   1-way ANOVA 75 SOLUTION: first Ind.var is categorical (3 levels), second Ind.var is also categorical with three levels, Dep.var. is continuous.  Two-way ANOVA.  Block design if only one obs per group, or Factorial if >1 obs per group 76 SOLUTION: Chi-square analysis in 3x3 contingency table.  But things can get complex.  If we do this and just have the total count in each cell, we get a test of the relationship between haz types and training groups.  (No pass/fail)  So we may stratify - and that is beyond the scope of this course 77 SOLUTION: Chi-square analysis in 2x3 contingency table. 78 Click here for detailed solution 79 F= 23.53,   2, 33 df,    P<0.01 detailed solution 80 F= 3.190,   2,12 df,    P>0.05  detailed solution 81 detailed solution
##### 24

70.  What method would you use for these? (70 a-b-c-d-)

70A... A study of effectiveness of new skin antiseptic barrier cream to be used in food manufacturing.  Two hundred & forty workers are recruited and randomized into two groups - one group to receive the antiseptic cream, and the other group to receive a similar-looking product without antiseptic action.  Each day a hand swab test is taken from the workers and analyzed for E. coli  which are reported as "present" or "absent"

70B.... The above study but the outcome (E.coli) is required as a count.

70C...Someone has suggested that those people working with raw foods would be exposed to a much greater bacterial load, so a further division is made between "raw" and "cooked" foods.  How would this change the analysis?

70D....As an alternative to the basic study in (70C) above, the investigators contemplate taking all 240 workers and letting them work for a month (with hand swabs every day) before they are asked to use the antiseptic hand cream daily.  They are then swabbed daily for another month.  What is the statistical analysis you would recommend here?

71... A filter system for removing lead from the water supply is being compared with a conventional filter system.  Fifteen water samples with known (but different) concentrations of lead are passed through each of the filters, and the resulting filtrate is examined for lead content.  There are 30 samples tested in this way.  What method of analysis would you use here?

72....The number of new cases of diarrhoea in infants in a six month period is the outcome measure used in a study of the effectiveness (if any) of a strict double-handwash procedure after changing diapers.  Ten centres are selected and invited to participate in the trial, along with another 10 which will undertake normal handwashing practices.

73...  In 1975, during the replacement of fuel rods at a nuclear plant, 32 workers were accidentally exposed to radiation for about 4 hours.  They have been followed and their health outcomes compared to a group of 48 welders and pipe fitters in a conventional power station.  After thirty years, the 8 of the nuclear plant workers and 9 of the conventional plant works have been diagnosed with some form of cancer.  What is the appropriate risk measurement?  What is the method of analysis to be used?

74    Three methods of training ESL workers  on WHMIS procedures are being compared: lecture, power-point, and animated cartoon.  The outcome is a knowledge test score out of 40.   What method of analysis is to be used here?

?/fo

75    Same three methods of WHMIS training but this time someone suggests the type of work may play am important part.  So the workers are divided into low hazard, medium hazard, and high hazard jobs. What method of analysis is to be used here? ?/fon

76    Same project comparison between three WHMIS training groups x three types of hazard with which the people are working, but this time the dependent var is just ‘pass/fail' the standard test. What method of analysis is to be used here?

?/fo

77      A herbal treatment for contact dermatitis is being evaluated against the conventional treatment of corticosteroids and emollient creams. The outcome after 7 days is recorded as improved? unchanged? or worse? What method of analysis is to be used here?

78        You are investigating the extent to which INDOOR air quality (IAQ) in office buildings without openable windows compares to air quality in similar buildings equipped with windows that open for 5 hours/day. Measurements are taken in three types of EXTERNAL air quality (EAQ). The dependent variable is the count of suspended particles (10 microns or less) per 10 cc. Complete the ANOVA table and summarize.

 GOOD MODERATE POOR CLO OPN CLO OPN CLO OPN N=10 N=10 N=10 N=10 N=10 N=10 MEAN=49.6 MEAN=36.3 MEAN=49.4 MEAN=45.6 MEAN=49.7 MEAN=54.5
 ANOVA TABLE
 source SS df MS F P all groups 1842.53 EAQ 810.13 wind O/C 240.00 interaction residual
 total 2570.73

 # 79    Three methods of teaching infection control are being compared using three groups of students selected at random.  Their scores out of 20 are shown.  Is there a significant difference between the means of each group?   Explain clearly. Method A: Method B Method C 5.0     9.0     5.0     8.0     5.0     7.0    6.0    7.0    6.0    6.0    6.0    7.0   12.0   6.0    11.0    7.0    10.0    7.0    10.0    8.0    10.0    9.0    9.0    9.0   3.0    3.0    8.0     7.0     4.0     4.0     6.0     4.0     5.0     5.0     4.0     4.0

 # 80    The pH of three types of canned tomatoes are being compared using the pH readings are shown.  Is there a significant difference between the means of each group?   Explain clearly. Method 1: Method 2 Method 3 1.9,   2.4,   2.5,    2.6,  3.0 2.0,  2.1,   3.0,   3.1,   3.3 2.8,   2.9,   3.3,   3.3,   3.8

NEW

 81.  In the following ANOVA analysis, two systems of purifying water (‘old’ and ‘new’) are being compared, with the pH of the water also being studied. The dependent variable is the ppm of the contaminant (least is better). Twenty-eight separate measurements have been made. Note that you do not have the original data, just the means, total, and N for each group.  The ANOVA table has been completed and is shown below. NEW EQUIPM OLD EQUIPM Low pH High pH Low pH High pH TOTAL MEAN N 174.00 24.86 7.00 172.00 24.57 7.00 303.00 43.29 7.00 238.00 34.00 7.00
 source SS DF MS F P ALL GRPS EQUIP pH EQ X pH RESIDUAL TOTAL 1850.11

_________________________________________________________________

[a]   Enter the missing values in the ANOVA table above

[b]   Now select the single correct statement that summarizes the analysis

A. Old equipm and new equipm perform equally in low pH situations

B. New equipm is not affected by pH, but performs consistently better than the old

C. In high pH older equipm performs better, but in low pH new equipm is better

D. Old equipm and new equipm perform equally in high pH situations

E.  The performance of the old equipment is not affected by pH

 82