+1 917 8105386 [email protected]

ReMA | Quantitative Foundations | Biostatistic

Homework 10 POSTED: 12/11/2015 DUE DATE: 12/18/2015 (at 4:45pm placed in a box at the front desk on ARB floor 7) Please note that you will NOT get your graded homework before the final. If you would like to compare your homework to the solutions (posted 12/18/2015 at 5pm) please make a copy to keep. You will be able to collect your graded homework assignment in January. ? Please write your name on each page and staple (no paper clips) together the multiple pages of your assignment (you can’t use paper clips because they fall off too easily). ? Please SHOW ALL YOUR WORK for problems requiring hand calculations. You will receive partial credit for showing the steps along the way. A final answer with no work shown is not enough for full credit. ? Some hints on making the most of homework as a learning opportunity: o You can work in groups or discuss the problems with your classmates, but only in a spirit of learning. Do not simply “cut and paste” from others’ work. Your final submission must be strictly your own, though informed by collaborative group work. o If you do join a group to work on homework assignments, be sure to try all the homework problems on your own first, before meeting with your group. This way, you will have the opportunity to try to devise solutions on your own, without input from others. Then, when you get together, you can compare approaches. Problem 1 Researchers followed 481 subjects in a study of heart disease. They computed overall survival (defined as time from diagnosis of heart disease until death from any cause) for subjects over a period of 16 years. Of the 481 subjects, 249 die, and 232 are censored. These data have been read into STATA and analyzed to assess the relationship between biological sex and survival. In these analyses, follow-up time (lenfolyr) is recorded in years (ranging from 0.003 to 15.99), survival status (fstat) is coded as 1 for dead and 0 for alive, and biological sex (sex) is coded as 1 for women and 0 for men. Some of the output is provided below. Use the output and your knowledge of survival methods to answer the questions below. . tab fstat sex, chi Status as | of Last | Sex Follow-up | Male Female | Total -----------+----------------------+---------- Alive | 154 78 | 232 Dead | 133 116 | 249 -----------+----------------------+---------- Total | 287 194 | 481 Pearson chi2(1) = 8.3895 Pr = 0.004 a) Define “censoring” in survival studies. What is meant by the term “non-informative censoring” and why is it important in survival analysis? b) What proportion of men die during the study follow-up period? What proportion of women? Estimate the relative risk of death for women versus men from these data, and interpret. Write your answers using probability notation. 2 c) Is the 2x2 table chi-squared statistic the best approach for evaluating the difference described in part (b)? Why or why not? These data were analyzed using survival methods as well. Part of the output appears below. . stcox sex Cox regression -- Breslow method for ties No. of subjects = 481 Number of obs = 481 No. of failures = 249 Time at risk = 2284.887068 LR chi2(X) = X.XX Log likelihood = -1416.0443 Prob > chi2 = 0.0025 ------------------------------------------------------------------------------ _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- sex | 1.472518 .1872124 X.XX X.XXX 1.147733 1.88921 ------------------------------------------------------------------------------ d) What is the estimated median survival in men? In women? Which group seems to do better, based only on this comparison? e) What is the estimated proportion surviving to 5 years in men? In women? 0.000.250.500.751.00051015analysis timesex = Malesex = FemaleKaplan-Meier survival estimates 3 f) Suppose a colleague computes the logrank statistic for these data, and finds the value to be equal to 9.43. Help your colleague formally test whether the survival curves for men and women are significantly different. Remember to state the null and alternative hypotheses, note the value of the test statistic and p-value, state your decision, and interpret your decision using the wording of the problem. Use an alpha of 0.05. g) Provide an estimate of the incidence rate ratio of death for women versus men; interpret. Provide a 95% confidence interval for this value; interpret. How does the IRR compare to the RR you computed in (b)? Problem 2 Residents of three villages, each with their own water supply, were asked to participate in a survey to identify cholera carriers. Virtually all residents in the villages present during the study period underwent examination. The proportion of residents in each village who were carriers in each village was computed and compared. a) What type of study did the researchers conduct? (Choose the correct answer). i. Case-control study ii. Cross sectional study iii. Cohort study iv. Randomized Controlled Trial v. Ecologic Study b) The researchers want to test the null hypothesis that prevalence of cholera does not differ by village. What kind of test statistic can they use? Explain your thinking. c) Below are the data. Formally test the hypothesis that village is associated with cholera colonization. Remember to state the null and alternative hypotheses, compute the value of the test statistic, df, and critical value. State your decision and interpret your decision using the wording of the problem. Use an alpha of 0.05. Cholera Carriers Non-Carriers Total Village 1 47 109 156 Village 2 136 721 857 Village 3 108 305 413 Total 291 1135 1426 4 Problem 3 The Public Health Service studied the relationship between smoking and health, in a large sample of representative households. For men and for women in each age group, those who had never smoked were on average somewhat healthier than the current smokers, but the current smokers were on average much healthier than those who had recently stopped smoking. a) What type of observational study design is this likely to be? b) Why did they study men and women and the different age groups separately? Be brief in your response. c) The lesson seems to be that you shouldn’t start smoking, but once you’ve started, don’t stop. Please comment on this conclusion. Do you endorse it? Why or why not? Provide a concise response in 3-5 sentences. Problem 4 Please fill in the blanks below. Note that some blanks may require more than one-word responses. a) The odds ratio provides a good estimate of the relative risk when the disease/outcome in question is _____________. b) A random sample from a given population represents one in which all members of that population have _____________ chance of being chosen. c) When testing for association between a binary exposure and a binary outcome in a 2x2 table, it is probably best to use the chi-squared test only when all of the_____________ cell counts are greater than or equal to 5. d) A case-control study is planned to evaluate a protective exposure (one that decreases the risk of disease). The study has been designed to assure at least 80% power for detecting an odds ratio of 0.5 for a specified probability of exposure among controls. If the true odds ratio is 0.6, power will be _____________ 80%, assuming all else remains the same (e.g., sample size, exposure probability among controls, etc.). Problem 5 a) Define the ecological fallacy and provide an example, in 3-5 sentences. b) Define the atomistic fallacy and provide an example, in 3-5 sentences. 5 Problem 6 If you conduct a study, check and control for confounding, use the proper statistical tests, and interpret your p-values correctly, are you able to say your study does not suffer from bias? Briefly explain your thinking. Problem 7 Researchers are interested in studying the relationship between self-characterization as a night owl (stays up late) versus an “early bird” (goes to bed early) and IQ among adults aged 18-50. Previous research suggests that the mean IQ is 100 with a standard deviation of 15 among early birds (you may assume the same SD among night owls). Pilot data suggest that the true mean difference in IQ between night owls and early birds is 2.5 IQ points. a) Assuming an alpha level of 0.05, if the researchers would like to assure a level of power no lower than 80%, what is the minimum sample size required per group? Use the Stata output below to answer your question. sampsi 17.5 15, sd(15) power(.8) Estimated sample size for two-sample comparison of means Test Ho: m1 = m2, where m1 is the mean in population 1 and m2 is the mean in population 2 Assumptions: alpha = 0.0500 (two-sided) power = 0.8000 m1 = 17.5 m2 = 15 sd1 = 15 sd2 = 15 n2/n1 = 1.00 Estimated required sample sizes: n1 = 566 n2 = 566 b) How would we expect the required sample size to change from part a) if a new pilot study suggests that the mean difference in IQ between night owls and early birds is 5 IQ points, all else remaining constant? You need not do any calculations to answer this question. c) How would we expect the required sample size to change from part a) if the researchers decide they would like to have a minimum power of 90%, all else remaining constant? You need not do any calculations to answer this question. d) How would we expect the required sample size to change from part a) if the researchers decide to decrease the alpha level to 0.01, all else remaining constant? You need not do any calculations to answer this question.

Ready To Get Started?

GET STARTED TODAY