academic.oup.com
Open in
urlscan Pro
52.224.90.245
Public Scan
Submitted URL: https://academic.oup.com/aje/advance-article-abstract/doi/10.1093/aje/kwaa136/5869593
Effective URL: https://academic.oup.com/aje/article/190/2/191/5869593
Submission: On July 26 via manual from US — Scanned from DE
Effective URL: https://academic.oup.com/aje/article/190/2/191/5869593
Submission: On July 26 via manual from US — Scanned from DE
Form analysis
1 forms found in the DOMGET /Citation/Download
<form action="/Citation/Download" method="get" id="citationModal">
<input type="hidden" name="resourceId" value="5869593">
<input type="hidden" name="resourceType" value="3">
<label for="selectFormat" class="hide js-citation-format-label">Select Format</label>
<select required="" name="citationFormat" class="citation-download-format js-citation-format" id="selectFormat">
<option selected="" disabled="">Select format</option>
<option value="0">.ris (Mendeley, Papers, Zotero)</option>
<option value="1">.enw (EndNote)</option>
<option value="2">.bibtex (BibTex)</option>
<option value="3">.txt (Medlars, RefWorks)</option>
</select>
<button class="btn citation-download-link disabled" type="submit">Download citation</button>
</form>
Text Content
Skip to Main Content Advertisement Journals Books * Search Menu * * * Menu * * Navbar Search Filter American Journal of EpidemiologyThis issue Public Health and EpidemiologyBooksJournalsOxford Academic Mobile Enter search term Search * Issues * More Content * Advance articles * Editor's Choice * 100 years of the AJE * Collections * Submit * Author Guidelines * Submission Site * Open Access Options * Purchase * Alerts * About * About American Journal of Epidemiology * About the Johns Hopkins Bloomberg School of Public Health * Journals Career Network * Editorial Board * Advertising and Corporate Services * Self-Archiving Policy * Dispatch Dates * Journals on Oxford Academic * Books on Oxford Academic * Issues * More Content * Advance articles * Editor's Choice * 100 years of the AJE * Collections * Submit * Author Guidelines * Submission Site * Open Access Options * Purchase * Alerts * About * About American Journal of Epidemiology * About the Johns Hopkins Bloomberg School of Public Health * Journals Career Network * Editorial Board * Advertising and Corporate Services * Self-Archiving Policy * Dispatch Dates Close Navbar Search Filter American Journal of EpidemiologyThis issue Public Health and EpidemiologyBooksJournalsOxford Academic Enter search term Search Advanced Search Search Menu Article Navigation Close mobile search navigation Article Navigation Volume 190 Issue 2 February 2021 ARTICLE CONTENTS * Abstract * ACKNOWLEDGMENTS * REFERENCES * * Next > Article Navigation Article Navigation Journal Article Editor's Choice SURPRISE! Stephen R Cole, Stephen R Cole Correspondence to Dr. Stephen R. Cole, Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Campus Box 7435, Chapel Hill, NC 27599-7435 (e-mail: cole@unc.edu). Search for other works by this author on: Oxford Academic PubMed Google Scholar Jessie K Edwards, Jessie K Edwards Search for other works by this author on: Oxford Academic PubMed Google Scholar Sander Greenland Sander Greenland Search for other works by this author on: Oxford Academic PubMed Google Scholar American Journal of Epidemiology, Volume 190, Issue 2, February 2021, Pages 191–193, https://doi.org/10.1093/aje/kwaa136 Published: 10 July 2020 Article history Received: 03 April 2020 Revision received: 15 June 2020 Accepted: 07 July 2020 Published: 10 July 2020 * PDF * Split View * Views * Article contents * Figures & tables * Cite CITE Stephen R Cole and others, Surprise!, American Journal of Epidemiology, Volume 190, Issue 2, February 2021, Pages 191–193, https://doi.org/10.1093/aje/kwaa136 Select Format Select format .ris (Mendeley, Papers, Zotero) .enw (EndNote) .bibtex (BibTex) .txt (Medlars, RefWorks) Download citation Close * Permissions Icon Permissions * Share Icon Share * Facebook * Twitter * LinkedIn * Email Navbar Search Filter American Journal of EpidemiologyThis issue Public Health and EpidemiologyBooksJournalsOxford Academic Mobile Enter search term Search Close Navbar Search Filter American Journal of EpidemiologyThis issue Public Health and EpidemiologyBooksJournalsOxford Academic Enter search term Search Advanced Search Search Menu ABSTRACT Measures of information and surprise, such as the Shannon information value (S value), quantify the signal present in a stream of noisy data. We illustrate the use of such information measures in the context of interpreting P values as compatibility indices. S values help communicate the limited information supplied by conventional statistics and cast a critical light on cutoffs used to judge and construct those statistics. Misinterpretations of statistics may be reduced by interpreting P values and interval estimates using compatibility concepts and S values instead of “significance” and “confidence.” compatibility, confidence intervals, information, P value, random error, S value, significance tests, statistical inference Topic: * casts, surgical * inference Issue Section: Commentary Editor's note: The opinions expressed in this article are those of the authors and do not necessarily reflect the views of the American Journal of Epidemiology. A response to this commentary appears on page 194. Measures of information and surprise have a long history (see Good (1), chapter 16) but have seen little use outside the fields of engineering and mathematical statistics. Such measures of information and surprise attempt to quantify the signal present in a stream of noisy data. One such measure is the Shannon information, which, when seeing an event of probability p, is defined as s=log2(1/p)=−log2(p)s=log2(1/p)=−log2(p) and is also known as the binary surprise index or surprisal (2). It has been argued that this measure could aid interpretation of P values and interval estimates, especially when the latter are viewed as showing compatibility of data with hypotheses, rather than stronger notions of significance or confidence (3–5). Here we briefly illustrate these ideas in the context of interpreting P values as compatibility indices. THE PP VALUE AS A COMPATIBILITY INDEX A P value represents the chance of observing a data summary (test statistic) as extreme as or more extreme than what was seen, under a test hypothesis and auxiliary (background) assumptions. Perhaps the most common auxiliary assumptions are that the observed data are randomly sampled or treatment is randomly assigned within observed covariate levels, and that measurement error is negligible (6); regression models add further assumptions. Typically the test hypothesis is that a parameter is 0 (2-sided null) or is no greater than 0 (1-sided null), but other values can and should be tested besides 0 (3–5, 7; also see Rothman et al. (8), chapter 10). A P value is valid if it would have a uniform distribution when sampling data under the tested hypothesis given the auxiliary assumptions used to compute it. Such a P value can be interpreted as giving the percentile at which the observed data fell in this distribution. The P value can thus be taken as an index of compatibility between the data and the parameter values specified by the tested hypothesis given the auxiliary assumptions, ranging from p = 0 (data flatly contradict the hypothesis) to p = 1 (data are exactly as expected under the hypothesis) (3–5). A valid 95% confidence interval can be constructed as the set of all parameter values with p > 0.05 (see Rothman et al. (8), chapter 10). Therefore, the values of a 95% confidence interval have a compatibility index of 0.05 and above, and they comprise a 5%-or-more compatibility interval (3–5), which can also, like the confidence interval, be abbreviated using “CI.” (Some authors define a P value as a random variable P that is uniform under the test hypothesis and auxiliary assumptions, with p being the value of P in the observed data). Consider a recent randomized trial of lopinavir and ritonavir, versus standard care, in the treatment of severe coronavirus disease 2019 (9) which reported 19 and 25 deaths among 99 and 100 patients, respectively. The authors stated that “no benefit was observed with lopinavir-ritonavir treatment beyond standard care” (9, p. 1787), despite observing a 28-day mortality risk difference of −5.8% (i.e., 19.2% − 25.0%), with a 95% compatibility (“confidence”) interval ranging from −17.3% to 5.7%. This interval includes risk differences ranging from −17.3%, which represents a tremendous mortality benefit, to 5.7%, which represents a nontrivial increase in mortality. The statistics leave the hypothesis of no benefit (i.e., a causal risk difference ≥0≥0) as reasonably compatible with the data, with 1-sided p = 0.16 (from a z score of 0.9780 (−0.0581/0.0587), where 0.0587 (0.0587 = [0.057 − (−0.173)]/3.92) is an approximate standard error). But benefits up to a risk difference of −11.6% are even more compatible with the data, in that they have even higher P values than does no benefit. THE SS VALUE The S value provides a reinterpretation of the P value using a familiar mechanical framework for calibrating intuitions, one that is simpler and less abstract than effect estimation from statistical models. Envision a coin-tossing setup that we want to check for bias toward heads (as we might be advised to do if we were going to wager on tails from this setup). We check by tossing the coin s times. If we observe heads on every toss, the exact P value for the hypothesis of no bias toward heads is 0.5s0.5s, a special case of the fact that, for m heads in n tosses, the exact P value for the 1-sided hypothesis that “the probability of heads is no greater than μ” is ∑k=mn(nk)μk(1−μ)(n−k)∑k=mn(nk)μk(1−μ)(n−k) . The Shannon measure of the information against this hypothesis is then the binary surprisal −log2(0.5s)=s−log2(0.5s)=s, the number of heads in a row observed. Because s is computed using base-2 logs, its units are said to be bits (binary digits) of information (2, p. 32); other base units are possible (3). A key benefit of the S value is that it provides a simple coin-tossing framework for interpretation of P values and confidence intervals. Returning to the coronavirus example, the 1-sided P value of 0.16 for the no-benefit hypothesis yields an S value of 2.6 (−log2(0.16)=2.6−log2(0.16)=2.6). To place this result into our coin-tossing framework, a result of all heads in 3 fair tosses has a 0.125 (1 in 8) chance of occurring and thus does not seem terribly surprising (albeit it is more surprising than 2 heads in a row, where p=0.25p=0.25, and less surprising than 4 heads in a row, where p=0.0625p=0.0625). Therefore we say that, if there is no treatment benefit, the observed p=0.16p=0.16 is less surprising than seeing 3 heads in a row in 3 fair tosses (because 2.6 < 3). The P value for a benefit of 11.6% is equal to the P value for the no-benefit hypothesis, meaning that the data are equally compatible with (and would be equally surprising under) both hypotheses. These data would be even less surprising under risk differences between 0 and 11.6%. Viewing the compatibility interval of –17.3% to 5.7%, the data supply at most 4.3 bits of information (−log2(0.05)=4.3−log2(0.05)=4.3) against treatment effects ranging from a 17.3% reduction to a 5.7% increase in mortality, and all risk differences in this interval make the data about the same as or less surprising than seeing 4 heads (−log2(0.05)≈4−log2(0.05)≈4) in 4 fair tosses. Now consider P values of 0.10, 0.05, 0.01, and 0.005. The corresponding S values are 3.3, 4.3, 6.6, and 7.6, so with rough rounding, these P values should seem about as surprising as seeing 3, 4, 7, or 8 heads in a row from fair coin-tossing. Figure 1 provides the mapping from P values to S values. One may feel that p > 0.05 is unsurprising if the test hypothesis is correct, given s<4.3s<4.3. That judgment is fine; nonetheless, effect sizes with higher P values than the test hypothesis exhibit more compatibility with the data and have less information against them than does the test hypothesis. Thus, p > 0.05 is not a sufficient basis for claiming or acting as if the results support the test hypothesis or do not support alternatives, since such dichotomizations mask important distinctions. Figure 1 Open in new tabDownload slide The S value as a function of the P value. The S value is based on the same assumptions as those used to compute its source P value, and thus introduces no new technical or validity issues. While the computations are objectively determined by data and assumptions, their interpretations are subject to the limitations of human cognition. One should expect an event with chance 1 in 10 to happen in one-tenth of our observations, on average. If one hypothesizes that the event is as likely as not (i.e., chance 1 in 2), then one ought to feel no surprise if one sees 1 event in 2 tries (2-sided p=1,s=0p=1,s=0). The extent of our surprise ought to grow, as does the S value, as the data diverge from the hypothesis. Specifically, the S value grows by a unit for every halving of the PP value. Being a continuum, there is no particular S value cutpoint above which one ought to be “surprised.” Use of the P value or S value as a continuum is not as arbitrary as making a dichotomous comparison, say P < 0.05. A key point here is that the S value maps directly onto a standard game of coin-tossing, providing the highly heterogeneous set of human observers with an easily taught reference system, to help gauge the information content of studies. In conclusion, we advise that misinterpretations which remain standard in the medical literature can be reduced by reinterpreting P values and confidence intervals as indicators of compatibility with data (rather than as indicating significance, confidence, or support). In the above example (9), the authors used confidence intervals as significance tests, concluding that “no benefit was observed” because the 95% confidence interval contained the null value (equivalent to a null p > 0.05). But interpreting the P values and confidence intervals as compatibility values and intervals instead of significance tests shows that the results are 1) most compatible with a modest benefit and 2) imprecise and therefore highly compatible with a wide range of effects. We thus conclude that compatibility interpretations and S values can help communicate the limited information supplied by conventional statistics and can cast a critical light on the cutoffs used to judge and construct those statistics. ACKNOWLEDGMENTS Author affiliations: Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina (Stephen R. Cole, Jessie K. Edwards); Department of Epidemiology, Fielding School of Public Health, University of California, Los Angeles, Los Angeles, California (Sander Greenland); and Department of Statistics, College of Physical Sciences, University of California, Los Angeles, Los Angeles, California (Sander Greenland). This work was supported in part by National Institutes of Health grants K01 AI125087 and R01 AI157758. Conflict of interest: none declared. REFERENCES 1. Good IJ . Good Thinking . Minneapolis, MN : University of Minneapolis Press ; 1983 . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC 2. Shannon CE . A mathematical theory of communication . Bell Syst Tech J . 1948 ; 27 : 379 – 424 , 623–656 . Google Scholar Crossref Search ADS WorldCat 3. Greenland S . Valid P-values behave exactly as they should: some misleading criticisms of P-values and their resolution with S-values . Am Stat . 2019 ; 73 ( suppl 1 ): 106 – 114 . Google Scholar Crossref Search ADS WorldCat 4. Amrhein V , Trafimow D , Greenland S . Inferential statistics as descriptive statistics: there is no replication crisis if we don’t expect replication . Am Stat . 2019 ; 73 ( suppl 1 ): 262 – 270 . Google Scholar Crossref Search ADS WorldCat 5. Rafi Z , Greenland S . Semantic and cognitive tools to aid statistical science: replace confidence and significance by compatibility and surprise. BMC Med Res Methodol. 2020 ; 20 :Article 244. 6. Greenland S . Randomization, statistics, and causal inference . Epidemiology . 1990 ; 1 ( 6 ): 421 – 429 . Google Scholar Crossref Search ADS PubMed WorldCat 7. Poole C . Beyond the confidence interval . Am J Public Health . 1987 ; 77 ( 2 ): 195 – 199 . Google Scholar Crossref Search ADS PubMed WorldCat 8. Rothman KJ , Greenland S , Lash T . Modern Epidemiology . 3rd ed. New York, NY : Lippincott-Raven ; 2008 . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC 9. Cao B , Wang Y , Wen D , et al. A trial of lopinavir-ritonavir in adults hospitalized with severe Covid-19 . N Engl J Med . 2020 ; 382 ( 19 ): 1787 – 1799 . Google Scholar Crossref Search ADS PubMed WorldCat © The Author(s) 2020. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) Download all slides Advertisement CITATIONS 22 CITATIONS VIEWS 6,699 ALTMETRIC More metrics information × EMAIL ALERTS Article activity alert Advance article alerts New issue alert Receive exclusive offers and updates from Oxford Academic RECOMMENDED 1. Rothman Responds to “Surprise!” Kenneth J Rothman, American Journal of Epidemiology, 2020 2. Statistical inference through estimation: recommendations from the International Society of Physiotherapy Journal Editors Mark R Elkins et al., Physical Therapy, 2022 3. Statistics III: Probability and statistical tests Anthony McCluskey et al., BJA Education, 2007 1. Empirical Likelihood Based Diagnostics for Heteroscedasticity in Semiparametric Varying-Coefficient Partially Linear Models with Missing Responses Feng Liu et al., Journal of Systems Science and Complexity, 2021 2. A novel somatosensory spatial navigation system outside the hippocampal formation Xiaoyang Long et al., Cell Research, 2021 3. Computational Tools in Weighted Persistent Homology Shiquan Ren et al., Chinese Annals of Mathematics, Series B, 2021 Powered by * Privacy policy * Do not sell my personal information * Google Analytics settings I consent to the use of Google Analytics and related cookies across the TrendMD network (widget, website, blog). Learn more Yes No CITING ARTICLES VIA Web of Science (20) Google Scholar * LATEST * MOST READ * MOST CITED Sustained hypothetical interventions on midlife alcohol consumption in relation to all-cause and cancer mortality: The Australian Longitudinal Study on Women’s Health Known unknowns – steps towards the true annual risk of infection of M. tuberculosis Correcting the Narrative toward More Diverse & Inclusive Institutions Perspective on ‘Harm’ in Personalized Medicine Examining the Relationship Between Multilevel Resilience Resources and Cardiovascular Disease Incidence Overall and by Psychosocial Risks Among Participants in JHS, MESA, and MASALA More from Oxford Academic Medicine and Health Public Health and Epidemiology Books Journals LOOKING FOR YOUR NEXT OPPORTUNITY? ACADEMIC SURGICAL PATHOLOGIST , Vermont MEDICAL MICROBIOLOGY AND CLINICAL LABORATORY MEDICINE PHYSICIAN , Vermont CLINICAL CHEMISTRY LABORATORY MEDICINE PHYSICIAN , Vermont Scientific Director Bethesda, Maryland View all jobs Advertisement Advertisement Advertisement close advertisement * About American Journal of Epidemiology * Editorial Board * Author Guidelines * Facebook * Twitter * LinkedIn * Purchase * Recommend to your Library * Advertising and Corporate Services * Journals Career Network * Online ISSN 1476-6256 * Print ISSN 0002-9262 * Copyright © 2023 Johns Hopkins Bloomberg School of Public Health * About Oxford Academic * Publish journals with us * University press partners * What we publish * New features * Authoring * Open access * Purchasing * Institutional account management * Rights and permissions * Get help with access * Accessibility * Contact us * Advertising * Media enquiries * Oxford University Press * News * Oxford Languages * University of Oxford Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide * Copyright © 2023 Oxford University Press * Cookie settings * Cookie policy * Privacy policy * Legal notice Close Close THIS FEATURE IS AVAILABLE TO SUBSCRIBERS ONLY Sign In or Create an Account Close This PDF is available to Subscribers Only View Article Abstract & Purchase Options For full access to this pdf, sign in to an existing account, or purchase an annual subscription. Close Manage Cookies When you visit web sites, they may store or retrieve data in your web browser. This storage is often necessary for basic functionality of the web site or the storage may be used for the purposes of marketing, analytics, and personalization of the web site such as storing your preferences. Powered by Privado Save Oxford University Press uses cookies to enhance your experience on our website. By selecting ‘accept all’ you are agreeing to our use of cookies. You can change your cookie settings at any time. More information can be found in our Cookie Policy. Cookie settings Deny all Accept all