On the topic
Early this year, the world population faced the onset and spread of the Coronavirus disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), following an impactful and still ongoing pandemic, which has led to more than one million and half deaths until 11 December 2020 according to World Health Organization. Since then the research community has united in the pursuit of scientific knowledge on COVID-19 in order to fully understand its pathological mechanisms and long-term consequences in survivors, and to develop new methods that could help to manage and reduce the associated health burden.
This study, called COVID-19-immunophenotyping (COVID-IP), claims to have identified a core peripheral blood immune signature that could distinguish between COVID-19 patients and other cohorts of patients suffering from different pathologies. This immunological signature could discriminate COVID-19 patients when compared to other groups and it could also distinguish these patients in terms of their severity, which could help to identify the most alarming cases. Finally, the study also proposes alternative immunological parameters to assess patients’ inflammation responses to this specific viral infection. These findings could help to expedite the medical screening process and focus hospital resources on higher-risk cases.
This paper generated a great discussion among our group. Its sequential process of analysis and gradual increase in complexity by linking previous results made for a good read. But ultimately, adjusting or not adjusting for multiple testing was the main topic under discussion.
A large number of statistical tests was executed in the study. The authors even detailed a total of 315 performed tests as a conservative estimate. It was also mentioned the high interdependency of the biological samples as well as the sequential nature of how the different statistical tests were performed. However, the authors decided not to perform any multiple testing correction as explicitly mentioned in the majority of the figures’ legends. We agree with this sensible decision, but such agreement requires a shift in a way we currently think about statistical testing and its reporting.
Given the exploratory nature of this study, the authors should have been crystal clear that they performed significance tests in the sense of Fisher, where it is only specified the null hypothesis and one only wants to know the plausibility of the data under this hypothesis; no action of acceptance/rejection about the validity of the hypothesis is taken. These tests are in contrast with the hypothesis testing in the sense of Neyman-Pearson, where there are both null and alternative hypotheses to compare and a decision has to be made about the validity of the null hypothesis. Multiple testing correction arises in the latter setting, because one adjusts p-values from different tests in order to be make a decision about the null hypothesis in each individual test.
In summary, this study was a very comprehensive analysis of immunological data on COVID-19, but it is unclear how the findings can be replicated and they can reach the clinic.