In March of 2020, thousands of scientists from around the world, including researchers from the Colorado Center for Personalized Medicine, united to answer a pressing and complex question: What genetic factors influence why some COVID-19 patients develop severe, life-threatening disease requiring hospitalization, while others escape with mild symptoms or none at all?
A comprehensive summary of their findings to date, published in Nature, reveals 13 loci, or locations in the human genome, that are strongly associated with infection or severe COVID-19. The researchers also identified causal factors such as smoking and high body mass index. These results come from one of the largest genome-wide association studies ever performed, which includes nearly 50,000 COVID-19 patients and 2 million uninfected controls.
The findings could help provide targets for future therapies and illustrate the power of genetic studies in learning more about infectious disease.
This global effort, called the COVID-19 Host Genomics Initiative, was founded in March 2020 by Andrea Ganna, group leader at the Institute for Molecular Medicine Finland (FIMM), University of Helsinki and Mark Daly, director of FIMM and institute member at the Broad Institute of MIT and Harvard. The initiative has grown to be one of the most extensive collaborations in human genetics and currently includes more than 3,300 authors and 61 studies from 25 countries.
Collaborative endeavor
To do their analysis, the consortium pooled clinical and genetic data from the nearly 50,000 patients in their study who tested positive for the virus, and 2 million controls across numerous biobanks, clinical studies and direct-to-consumer genetic companies, such as 23andMe.
The biobank housed in the Colorado Center for Personalized Medicine (CCPM), a partnership between non-profit health system UCHealth and the University of Colorado Anschutz Medical Campus, was one of four biobanks in the United States that contributed to these studies. The CCPM biobank is actively generating genomic data and combining insights with the electronic health record, comprising one of the largest institutional biobanks in the United States. Currently, CCPM has more than 180,000 research participants consented with active recruitment to develop a comprehensive clinical and research resource for hundreds of thousands of UCHealth patients.
“We were honored to take part in a study allowing us to understand genetic processes relevant to the COVID-19 pandemic, and to leverage the great resource that is the CCPM in such a collaborative endeavor,” said Chris Gignoux, PhD, director of research at CCPM and associate professor in the division of Biomedical Informatics and Personalized Medicine at the University of Colorado School of Medicine.
“Globe-spanning efforts like this one highlight the strengths and necessities of large-scale biobanks as open resources for the world," Gignoux said. "We are encouraged that other studies can easily leverage what was learned here, something we are certainly interested in continuing to pursue. We can also use this information to identify biological pathways relevant to disease, and hopefully identify novel candidates for future therapies.”
Because of the large amount of data pouring in from around the world, the scientists were able to produce statistically robust analyses far more quickly, and from a greater diversity of populations, than any one group could have on its own.
Harnessing diversity
Of the 13 loci identified so far by the team, two had higher frequencies among patients of East Asian or South Asian ancestry than in those of European ancestry, underscoring the importance of diversity in genetic datasets. “We’ve been much more successful than past efforts in sampling genetic diversity because we’ve made a concerted effort to reach out to populations around the world,” said Daly. “I think we still have a long way to go, but we’re making very good progress.”
The team highlighted one of these two loci in particular, near the FOXP4 gene, which is linked to lung cancer. The FOXP4 variant associated with severe COVID-19 increases the gene’s expression, suggesting that inhibiting the gene could be a potential therapeutic strategy. Other loci associated with severe COVID-19 included DPP9, a gene also involved in lung cancer and pulmonary fibrosis, and TYK2, which is implicated in some autoimmune diseases.
“We’d like to aim to get a good handful of very concrete therapeutic hypotheses in the next year,” Daly said. “Realistically, we will most likely be addressing COVID-19 as a serious health concern for a long time. Any therapeutic that emerges this year, for example from repurposing an existing drug based on clear genetic insights, would have a great impact.”