This article has been translated into العربية, বাংলা, Deutsch, Español, Suomi, Français, हिन्दी, Magyar, Italiano, Bahasa Melayu and Polski.
April 20, 2021
Written by: James Priest, Brooke Wolford, Nirmal Vadgama, Sophie Limou, M Medina Gomez, Atanu Kumar Dutta, Claudia Schurmann, Fauzan Ahmad, Jamal Nasir, Kumar Veerapen
Note: The COVID-19 Host Genetics Initiative (HGI) represents a consortium of over 2000 scientists from over 54 countries working collaboratively to share data, ideas, recruit patients and disseminate our findings. For a primer on our study design, please read our inaugural blog post. Our research is iterative, and we summarize our new results via blog posts and on the results section of our website. Finally, if any vocabulary here is unfamiliar, please send us an email at hgi-faq@icda.bio—we’d be happy to update the information here to provide more clarity
Infection with SARS-CoV-2 in humans produces a wide range of outcomes, from no symptoms or a mild flu-like illness to severe disease that can cause death. Older people or people affected with other health problems are at the greatest risk of death from SARS-CoV2 infection, but even younger people can be affected with severe disease or may die. The COVID-19 Host Genetics Initiative (HGI) was formed to try and understand how human genetic variation influences which people might be protected from or susceptible to severe disease from SARS-CoV-2 infection.
Our study focuses on understanding the relationship between genetic variation and disease severity and susceptibility in humans who are infected with SARS-CoV-2 (host genetics). Our study is not looking at the genetic code of the virus itself (viral genetics) which is being studied by other groups of fantastic scientists around the globe. Both efforts are important for informing the development of vaccines, treatments, and tests to detect SARS-CoV-2 infection.
The COVID-19 Host Genetics Initiative (HGI) was developed in March 2020, at the height of the COVID-19 global pandemic, from the leadership of Drs. Andrea Ganna and Mark Daly from the Institute of Molecular Medicine in Finland (FIMM) and the Broad Institute of MIT and Harvard.
Disclaimer: The findings in this study have yet to yield high confidence in predicting causation for severity and susceptibility of COVID-19 infection based on the genetic information of a person. Therefore, results released from the HGI are not for use to diagnose COVID-19 patients by their genotype.
The human genetic code is made up of 3 billion chemical letters (abbreviated A, T, G, and C), which encode everything from our eye color to our blood type. The genetic code between any two people is 99.9% identical, but the 0.01% that differs is called genetic variation. Almost all genetic variation is inherited from your parents, and almost all of it is shared within your own family and with your ancestors, and a bit of it with people from around the world.
Two common ways of “reading” the human genome are DNA sequencing and genotyping. In both methods, the genetic code from a person (in the form of the chemical DNA) is extracted from a sample, like blood or saliva, and we use chemical reactions to read the order of the letters (A, T, G, or C) like a book. From decades of genetic research we can easily identify when a section of a genetic material contains differences, and if those differences have been seen before in other genetic studies or large groups of people.
The ability to identify genetic variation allows us to study whether the regions of the genome containing those variants are associated with a disease. A simple and straightforward method that we use is called Genome Wide Association Studies (GWAS). Check out this video or infographic for an illustrated explanation of GWAS.
Using GWAS, we can test if genetic variation is associated with a disease. To answer this question, GWAS requires a simple comparison of the amount of genetic variation between one group of people with a disease and another group without a disease:
Does a group of people with severe disease have a different amount of genetic variation than a group of people without severe disease?
We can study COVID-19 with GWAS. For example, we can test if genetic variation across the genome makes a person more likely to need respiratory support in the hospital (one indicator of disease and symptom severity). At each location in the genetic code that may be different, we compare the counts of a genetic variant in cases (e.g., positive COVID-19 test and hospitalized with respiratory support) compared to controls (e.g., positive COVID-19 test and not hospitalized) (Figure 1).
Figure 1: Interpreting risk based on genotype observation (credit: Sophie Limou)
We feel more confident that a genetic variation is truly associated with a disease when the same pattern is observed in groups of people in the UK, Spain, the US, and Finland, for example. For the COVID-19 HGI, the results from individual GWAS studies are compared and combined in a method called meta-analysis.
Not necessarily. Results from our GWAS only tell us that we can observe this pattern of genetic variation correlated with COVID-19 susceptibility or severity in a large group of people. In addition, the genetic variation identified by our study may be related to more risk or less risk COVID-19 susceptibility or severity.
Most of the genetic variation identified in our study is related only to a very small increase or decrease in risk. Therefore, it is not yet possible to predict which individuals may have a more severe or less severe outcome if infected with COVID19. Finally, we would not recommend using your direct-to-consumer genotypes (examples; 23andMe, Ancestry.com) and the COVID-19 HGI findings to interpret your risk of COVID-19. The same public safety measures suggested by public health officials apply regardless if you carry identified “risk” variants or not. Always talk to a medical professional to guide your medical choices!
In short, no! The findings from our study don’t completely explain the variability in who gets more severe or less severe disease from COVID-19 infection. But the more data, the better, and the COVID-19 HGI is planning to repeat our meta-analysis on a regular basis with more studies contributing GWAS results involving more people. While larger studies regrettably mean that more people have become infected with COVID-19, it also improves our ability to find patterns between host genetics and disease outcomes. Additionally, there are several ongoing projects that are supported by the COVID-19 HGI, but require specialized approaches or are relevant to specific groups of people.
We are optimistic that our findings will identify more regions of the human genome associated with COVID-19 susceptibility and severity. As part of our contribution to the scientific community and to the public, the COVID-19 HGI will release the meta-analysis results on our website. This will allow other researchers to then perform experiments designed to better understand the biology behind the genetic associations.
The COVID-19 HGI contributors are already performing computational experiments to enable deeper understanding of the current findings. Other researchers perform experiments in human cells and in animals to better understand our current findings. Together, the goal is that this information may help understand which medications may help prevent or treat disease, identify groups of people at high-risk for disease, and otherwise improve the ability of the global community in coping with the COVID-19 pandemic.
The COVID-19 HGI is a collaboration of researchers from across the world who are each conducting independent genetic studies and contributing the results to our meta-analysis. We currently stand at approximately 3,033 researchers organized by the International Common Disease Alliance, representing an unprecedented mobilization of geneticists and making us one of the largest global efforts for understanding the genetics of a particular disease. To date, we have global contributors from 47 groups (Figure 2).
Figure 2: List of COVID-19 HGI contributors for data freeze release 5 Of the 47 contributing studies, 19 included non-European populations. Adapted from Andrea Ganna’s presentation on January 25, 2021. You can view all registered studies here and the acknowledgements for specific researchers from contributing studies here.
The meta-analysis is performed at the Institute of Molecular Medicine in Finland (FIMM). Data freezes will include new data from all registered studies that have actively contributed data in that cycle. We currently have genetic data from 19 countries with approximately 2 million individuals from different ancestral backgrounds (Figure 3). You can view all registered studies here. You can view the acknowledgements for specific researchers from contributing studies here. Although some of the individual contributing studies are funded by private companies, the results are independently obtained.
Figure 3. Overview of the studies contributing to the COVID-19 host genetics initiative and composition by major ancestry groups in meta-analyses. In data freeze 5, 19 studies contributed with non-European populations: 7 African American, 5 Admixed American, 4 East Asian, 2 South Asian, and 1 Arab. Diamonds show the effective sample size (sample size that will find statistically significant effect in scientific events) received from different geographical locations.
At this time, our pre-print is in the process of peer review. The short answer is that we are in this process of peer review at the point of this post. But you may be wondering: what is peer review?
Scientists often communicate their findings in a scientific manuscript and ask for feedback from a scientific journal. These journals call on other scientific experts in the field (scientific peers) to give their opinions on the manuscript and sometimes suggest changes; this is called peer-review. The process of peer review doesn’t mean that everything in the manuscript is completely correct; as new information is developed ideas are re-thought, but peer-review is an important part of science that helps research be the best it can be. Sometimes, the writing and peer-review of a scientific manuscript can take years and there may be a delay before being published. Therefore, it is important to view a study in the context of what is already known and accepted by other scientists when referring to a single peer-reviewed paper. Finally, many of these peer reviewed manuscripts cost money to read and would not be accessible to scientists, students, or members of the public who cannot afford to pay for a journal subscription.
While our team is made of experts in the field, and we make every effort to produce rigorous science, this work has not yet been peer reviewed. We are currently focused on making results readily available to the scientific community on the website on an ongoing basis. A guiding principle of our work is in the spirit of providing broad access to the emerging knowledge surrounding COVID-19. An article describing the COVID-19 HGI’s approach, but not results, was peer-reviewed and published in May. A pre-print version of our manuscript has been deposited here: this work presents the results from our genetic analyses but has not been peer-reviewed yet.
Thank you to Caitlin Cooney, CGC, Karen Zusi, Andrea Ganna, and Alina Chan for thoughtful feedback.