TY - JOUR
T1 - Common, low-frequency, rare, and ultra-rare coding variants contribute to COVID-19 severity
AU - WES/WGS Working Group Within the HGI GenOMICC Consortium GEN-COVID Multicenter Study
AU - Fallerini, Chiara
AU - Picchiotti, Nicola
AU - Baldassarri, Margherita
AU - Zguro, Kristina
AU - Daga, Sergio
AU - Fava, Francesca
AU - Benetti, Elisa
AU - Amitrano, Sara
AU - Bruttini, Mirella
AU - Palmieri, Maria
AU - Croci, Susanna
AU - Lista, Mirjam
AU - Beligni, Giada
AU - Valentino, Floriana
AU - Meloni, Ilaria
AU - Tanfoni, Marco
AU - Minnai, Francesca
AU - Colombo, Francesca
AU - Cabri, Enrico
AU - Fratelli, Maddalena
AU - Gabbi, Chiara
AU - Mantovani, Stefania
AU - Frullanti, Elisa
AU - Gori, Marco
AU - Crawley, Francis P.
AU - Butler-Laporte, Guillaume
AU - Richards, Brent
AU - Zeberg, Hugo
AU - Lipcsey, Miklós
AU - Hultström, Michael
AU - Ludwig, Kerstin U.
AU - Schulte, Eva C.
AU - Pairo-Castineira, Erola
AU - Baillie, John Kenneth
AU - Schmidt, Axel
AU - Frithiof, Robert
AU - Mari, Francesca
AU - Renieri, Alessandra
AU - Furini, Simone
AU - Montagnani, Francesca
AU - Tumbarello, Mario
AU - Rancan, Ilaria
AU - Fabbiani, Massimiliano
AU - Rossetti, Barbara
AU - Bergantini, Laura
AU - D’Alessandro, Miriana
AU - Cameli, Paolo
AU - Bennett, David
AU - Anedda, Federico
AU - Faulkner, M.
N1 - Publisher Copyright:
© 2021, The Author(s).
PY - 2022/1/1
Y1 - 2022/1/1
N2 - The combined impact of common and rare exonic variants in COVID-19 host genetics is currently insufficiently understood. Here, common and rare variants from whole-exome sequencing data of about 4000 SARS-CoV-2-positive individuals were used to define an interpretable machine-learning model for predicting COVID-19 severity. First, variants were converted into separate sets of Boolean features, depending on the absence or the presence of variants in each gene. An ensemble of LASSO logistic regression models was used to identify the most informative Boolean features with respect to the genetic bases of severity. The Boolean features selected by these logistic models were combined into an Integrated PolyGenic Score that offers a synthetic and interpretable index for describing the contribution of host genetics in COVID-19 severity, as demonstrated through testing in several independent cohorts. Selected features belong to ultra-rare, rare, low-frequency, and common variants, including those in linkage disequilibrium with known GWAS loci. Noteworthily, around one quarter of the selected genes are sex-specific. Pathway analysis of the selected genes associated with COVID-19 severity reflected the multi-organ nature of the disease. The proposed model might provide useful information for developing diagnostics and therapeutics, while also being able to guide bedside disease management.
AB - The combined impact of common and rare exonic variants in COVID-19 host genetics is currently insufficiently understood. Here, common and rare variants from whole-exome sequencing data of about 4000 SARS-CoV-2-positive individuals were used to define an interpretable machine-learning model for predicting COVID-19 severity. First, variants were converted into separate sets of Boolean features, depending on the absence or the presence of variants in each gene. An ensemble of LASSO logistic regression models was used to identify the most informative Boolean features with respect to the genetic bases of severity. The Boolean features selected by these logistic models were combined into an Integrated PolyGenic Score that offers a synthetic and interpretable index for describing the contribution of host genetics in COVID-19 severity, as demonstrated through testing in several independent cohorts. Selected features belong to ultra-rare, rare, low-frequency, and common variants, including those in linkage disequilibrium with known GWAS loci. Noteworthily, around one quarter of the selected genes are sex-specific. Pathway analysis of the selected genes associated with COVID-19 severity reflected the multi-organ nature of the disease. The proposed model might provide useful information for developing diagnostics and therapeutics, while also being able to guide bedside disease management.
UR - http://www.scopus.com/inward/record.url?scp=85123651761&partnerID=8YFLogxK
U2 - 10.1007/s00439-021-02397-7
DO - 10.1007/s00439-021-02397-7
M3 - Article
C2 - 34889978
AN - SCOPUS:85123651761
SN - 0340-6717
VL - 141
SP - 147
EP - 173
JO - Human Genetics
JF - Human Genetics
IS - 1
ER -