TY - JOUR
T1 - Using an Ensemble to Identify and Classify Macroalgae Antimicrobial Peptides
AU - Caprani, Michela Chiara
AU - Healy, John
AU - Slattery, Orla
AU - O’Keeffe, Joan
N1 - Publisher Copyright:
© 2021, International Association of Scientists in the Interdisciplinary Areas.
PY - 2021/6
Y1 - 2021/6
N2 - The rapid spread of multi-drug resistant microbes has lead researchers to discover natural alternative remedies such as antimicrobial peptides (AMPs). In the first line of defense, AMPs display a broad spectrum of potent activity against multi-resistant pathogenic bacteria, viruses, fungi, and even cancer. AMPs can be further characterised into families according to amino acid composition, secondary structure, and function. However, despite recent advancements in rapid computational methods for AMP prediction from various mammalian, aquatic, and terrestrial species, there is limited information regarding their presence, functional roles, and family type from marine macroalgae. In this paper, we present a promising two-tier ensemble of heterogeneous machine learning models that integrates seven well-known machine learning classifiers to predict AMPs from macroalgae. The first tier of the ensemble consists of a suite of binary classifiers that identify AMPs from protein sequence data which are then forwarded to a second-tier multi-class ensemble to characterise their functional family type. The two-tier ensemble was successfully used to identify 39 putative AMP sequences in 12 macroalgae species from three different phyla groups. The approach we describe is not limited to AMPs and can also be applied to search sequence data for other types of proteins.
AB - The rapid spread of multi-drug resistant microbes has lead researchers to discover natural alternative remedies such as antimicrobial peptides (AMPs). In the first line of defense, AMPs display a broad spectrum of potent activity against multi-resistant pathogenic bacteria, viruses, fungi, and even cancer. AMPs can be further characterised into families according to amino acid composition, secondary structure, and function. However, despite recent advancements in rapid computational methods for AMP prediction from various mammalian, aquatic, and terrestrial species, there is limited information regarding their presence, functional roles, and family type from marine macroalgae. In this paper, we present a promising two-tier ensemble of heterogeneous machine learning models that integrates seven well-known machine learning classifiers to predict AMPs from macroalgae. The first tier of the ensemble consists of a suite of binary classifiers that identify AMPs from protein sequence data which are then forwarded to a second-tier multi-class ensemble to characterise their functional family type. The two-tier ensemble was successfully used to identify 39 putative AMP sequences in 12 macroalgae species from three different phyla groups. The approach we describe is not limited to AMPs and can also be applied to search sequence data for other types of proteins.
KW - Antimicrobial peptides
KW - Ensemble classification
KW - Machine learning
KW - Macroalgae
UR - http://www.scopus.com/inward/record.url?scp=85105865248&partnerID=8YFLogxK
U2 - 10.1007/s12539-021-00435-6
DO - 10.1007/s12539-021-00435-6
M3 - Article
C2 - 33978916
AN - SCOPUS:85105865248
SN - 1913-2751
VL - 13
SP - 321
EP - 333
JO - Interdisciplinary Sciences – Computational Life Sciences
JF - Interdisciplinary Sciences – Computational Life Sciences
IS - 2
ER -