Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS
Cite
Citation

Résumé

Protein thermostability is one of the most important features of bio-engineered proteins with significant scientific and industrial applications. Unfortunately, obtaining thermostable proteins is both expensive and complex. Recent advances in Protein Language Models (pLM) offer promising framework for sequence-to-sequence problems, especially in the realm of protein thermostability prediction. In this work, we present EsmTemp, a transfer learning model based on the ESM-2 pLM architecture. EsmTemp undergoes training on a meticulously curated dataset comprising 24,000 protein sequences with known melting temperatures. A rigorous evaluation, conducted through a 10-fold cross-validation, yields a coefficient of determination () of 0.70 and a mean absolute error of 4.3C. These outcomes highlight how pLM has the potential to advance our understanding of protein thermostability and facilitate the rational design of enzymes for various applications.

Détails

Actions

PDF