Research Article
Forecasting Host Cells for Recombinant Protein Expression
Hung Van Le*
Issue:
Volume 11, Issue 1, March 2026
Pages:
1-13
Received:
15 January 2026
Accepted:
26 January 2026
Published:
6 February 2026
DOI:
10.11648/j.bmb.20261101.11
Downloads:
Views:
Abstract: Selection of an appropriate host cell is a critical determinant of success in recombinant protein expression. In practice, host choice is still largely guided by individual experience, ad hoc consultation of the literature, and intuitive decision-making, often resulting in suboptimal expression outcomes and costly cycles of experimental trial and error. Despite several decades of accumulated empirical knowledge in the field, there is currently no systematic, evidence-based framework for forecasting host cell suitability from protein sequence and structural characteristics. The purpose of this study was to develop predictive models that enable rational selection of host cells for recombinant protein expression based on intrinsic protein features. To achieve this, we leveraged collective experimental experience embedded in publicly available structural data. Protein entries from the Protein Data Bank were curated and analyzed, and logistic regression approaches were applied to relate expression outcomes to a range of protein attributes, including structural parameters, stability indices, predicted subcellular localization, and post-translational modification requirements. Using these variables, we constructed and validated statistical models capable of forecasting expression preferences across four commonly used host systems: Escherichia coli, insect cells, mammalian cells, and yeast. Model performance was assessed using internal validation procedures, demonstrating that distinct combinations of protein features are associated with differential expression success among host types. In conclusion, this work provides an evidence-based and quantitative framework for predicting suitable host cells for recombinant protein expression. By translating accumulated empirical knowledge into practical predictive tools, the proposed models reduce reliance on subjective judgment and trial-and-error experimentation. To facilitate broad adoption, the models, together with user guidance, have been implemented in a publicly accessible web server, offering a practical resource to improve experimental efficiency and success rates in protein expression studies.
Abstract: Selection of an appropriate host cell is a critical determinant of success in recombinant protein expression. In practice, host choice is still largely guided by individual experience, ad hoc consultation of the literature, and intuitive decision-making, often resulting in suboptimal expression outcomes and costly cycles of experimental trial and e...
Show More