Recombinant Protein Expression

Recombinant Protein Expression

What is recombinant protein expression?

The expression of recombinant proteins refers to producing a specific protein in a laboratory setting using genetic engineering techniques. Scientists begin by isolating the gene that encodes the protein of interest. This gene is then inserted into a host organism, such as bacteria, yeast, or mammalian cells, using a circular piece of DNA called a plasmid. The host organism reads the inserted gene and uses cellular machinery to produce the desired protein.

This technique is widely used in research, medicine, and industry. For example, recombinant proteins are used to create life-saving drugs like insulin and vaccines, study protein function, and develop enzymes for industrial processes. The method allows for large-scale, efficient production of proteins that might be difficult or expensive to isolate from natural sources, making it an essential tool in biotechnology and modern science.

How to control protein expression?

Protein expression can be controlled at multiple levels, including transcription, translation, and plasmid copy number.

·      Transcriptional control involves the use of promoters, which determine the strength and timing of mRNA production. Strong promoters yield high mRNA levels, while inducible promoters enable precise regulation in response to environmental or chemical signals. Transcriptional terminators also play a role by ensuring efficient mRNA synthesis and preventing read-through transcription.

·      Translation is regulated by factors like ribosome binding site (RBS) strength, which affects the efficiency of ribosome attachment, and codon optimization, which ensures compatibility with the host’s tRNA pool for accurate and rapid translation. The stability of mRNA further influences protein yield by extending its lifespan in the cell.

·      The expression of a recombinant protein can be increased by having multiple copies of the gene, as this provides more templates for transcription. In bacteria, the gene copy number is typically controlled by the plasmid's replication origin, which determines the plasmid copy number per cell. In eukaryotic hosts, the number of gene copies inserted into the genome can similarly influence recombinant protein production, as higher gene copy numbers often correlate with increased protein yield.

Ultimately, the expression of a recombinant protein is determined by the interplay of transcription, translation, and gene copy number. These factors collectively influence the amount of mRNA produced, its efficiency in being translated into protein, and the availability of gene templates within the host cell.

Controlling transcription

Controlling transcription in a plasmid is primarily achieved through the promoter, which governs both the initiation and rate of mRNA synthesis. Promoter strength plays a critical role, with stronger promoters driving higher transcription levels and, subsequently, greater protein production. Additionally, transcription timing can be precisely regulated using promoters that respond to environmental cues. For example, inducible promoters are activated in the presence of specific molecules such as IPTG or arabinose, while repressible promoters are silenced by adding a chemical signal. In contrast, constitutive promoters continuously drive transcription and are not influenced by environmental conditions, providing consistent gene expression.

In eukaryotic systems, additional elements can be incorporated to control transcription more effectively. Enhancers are DNA sequences that amplify transcription levels by recruiting transcription factors to the promoter, increasing mRNA synthesis. Further, the inclusion of introns and poly-A tails in the mRNA transcript stabilizes it, prolonging its lifespan in the cell and increasing the total protein yield.

Other regulatory elements such as transcription terminators and insulators ensure precise transcription. Terminators prevent read-through transcription into adjacent genes, while insulators block interference from neighboring regulatory sequences, ensuring that only the gene of interest is expressed as intended.

By combining promoter design, transcriptional regulatory elements, and post-transcriptional stabilizing features, researchers can precisely control transcription, optimizing recombinant protein production for experimental, industrial, or therapeutic purposes. These strategies allow fine-tuning of both the quantity and timing of protein expression.

Controlling translation

Controlling translation is critical for efficient protein production, and several strategies can optimize this process. Codon optimization is a primary method for improving translation efficiency. Different organisms have distinct preferences for codons (genetic triplets that specify amino acids) based on the abundance of corresponding tRNAs. By redesigning the gene to use codons favored by the host organism, translation can proceed more quickly and accurately. This is especially important when expressing a gene in a heterologous host, such as producing human proteins in bacteria or yeast.

The translation initiation site also plays a pivotal role in regulating translation. In prokaryotes, the Shine-Dalgarno sequence is a ribosome-binding site located upstream of the start codon. Its strength and complementarity to the ribosomal RNA determine how effectively ribosomes are recruited to initiate translation. Similarly, in eukaryotic systems, the Kozak sequence flanks the start codon and influences ribosome recognition and binding. A strong Kozak sequence enhances initiation efficiency, leading to higher protein yields.

Additional strategies include regulating mRNA secondary structure, particularly near the ribosome-binding site or start codon, to ensure ribosome access. mRNA stability can be improved by adding stabilizing elements, such as 5' caps in eukaryotes or specific untranslated region (UTR) sequences, to extend transcript longevity. Internal ribosome entry sites (IRES) enable translation initiation independently of the 5' cap, allowing simultaneous translation of multiple proteins from one mRNA. Regulatory RNAs or RNA-binding proteins can enhance or repress translation in response to environmental or cellular conditions. Finally, host strain engineering, such as overexpressing tRNAs for rare codons or optimizing ribosome function, can significantly enhance translational efficiency. By combining these strategies, researchers can precisely regulate translation, improving protein yield and ensuring the production of functional recombinant proteins tailored to specific applications.

Controlling plasmid copy number

Controlling gene copy numbers is a fundamental strategy for regulating protein expression, as the number of gene copies directly influences the availability of templates for transcription. In prokaryotes, the plasmid copy number is determined by the plasmid’s origin of replication. High-copy plasmids, such as those derived from pUC, can produce hundreds of copies per cell, leading to robust protein expression. However, they may impose a metabolic burden on the host. Low-copy plasmids, like pSC101 derivatives, are better suited for applications requiring tighter control or reduced cellular stress. Modifying the replication origin or introducing copy number control systems allows fine-tuning of plasmid copy numbers based on the experimental needs.

In mammalian systems, episomal vectors provide a method for increasing gene copy number without integration into the genome. These vectors replicate independently, often in response to specific replication signals, allowing higher gene expression without altering the host genome. Episomal vectors, such as those derived from Epstein-Barr virus (EBV), are used for transient expression systems and enable multiple rounds of replication in dividing cells.

For long-term expression in mammalian systems, stable transfections involve integrating the gene into the host genome. Following transfection, cells are screened for clones with optimal gene copy numbers to ensure consistent protein production. This process often includes antibiotic selection to isolate stably transfected cells and quantitative screening (e.g., qPCR or fluorescence analysis) to identify clones with desirable expression levels. Over time, amplification systems, such as DHFR/methotrexate selection, can increase gene copy number at the insertion site.

By controlling plasmid copy number in prokaryotes, leveraging episomal vectors in mammalian cells, or selecting stable cell lines with the desired gene copy number, researchers can optimize protein expression for specific applications, balancing yield, stability, and host cell viability. These strategies allow precise tuning of gene expression to meet research or industrial production needs.

Expression hosts

The choice of expression host significantly impacts protein expression, influencing yield, folding, functionality, and the ability to perform post-translational modifications (PTMs). Bacteria, such as Escherichia coli, are widely used due to their rapid growth, simplicity, and cost-effectiveness. However, bacterial systems lack the machinery for most PTMs, such as glycosylation or phosphorylation, which can limit the functionality of eukaryotic or therapeutic proteins. Proteins expressed in bacteria may also misfold or form insoluble aggregates, requiring refolding steps.

Yeasts, such as Saccharomyces cerevisiae and Pichia pastoris, offer a compromise between prokaryotic and eukaryotic systems. They can perform some PTMs, such as glycosylation, although the patterns often differ from those in higher eukaryotes. Yeasts are also capable of producing higher yields and secreting proteins into the culture medium, simplifying downstream processing.

Mammalian cell lines, such as CHO, Vero, or HEK293 cells, are the gold standard for expressing complex proteins requiring authentic PTMs. These systems can accurately perform glycosylation, phosphorylation, and other modifications critical for the biological activity and stability of therapeutic proteins. However, they are more expensive and slower growing than bacterial or yeast systems.

The choice of host depends on the protein’s complexity and intended application. While bacteria are ideal for simple proteins, eukaryotic hosts are necessary for functional proteins requiring proper folding and PTMs.

Growth conditions

Growth conditions significantly influence protein expression by affecting the host organism's metabolism, gene transcription, and protein synthesis. Temperature is one of the most critical factors. Lowering the growth temperature can improve protein solubility and reduce the formation of inclusion bodies in prokaryotic systems like E. coli. Cooler temperatures slow down protein production, allowing proper folding and minimizing aggregation.

Nutrient availability is another key variable. Rich media supply essential amino acids, vitamins, and carbon sources, boosting growth rates and protein production. Conversely, nutrient-limited conditions may stress the cells, leading to reduced yields or unwanted stress responses.

Induction timing is particularly important when using inducible promoters. Adding inducers like IPTG too early or at high concentrations can overburden the host’s cellular machinery, leading to metabolic stress and lower overall yields. Optimizing the time and concentration of inducer addition ensures balanced protein production.

The pH of the culture also impacts protein expression, as extreme pH levels can denature the protein or affect the stability of the host cells. Maintaining optimal pH supports enzyme activity and cellular health.

Finally, dissolved oxygen levels are critical for aerobic organisms. Insufficient oxygen can limit energy production, reducing protein synthesis. Adjusting growth conditions like aeration, agitation, or feeding strategies optimizes protein yield, quality, and functionality.

Optimizing protein expression

Optimizing protein expression is a delicate process requiring the careful balance of multiple parameters, including transcription, translation, gene copy number, and growth conditions. While it may seem intuitive to maximize each factor, doing so often imposes an excessive burden on the host cell’s metabolism. This metabolic overload can lead to stress responses, reduced cell viability, and the accumulation of misfolded proteins, ultimately compromising the desired outcome. Furthermore, many recombinant proteins are toxic to the host, necessitating a determination of the maximum expression level the host can tolerate without detrimental effects.

Transcription is a key control point, where the strength and timing of the promoter regulate mRNA production. Overly strong transcription can saturate the cellular machinery, leading to mRNA degradation or inefficient translation. Similarly, translation efficiency, determined by codon usage, ribosome-binding site strength, and mRNA stability, must be optimized to ensure smooth protein synthesis without overwhelming the ribosomes or triggering stress responses. Gene copy number, influenced by the plasmid replication origin or genomic integration, adds another layer of complexity. High-copy plasmids produce more templates for transcription but can deplete cellular resources, reducing overall expression efficiency. Lastly, growth conditions, including temperature, nutrient availability, pH, and aeration, directly affect cellular health and the folding of expressed proteins. Suboptimal conditions can exacerbate protein misfolding and aggregation, further reducing yields.

The interaction between these parameters complicates optimization, as altering one often affects the others. For instance, increasing gene copy numbers may necessitate adjustments in growth conditions to mitigate metabolic stress. Similarly, optimizing transcription levels might require fine-tuning of translation rates to prevent bottlenecks in protein synthesis. Many recombinant proteins exacerbate these challenges by being inherently toxic to the host cells, requiring precise control of expression levels to avoid host cell death.

To address these challenges, researchers must employ experimental designs that systematically explore the interplay between these variables. A common approach is to construct libraries of plasmids, each designed with variations in promoter strength, ribosome-binding sites, codon usage, and plasmid replication origins. These libraries enable the testing of numerous combinations to identify the optimal parameters for a specific protein.

The data generated from such experiments can be used to build quantitative models that describe how the various factors influence protein expression. These models provide critical insights into the underlying biology and allow researchers to predict the effects of genetic modifications. Using these predictive tools, genetic designers can identify solutions that maximize protein yield while minimizing metabolic burden and cellular stress.

Ultimately, optimization of protein expression requires a combination of empirical experimentation and computational modeling. By balancing transcription, translation, gene copy number, and growth conditions while considering the specific characteristics of the recombinant protein, researchers can achieve efficient and sustainable protein production. This integrated approach not only enhances yields but also ensures the production of correctly folded and functional proteins suitable for research, industrial, or therapeutic applications.

Coexpression

The coexpression of two recombinant proteins presents a unique and complex challenge in expression optimization. This scenario often arises when producing proteins composed of multiple polypeptides, such as antibodies, which consist of heavy and light chains. For these proteins to assemble correctly and achieve full functionality, the expression of each gene must be finely balanced to maintain an optimal ratio of the individual components.

Unequal expression levels can lead to misfolded or incomplete protein assemblies, reducing yields of the functional product and wasting cellular resources. For example, an excess of one polypeptide may aggregate or be degraded, while insufficient expression of the complementary chain may limit the production of the final protein. This imbalance can also impose metabolic stress on the host cell, further reducing overall productivity.

Optimizing the expression ratio requires adjusting multiple factors, such as promoter strength, ribosome-binding site efficiency, codon usage, and gene copy number for each gene. Experimental designs involving larger plasmid libraries with varying expression levels for each gene are often necessary to explore these parameters systematically.

Achieving the correct balance is essential for maximizing the yield and functionality of complex recombinant proteins like antibodies, making coexpression a highly intricate optimization problem.

References

 

Strategies for optimization of heterologous protein expression in E. coli: Roadblocks and reinforcements

Recent Developments in Bioprocessing of Recombinant Proteins: Expression Hosts and Process Development

Combinatorial optimization of mRNA structure, stability, and translation for RNA-based therapeutics

Codon-optimization in gene therapy: promises, prospects and challenges

Conclusion

Schedule a call to discuss your protein expression project

GenoFAB supports protein expression projects by providing expertise in the design and assembly of plasmid libraries tailored to optimize expression. These libraries enable systematic exploration of key parameters such as promoter strength, ribosome-binding site efficiency, codon usage, and gene copy number. GenoFAB ensures that plasmids are precisely designed to meet specific project needs by leveraging advanced computational tools and experimental workflows. This approach accelerates the identification of optimal expression conditions, minimizing trial-and-error experimentation. Whether for single proteins or complex coexpression systems, GenoFAB empowers researchers to achieve efficient, scalable, and reproducible protein production, driving success in research and industrial applications.

 

 

Back to blog

Leave a comment

Please note, comments need to be approved before they are published.