Information

Cloning a coding gene into a non-expression vector


Does it make any sense to clone a CODING gene into a NON-expression vector? doing this will only give us multiple copies of the gene, while we could run PCR instead (Let's say we know the gene sequence)


In principle, you can amplify any gene with PCR and then clone it into a vector of choice. However, it is often not as easy. Most things can be done with expression vectors, but sometimes it is easier to clone a sequence first and then subclone it.

Some sequences are hard to amplify, especially when they get long or very long. Amplifying 2,5kb can be tricky, digesting and subcloning such a piece might be easier. Additionally yields of PCR can be low, while you get a lot of DNA from a miniprep.

You might want to introduce additional restriction enzyme recognition sites (one can easily be done via PCR, but more again is tricky), linker or additional tags, which are not present in all expression vectors. Then it is easier to simply do a first cloning step into the non-expression vector and then take the complete insert into the expression vector.


Process of Gene Cloning

Gene cloning is the process of making multiple, identical copies of a particular piece of DNA. Using this technique we can isolate and clone single copy of a gene or DNA segment into an indefinite number of copies, all identical.

Gene cloning or molecular cloning is an umbrella term that encompasses a number of experimental protocols leading to the transfer of genetic information from one organism to another.

Steps of gene cloning:

Following are the steps of gene cloning:

2. Cutting of DNA at specific locations

3. Amplification of gene of interest

4. Insertion of recombinant DNA into host cell

5. Selection of recombinants

Fig: Steps of Gene Cloning

Isolation of Genetic Material (DNA)

For molecular cloning, both the source DNA that contains the target sequence and the cloning vector must be consistently cut into discrete and reproducible fragments. Several methods such as mechanical shearing, restriction digest, c dna synthesis, chemical synthesis, etc., are helpful in the isolation of the DNA fragments.

Cutting of DNA at Specific Locations

The vector and the target DNA fragment can be separately digested with the same restriction enzyme. The digested vector and the target DNA fragment are then incubated together in the presence of DNA ligase enzyme. Incubation results in bonding two types of DNA by phosphodiester bonds between them. Thus, deoxyribose-phosphate backbones of vector molecule and the target DNA fragment link covalently , forming a recombinant DNA molecule.

Another possibility in this experiment is the rejoining of the sticky ends of the vector molecule itself, forming a circular vector DNA molecule that is without foreign DNA molecule. Treating the digested vector with alkaline phosphatase or use of different restriction enzymes overcomes the above possibility.

Amplification of Gene of Interest Using PCR

This is the process of selective multiplication of a specific region of DNA molecule. Amplification is achieved by a special method termed as polymerase chain reaction. The principle underlying the technique is to heat double stranded DNA molecule to a high temperature so that the two DNA strands separate into single strand DNA molecules. Such single stranded DNA molecules require primers made of 10- 18 oligonucleotides for hybridization of each strand of the double helix. It also requires enzyme DNA polymerase. It is a thermostable enzyme also termed as Taq polymerase. PCR amplification cycle involves three major steps- denaturation, annealing and extension. At the end of the process it results in the production of 2n number of DNA molecules for n number of cycles completed.

Insertion of recombinant DNA into host cell:

The next step in a recombinant DNA experiment requires the uptake of the cloned DNA by E. coli. After preparing the r-DNA of a vector, it enters into a suitable host for expression of foreign DNA. The process of introducing purified DNA into bacterial cell is termed as transformation.

For E.coli one of the methods of transformation requires that the cells be treated with high temperature and calcium chloride. The strains of E.coli possesses restriction enzymes, hence they degrade foreign DNA. To escape degradation the exponentially growing cells are pretreated with calcium chloride at low temperature. The vector DNA enters the bacterial cells.

Selection of Recombinants

After the insertion of recombinant DNA into the host cell, the next step is selection or identification of the recombinants. The methods used to do so consider expression on non-expression of certain characters especially antibiotic resistance gene (e.g., ampicillin resistance gene) on plasmid vector. Selectable marker usually provides resistance against substrate which when added to the culture medium inhibits the growth of normal cells or tissues in culture, consequently only transformed tissues will grow.

The simplest method for identification is to grow transformed host cells (with ampicillin resistance gene) on medium containing ampicillin. This would enable the cells containing this transformed plasmid to grow and form colonies. There are other methods for detection of recombinants based on the fact that the cloned DNA fragment disturb the coding sequence of gene. This is termed as insertional inactivation.

Insertional inactivation of marker gene

In this method the inactivation a marker gene of the vector by addition of the foreign DNA is helpful in distinguishing the recombinants from the non recombinants.

Let us consider a plasmid containing genes resistant for two different antibiotics, i.e.,ampicillin and tetracycline. The plain or the normal pBR322 vector is having two antibiotic resistance marker gene (i.e., ampicillin and tetracycline).

Insertion of the target DNA fragment in ampicillin resistance gene, results in the inactivation of this gene . Thus, host cells with such a recombinant plasmid will be sensitive to ampicillin but resistant to tetracycline. These host cell will die when grown on ampicillin containing medium but would grow on medium containing tetracycline. Self ligated (i.e., non recombinant) vectors would grow on medium containing both ampicillin and tetracycline being resistant to them.

Visual screening method:

Another but similar example that involves insertional inactivation is that of lac z gene E. coli. It is also termed as blue white selection. This is the quickest method of screening for selection which relys on the action of a gene product (i.e., enzyme) on a chromogenic substance (on hydrolysis produces blue colour product) to distinguish between the recombinants and the non recombinants.

This method can be well explained with the help of pUC vector. This vector contains a marker gene termed as lac z marker gene which encodes for an enzyme termed as beta-galactosidase. The enzyme acts on a colourless substance X-gal and hydrolyzes it into blue colour product. Addition of foreign DNA into the lac z gene leads to its inactivation. Thus it results in the formation of a recombinant vector.

The non recombinants having plain vector will form blue colonies on the nutrient medium due to the activity of functional lac z gene, where as the recombinants will form white colonies as the the action of lac z gene is lost in the recombinant vector.

Fig: Process of Gene Cloning (Selection/Screening of Recombinants)

Plaque morphology

It is an important method of screening the recombinant phages. Bacteriophage lambda contains a CI gene which encodes a CI protein (a repressor protein). This CI repressor protein is responsible for establishing lysogenic life cycle in the host E. coli cells.


EXPERIMENTAL PROCEDURES

General Cloning Strategy—

A widely used method to generate recombinant genes involves amplifying the gene of interest by PCR and then ligating the PCR product into the desired plasmid vector. In order to facilitate this process, PCR primers can be designed to incorporate appropriate restriction sites at the ends of the target DNA for subsequent digestion and ligation into the vector of choice. The addition of restriction sites by PCR during amplification of a genetic sequence allows for the insertion of virtually any target gene into any vector [ 12 ].

This cloning strategy was used to create a recombinant M. sexta GFP-His6-serpin cDNA construct inserted into the expression vector pGFPuv (Clontech, Palo Alto, CA) (Fig. 1A). Specifically, a M. sexta-mutagenized cDNA clone of functional serpin1B(A343K) previously engineered with a His6 tag at the N terminus of the protein was used as a template in PCR synthesis (Fig. 1B) [ 9 ]. The PCR primers were designed to add a 3′ (coding strand) EcoR1 site and a 5′ (coding strand) SacI site to the His6-serpin sequence (Fig. 2A). Both the His6-serpin PCR product and the pGFPuv expression vector were digested by SacI and EcoRI. The His6-serpin PCR product then was inserted at the 3′ end of the GFP coding sequence using the 5′ multi-cloning site of the vector (Fig. 1C). The final chimeric protein carries GFP at the N-terminal end, a His6 affinity tag flanked by flexible glycine residues in the middle of the construct, and the serpin cDNA at the C-terminal end (Fig. 1D).

Sequence Management—

A computer program was used throughout the course to document and manipulate DNA sequences, including generating restriction maps and translating DNA sequences into protein sequences. While many sequence management programs are available, the shareware program DNA Strider 1.1 (Macintosh) was utilized [ 13 ]. In many cases, complete plasmid sequences can be downloaded from commercial websites and manipulated by these programs. The use of such a DNA sequence management program, while not required, does familiarize students with DNA manipulation and database management. Sequence management programs are extremely helpful in developing initial cloning strategies.

Organization of the Laboratory Series—

Our students worked individually, with each one performing their own set of reactions throughout the semester. They did however engage in a great deal of collaboration and support, sharing materials, discussing experimental options, coordinating agarose gel use, etc. A faculty member was present throughout the laboratory period, providing guidance and appropriate intervention. The laboratory period was a 3-h block one afternoon a week. On several occasions, students were asked to perform brief preparations (e.g. inoculate cultures) on the afternoon before our laboratory session. A proposed timeline of laboratory exercises over the course of the semester is shown in Table I.

Week One

Transformation—

An M. sexta serpin cDNA serpin1B(A343K) was previously cloned into the expression plasmid pQME-60, which added DNA encoding a His6 epitope tag to the N-terminal end of the protein to create the plasmid serpin1B(A343K) [ 9 ]. In principle, any plasmid, genomic DNA isolation, or cDNA library could serve as the template for PCR at the outset of this experiment. Our students performed separate transformations of the plasmid serpin1B(A343K) and pGFPuv into Escherichia coli-competent cells. During all genetic manipulations, a stable E. coli cloning strain lacking the recA recombination gene, such as XL-1 Blue (Stratagene, La Jolla, CA) or DH5α (Invitrogen, San Diego, CA), was used. Subcloning-grade competent cells yielding ∼1 × 10 6 transformants per microgram of DNA were obtained from commercial sources. The transformation protocol recommended by the competent cell manufacturer took under 2 h, and the resulting cells were plated on ampicillin-containing plates. These plates were kept at 37 °C overnight (12–16 h) to allow sufficient bacterial colony growth. On the day following the transformation, students returned to the laboratory removed their plates from the incubator, noted their results, and stored their plates at 4 °C for the intervening period. The students were instructed to calculate the number of colonies per microgram of DNA for their reactions.

A negative control (no plasmid DNA) and a positive control (plasmid DNA from a commercial source, e.g. pUC18) were included for the transformation. Our students used pre-prepared warm autoclaved Luria-Bertani (LB)/agar solution and poured plates at the beginning of the laboratory period. A suitable amount of 1000 × stock of ampicillin solution (0.1g ampicillin/ml of 50% ethanol/water solution, store at −20 °C) was added to the liquid agar prior to pouring.

Week Two

Plasmid Preparation—

On the afternoon prior to the second laboratory period, students started small (2 ml) cultures of LB ampicillin broth from a single colony on their transformation plates. The culture was incubated at 37 °C with shaking (225 rpm) overnight. Students performed this inoculation in ∼15 min alternatively, the faculty or teaching assistant can begin these cultures. Incubation of the cultures should not exceed 24 h to ensure the plasmids are produced in healthy cells.

During the laboratory session, plasmid DNA suitable for PCR was generated from this culture using a miniprep DNA isolation kit (available from Qiagen at a cost of about $1 per plasmid preparation). The miniprep procedure should take the average student ∼1 h and yield 50 μl of purified plasmid at a concentration of ∼200 ng/μl.

Restriction Digests—

In order to confirm that the plasmid preparation was successful, a diagnostic restriction digest was performed. Students were given a plasmid map and asked to design a restriction digestion that gave a fragment pattern characteristic of the correct plasmid. For example, digesting the pGFPuv vector with the enzyme SacI or EcoRI should yield a single linearized DNA fragment of 3.3 kb as visualized on an agarose gel. These enzymes and digestion protocols were obtained from commercial sources (e.g. New England Biolabs, Promega). Generally, 200 ng of purified plasmid DNA were digested in a total volume of 10 μl. The digestion can readily be scaled up to digest larger amounts of DNA. The volume of enzyme needed was calculated using the activity of each enzyme preparation as indicated by the manufacturer. An incubation of 30 min is usually sufficient for an appropriate amount of enzyme to specifically cut the majority of the DNA. If desired, more complex digests can be performed utilizing more than one enzyme simultaneously. If this option is pursued, students must select a digestion buffer and reaction conditions that are compatible with all of the enzymes. The digested DNA fragments were stored at −20 °C until the following week.

Week Three

DNA Fragment Visualization by Electrophoresis through an Agarose Gel—

Agarose E-gels (Invitrogen) provide a safe and rapid method to visualize the DNA fragments. The gels do not require electrophoresis buffer and run in less than 45 min. The agarose gel containing the fluorophore ethidium bromide is sealed between two plastic plates, thus minimizing student exposure to this carcinogen. In addition, no DNA loading buffer is required, streamlining the loading process. Alternatively, a standard agarose gel that is poured by the students and appropriately stained to visualize the DNA can be used. Following electrophoresis, the DNA fragments were visualized using an ultraviolet light source. An example of a 1.2% E-gel on a transilluminator is shown in Fig. 3. Approximately 200 ng of DNA were loaded in each lane (equivalent to about 1 μl of DNA preparation). This amount of DNA gave a readily apparent band. Larger amounts of DNA may be needed to see multiple bands or smaller DNA fragments (less than 1.0 kb in length). The linearized DNA was compared with a DNA ladder (Sigma or New England Biolabs) to ascertain that a fragment of the correct length was obtained. Uncut plasmid DNA yields a pattern of multiple DNA bands that do not directly correspond to the true length of the plasmid due to differential DNA supercoiling of the circular plasmid this was a useful control for determining if the experimental DNAs were fully cut.

Weeks Three and Four

Primer Design—

In order to amplify the His6-serpin cDNA sequence from the plasmid serpin1B(A343K), customized PCR primers were designed. With the help of the instructors, students designed and ordered PCR primers that introduced restriction sites to the ends of the cDNA. The primers consisted of two distinct regions: the 3′ ends of the primers are complementary to the cDNA and served as the template for polymerase extension. The 5′ ends of the primers are noncomplementary to the serpin cDNA and allowed introduction of specific DNA sequences to the end of the PCR product (see Fig. 2A).

The “front” primer was designed to introduce a new SacI site in the PCR serpin cDNA amplification product, which allows for ligation to the SacI site in the pGFPuv vector (see Fig. 2B). The 3′ end of the primer consisted of a region complementary to the 5′ end of the coding strand of the His6-serpin cDNA. The Tm of this interaction of the primer with the DNA template should be ∼60 °C or greater. A general rule of thumb is that a complementary G-C pair contributes to the Tm by about 4 °C, while an A-T pair contributes to the Tm by about 2 °C. The 5′ noncomplementary region of the primer included DNA coding for a glycine residue upstream of the His6 tag to allow for the GFP domain to orient itself independently of the His6 tag and the serpin protein. The addition of the glycine also eliminated the GFP's endogenous stop codon, allowing translation of the full-length GFP-serpin sequence. Finally, the 5′ end of the primer was designed to reconstruct the last few codons of the GFP gene, add a SacI site, and a DNA overhang to allow cleavage of the SacI site in the His6-serpin PCR product. Restriction enzymes often require a DNA overhang to efficiently cleave the restriction site [ 14 ]. More information on appropriate DNA overhangs for individual restriction enzymes is available from the New England Biolabs (Beverly, MA www.neb.com).

The “back” PCR primer, designed to bind to the 5′ end of the serpin cDNA noncoding strand sequence, introduces an EcoR1 site to the serpin cDNA (see Fig. 2C). The 3′ end of this primer consisted of a sequence complementary to the serpin cDNA. This complementary section ended at the stop codon of the serpin cDNA. The noncomplementary portion of this primer (5′ end) added an additional stop codon to ensure termination of translation and contained the introduced EcoRI restriction site. Finally, the 3′ end of the “back” primer contained a multibase DNA overhang beyond the EcoRI restriction site.

Over the past decade, the cost of DNA oligonucleotide primers has dramatically decreased and primers can now readily be ordered online from companies such as Integrated DNA Technologies or Life Technologies and arrive in less than 3 days at a cost of approximately

Acknowledgements

We thank Ishan Dillon, Lacey Verkamp, Monica Silver, Kate Ware, Abigail Gay, Kjersti Knox, Lauren Phillips, and Zachary Seymour for their enthusiasm, diligence, and perseverance during the course of the semester. The generous donation of the His6-serpin cDNA by Mike Kanost and Haobo Jiang is greatly appreciated. In addition, thanks go to Melissa Foster, William Harvey, the Howard Hughes Medical Institute (HHMI), and the Earlham College Biology and Chemistry Departments, for provisions of materials and equipment used in the course of the laboratory sequence. J. A. B. would like to thank HHMI and the members of Earlham College for making an enjoyable postdoctoral teaching experience possible.

.50/DNA base for a 100-nmol preparation. Therefore, primers were designed and ordered online in one laboratory period. They arrived prior to the next weekly meeting. The lyophilized primers were resuspended by vortexing to a concentration of 100 μ M in molecular biology-grade water and subsequently aliquoted and stored at −20 °C.

Week 5

PCR Amplification of the Target Sequence—

Amplification of the target sequence by PCR was performed utilizing the thermophilic enzyme Pfu Turbo (Stratagene), which has a higher fidelity than many other traditional PCR enzymes (e.g. Taq polymerase), reducing the frequency of errors in the amplification product. A detailed protocol for PCR utilizing Pfu turbo can be obtained from Stratagene. To run the reaction, 100 ng of plasmid DNA, 100 ng of each primer, 5 μl of 10 × Pfu buffer mix, dNTPs to a concentration of 500 n M each, deionized water to 50 μl, and 1 μl of enzyme (2.5 U) were mixed in a thin-walled PCR tube. A single cycle consisted of a denaturing phase at 95 °C for 2 min, an annealing phase at 67 °C for 30 s, and finally an extension phase at 72 °C for 3 min. This sequence was repeated for 25 cycles. The following considerations should be taken into account when designing a PCR cycle. The polymerase extension time should be at least 2 min/kb of PCR product. The annealing temperature should be 5 °C lower than the lowest primer Tm. If a thermocycler is used without a heated lid, 30 μl of mineral oil should be laid on top of each reaction to prevent evaporation of the reaction mixture. As the PCR run time can take 3–4 h, the reactions can be started during the laboratory period and the thermocycler can be programmed to hold the reactions at 4 °C or 10 °C overnight. PCR products can then be stored for extended periods at −20 °C.

Week 6

Restriction Digestion of the PCR Product and Gene Vector—

The presence of a PCR product of the appropriate size (1.2 kb) was verified by E-gel (see Fig. 3). The concentration of PCR product may vary greatly so both a relatively large (5 μl) and small amount (1 μl) of the PCR should be loaded into the gel. The desired PCR insert should be the major product visible on the gel.

In preparation for directional insertion of the PCR fragment into the pGFPuv vector, both the PCR product and vector were cut with SacI and EcoR1. This digestion yields complementary “sticky” single-stranded ends that facilitate ligation of the insert into the linearized vector. Restriction digestions using two different enzymes can be performed simultaneously if both enzymes are active in the same buffer. The results of the restriction digestions of the vector and the PCR product were analyzed on an E-gel to ensure that they gave DNA fragments of the expected lengths. Because the restriction sites are near the end of the amplified insert, the mobility of this PCR product does not change appreciably upon digestion. Similarly, the digested vector closely resembled the singly cut vector as only a short DNA piece (less than 100 bases) was excised from the vector. This step was used to ensure that no uncut vector remained and that there were no unintended digestions in the vector and PCR product. Following the digestion, the DNA was stored at −20 °C. The single-stranded DNA “sticky” ends are especially sensitive to degradation, thus nuclease free reagents were used during all manipulations and care was taken to avoid contamination of the reactions. Finally, a preparation of vector digested with only EcoR1 was also produced in this manner for use as a positive control in the subsequent ligation reaction.

Week 7

Spin Column Purification of Large DNA Fragments—

Large DNA fragments produced by restriction digests of the PCR product in the previous week were purified by a spin column protocol to remove contaminants. Large DNA fragments were rapidly (in less than 30 min) recovered in high purity using Qiaquick PCR spin purification columns according to the vendor's protocol (Qiagen, Valencia, CA). This step was used to remove enzymes, small DNA fragments (less then 100 bases) and undesirable buffer components. In many cloning schemes, the use of a DNA purification column may provide a rapid alternative to gel-purification techniques.

Treatment of the Vector with Calf Alkaline Intestinal Phosphatase—

After restriction digestion, the vector was treated with calf intestinal alkaline phosphatase (CIAP Promega, Madison, WI) to prevent self-ligation in the subsequent step. CIAP removes 5′ phosphates from DNA. Circular plasmids transform and replicate much more efficiently than linearized DNA. While the digestion of the vector with two different enzymes should prevented ligation of the vector alone, the use of CIAP prevents ligation of any singly cut vector, reducing the number of false positives in the subsequent ligation reactions. An excess of CIAP was used in a 30-min reaction to treat sufficient linearized vector. The vector was then immediately purified using a PCR spin column (Qiagen) to remove the CIAP enzyme. Purified vector was stored at −20 °C until the following week. An aliquot of digested vector not treated with CIAP was reserved for use as a control reaction in the ligation step. EcoRI singly cut vector was also treated with CIAP as a control.

Week 8

Reaction Ligation and Transformation—

Purified digested PCR insert and vector (treated with CIAP) were ligated to form the new recombinant plasmid pMAJILK containing the GFPuv- His6-serpin sequence (see Fig. 1). Utilizing a recently introduced quick ligation kit (New England Biolabs), the ligation incubation time was 5 min at room temperature. This allowed the ligation reaction and transformation to be completed in a single laboratory period. A molar ratio of ∼2:1 insert-to-vector was used in the reaction. At least 200 ng of linearized vector should be utilized. Initially, only half of the ligation product was used in the transformation in case the transformation was unsuccessful. Unused insert and vector were stored at −80 °C for future ligation attempts.

Transformations of the ligation products were done using high-efficiency competent cell preparations with transformation efficiencies of at least 10 6 transformants per microgram of viable plasmid DNA. The transformation reactions were plated on LB agar plates containing the appropriate antibiotic. Multiple agar plates should be used in order to plate the entire ligation reaction volume. To check the effectiveness of ligation, a sample of expression vector cut only with EcoR1 (no CIAP) was treated with ligase and compared with an untreated sample. To check the CIAP reaction, a sample of the EcoRI cut vector treated with CIAP also was included. As a negative control, a sample containing the CIAP-treated vector and insert but not exposed to ligase was also transformed. Finally, positive and negative control transformation reactions as described previously were employed to verify the effectiveness of the transformation procedure.

Week 9

Restriction Analysis of the Gene Products—

Following plating of the transformation reaction, a small number of colonies were typically observed (less than 50). A limited number of colonies (∼5) were grown in separate 2-ml cultures containing antibiotic, and a plasmid DNA isolation of each culture was performed. As described previously, these cultures were inoculated the day prior to the laboratory meeting. A diagnostic restriction digestion using EcoRI and SacI was performed to assay whether the isolated plasmids contained the recombinant insert. The digestions were stored at −20 °C until the following week.

Week 10

Agarose Gel of the Diagnostic Restriction Digest—

The diagnostic restriction digestions were visualized on an E-gel (see Fig. 3). Digestion of plasmids without the recombinant insert yielded only linearized vector. Digestion of the desired recombinant plasmids from recombinant colonies resulted in the obsevation of a liberated insert (1.2 kb) containing the His6-serpin gene as well as the linearized vector. Utilizing this procedure, our students consistently observed insert in ∼50% of the plasmid preparations. While the observation of an insert in these diagnostic restriction digests strongly indicates successful completion of the subcloning process, plasmids with inserts should be investigated further by sequencing.

Design and Ordering of Sequencing Primers—

Primers for sequencing were 22–25 bases in length and were completely complementary to the gene of interest. Selection of the site on the gene for the design of primers was based on several factors. First, the spacing of primers was approximately every 400 bases to enable complete and accurate sequencing of the entire sequence of interest. Second, the melting temperature was 55–60 °C, with a GC content of 45–60% in accordance with the requirements of the sequencing facility. Third, primers were checked for self-priming and loops using software from Integrated DNA Technologies (www.IDTDNA.com). Finally, the primers were located ∼50 bases upstream (3′) of the beginning of the sequence to be read. For some commonly used plasmids, commercially prepared sequencing primers are available.

Week 11

Sequencing of the Plasmids Containing the DNA Insert—

A crucial test of a successful cloning experiment is the sequencing of the gene insert along with flanking sections of the vector. Although automated sequencers are generally too expensive for small, liberal arts colleges, DNA sequences can be easily and inexpensively obtained through outside vendors. These vendors typically require that they be sent purified plasmid and sequencing primer. This process typically requires less than 10–12 μl of spin-column-purified plasmid sample with a concentration of 150–200 ng/μl. A typical sequencing reaction yields about 700 bases of readable sequences and costs $9–13 per reaction, depending on the company. We used Genegateway (www.genegateway.com), which cost $9 per reaction and had a very short turn-around time (less than 3 days).

Week 12

Interpretation of the Plasmid Sequencing Results—

The sequence was transmitted from Genegateway by E-mail in a compressed file which, when expanded, gave both the original sequence chromatogram and a sequence file. An example chromatogram is shown in Fig. 4. In this method of automated sequencing, each terminal base is labeled with a different fluorescent dye [ 15 ]. As the different DNA fragments migrate through a matrix and pass a detector, the observed relative fluorescent intensities are used to determine the identity of the terminal DNA base. The automated sequence read can be aligned with the expected vector/gene sequence using a program such as ClustalW. There are many ClustalW sites on the web, such as pir.georgetown.edu/pirwww/search/multaln.html. Discrepancies between the expected and obtained DNA sequences must be investigated further. Because there can be errors in the automated reading of bases, it is highly recommended that students edit the chromatogram where discrepancies occur to ensure that the sequence read correctly corresponds with the chromatogram. Special attention should be paid to the regions of the primers because error in primer synthesis can result in unexpected mutations, and did so in one of our selected clones.


Key Terms

  • polymerase chain reaction: A technique in molecular biology for creating multiple copies of DNA from a sample used in genetic fingerprinting etc.
  • molecular cloning: a set of experimental methods in molecular biology that are used to assemble recombinant DNA molecules and to direct their replication within host organisms.
  • restriction enzyme: An endonuclease that catalyzes double-strand cleavage of DNA containing a specific sequence.

Recombinant DNA technology also referred to as molecular cloning is similar to polymerase chain reaction ( PCR ) in that it permits the replication of a specific DNA sequence. The fundamental difference between the two methods is that molecular cloning involves replication of the DNA in a living microorganism, while PCR replicates DNA in an in vitro solution, free of living cells.

In standard molecular cloning experiments, the cloning of any DNA fragment essentially involves seven steps:

  1. Choice of host organism and cloning vector
  2. Preparation of vector DNA
  3. Preparation of DNA to be cloned
  4. Creation of recombinant DNA
  5. Introduction of recombinant DNA into host organism
  6. Selection of organisms containing recombinant DNA
  7. Screening for clones with desired DNA inserts and biological properties

Although a very large number of host organisms and molecular cloning vectors are in use, the great majority of molecular cloning experiments begin with a laboratory strain of the bacterium E. coli (Escherichia coli) and a plasmid cloning vector. E. coli and plasmid vectors are in common use because they are technically sophisticated, versatile, widely available, and offer rapid growth of recombinant organisms with minimal equipment. The cloning vector is treated with a restriction endonuclease to cleave the DNA at the site where foreign DNA will be inserted. The restriction enzyme is chosen to generate a configuration at the cleavage site that is compatible with that at the ends of the foreign DNA.

Typically, this is done by cleaving the vector DNA and foreign DNA with the same restriction enzyme, for example EcoRI. Most modern vectors contain a variety of convenient cleavage sites that are unique within the vector molecule (so that the vector can only be cleaved at a single site) and is located within a gene (frequently beta-galactosidase) whose inactivation can be used to distinguish recombinant from non-recombinant organisms at a later step in the process. To improve the ratio of recombinant to non-recombinant organisms, the cleaved vector may be treated with an enzyme (alkaline phosphatase) that dephosphorylates the vector ends. Vector molecules with dephosphorylated ends are unable to replicate, and replication can only be restored if foreign DNA is integrated into the cleavage site.

For cloning of genomic DNA, the DNA to be cloned is extracted from the organism of interest. Polymerase chain reaction (PCR) methods are often used for amplification of specific DNA or RNA (RT-PCR) sequences prior to molecular cloning. The purified DNA is then treated with a restriction enzyme to generate fragments with ends capable of being linked to those of the vector. If necessary, short double-stranded segments of DNA (linkers) containing desired restriction sites may be added to create end structures that are compatible with the vector. The creation of recombinant DNA is in many ways the simplest step of the molecular cloning process. DNA prepared from the vector and foreign source are simply mixed together at appropriate concentrations and exposed to an enzyme (DNA ligase) that covalently links the ends together. This joining reaction is often termed ligation. The resulting DNA mixture containing randomly joined ends is then ready for introduction into the host organism. The DNA mixture, previously manipulated in vitro, is moved back into a living cell, referred to as the host organism. The methods used to get DNA into cells are varied, and the name applied to this step in the molecular cloning process will often depend upon the experimental method that is chosen (e.g. transformation, transduction, transfection, electroporation).

When microorganisms are able to take up and replicate DNA from their local environment, the process is termed transformation, and cells that are in a physiological state such that they can take up DNA are said to be competent. When bacterial cells are used as host organisms, the selectable marker is usually a gene that confers resistance to an antibiotic that would otherwise kill the cells, typically ampicillin. Cells harboring the vector will survive when exposed to the antibiotic, while those that have failed to take up vector sequences will die. Modern bacterial cloning vectors (e.g. pUC19) use the blue-white screening system to distinguish colonies (clones) of transgenic cells from those that contain the parental vector.

In these vectors, foreign DNA is inserted into a sequence that encodes an essential part of beta-galactosidase, an enzyme whose activity results in formation of a blue-colored colony on the culture medium that is used for this work. Insertion of the foreign DNA into the beta-galactosidase coding sequence disables the function of the enzyme, so that colonies containing recombinant plasmids remain colorless (white). Therefore, recombinant clones are easily identified.

Figure: Blue White screen: The blue-white screen is a screening technique that allows for the detection of successful ligations in vector-based gene cloning. DNA of interest is ligated into a vector. The vector is then transformed into competent cell (bacteria). The competent cells are grown in the presence of X-gal. If the ligation was successful, the bacterial colony will be white if not, the colony will be blue. This technique allows for the quick and easy detection of successful ligation.


A cloning vector is a small DNA molecule that carries a foreign DNA fragment into the host cell while expression vector is a type of vector that facilitates the introduction, expression of genes and production of proteins. So, this is the key difference between cloning vector and expression vector. Furthermore, another significant difference between cloning vector and expression vector is that a cloning vector introduces a foreign DNA fragment into a host while expression vectors express the introduced gene by producing the relevant protein.

Furthermore, cloning vector consists of an origin of replication, restriction sites, and a selectable marker. While, the expression vector contains enhancers, promoter region, termination codon, transcription initiation sequence, an origin of replication, restriction sites, and a selectable marker. Therefore, this is also a difference between cloning vector and expression vector. Besides, plasmids, bacteriophages, bacterial artificial chromosomes, cosmids, mammalian artificial chromosomes, yeast artificial chromosome, etc, are examples of cloning vectors. Meanwhile, expression vectors are mostly plasmids.


Cloning a coding gene into a non-expression vector - Biology

BioBrick parts are DNA sequences that follow a specific restriction-enzyme assembly standard. There are specific standards for compiling and maintaining a registry of thes BioBrick parts as well as assembly methods used to construct those parts into devices and systems to produce a desired protein in model microorganisms. BioBricks can be thought as building blocks or legos that can be put together in different ways to assemble larger synthetic biological circuits that together have a unique function. These biological circuits can then be incorporated into living cells, such as E.Coli, to construct new biological systems and to carry out defined functions. Examples of BioBrick parts include coding sequences, promoters, ribosomal binding sites, and terminators.
The BioBricks standard is an empirical, universal standard for defining the sequences of nucleic acids of the parts and describes how they can be combined with other parts' sequences within cloning vectors. This allows for standardization of the BioBrick parts. The different standards used to define BioBrick parts are described in the "Assembly Standards for BioBricks" page. BioBrick assembly pertains to the methods and procedures used to compile the parts' sequences, which is described in more depth in the "Assembly techniques for BioBricks" page.
BioBricks are incredibly useful for many different industries which utilize cloning strategies to produce products such as therapuetic drugs or enzymes used in foods. These industries include pharmaceuticals, food processing, biofuels, and many others. Many of these industries are taking advantage of synthetic biology tools, like BioBricks, to engineer organisms to produce many of their products. Synthetic biology can be defined as A) the assembly of new biological parts, devices, and systems or B) the reconfiguration of existing, natural biological systems for useful purposes. BioBricks are making it easier to design and assemble the biological pathways needed for engineered organisms to perform new tasks.

This video provides an introduction to synthetic biology

A database of BioBrick parts can be find in the registry of standard biological parts that makes it easier for researchers to share parts and collaborate. A biological part is defined as a specific sequence of nucleic acids that encodes a definable biological feature. The registry contains over 20,000 BioBrick parts that have been cataloged and categorized based on their biological function. Each catalog entry describes the function, performance, and design of the part. Each BioBrick part also comes with a unique identification code. The registry was created by researchers at MIT, Harvard, and UCSF. Each year participants in the iGEM competition and academic laboratories contribute to the registry, therefore the registry is growing considerably over time.

The registry of standard biological parts can be found here!

A key innovation of the BioBrick assembly standard is that an engineer can join any two BioBrick parts and the subsequent combined form is also a BioBrick part that can be joined with any other BioBrick parts. The standardization of BioBricks allows for the distributed production of a biological parts collection. Engineers in different parts of the world can design parts that conform to the standardization assembly and these two parts will be able to be joined as they follow the same standard. Moreover, engineers carry out the same exact process when they combine two BioBrick parts, so the assembly process is open to optimization and automation in contrast to more traditional molecular cloning approaches for combing genetic sequences. Not only can the registry be used by any researcher or engineer on the planet to identify which biological parts would be most conducive to facilitating a certain manufacturing process, but these scientists may also contribute their own discoveries to the registry following validation by those that maintain the database. This serves to assert BioBricks as a very versatile, diverse tool in a continuous state of becoming more comprehensive, making it one of the most useful resources available to synthetic biologists worldwide.


Shoemaker, S., Schweickart, V., Ladner, M., Gelfand, D., Kwok, S., Myambo, K. and Innis, M. 1983. Molecular cloning of exo-cellobiohydrolase I derived from Trichoderma reesei strain L27. Bio/Technology 1: 691–696.

Chen, M.C., Gritzali, M. and Stafford, W.D. 1987. Nucleotide sequence and deduced primary structure of cellobiohydrolase II from Trichoderma reesei. Bio/Technology 5: 274–278.

Teeri, T.T., Lehtovaara, P., Kauppinen, S., Salovuori, I. and Knowles, J. 1987. Homologous domains in Trichoderma reesei cellulolytic enzymes: gene sequence and expression cellobiohydrolase II. Gene 51: 43–52.

Penttila, M., Lehtovaara, P., Nevalainen, H., Bhikhabhai, R. and Knowles, J.K.C. 1986. Homology between cellulase genes of Trichoderma reesei: complete nucleotidc sequence of the endogluconase I gene. Gene 45: 253–263.

Saloheimo, M., Lehtovaara, P., Penttila, M., Teeri, T.T., Stahlberg, J., Pettersson, G., Claeyssens, M., Tomme, P. and Knowles, J.K.C. 1988. EGIII, a new endoglucanase From Trichoderma reesei: the characterization of both gene and enzyme. Gene 63: 11–21.

Enari, T.M., Niku-Paavola, M.L., Harju, L., Lappalainen, A. and Nummi, M. 1981. Purification of Trichoderma reesei and Aspergillus niger β-glucosidase. J. Appl. Biochem. 3: 157–163.

Umile, C. and Kubicek, C.P. 1986. A constitutive, plasma membrane bound β-glucosidase in Trichoderma reesei. FEMS Microbiol. Letts. 34: 291–295.

Jackson, M.A. and Talburt, D.E. 1988. Purification and partial characterization of an extracellular β-glucosidase of Trichoderma reesei using cathodic run, polyacrylamide gel electrophoresis. Biotechnol. Bioeng. 32: 903–909.

Hofer, F., Weissinger, E., Mischak, H., Messner, R., Meixner-Monori, B., Blaas, D., Visser, J. and Kubicek, C.P. 1989. A monoclonal antibody against the alkaline extracellular β-glucosidase from Trichoderma reesei: reactivity with other Trichoderma β-glucosidases. Biochim. Biophys. Acta. 992: 298–306.

Messner, R. and Kubicek, C.P. 1990. Evidence for a single, specific β-glucosidase in cell walls from Trichoderma reesei QM9414. Enzyme Microb. Technol. 12: 685–690.

Inglin, M., Feinberg, B.A. and Loewenberg, J.R. 1980. Partial purification and characterization of a new intraccllular β-glucosidase of Trichoderma reesei. Biochem. J. 185: 515–519.

Kubicek, C.P. 1981. Release of carboxymethyl-cellulase and β-glucosidase from cell walls of Trichoderma reesei. Eur. J. Appl. Biotechnol. 13: 226–231.

Messner, R., Hagspiel, K. and Kubicek, C.P. 1990. Isolation of a β-glucosidase binding and activating polysaccharide from cell walls of Trichoderma reesei. Arch. Microbiol. 154: 150–155.

Sternberg, D., Vijayakumar, P. and Reese, E.T. 1977. β-glucosidase: microbial production and effect on enzymatic hydrolysis of cellulose. Can. J. Microbiol. 23: 139–147.

Kadam, S.K. and Demain, A.L. 1989. Addition of cloned β-glucosidase enhances the degradation of crystaline cellulose by the Clostridium thermocellum cellulase complex. Biochem. Biophys. Res. Comm. 161: 706–711.

Kowamori, M., Ado, Y. and Takasawa, S. 1986. Preparation and application of Trichoderma reesei mutants with enhanced β-glucosidase. Agr. Biol. Chem. 50: 2477–2482.

Mandels, M., Parrish, F.W. and Reese, E.T. 1962. Sophorose as an inducer of cellulase in Trichoderma viride. J. Bacteriol. 83: 400–408.

Perlman, E. and Halvorson, H. 1983. A putative signal peptidase recognition site and sequence in eukaryotic and prokaryotic signal peptides. J. Mol. Bio. 167: 391–409.

Von Heijne, G. 1984. How signal sequences maintain cleavage specificity. J. Mol. Biol. 173: 243–251.

Von Heijne, G. 1986. A new method for predicting signal sequence cleavage sites. Nucleic Acids Res. 14: 4683–4690.

Gavel, Y. and Heijne Von, G. 1990. Sequence differences between glycosylaled and nonglycosylated Asn-X-Thr/Ser acceptor sites: implications for protein engineering. Protein Engineering 3: 433–442.

Gurr, S.J., Unkles, S.E. and Kinghorn, J.R. 1987. The structure and organization of nuclear genes of filamentous fungi, p. 93–139. In: Gene Structure in Eukaryotic Microbes. Kinghorn, J. R. (Ed.). IRL Press, Boca Raton, FL.

Smith, J.L., Bayliss, F.T. and Ward, M. 1991. Sequence of the cloned pyr4 gene of Trichoderma reesei and its use as a homologous selectable marker for transformation. Cur. Genet. 19: 27–33.

Chirico, W.J. and Brown, R.D. Jr. 1987. Purification and characterization of a β-glucosidase from Trichoderma reesei. Eur. J. Biochem. 165: 333–341.

Bause, E. and Legler, G. 1980. Isolation and structure of a tryptic glycopeptide from the active site of β-glucosidase A3 from Aspergillus wentii. Biochim. Biophys. Acta. 626: 459–465.

Paice, M.G. and Jurasek, L. 1979. Structural and mechanistic comparisons of some β-l,4-glycoside hydrolases. Adv. Chem. Ser. 181: 361–374.

Clarke, A.J. 1990. Chemical modification of a β-glucosidase from Schizophyllum commune: evidence for essential carboxyl groups. Biochim. Biophys. Acta. 1040: 145–152.

Boel, E., Hjort, I., Svensson, B., Norris, K.E. and Fiil, N.P. 1984. Glucoamylases G1 and G2 from Aspergillus niger are synthesized from two different but closely related mRNAs. EMBO. J. 3: 1097–1102.

Boel, E., Hansen, M.T., Hjort, I., Hoegh, I. and Fiil, N.P. 1984. Two different types of intervening sequences in the glucoamylase gene from Aspergillus niger. EMBO. J. 3: 1581–1585.

Finklestein, D.B., Rambosek, J., Crawford, M.S., Soliday, C.L., McAda, P.C. and Leach, J. 1989. Protein secretion in Aspergillus niger, p. 259–300. In: Genetics and Molecular Biology of Industrial Microorganisms Queener, S. W. and Hegeman, G. (Eds). ASM Publications, Washington, D.C.

Sheir-Neiss, G. and Montenecourt, B.S. 1984. Characterization of the secreted cellulases of Trichoderma reesei wild type mutants during controlled fermentations. Appl. Microbiol. Biotechnol. 20: 46–53.

Gritzali, M. 1977. Biosynthesis and identification of enzymes of the cellulase system of Trichoderma reesei. Masters Thesis. Virginia Polytechnic Institute and State University.

Mandels, M. and Weber, J. 1969. Production of cellulases. Adv. Chem. Ser. 95: 391–413.

Messing, H., Crea, R. and Seeburg, P.H. 1981. A system for shotgun DNA sequencing. Nucleic Acids Res. 9: 309–321.

Yanisch-Perron, C., Vieira, J. and Messing, J. 1985. Improved M 13 phage gene cloning vectors and host strains: Nucleotide sequences of the M 13mpl8 and pUC19 vectors. Gene 33: 103–119.

Korman, D.R., Bayliss, F.T., Barnett, C.C., Carmona, C.L., Kodama, K.H., Royer, T.J., Thompson, S.A., Ward, M., Wilson, L.J. and Berka, R.M. 1990. Cloning, characterization, and expression of two alpha-amylase genes from Aspergillus niger var. awamori. Curr. Genet. 17: 203–221.

Timberlake, W.E. and Barnard, E.C. 1981. Organization of a gene cluster expressed specifically in the asexual spores of A. nidulans. Cell 26: 29–37.

Sambrook, J., Fritsch, E.F. and Maniatis, T. 1989. Molecular Cloning. A Laboratory Manual, Second Edition. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.

Thomas, P.S. 1980. Hybridization of denatured RNA and small DNA fragments transferred to nitrocellulose. Proc. Nat. Acad. Sci. USA 77: 5201–5205.

Sanger, F., Nicklen, S. and Coulson, A.R. 1977. DNA sequencing with chain terminating inhibitors. Proc. Natl. Acad. Sci. USA 74: 5463–5467.

Ohmiya, K., Takano, M. and Shimizu, S. 1990. DNA sequence of a β-glucosidase from Ruminoccocus albus. Nucleic Acids Res. 18: 671.

Raynal, A., Gerbaud, C., Francingues, M.C. and Guerineau, M. 1987. Sequence and transcription of the β-glucosidase gene of Kluyveromyces fragalis cloned in Saccharomyces cerevisiae. Curr. Genet. 12: 175–184.

Kohchi, C. and Toh-e, A. 1985. Nucleotide sequence of Candida pelliculosa β-glucosidase gene. Nucleic Acids Res. 13: 6273–6282.

Machida, M., Ohtsuki, I., Fukui, S. and Yamashita, I. 1988. Nucleotide sequences of Saccharomycopsis fibuligera genes for extracellular β-glucosidases as expressed in Saccharomyces cerevisiae. App. Env. Micro. 54: 3147–3155.


7 Main Steps Involved in Gene Cloning

The following points highlight the seven main steps involved in gene cloning. Some of the steps are: 1. Isolation of DNA (gene of interest) fragments to be cloned 2. Insertion of Isolated DNA into the a suitable vector to form the recombi­nant DNA 3. Introduction of the recombinant DNA into a suitable organism known as host and other steps too.

Gene Cloning Step # 1.

Isolation of DNA (Gene of Inter­est) Fragments to be Cloned:

Before we carry out the operation of gene clon­ing we need two basic things in their purified state – the gene of our interest (GI) and the vector. A GI is a fragment of gene whose prod­uct (a protein, enzyme or a hormone) interests us. For example, gene encoding for the hormone insulin.

Similarly, the vector is a carrier molecule which can carry our GI into a host, replicate there along with the GI making its multiple copies. In this state the GI can also be expressed in the host cell producing the product of the gene which is needed by us.

Gene Cloning Step # 2.

Insertion of Isolated DNA into the a Suitable Vector to Form the Recombinant DNA:

Once the ingredients are ready we can start the operation. Our next step will be to cut both the vectors as well as the GI by using a special type of enzyme, called restriction endonuclease. A restriction endonuclease is an enzyme that cuts double-stranded or single-stranded DNA at specific recognition nucleotide se­quences known as restriction sites towards the inner region (hence endonuclease).

They are also regarded as molecular scissors as they cut open the DNA strands. After this cutting step we move to pasting. Here the GI is taken and pasted to the cut vector. This procedure also needs an enzyme, called DNA ligase. They are also considered as molecular glue.

The result­ing DNA molecule is a hybrid of two DNA molecules – our GI and the vector. In the ter­minology of genetics this intermixing of dif­ferent DNA strands is called recombination (which naturally takes place in the prophase 1 of meiosis 1). Hence, this new hybrid DNA molecule is also called a recombinant DNA molecule and this technology is called recom­binant DNA technology (RDT).

Gene Cloning Step # 3.

Introduction of the Recombinant DNA into a Suitable Organism known as Host:

When our recombinant DNA molecule is ready we need to introduce it into a living system known as host.

This is done either for one or both of the following reasons:

(a) To replicate the recombinant DNA mol­ecule in order to get the multiple copies of our GI.

(b) To let our GI get express and produce the protein which is needed by us.

Introduction of the recombinant DNA into the host cell is done by various ways and strictly depends upon the size of the DNA molecule and the nature of GI. Some of the methods followed to carry out this step in­cludes electroporation, micro-injection, lipofection, etc.

When we carry out this pro­cess some of the host cells will take up the re­combinant DNA and some will not. The host cells which have taken up the recombinant DNA are called transformed cells and the pro­cess is called transformation.

Gene Cloning Step # 4.

Selection of the Transformed Host Cells and Identification of the Clone Con­taining the Gene of Interest:

The transformation process generates a mixed population of transformed and non-trans- formed host cells. As we are interested only in transformed host cells it becomes necessary to filter them out. This is exactly what is done in the selection process. There are many ex­isting selection strategies some of which in­clude taking the help of reporter genes, colony hybridization technique, etc.

Gene Cloning Step # 5.

Multiplication/Expression of the Introduced Gene in the Host:

Once we have purified our transformed host cells by the screening process it is now our job to provide them optimum parameters to grow and multiply. In this step the transformed host cells are introduced into fresh culture media which provide them rich nourishment followed by an incubation in the oven at right tempera­ture.

At this stage the host cells divide and re-divide along with the replication of the recom­binant DNA carried by them. Now at this point we have two choices.

When the aim of the clon­ing process is to generate a gene library, then our target will be obtaining numerous copies of GI. So with this plan in our mind we will simply go for the replication of the recombi­nant DNA and not beyond that.

If the aim of the cloning experiment is to obtain the product of GI, then we will go for a step ahead where we will provide favourable conditions to the host cells in which the GI sitting in the vector can express our product of interest (PI).

Gene Cloning Step # 6.

Isolation of the Multiplied Gene Copies/Protein Expressed by the Intro­duced Gene:

In this step we isolate our multiplied GI which is present attached with the vector or the pro­tein encoded by it. This can be rightly com­pared with the process of harvesting where we collect the crop from the field. There are many processes of isolation, the selection of which varies from case to case.

Gene Cloning Step # 7.

Purification of the Isolated Gene Copy/Protein:

After the harvesting of the isolated gene copy or the protein it is now our job to purify them.


Critical Factors Affecting the Success of Cloning, Expression, and Mass Production of Enzymes by Recombinant E. coli

E. coli is the most frequently used host for production of enzymes and other proteins by recombinant DNA technology. E. coli is preferable for its relative simplicity, inexpensive and fast high-density cultivation, well-known genetics, and large number of compatible molecular tools available. Despite all these advantages, expression and production of recombinant enzymes are not always successful and often result in insoluble and nonfunctional proteins. There are many factors that affect the success of cloning, expression, and mass production of enzymes by recombinant E. coli. In this paper, these critical factors and approaches to overcome these obstacles are summarized focusing controlled expression of target protein/enzyme in an unmodified form at industrial level.

1. Introduction

In the past few years recombinant DNA technology has enabled scientists to produce a large number of diverse proteins, in microorganisms, that were previously unavailable, relatively expensive, or difficult to obtain in quantity [1]. While the expression of foreign genes has been reported in a variety of microorganisms and cell lines, most of this work utilizes E. coli for the cloning and expression of foreign genes [2]. Production of enzymes involves cloning of the appropriate gene into an expression vector under the control of an inducible promoter [3].

2. Enzyme Production in E. coli

The expression of recombinant proteins in cells in which they do not naturally occur is termed heterologous protein production. Bacterial expression systems are commonly used for production of heterologous gene products of both eukaryotic and prokaryotic origin [4]. The expression of heterologous proteins in E. coli, which is the bacterial system, is most widely and routinely used. A number of therapeutically important proteins are now produced as heterologous in E. coli. The first heterologous protein to be employed clinically was human insulin produced in E. coli, first approved in 1982, in UK, West Germany, The Netherland, and USA [5] (Table 1).

3. General Considerations of Selecting E. coli as Heterogeneous Protein Expression Host

E. coli is widely used as the host for heterogeneous protein expression for the following advantages: (1) ease of growth and manipulation using simple laboratory equipment (2) availability of dozens of vectors and host strains that have been developed for maximizing expression (3) a wealth of knowledge about the genetics and physiology of E. coli (4) expression can often be achieved quite rapidly beginning with an eukaryotic cDNA clone, express the protein in E. coli, and purify in milligram quantities in less than 2 weeks (5) suitable fermentation technology well established (6) can generate potentially unlimited supplies of recombinant protein (7) economically attractive [6].

4. Limitations Using E. coli as Heterogeneous Protein Expression Host

They are (1) inability of E. coli as a prokaryotic to carry out posttranslational modification which is typical for eukaryotic (2) limited ability to carry out extensive disulfide bond formation (3) some proteins are made in insoluble form, a consequence of protein misfolding, aggregation, and intracellular accumulation as inclusion bodies (4) sometimes sufficient expression may not be observed due to protein degradation or insufficient translation (mRNA may remain in secondary structure and translation hampered) (5) codon sequence for a specific amino acid in Eukaryotic is different from Prokaryotic as E. coli. This phenomenon is known as “codon bias” which vastly hampers protein synthesis and gene expression in E. coli [6].

5. Factors Affecting Expression of Enzymes in E. coli

The expression of genes of enzymes in E. coli is influenced by a range of factors. These are discussed below.

5.1. Unique and Subtle Structural Features of the Gene Sequence

Unique DNA sequences are involved in different stages of expression of recombinant enzymes such as transcription and translation.

(a) DNA Sequences Involved in Transcription. Three different DNA sequences and one multicomponent protein are involved in transcription of genes. (1) The promoter: promoters normally consist of three regions called the −35 and the −10 box and the spacer region separating both boxes. Alignment of many promoters allows the deduction of a so-called consensus sequence. This sequence represents the optimal promoter sequence with a spacer region of 17 nucleotides. It should be mentioned that there is not a single promoter present on the E. coli chromosome identical to the consensus sequence. In most cases, there are one or two deviations in both the −35 and the −10 box [4]. (2) The transcriptional terminator: a transcriptional terminator is required to allow termination of transcription. Two classes of terminators have been described, factor-independent and factor-dependent terminators [7]. (3) The regulatory sequence: genes are either expressed constitutively or regulated. Two different classes of regulators have been described, transcriptional repressors and transcriptional activators. Repressors bind to operators located either within the promoter region or immediately downstream from it and, in most cases, prevent RNA polymerase promoter binding or act as a road block. To relieve repression, the repressor has to dissociate from its operator. In some cases, an inducer will be either synthesized by the cell or taken up from the environment which binds to the repressor causing dissociation from its operator [3]. (4) The RNA polymerase: the RNA polymerase consists of five different components termed α, β, β′, ω, and σ. While α2ββω constitute the core enzyme, addition of σ conferring promoter specificity makes up the holoenzyme. The σ factor is responsible for the recognition of the promoter, and it follows that each σ factor recognizes a different promoter [8]. E. coli codes for six alternative factors where σ 32 is needed after a sudden temperature upshift and σ S replaces the housekeeping σ factor σ 70 during the stationary phase. So far, only σ 70 is used in the production of recombinant proteins such as enzymes [3].

(b) DNA Sequences Involved in Translation. It became clear that the wide range of efficiencies in translation of different mRNAs is predominantly due to the structure at the 5′ end of each mRNA species. The translation initiation region comprises four different sequences: (1) the Shine-Dalgarno sequence, (2) the start codon, (3) the spacer region between the Shine-Dalgarno sequence and the start codon, the optimal spacing has been determined to be 4 to 8 nucleotides, and (4) translational enhancers [3].

The secondary structure at the translation initiation region of the mRNA plays an important role in the efficiency of gene expression. It has been shown that occlusion of the Shine-Dalgarno sequence and/or the start codon by a stem-loop structure prevents accessibility to the 30S ribosomal subunit and inhibits translation [9]. The mutation of specific nucleotides up- or downstream from the Shine-Dalgarno sequence suppressed the formation of mRNA secondary structures and enhanced the translation efficiency [10, 11].

5.2. The “Strength” of the Transcriptional Promoter

For higher expression, the gene of enzymes should be placed under the control of a strong promoter. Many plasmid and bacteriophage vectors have been developed in which the cloned gene is situated immediately downstream from a strong transcriptional promoter [2]. Use of these vectors requires that the promoter should not be constitutive (i.e., always turned on) but, rather, be turned on at a specific stage in the growth of the transformed E. coli cells. This is often accomplished by the addition of a specific metabolite or by a shift in the temperature of the growth medium [12]. Regulation of promoter activity ensures that the expression of a foreign gene does not interfere with normal cellular gene functions and is not deleterious to the cell. Failure to regulate the expression of strong promoters often results in the loss of the plasmid carrying the strong promoter or the constitutive expression of the strong promoter which may be lethal to the cell [13].

The most widely used strong promoters are from the E. coli trp and lac operons, the tae promoter (an in vitro construct including elements from both the trp and lac promoters), and the leftward, or pL, promoter of bacteriophage lambda [4].

5.3. The Stability of the Vector in E. coli Cells

After a foreign gene has been cloned into an expression vector, the vector is introduced into competent E. coli cells that become a source of the foreign protein. However, plasmids are not always stable, especially in cells grown for many generations in large-scale cultures [14] so that when a process is scaled up it is important that vector stability be addressed. Since a plasmid-free strain has a faster-specific growth rate than a plasmid-containing strain, as a result of the metabolic energy which is expended for plasmid maintenance, the plasmid-free strain will eventually outcompete the plasmid-containing strain [15].

5.3.1. Reasons of Instability

(1) Plasmid stability is influenced by the vector and host genotypes the same plasmid in different hosts exhibits different degrees of stability and vice versa [16]. (2) The origin and size of foreign DNA have been observed to affect the plasmid stability [16]. (3) Plasmid loss first occurs at the level of the individual cell as a result of defective segregation at cell division, and then at the population level [15]. (4) Instability is due to increase in metabolic energy required for plasmid maintenance and function [17]. (5) Plasmid stability is also a function of physiological parameters that affect the growth rate of the host cell, which include pH, temperature, aeration rate, medium components, and heterologous protein accumulation [16].

5.3.2. Solutions to the Problem of Instability

(1) The most common method of ensuring that a recombinant plasmid is not lost during the growth of the microorganism is the inclusion of antibiotics which are selected for the presence of plasmids carrying the appropriate antibiotic resistance genes. However, scale-up of this approach may not be economically feasible due to the cost of the added antibiotics placed on the cell [14]. (2) An analogous strategy involves the use of runaway-replication plasmid vectors where plasmid copy number is relatively low at lower temperatures and is increased when the temperature is raised. The lower plasmid copy number during much of the cell growth cycle reduces the metabolic load on the cell and ensures plasmid stability. At the same time the higher plasmid copy number for a portion of the growth cycle results in high levels of expression of the cloned foreign gene [18].

5.4. The Number of Copies of the Gene

Since the target gene is often incorporated into a plasmid vector system, gene dosage is dependent on plasmid copy number. As can be expected, an increase in copy number results in concomitantly higher recombinant protein productivity, but not indefinitely. Plasmid copy number is affected by plasmid and host genetics and also by cultivation conditions such as growth rates, media, and temperature [19].

5.5. Codons Utilized in Foreign Gene Compared to the Normal Pattern of Codon Usage in E. coli

Since the 20 amino acids are encoded by 61 different trinucleotide codons, several trinucleotide codons can encode the information for the insertion of the same amino acid into protein. Organisms show marked differences in codon preference. In fact, it appears that the frequency of codon usage in an organism is a direct reflection of the pool of cognate tRNAs [20]. Highly expressed genes use codons for which there is a large pool of cognate tRNAs while regulatory genes often use codons for which there is only a very small pool of cognate tRNAs. Accordingly, expression of a foreign gene may be limited by the availability of a particular aminoacyl tRNA [21].

The codon usage by the different species can be quite different. As an example, codon usage for arginine of four different species is presented in the following Table 2.

Overexpression of genes with high contents of rare codons may result in defective synthesis of the corresponding enzyme. Besides the amount, the location of rare codons within the coding region can significantly influence the translation level. Rare codons close to the initiator may stall the ribosome and prevent the entry of new incoming ribosomes [22].

5.5.1. Solutions to the Problem of Codon Usage

There are two experimental solutions to this problem: (1) increase in the amount of the appropriate cognate tRNA, (2) alteration of these codons to frequently used ones by sequence-specific mutagenesis [22].

5.6. The Stability and Efficiency of mRNA

mRNA of recombinant genes tends to accumulate in the cell however, E. coli mRNAs are rather unstable. Some features of mRNA affect its stability. These include (1) the Shine-Dalgarno (S-D) sequence at the 5′ end of the mRNA that is thought to help position the mRNA on the ribosome, (2) the distance between the S-D sequence and the initiation codon, and (3) the secondary and tertiary structure of the mRNA [7].

5.6.1. Solutions

(1) It was reported recently that the addition of a short-specific DNA sequence (approximately 89 base pairs) to the distal end of cloned genes may stabilize the mRNA transcribed from that gene, thereby increasing gene expression. This “retroregulator” sequence probably becomes incorporated at the 3′ end of the mRNA, protecting it from exonuclease digestion [23]. (2) It has been shown that stable secondary structures engineered into the 5′ untranslated region and 3′ rho-independent terminator of the mRNA can aid in mRNA stability and prevent degradation by exonucleases. In particular, a hairpin at the 5′ end without any 5′ single-stranded nucleotide overhangs has conferred mRNAs with considerable resistance to exonuclease activity in the cytoplasm [24].

5.7. The Location of the Cloned Protein within the E. coli Cell

While E. coli proteins are synthesized in the cytoplasm, it is possible to direct a cloned gene product to the cytoplasm, the inner or outer membrane, or the periplasmic space [25]. Secretion of a cloned gene product to the periplasmic space often allows for higher levels of expression of the foreign protein that might be degraded by proteases in the cytoplasm [26]. E. coli is capable of recognizing and correctly processing signal sequences so that secretion of enzymes into the E. coli periplasmic space is possible [27].

5.7.1. There are Four Reasons to Translocate Recombinant Proteins into the Periplasm

(1) the oxidizing environment facilitates the formation of disulfide bonds, (2) it contains only 4% of the total cell protein (

100 different proteins), (3) there is less protein degradation, and (4) easy purification by osmotic shock [3].

5.7.2. Disadvantage of Periplasmic Expression

While it is technically feasible to direct the protein products of foreign genes to the inner or outer membrane, high levels of a foreign protein in the membrane may interfere with normal cellular functions and be lethal to the cell [28].

5.7.3. Solution

Expression vectors have recently been constructed which place the genes for foreign proteins, not normally secreted, behind a DNA fragment encoding a signal sequence. This results in the foreign protein being efficiently secreted (in large amounts) to the periplasmic space with no evidence for accumulation of the unprocessed form in the cytoplasm [29].

5.8. The Stability of the Cloned Enzyme in E. coli

Secretion of a cloned gene product to the periplasmic space often allows for higher levels of expression of the foreign protein that might be degraded by proteases in the cytoplasm [26]. The large-scale production of eukaryotic proteins in E. coli is often limited by the instability of these polypeptides within the bacterial host [30].

Protease susceptibility can be affected by the N- and C-terminal sequences of the recombinant protein. The presence of Arg, Leu, Lys, Phe, Trp, or Tyr at the N-terminus targets proteins for more rapid degradation (N-end rule). Nonpolar amino acids at the C-terminus can lead to rapid degradation however, proteins with last five amino acids polar or charged fail to be degraded [31].

Other factors in protease susceptibility include (1) the presence of damaged or excess protein products caused by formation of incomplete polypeptides, (2) excessive synthesis of subunits from multimeric complexes, (3) post-translational damage, or genetic engineering of the target protein, and (4) culture growth parameters such as nutrient composition of media, growth temperature, and pH [32].

5.8.1. Solving the Problem

(1) A common strategy which has been used to overcome this problem is to fuse the gene for the eukaryotic protein to a portion of a bacterial gene [33]. (2) An alternate approach to stabilizing a cloned protein is to clone multiple copies of the gene in tandem onto the same plasmid [34].

5.9. Inclusion Bodies and How to Prevent Their Formation

Rapid production of recombinant proteins can lead to the formation of insoluble aggregates designated as inclusion bodies [35]. These are large, spherical particles which are clearly separated from the cytoplasm and result from the failure of the quality control system to repair or remove misfolded or unfolded protein [36]. In this instance it may be advantageous to clone the gene into a secretion vector so that the cloned protein does not accumulate in the cytoplasm [37].

5.9.1. Solutions

Strategies to prevent the formation of inclusion bodies are aimed to slow down the production of recombinant proteins and include (1) low-copy number vectors, (2) weak promoters, (3) low temperature, (4) coexpression of molecular chaperones, (5) use of a solubilizing partner, and (6) fermentation at extreme pH values [3]. (7) A common strategy which has been used to overcome this problem is to fuse the gene for the eukaryotic protein to a portion of a bacterial gene [33].

Advantages of Expression or Heterologous Proteins as Fusion Proteins or with Protein Tag. Many vectors are available which allow expression of heterologous proteins which are fused at their N- or C-terminal partners are often termed as protein tag [38]. For example, Histidine (His) tag is a fusion protein. Such fusion partners offer several potential advantages. Improved expression: fusion of the N terminals of a heterologous protein to the C-terminus of a highly expressed fusion partner often allows high level of expression of the fusion protein [39]. Improved solubility: fusion of N terminus of heterologous protein to the C-terminus of a soluble fusion partner often improves solubility of a protein [40]. Improved detection: fusion of a protein at either terminus to a short peptide or a polypeptide which is recognized by an antibody or binding protein allows western blot analysis of a protein during expression and purification [41]. Improved purification: it is a widely used phenomenon. Simple purification schemes have been described for proteins fused at either end to tags which bind affinity resins. Available tags include His6 (six tandem Histidine residues), which bind to Ni-NTA (nitrilotriacetate chelated with Ni 2+ ions) GST (glutathione-S-transferase, which bind to glutathione-sepharose). These tags bind to their specific resins and separated easily. There is no effect of tags on protein and the excised easily [42].

5.10. Correct and Efficient Protein Folding

During or following translation, the polypeptide must fold so as to adopt its functionally active conformation [43]. Since many denatured proteins can be refolded in vitro, it appears that the information for correct folding is contained in the primary polypeptide structure [44]. However, folding comprises rate-limiting steps during which some molecules may aggregate, particularly at high rates of synthesis and at higher temperatures. In contrast to intracellular proteins, naturally secreted proteins encounter an abnormal environment in the cytoplasm disulphide bond formation is not favoured and glycosylation cannot occur [45].

5.10.1. Solutions

(1) Coexpress additional chaperones to aid in protein folding. This can cause a reduction in the expression of the enzyme, but it promotes solubility. There is evidence that certain heat shock proteins act as molecular chaperones in preventing the formation and accumulation of unfolded aggregates, while accelerating the folding reactions. (2) For disulfide bond formation, coexpress thioredoxin (or use as a fusion partner) or use strains deficient in thioredoxin reductase. An alternative to consider is targeting the protein to the periplasm where disulfide-bond formation can occur (most E. coli proteins having disulfide bonds are located in the periplasm) [46].

5.11. Cell Growth Characteristics

Cell growth characteristics have marked influence on the expression of recombinant enzymes. Some of the manipulations of culture media are as follows. (a) Decrease culture growth temperature: advantages of decreased growth temperature are the following. (1) Growth at 37°C can promote inclusion body formation for some proteins while growth at lower temperatures (e.g., 30°C, 25°C, 15°C) may not. (2) The lower temperature also decreases protease activity. Disadvantages are the following. (1) Growing the culture at a lower temperature will significantly slow the growth of E. coli, and so a longer induction period (e.g., overnight) may be necessary to obtain a sufficient amount of recombinant protein. (2) Growing the culture at a lower temperature will slow the rate of protein synthesis, possibly keeping recombinant proteins from saturating cellular folding machinery and aggregating [47]. (b) Addition of cofactors: potential cofactors should be added to the growth medium. Some proteins cannot properly fold without their cofactor and therefore can form inclusion bodies. (c) pH alteration: alteration of pH of growth medium can improve expression. pH is one culture variable that can affect proteolytic activity, secretion, and protein production levels [48].

5.12. Metabolic Load on the Organism

Regardless of the nature of the foreign gene or the design of the fermenter, the introduction of an exogenous plasmid into an E. coli cell is bound to impose some metabolic load [49].

5.12.1. Solution

This may be avoided (1) by integrating the foreign gene into the E. coli chromosome through the use of a defective bacteriophage lambda lysogen carrying the foreign gene [50], (2) by the direct insertion of a foreign gene into a specific site on the host chromosome [51].

6. Conclusion

While the efficient expression of foreign genes in E. coli is dependent on a number of factors, it is nevertheless reasonable to expect that most foreign genes may be expressed at high levels in E. coli and that this expression will be amenable to scale-up. Although the strategy of gene expression and scale-up is likely to vary, there are more similarities than differences from one gene to the next, resulting in the development of a “systems” approach to the cloning, expression, and scale-up of enzyme genes in E. coli. The eventual objective of producing a desired protein in an economical heterologous host is influenced by a variety of factors. However, maximizing production of heterologous proteins for commercial application is still an art. We have begun to understand factors influencing the eventual production. These factors, described in detail in this paper are varied and at times poorly understood. Largely the approach remains empirical. However, our collective experience will permit us to rationalize our approach in designing heterologous production of commercially important enzymes in a variety of expression systems. Subsequent to production, stabilization, and formulation of proteins will pose significant hurdles in utilizing the natural biological catalysts and other proteins for therapeutic and industrial purposes.

References

  1. M. Devasahayam, “Factors affecting the expression of recombinant glycoproteins,” Indian Journal of Medical Research, vol. 126, no. 1, pp. 22–27, 2007. View at: Google Scholar
  2. M. J. Carrier, M. E. Nugent, W. C. A. Tacon, and S. B. Primrose, “High expression of cloned genes in E. coli and its consequences,” Trends in Biotechnology, vol. 1, no. 3𠄵, pp. 109–113, 1983. View at: Google Scholar
  3. W. Schumann and L. C. S. Ferreira, “Production of recombinant proteins in Escherichia coli,” Genetics and Molecular Biology, vol. 27, no. 3, pp. 442–453, 2004. View at: Google Scholar
  4. B. R. Glick and G. K. Whitney, “Factors affecting the expression of foreign proteins in Escherichia coli,” Journal of Industrial Microbiology, vol. 1, no. 5, pp. 277–282, 1987. View at: Google Scholar
  5. R. Crowl, C. Seamans, P. Lomedico, and S. McAndrew, “Versatile expression vectors for high-level synthesis of cloned gene products in Escherichia coli,” Gene, vol. 38, no. 1𠄳, pp. 31–38, 1985. View at: Google Scholar
  6. T. A. Brown, Gene Cloning- An Introduction, Wiley-Blackwell, 4th edition, 1995.
  7. G. D. Stormo, T. D. Schneider, and L. M. Gold, “Characterization of translational initiation sites in E. coli,” Nucleic Acids Research, vol. 10, no. 9, pp. 2971–2996, 1982. View at: Publisher Site | Google Scholar
  8. T. M. Gruber and C. A. Gross, “Multiple sigma subunits and the partitioning of bacterial transcription space,” Annual Review of Microbiology, vol. 57, pp. 441–466, 2003. View at: Publisher Site | Google Scholar
  9. V. Ramesh, A. De, and V. Nagaraja, “Engineering hyperexpression of bacteriophage Mu C protein by removal of secondary structure at the translation initiation region,” Protein Engineering, vol. 7, no. 8, pp. 1053–1057, 1994. View at: Google Scholar
  10. J. Coleman, M. Inouye, and K. Nakamura, “Mutations upstream of the ribosome-binding site affect translational efficiency,” Journal of Molecular Biology, vol. 181, no. 1, pp. 139–143, 1985. View at: Google Scholar
  11. G. Gross, C. Mielke, I. Hollatz, H. Blockers, and R. Frank, “RNA primary sequence or secondary structure in the translational initiation region controls expression of two variant interferon-β genes in Escherichia coli,” The Journal of Biological Chemistry, vol. 265, no. 29, pp. 17627–17636, 1990. View at: Google Scholar
  12. J. R. Swartz, “Advances in Escherichia coli production of therapeutic proteins,” Current Opinion in Biotechnology, vol. 12, no. 2, pp. 195–201, 2001. View at: Publisher Site | Google Scholar
  13. S. Ringquist, S. Shinedling, D. Barrick et al., “Translation initiation in Escherichia coli: sequences within the ribosome-binding site,” Molecular Microbiology, vol. 6, no. 9, pp. 1219–1229, 1992. View at: Publisher Site | Google Scholar
  14. J. Pierce and S. Gutteridge, “Large-scale preparation of ribulosebisphosphate carboxylase from a recombinant system in Escherichia coli characterized by extreme plasmid instability,” Applied and Environmental Microbiology, vol. 49, no. 5, pp. 1094–1100, 1985. View at: Google Scholar
  15. R. E. Ashby and K. A. Stacey, “Stability of a plasmid F trim in populations of a recombination-deficient strain of Escherichia coli in continuous culture,” Antonie van Leeuwenhoek, vol. 50, no. 2, pp. 125–134, 1984. View at: Google Scholar
  16. R. Meena and P. Harish, “Expression systems for production of heterologous proteins,” Current Science, vol. 80, no. 9, pp. 1121–1128, 2001. View at: Google Scholar
  17. S. Aiba, H. Tsunekawa, and T. Imanaka, “New approach to tryptophan production by Escherichia coli: genetic manipulation of composite plasmids in vitro,” Applied and Environmental Microbiology, vol. 43, no. 2, pp. 289–297, 1982. View at: Google Scholar
  18. M. Bittner and D. Vapnek, “Versatile cloning vectors derived from the runaway-replication plasmid pKN402,” Gene, vol. 15, no. 4, pp. 319–329, 1981. View at: Publisher Site | Google Scholar
  19. J. E. Hughes and D. L. Welker, “Copy number control and compatibility of nuclear plasmids in dictyostelium discoideum,” Plasmid, vol. 22, no. 3, pp. 215–223, 1989. View at: Google Scholar
  20. O. G. Berg and C. G. Kurland, “Growth rate-optimised tRNA abundance and codon usage,” Journal of Molecular Biology, vol. 270, no. 4, pp. 544–550, 1997. View at: Publisher Site | Google Scholar
  21. N. Stoletzki and A. Eyre-Walker, “Synonymous codon usage in Escherichia coli: selection for translational accuracy,” Molecular Biology and Evolution, vol. 24, no. 2, pp. 374–381, 2007. View at: Publisher Site | Google Scholar
  22. G. F. T. Chen and M. Inouye, “Role of the AGA/AGG codons, the rarest codons in global gene expression in Escherichia coli,” Genes & Development, vol. 8, no. 21, pp. 2641–2652, 1994. View at: Google Scholar
  23. H. C. Wong and S. Chang, “Identification of a positive retroregulator that stabilizes mRNAs in bacteria,” Proceedings of the National Academy of Sciences of the United States of America, vol. 83, no. 10, pp. 3233–3237, 1986. View at: Google Scholar
  24. S. A. Emory, P. Bouvet, and J. G. Belasco, “A 5′-terminal stem-loop structure can stabilize mRNA in Escherichia coli,” Genes & Development, vol. 6, no. 1, pp. 135–148, 1992. View at: Google Scholar
  25. F. Baneyx, “Recombinant protein expression in Escherichia coli,” Current Opinion in Biotechnology, vol. 10, no. 5, pp. 411–421, 1999. View at: Publisher Site | Google Scholar
  26. C. S. Hoffman and A. Wright, “Fusions of secreted proteins to alkaline phosphatase: an approach for studying protein secretion,” Proceedings of the National Academy of Sciences of the United States of America, vol. 82, no. 15, pp. 5107–5111, 1985. View at: Google Scholar
  27. G. L. Gray, J. S. Baldridge, K. S. McKeown, H. L. Heynecker, and C. N. Chang, “Periplasmic production of correctly processed human growth hormone in Escherichia coli: natural and bacterial signal sequences are interchangeable,” Gene, vol. 39, no. 2-3, pp. 247–254, 1985. View at: Google Scholar
  28. F. J. Mergulhão and G. A. Monteiro, “Analysis of factors affecting the periplasmic production of recombinant proteins in Escherichia coli,” Journal of Microbiology and Biotechnology, vol. 17, no. 8, pp. 1236–1241, 2007. View at: Google Scholar
  29. J. Ghrayeb, H. Kimura, M. Takahara, H. Hsiung, Y. Masui, and M. Inouye, “Secretion cloning vectors in Escherichia coli,” EMBO Journal, vol. 3, no. 10, pp. 2437–2442, 1984. View at: Google Scholar
  30. L. C. Simrnons and D. G. Yansura, “Translational level is a critical factor for the secretion of heterologous proteins in Escherichia coli,” Nature Biotechnology, vol. 14, no. 5, pp. 629–634, 1996. View at: Google Scholar
  31. S. Gottesman, S. Wickner, and M. R. Maurizi, “Protein quality control: triage by chaperones and proteases,” Genes & Development, vol. 11, no. 7, pp. 815–823, 1997. View at: Google Scholar
  32. N. Dedhia, R. Richins, A. Mesina, and W. Chen, “Improvement in recombinant protein production in ppGppdeficient Escherichia coli,” Biotechnology and Bioengineering, vol. 53, pp. 379–386, 1997. View at: Google Scholar
  33. K. Itakura, T. Hirose, R. Crea, and A. D. Riggs, “Expression in Escherichia coli of a chemically synthesized gene for the hormone somatostatin,” Science, vol. 198, no. 4321, pp. 1056–1063, 1977. View at: Google Scholar
  34. J. L. Hartley and T. J. Gregori, “Cloning multiple copies of a DNA segment,” Gene, vol. 13, no. 4, pp. 347–353, 1981. View at: Publisher Site | Google Scholar
  35. S. Betts and J. King, “There's a right way and a wrong way: in vivo and in vitro folding, misfolding and subunit assembly of the P22 tailspike,” Structure, vol. 7, no. 6, pp. R131–R139, 1999. View at: Publisher Site | Google Scholar
  36. J. R. Blackwell and R. Horgan, “A novel strategy for production of a highly expressed recombinant protein in an active form,” FEBS Letters, vol. 295, no. 1𠄳, pp. 10–12, 1991. View at: Publisher Site | Google Scholar
  37. B. Fahnert, H. Lilie, and P. Neubauer, “Inclusion bodies: formation and utilisation,” Advances in Biochemical Engineering/Biotechnology, vol. 89, pp. 93–142, 2004. View at: Google Scholar
  38. D. Esposito and D. K. Chatterjee, “Enhancement of soluble protein expression through the use of fusion tags,” Current Opinion in Biotechnology, vol. 17, no. 4, pp. 353–358, 2006. View at: Publisher Site | Google Scholar
  39. H. P. Sørensen and K. K. Mortensen, “Soluble expression of recombinant proteins in the cytoplasm of Escherichia coli,” Microbial Cell Factories, vol. 4, article 1, 2005. View at: Publisher Site | Google Scholar
  40. S. Stahl and P. A. Nygren, “The use of gene fusions to protein A and protein G in immunology and biotechnology,” Pathologie Biologie, vol. 45, no. 1, pp. 66–76, 1997. View at: Google Scholar
  41. S. C. Makrides, “Strategies for achieving high-level expression of genes in Escherichia coli,” Microbiological Reviews, vol. 60, no. 3, pp. 512–538, 1996. View at: Google Scholar
  42. T. Moks, L. Abrahmsén, E. Holmgren et al., “Expression of human insulin-like growth factor I in bacteria: use of optimized gene fusion vectors to facilitate protein purification,” Biochemistry, vol. 26, no. 17, pp. 5239–5244, 1987. View at: Google Scholar
  43. J. G. Thomas and F. Baneyx, “Protein misfolding and inclusion body formation in recombinant Escherichia coli cells overexpressing heat-shock proteins,” The Journal of Biological Chemistry, vol. 271, no. 19, pp. 11141–11147, 1996. View at: Publisher Site | Google Scholar
  44. M. J. Gething and J. Sambrook, “Protein folding in the cell,” Nature, vol. 355, no. 6355, pp. 33–45, 1992. View at: Publisher Site | Google Scholar
  45. C. H. Schein and M. H. M. Noteborn, “Formation of soluble recombinant proteins in Escherichia coli is favored by lower growth temperature,” Bio/Technology, vol. 6, no. 3, pp. 291–294, 1988. View at: Google Scholar
  46. D. C. Andersen and L. Krummen, “Recombinant protein expression for therapeutic applications,” Current Opinion in Biotechnology, vol. 13, no. 2, pp. 117–123, 2002. View at: Publisher Site | Google Scholar
  47. S. Hunke and J. M. Betton, “Temperature effect on inclusion body formation and stress response in the periplasm of Escherichia coli,” Molecular Microbiology, vol. 50, no. 5, pp. 1579–1589, 2003. View at: Publisher Site | Google Scholar
  48. J. H. Choi and S. Y. Lee, “Secretory and extracellular production of recombinant proteins using Escherichia coli,” Applied Microbiology and Biotechnology, vol. 64, no. 5, pp. 625–635, 2004. View at: Publisher Site | Google Scholar
  49. P. Neubauer, H. Y. Lin, and B. Mathiszik, “Metabolic load of recombinant protein production: inhibition of cellular capacities for glucose uptake and respiration after induction of a heterologous gene in Escherichia coli,” Biotechnology and Bioengineering, vol. 83, no. 1, pp. 53–64, 2003. View at: Publisher Site | Google Scholar
  50. N. E. Murray, S. A. Bruce, and K. Murray, “Molecular cloning of the DNA ligase gene from bacteriophage T4. II. Amplification and preparation of the gene product,” Journal of Molecular Biology, vol. 132, no. 3, pp. 493–505, 1979. View at: Google Scholar
  51. O. Raibaud, M. Mock, and M. Schwartz, “A technique for integrating any DNA fragment into the chromosome of Escherichi coli,” Gene, vol. 29, no. 1-2, pp. 231–241, 1984. View at: Google Scholar

Copyright

Copyright © 2013 Md. Fakruddin et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Creating the clone

The steps in cloning are as follows. DNA is extracted from the organism under study and is cut into small fragments of a size suitable for cloning. Most often this is achieved by cleaving the DNA with a restriction enzyme. Restriction enzymes are extracted from several different species and strains of bacteria, in which they act as defense mechanisms against viruses. They can be thought of as “molecular scissors,” cutting the DNA at specific target sequences. The most useful restriction enzymes make staggered cuts that is, they leave a single-stranded overhang at the site of cleavage. These overhangs are very useful in cloning because the unpaired nucleotides will pair with other overhangs made using the same restriction enzyme. So, if the donor DNA and the vector DNA are both cut with the same enzyme, there is a strong possibility that the donor fragments and the cut vector will splice together because of the complementary overhangs. The resulting molecule is called recombinant DNA. It is recombinant in the sense that it is composed of DNA from two different sources. Thus, it is a type of DNA that would be impossible naturally and is an artifact created by DNA technology.

The next step in the cloning process is to cut the vector with the same restriction enzyme used to cut the donor DNA. Vectors have target sites for many different restriction enzymes, but the most convenient ones are those that occur only once in the vector molecule. This is because the restriction enzyme then merely opens up the vector ring, creating a space for the insertion of the donor DNA segment. Cut vector DNA and donor DNA are mixed in a test tube, and the complementary ends of both types of DNA unite randomly. Of course, several types of unions are possible: donor fragment to donor fragment, vector fragment to vector fragment, and, most important, vector fragment to donor fragment, which can be selected for. Recombinant DNA associations form spontaneously in the above manner, but these associations are not stable because, although the ends are paired, the sugar-phosphate backbone of the DNA has not been sealed. This is accomplished by the application of an enzyme called DNA ligase, which seals the two segments, forming a continuous and stable double helix.

The mixture should now contain a population of vectors each containing a different donor insert. This solution is mixed with live bacterial cells that have been specially treated to make their cells more permeable to DNA. Recombinant molecules enter living cells in a process called transformation. Usually, only a single recombinant molecule will enter any individual bacterial cell. Once inside, the recombinant DNA molecule replicates like any other plasmid DNA molecule, and many copies are subsequently produced. Furthermore, when the bacterial cell divides, all of the daughter cells receive the recombinant plasmid, which again replicates in each daughter cell.

The original mixture of transformed bacterial cells is spread out on the surface of a growth medium in a flat dish (Petri dish) so that the cells are separated from one another. These individual cells are invisible to the naked eye, but as each cell undergoes successive rounds of cell division, visible colonies form. Each colony is a cell clone, but it is also a DNA clone because the recombinant vector has now been amplified by replication during every round of cell division. Thus, the Petri dish, which may contain many hundreds of distinct colonies, represents a large number of clones of different DNA fragments. This collection of clones is called a DNA library. By considering the size of the donor genome and the average size of the inserts in the recombinant DNA molecule, a researcher can calculate the number of clones needed to encompass the entire donor genome, or, in other words, the number of clones needed to constitute a genomic library.

Another type of library is a cDNA library. Creation of a cDNA library begins with messenger ribonucleic acid (mRNA) instead of DNA. Messenger RNA carries encoded information from DNA to ribosomes for translation into protein. To create a cDNA library, these mRNA molecules are treated with the enzyme reverse transcriptase, which is used to make a DNA copy of an mRNA. The resulting DNA molecules are called complementary DNA (cDNA). A cDNA library represents a sampling of the transcribed genes, whereas a genomic library includes untranscribed regions.

Both genomic and cDNA libraries are made without regard to obtaining functional cloned donor fragments. Genomic clones do not necessarily contain full-length copies of genes. Furthermore, genomic DNA from eukaryotes (cells or organisms that have a nucleus) contains introns, which are regions of DNA that are not translated into protein and cannot be processed by bacterial cells. This means that even full-sized genes are not translated in their entirety. In addition, eukaryotic regulatory signals are different from those used by prokaryotes (cells or organisms lacking internal membranes—i.e., bacteria). However, it is possible to produce expression libraries by slicing cDNA inserts immediately adjacent to a bacterial promoter region on the vector in these expression libraries, eukaryotic proteins are made in bacterial cells, which allows several important technological applications that are discussed below in DNA sequencing.

Several bacterial viruses have also been used as vectors. The most commonly used is the lambda phage. The central part of the lambda genome is not essential for the virus to replicate in Escherichia coli, so this can be excised using an appropriate restriction enzyme, and inserts from donor DNA can be spliced into the gap. In fact, when the phage repackages DNA into its protein capsule, it includes only DNA fragments the same length of the normal phage genome.

Vectors are chosen depending on the total amount of DNA that must be included in a library. Cosmids are engineered vectors that are hybrids of plasmid and phage lambda however, they can carry larger inserts than either pUC plasmids (plasmids engineered to produce a very high number of DNA copies but that can accommodate only small inserts) or lambda phage alone. Bacterial artificial chromosomes (BACs) are vectors based on F-factor (fertility factor) plasmids of E. coli and can carry much larger amounts of DNA. Yeast artificial chromosomes (YACs) are vectors based on autonomously replicating plasmids of Saccharomyces cerevisiae (baker’s yeast). In yeast (a eukaryotic organism) a YAC behaves like a yeast chromosome and segregates properly into daughter cells. These vectors can carry the largest inserts of all and are used extensively in cloning large genomes such as the human genome.