Spinning Junk Into Gold

Science magazine, 13 June 2003

Genomes are typically littered with “junk”: stretches of DNA with no obvious function that are scattered among genes. But scientists are now finding that some of this junk, at least that from lower organisms, can be astoundingly useful to people, if apparently not to the organisms that carry it.

This surprisingly handy DNA is located within genes and not in the no man’s land between one gene and the next. It comes in two types. So-called introns are clipped out of a gene’s RNA before a protein is made. By contrast, the less well known inteins are translated into protein but then immediately removed.

In simple creatures such as yeast, algae, and bacteria, some introns aren’t just tossed out like so much cellular garbage. Instead, the introns serve as templates for making other proteins. Among these are spectacular enzymes that inject new stretches of DNA into precisely defined spots in a genome.

Scientists are tailoring these enzymes to shoot genes into new locations. The technique may yield edible vaccines, hardier cheese cultures, and better-controlled gene therapy for diseases. “I see enormous possibilities for the use of introns and their enzymes in biotechnology and medicine,” says Marlene Belfort, a geneticist at the Wadsworth Center of the New York Department of Health in Albany.

Scientists are mining inteins, which occur primarily in yeast, algae, viruses, bacteria, and archaea, for a different skill: the ability to seamlessly extract themselves from a protein and tie the loose ends back together. In the past few years, researchers have parlayed this talent into methods for purifying proteins that otherwise can’t be made in bacteria. Inteins have also endowed crops with new genetic traits that are unlikely to cross over to nearby plants. “Not only have inteins fundamentally changed how we view gene expression, they’ve also been harnessed as workhorses that have revolutionized protein chemistry,” says molecular biologist Francine Perler of New England Biolabs in Beverly, Massachusetts, who discovered much of the basic biology of inteins. “For something most people have never heard of,” adds chemical engineer David Wood of Princeton University, “this is a hot area.”

Cell biologists identified introns in 1977, and in the mid-1980s a team discovered a yeast intron with an odd power. It codes for an enzyme later dubbed a homing endonuclease. The intron and its enzyme from one yeast cell are transferred during mating into the second cell. There the enzyme makes a single clip in the recipient cell’s DNA at the exact same location the intron occupied in the first yeast cell. The cell then patches the break by pairing up its chromosome with the homologous chromosome from the first yeast cell, thereby inserting the intron sequence into its own DNA. Thus, the intron invades the new genome like a harmless parasite.

Hundreds of similar enzymes spun by these so-called group I introns have since been found in yeast, algae, viruses, and the mitochondrial and chloroplast genomes of higher plants. All of them are very precise cutting tools. Restriction enzymes commonly used in the lab to cut DNA home in on strings of just six base pairs, sequences expected to occur about once every 4000 base pairs. But the intron-encoded enzymes target sequences from 15 to 40 base pairs long. Researchers estimate that most of them clip spots that would occur only once in a billion base pairs.

Such an enzyme would thus cut very few sites—perhaps just one—in the human genome. Consequently, in the past few years, scientists have been trying to engineer homing endonucleases for made-to-order sequences so that they might, for example, insert therapeutic genes into a chosen location. This could provide better control of the inserted gene’s expression and perhaps circumvent some hazards of today’s random insertion technology, which is thought to have caused leukemia-like disease in some patients when the therapeutic gene apparently activated a gene associated with leukemia (Science, 17 January, p. 320).

Two research teams have now taken the first step toward creating custom cellular scissors. Barry Stoddard and his team at the Fred Hutchinson Cancer Research Center in Seattle reported in the October 2002 issue of Molecular Cell that they could craft a new homing endonuclease by combining halves of two natural homing endonucleases. The researchers used a computer model to determine how to reconfigure key amino acids in half of the endonuclease I-Dmol, hailing from a heat-loving single-celled archaea, and half of a similar enzyme called I-Cre1 from an alga so that they fit together like pieces of a jigsaw puzzle.

In test tube experiments, the hybrid enzyme cleaved DNA at the predicted spot in a 22-base-pair sequence that consisted of half of the target sequence for I-Dmol and half of the target for I-Cre1. “We show that it is possible to treat the individual enzyme subunits as modular, like pieces of Lego,” Stoddard says. A group at the Paris-based biotech firm Cellectis has also shown, in work to appear in Nucleic Acids Research, that a similar artificial enzyme works not only in solution but also in yeast and mammalian cells.

Now both the Seattle and Paris groups are trying to tailor homing endonucleases to cleave DNA targets of their choosing. Both are using a combination of computer-aided protein design and mutagenesis to generate millions of variants of these enzymes that they can screen for the ability to cleave a given DNA target. “Everybody’s waiting to see if it’s possible to do this,” Stoddard says.

Meanwhile, other researchers are focusing on so-called group II introns, which may be easier to direct to new DNA targets. These introns are also extremely precise, recognizing sequences of 30 to 35 base pairs—in this case, using both the intron-encoded protein that cuts the DNA, among other duties, and the intron’s RNA. The RNA and protein work together to insert the intron at a site in the genome, but the RNA has the primary role in specifying that site—and RNA is much easier to modify than a protein.

Indeed, Alan Lambowitz of the University of Texas, Austin, and his team have shown that they can retarget an intron by altering its RNA without tinkering with the protein. They randomly mutated RNA for an intron from Lactococcus lactis, a bacterium that produces lactic acid and is used in making cheese. The researchers selected those mutants that integrated into two genes involved in HIV infection that had been inserted into Escherichia coli bacteria (Science, 21 July 2000, p. 374).

Since then, the Lambowitz team has created hundreds of introns that insert into different sites, enabling the researchers to deduce more refined rules about how best to alter the intron’s RNA sequence to retarget it to a particular gene. Lambowitz and his team have recently codified their rules into a computer program that tailors the L. lactis intron to any gene of interest, selecting the best flanking and insertion sites and specifying an RNA sequence that binds optimally to the target.

In unpublished work, the Lambowitz team used its computer program to figure out how to knock out 23 of 24 so-called DEAD-box protein genes in E. coli by inserting the intron into the gene sequences. From 1% to 80% of the bacterial colonies exposed to the introns sported the desired knockouts—a big improvement over the 0.1% insertion frequencies the team achieved 2 years ago using older techniques.

A St. Louis-based firm, InGex, recently began selling Lambowitz’s gene-insertion technology, dubbed the Targetron, which provides a plasmid toting the intron as well as Web access to the intron-designing computer program. This system can be used to disrupt bacterial genes with the intron. Alternatively, genes can be added to a bacterium by attaching them to the intron. Lambowitz is now trying to make his technology work in higher organisms. The trick, he says, is to get the intron’s protein-RNA complexes into cell nuclei in sufficient concentrations for insertion to occur. “If we could do what we do in bacteria in animal cells, it would be extremely powerful,” Lambowitz says.

The bacterial work is already moving toward commercial application. David Mills of the University of California (UC), Davis, and his colleagues used an intron technology based on Lambowitz’s to safely add viral resistance to cheese-making lactic acid bacteria. Traditionally, genetic engineers insert an antibiotic-resistance gene as a marker along with the desired gene and then expose the bacteria to an antibiotic. By process of elimination, this identifies the bacteria that successfully received the new genes. But because the L. lactis intron invades genomes at high frequency, Mills found an engineered colony by random genetic screening. “We don’t want to put antibiotic-resistance genes in anything going into food,” Mills says.

Using this technique, the researchers inserted a gene for resistance to a bacterial virus inside the L. lactis intron and inserted it into a laboratory strain of L. lactis. The intron and gene integrated into the organism’s DNA, making the strain more resistant to the virus, they reported in Applied and Environmental Microbiology in February. Mills’s team didn’t make cheese, but the technology could be used to provide viral resistance to any of hundreds of cheesemaking strains that are currently vulnerable to the virus.

Mills and other UC Davis scientists are now working with the California Dairy Research Foundation and RZ Syntopical, a biotechnology firm in Sacramento, California, to apply the same technology to make an edible vaccine for respiratory syncytial virus (RSV). No vaccine currently exists for this virus, a serious childhood respiratory pathogen that can also be deadly to adults with impaired immune systems.

The plan is to use the L. lactis intron to insert an RSV antigen into a strain of the edible lactic acid bacterium, which could then be put into a milk-based formula. In the gut, these bacteria would then produce the antigen and stimulate the immune system. Mills says the intron vaccine technology—still in its earliest stages—will more easily pass Food and Drug Administration scrutiny than other technologies do because no part of the delivery vector is foreign to the food itself. “It’s a very clean system,” Mills says.

Sticking together, splitting up

Introns are not the only genetic junk that has been put to extraordinary uses. Inteins, pieces of peptide removed from proteins, have become an increasingly versatile tool for molecular biologists, chemists, and drug developers. The first intein was discovered in 1990 by three independent labs. All identified a gene in yeast cells that was much larger than the protein it encoded, but the additional DNA did not appear to encode an intron.

The researchers thought that the additional material must be spliced out at the protein level. But because they could not find the larger precursor protein, hardly anyone believed them. Finally Perler and her colleagues at New England Biolabs reported in Cellin 1993 that they could insert an intein between two other proteins, purify the precursor at low temperature, and cause it to splice at a higher temperature. This nailed the case. Researchers went on to find inteins in more than 50 creatures, and some 400 papers about inteins now appear in a database called InBase (www.neb.com/neb/inteins.html).

Inteins break free when the section of protein on the carboxyl end of the intein—the so-called C-extein—grabs onto its N-extein counterpart and tears it off the intervening intein, attaching it to itself. The intein then cuts itself loose from the newly connected exteins. The reaction is spontaneous, occurring as soon as a protein folds—timing that explains the earlier difficulty in finding the natural protein precursors.

Perler, her former postdoc Ming Xu, and their colleagues quickly spun this mechanism into a cheaper way of purifying proteins. Ordinarily, a protein is purified by genetically fusing it to a tag that will selectively stick the protein to a coating on a purification column. But freeing the protein from its tag then requires an expensive protease enzyme. To get around this, the New England Biolabs team inserted an intein between the protein and its tag. The researchers then used a chemical trick to coax the intein to cut itself away from the protein at just one end, releasing the pure protein.

A team led by Wadsworth’s Belfort, including her husband Georges Belfort of Rensselaer Polytechnic Institute in Troy, New York, and Princeton’s Wood, has since developed a single-step method for mass-producing purified proteins using inteins. And on the other side of the size scale, the Belfort group has developed an unpublished miniaturized version of its system that isolates small amounts of protein. The technique should speed up proteomics studies, such as determining the functions of genes or identifying new protein targets for medications.

Inteins can get around other laboratory obstacles as well. The Belfort team, including Wadsworth’s Victoria Derbyshire and Wei Wu, has developed a technique for producing hard-to-make proteins such as blood-clotting factors or malarial proteins needed for vaccine development. Many labs use bacteria to mass-produce proteins, but these particular proteins often kill the bacteria that express them.

For 14 years, the Wadsworth team had been unable to produce the endonuclease I-TevI in bacteria because it is toxic to the organisms. But it got around this problem last year by inserting an intein gene into the I-TevI gene, disabling the resulting protein so that E. coli could produce it. Once the hybrid protein was purified, a drop in pH caused the intein to splice out, restoring I-TevI to its native form.

Inteins can split up genes for a different application: making safer transgenic plants. Because the chloroplasts of most crops are almost always maternally transmitted, says New England Biolabs’ Sriharsa Pradhan, their DNA isn’t carried by pollen, which can spread nucleus-based genes far and wide. Thus, putting one part of a new gene into a plant’s chloroplast protects against transfer of the full gene to other plants.

Pradhan’s team has now used this tactic to confer herbicide resistance on tobacco plants. The researchers attached one of two parts of an algal intein to each half of a herbicide-resistant gene, they report in the 15 April Proceedings of the National Academy of Sciences. One fragment went into the chloroplast genome, the other into the nucleus. The intein-containing fragments reassembled in the chloroplast and spliced to yield an intact protein that protected the plant.

Inteins can also link proteins to each other or to small molecules by attaching a kind of molecular Velcro to the protein when it is purified. Starting in 1998, chemist Tom Muir of Rockefeller University in New York City showed that purifying a protein using a variant of a New England Biolabs intein-based system leaves the protein with a sticky end composed of a thiol ester. This enables the protein to be easily joined to other molecules such as a sugar, a lipid, or another protein.

Biologists have used this approach to attach phosphate groups to certain sites on proteins. Phosphorylated proteins are often the functional versions of the proteins in cells, but they can be extremely difficult to make in the lab in pure form, even though they make up an estimated one-third of natural human proteins. “The beauty of this chemistry is that it’s so simple,” says Muir. “Anybody can do this.”

Now Muir and postdoctoral fellow Henning Mootz have developed a novel way of using inteins to rapidly switch proteins on or off inside a cell and thereby gain clues to their functions. The technique works within minutes, providing much finer temporal control than is possible by controlling gene expression, which can take hours to produce an effect. In this system, an intein splices out of a protein—and thereby activates or inactivates it—in response to the addition of the immunosuppressant rapamycin.

The original concept was to rapidly reconstitute two parts of a single protein, but the first test of the idea, reported last year, showed the linkage of two unrelated proteins. Muir and Mootz attached the gene for each of the test proteins to half of a yeast intein gene plus a gene for either of the human cellular signaling proteins FKBP or FRB, both of which are affected by rapamycin. The researchers mixed the tripartite proteins together in a test tube and added rapamycin, which binds simultaneously to FKBP and FRB. This brought the two protein constructs together and reconstituted the intein, which then spontaneously spliced, removing itself and the FKBP and FRB hooks and linking the two test proteins.

This scheme could be used to quickly activate a single protein within a cell by bringing two halves of it together with rapamycin, Muir says. Alternatively, one might inactivate a protein by attaching it to an inhibitory peptide. In unpublished work, Muir, Mootz, and their colleagues have shown that their splicing reaction works in cultured mammalian cells.

The applications of this technique are only beginning to be explored. “The idea of bringing two polypeptides together … is a basic tool that can have a number of different applications,” Muir says. “We haven’t thought of them all yet.” Indeed, researchers are just beginning to uncover all the treasures buried in inteins and their intron cousins—which are turning out to be anything but junk.