<Home> <Bioethics> <Genetic Engineering> <DNA Microsatellites: Agentsof Evolution?>

Repetitive DNA sequences play a surprising role in how bacteria and perhaps higher organisms adapt to their environments. On the downside, they have also been linked to human disease.

By E. Richard Moxon and Christopher Wills

A human's genetic code consists of roughly three billion bases of DNA, the familiar "letters" of the DNA alphabet. But a mere 10 to 15 percent of those bases make up genes, the blueprints cells use to build proteins. Some of the remaining base sequences in humans-and in many other organisms-perform crucial functions, such as helping to turn genes "on" and "off" and holding chromosomes together. Much of the DNA, however, seems to have no obvious purpose at all, leading some to refer to it as "junk".

Part of this "junk DNA" includes strange regions known as DNA satellites. These are respective sequences made up of various combinations of the four DNA bases-adenine (A), cytosine (C), guanine (G) and thymine (T)-repeated over and over, like a genetic stutter. In the past several years, researchers have begun to find that so-called microsatellites, those disproportionately great for their size and perform a variety of remarkable functions. Indeed, scientists are discovering that the repetitive nature of microsatellites makes them particularly prone to grow or shrink in length and that these changes can have both good and bad consequences for the organisms that possess them. In certain disease-causing bacteria, for

example, the repeat sequences promote the emergence of new properties that can enable the microbes to survive potentially lethal changes in the environment. Some microsatellites are also likely to have substantial effects in humans, because at least 100,000 occur in the human genome, the complete complement of DNA in a human cell. Although the only function assigned so far to human microsatellites is negative causing a variety of neurological diseases-microsatellites may be surviving relics of evolutionary processes that helped to shape modern humans.

While some investigators search for the reasons humans carry so much repetitive DNA, many are now learning to exploit microsatellites to diagnose neurological conditions and to identify people at risk for those disorders. They are also finding that microsatellites change in length early in the development of some cancers, making them useful markers for early cancer detection. And because the lengths of microsatellites may vary from one person to the next, scientists have even begun to use them to identify criminals and to determine paternity-a procedure known as DNA profiling or "fingerprinting".

Satellite DNA was first identified in the 1960s. Researchers discovered that when they centrifuged DNA under certain conditions, it settled into two or more layers: a main band that contained genes and secondary bands that came to be known as satellite bands. The satellite bands turned out to be made of very long, repetitive DNA sequences. In 1985 Alec J. Jeffreys of the University of Leicester found other, shorter repetitive regions of DNA, which he dubbed minisatellites, that turned out to consist of repeats of 15 or more bases. (Jeffreys and his colleagues also determined that the number of repeats in a given minisatellite differs between individuals, a finding that allowed them to invent the DNA-fingerprinting technique). In the late 1980s James L. Weber and Paula L. May of the Marshfield Medical Research Foundation in Marshfield, Wis., and Michael Litt and Jeffrey A. Luty of the Oregon Health Sciences University isolated satellites made up of still shorter DNA repeats and named them microsatellites; these, too, would prove useful for DNA fingerprinting.

What makes microsatellite DNA so important for evolution is its extremely high mutation rate: it is 10,000 times more likely to gain or lose a repeat from one generation to the next than a gene such as the one responsible for sickle cell anemia is to undergo the single-base mutation leading to that disease. And although it is quite rare for the single-base mutation that underlies sickle cell anemia to mutate back again to its benign state, microsatellites can readily return to their former lengths, often within a few generations. Today scientists generally consider microsatellite DNA to consist of sequences of up to six bases repeated over and over, end to end, like a train made up of the same type of boxcar.

"Smart" Microbes
The role of microsatellites in the diversity of pathogenic bacteria was uncovered in 1986 in the laboratory of Thomas F. Meyer of the Max Plank Institute for Biology in Tubingen. Meyer and his colleagues were studying Neisseria gonorrhoeae, the bacterium that causes the sexually transmitted disease gonorrhea. N. gonorrhoeae, a single-celled organism, possesses a family of up to 12 outer-membrane proteins that are encoded by genes called Opas. (The name of the genes is derived from the opaque appearance of bacterial colonies that make Opa proteins). The proteins produced by the Opas are important because they allow the bacterium to adhere to and to invade epithelial cells, such as those that line the respiratory tract, as well as cells of the immune system called phagocytes. Each of the Opa genes contains a microsatellite composed of multiple copies of the five-base motif CTCTT.

The enormous variation conveyed by microsatellite repeats results from the fact that the repeats are especially prone to DNA-replication errors, often through what is called slipped-strand mispairing. Before a cell-bacterial or otherwise-can replicate, it must make a duplicate set of its DNA. This is a complicated process because each DNA molecule is a double helix resembling a twisted ladder, where the rungs of the ladder are base pairs. The genetic code is spelled out by the bases on one side of the ladder; the bases along the other side are complementary (A always pairs with T, and C with G).

During DNA replication, the ladder splits down the middle, separating the base pairs, as enzymes called DNA polymerases copy each strand. As the new strand is made, it pairs with its template. Slipped-strand mispairing can occur when either the old, template strand or the newly forming, complementary strand slips and pairs with the wrong repeat on the other strand. This slippage causes the DNA polymerase to add or delete one or more copies of the repeat in the new strand of DNA.

The frequency of such slippage mechanisms is very high in N. gonorrhoeae: each time the bacterial divide, approximately one out of every 100 to 1,000 daughter cells will carry a mutation that changes the number of CTCTT repeats. This change can have a dramatic effect on the Opa genes, because genetic information is read in "words" of three bases, called codons. Proteins are strings of amino acids, and each codon specifies a particular amino acid in the protein chain. Because the repeat is not three bases long, an increase or decrease in the number of repeats shifts the meaning of all the subsequent codons.

In the case of the Opa genes, deleting a CTCTT repeat leads to the production of a protein that is shortened and cannot adhere to host cell; in consequence, the bacterium bearing the shortened protein becomes unable to enter those cell. But subsequent slippage has a good chance of adding the repeat back, thereby allowing the Opa gene to produce a functional protein once again.

This reversible switching, called phase variation, has been found in many disease-causing bacteria. By switching its various Opa genes on and off from one generation to the next, N. gonorrhoeae can increase its chances for survival. There are times, for instance, when its is useful for the microbe to stick to and enter host cells, such as when that bacterium is spreading to a new host. At other times, it is strategically more advantageous for the bacterium not to interact with host cells-particularly phagocytic cells, which engulf and destroy bacteria.

The implications of slipped-strand mispairing for the ability of a bacterium to vary its surface molecules have also been studied extensively in Hemophilus influenzae. Type b strains of this bacterium are a primary cause of the life-threatening brain infection bacterial meningits. Until the advent of a vaccine in the late 1980s, roughly one in every 750 children younger than five years of age contracted H. influenzae meningitis.

The outer membrane of H. influenzae is studded with molecules of fats and sugars joined together to make a molecule called lipopolysaccharide (LPS). One part of LPS, called choline phosphate, helps H. influenzae stick to cells in the human nose and throat, where the bacterium normally lives without eliciting symptoms. At least three of the genes required for making LPS contain microsatellites built from the four-base sequence CAAT. As is true of the microsatellites of the Opa genes of N. gonorrhoeae, changes in the number of CAAT repeats in these genes can cause H. inflenzae to make LPS that either has or lacks choline phosphate.

Jeffrey N. Weiser of the University of Pennsylvania has shown that strains of H. influenzae that have choline phosphates on their LPS molecules-so called ChoP+ strains-colonize the human nose and throat more efficiently than strains without them, which are referred to as ChoP- strains. Without ChoP, however, the bacterium is more resistant to being killed by various factors present in the host's blood and in other tissue fluids. The bacterial cells an switch between the two states, depending on whether they are being left undisturbed to grow in the respiratory tract or are spreading through the blood to other sites, where they are likely to be attacked by components of the immune system.

Most H. influenzae bacteria isolated from humans are ChoP+ variants, which are susceptible to the immune attack. ChoP- variants inevitably arise through slipped-strand mispariing, but they usually do not persist in the respiratory tract, because they adhere less efficiently to host cells than ChoP+ strains. But if the host contracts a viral infection that inflames the nasal tissues, the inflammation can increase the exposure of the bacteria to defense proteins of the host's immune system. In that case, ChoP- variants would have an advantage because they can fend off such an attack. One the viral infection subsides, ChoP+ mutants generated by further sipped-strand mispairing of microsatellite DNA will once again predominate.

Genes such as these that can switch on or off readily have been named contingency genes for their ability to enable at least a few bacteria in a given population to adapt to new environmental contingencies. The variety of traits encoded by contingency genes includes those governing recognition by the immune system, general motility, movement toward chemical cues (chemotaxis), attachment to and invasion of host cells, acquisition of nutrients and sensitivity to antibiotics. Contingency genes make up a very small fraction of a bacterium's DNA, but they can provide a vast amount of flexibility in functioning. If only 10 of the 2,000 genes in a typical bacterium were contingency genes, for instance, the bacterium would be able to display 210-1,024-different combinations of "on" and "off" genes. Such diversity ensures that at least one bacterium in a population can survive its host's immune or other defenses and then can replicate to produce a new, thriving colony.

Causing disease-which can backfire by killing the life-giving host-may be one of the prices that bacteria pay for their ability to produce so many variants. The occasional variant may stray beyond its usual ecological niche in the host. It may penetrate the cells lining the respiratory or intestinal tracts, for example, to yield a potentially fatal infection elsewhere in the body. Provided that such events occur rarely, however, the benefits of contingency genes for the survival of a bacterial species outweigh the disadvantages of killing some hosts.

The microsatellites of these bacterial are true evolutionary adaptations. It is implausible that such unusual repeats could have arisen by chance; they must have evolved and been retained because they enable bacterial populations to adapt rapidly to environmental changes.

Slipped-Strand Mispairing
In this process, the number of microsatellite repeats increases or shrinks when a cell copies its DNA before dividing. During DNA replication (a), enzymes called the DNA polymerase complex unzip the parental DNA helix and copy both strands. One of the copies is made piecemeal: the polymerase complex synthesizes a short fragment (1) beginning with an RNA primer, then skips ahead to generate a second short fragment (2). When the polymerase finishes the second fragment, the RNA primer is removed, and the two fragments are connected with DNA. Increases in the number of microsatellite repeats (b) occur when the new strand sips down one repeat in its binding to the old, template strand, causing the polymerase to add an extra repeat in the new strand to fill the gap. Decreases (c) happen when the old strand slips, which results in repair enzymes deleting a repeat.


Searching for Papa Chimp
Besides the well-publicized use of microsatellite DNA to nab criminals through DNA fingerprinting, microsatellites are also being used to aid conservation efforts through study of the sex lives of endangered animals.

DNA fingerprinting, which distinguishes people by differences in selected regions of their DNA, is possible because the lengths of microsatellite DNA sequences differ between individuals. Scientists create DNA fingerprints by using special enzymes to make millions of exact copies of various microsatellites from each subject and then separating the copies by size on a gel. The result is a pattern of bands that looks much like a bar code-and that is almost as unique to each individual as a fingerprint is.

Pascal Gagneux and David S. Woodruff of the University of California at San Diego-together with Christophe Boesch of the Zoological Institute of the University of Basel-have used DNA microsatellites as tracers to probe the mating habits of a group of wild chimpanzees in the Tai Forest of Ivory Coast. They collected hairs from the temporary treetop nests each animal built to sleep in and extracted DNA from cells clinging to the roots of the hairs. By comparing the microsatellite DNA fingerprints of the adult males and females with those of 13 offspring, Gagneux, Woodruff and Boesch found that seven of the babies could not have been fathered by males in the group. Although the researchers had never seen them doing it, at least some of the female chimpanzees must have sneaked into the surrounding forest during the night for trysts with males in other groups nearby.

Such nocturnal adventures might explain how even small groups of chimplanzees maintain a great deal of genetic diversity. Diversity is valuable for providing resistance to disease and is strongly suspected to aid survival in many other ways.

Preserving such variety is likely to be essential to the survival of wild chimpanzee populations. Unfortunately, as these populations become more and more fragmented and separated by longer distances, the ability of females to find males in other groups and to bring new genes into their group will be curtailed drastically. -E.R.M. and C.W.



Detecting Cancer
Microsatellite DNA may soon be affecting our lives in an important way: it may improve the early detection of cancer. Tests for mutations in genes that in their altered forms predispose to cancer, such as p53 and ras, can now be used to detect as few as one cancer cell out of 10,000 normal ones. But mutations in these genes do not occur in all cancers or even in all cancers of a given type.

Microsatellite provide another method for early cancer detection because the overall rate of microsatellite expansion or contraction in cells turns out to be markedly increased in some types of cancers. Such bursts of change, often involving many different microsatellites, can be detected fairly easily. The approach can currently detect one cancerous cell out of about 500 normal ones.

Microsatellite changes in cancer cells were first found in 1993 by Manuel Perucho of the California Institute of Biological Research in La Jolla, Calif., who was studying hereditary nonpolyposis colon cancer. Perucho noted that many microsatellites from cancer cells were either longer or shorter than those in normal cells from the same patient. It was soon shown that one of the defects causing these alterations was in a gene encoding an enzyme responsible for correcting the length of microsatellites that grew or shrank during DNA replication; loss of the functional gene would presumably increase the likelihood that the errors would go uncorrected.

The circle of proof was closed when Richard C. Boland of the University of California at San Diego and others inserted a human chromosome carrying a normal DNA-repair gene into colon cancer cells grown in the laboratory. They observed that the inserted gene corrected the tendency for microsatellites in the cancer cells to mutate.

Striking as these findings are, however, microsatellite instability may be more a symptom than a cause of cancer. Although so-called knockout mice that lack the gene encoding one of the major mismatch-repair proteins live for only a short time and acquire many types of cancer, none of the cancer cells show increased levels of microsatellite mutations. It appears that such alterations form only part of the great variety of genetic changes that can cascade through the genome of a cell once the process of carcinogenesis has been set in motion-meaning they might be by-products of the carcinogenic process rather than contributors it.

Nevertheless, the associations occur often enough that microsatellite instability provides clinicians with a new and powerful tool. Successful clinical trials of early detection systems employing microsatellites have been carried out for colorectal and bladder cancers and are now being extended to many other types of cancer, although none of the tests are now available outside research settings. As clinicians gain experience with these various patterns, not only will cancers be detected sooner than ever before, but the pattern of microsatellite variation will provide strong indications of the type of cancer involved. -E.R.M. and C.W.


Microsatellites in people
Useful as they are, contingency genes are apparently confined to bacteria. The role of microsatellites seems to be very different in eukaryotic organisms like ourselves, whose cells contain a nucleus. None of the eukaryotic microsatellites identified to date appear to scramble the way DNA is read and to yield non-functional proteins. Most lie outside genes, but roughly 10 percent actually fall within them. Of this 10 percent, almost all are so-called triplet repeats, which tend to expand or contract in units of three bases. Just as adding or deleting an "and" or a "the" in a sentence rarely obscures its meaning, triplet repeats can expand or contract without disturbing a gene's message. Having the same length as a codon, they may simply lead to insertion or removal of a few repetitive amino acids without changing the sequence of all the others down the line.

So, what are the functions of microsatellites in higher organisms? Scientists suspect that at least some of them must have uses, because eukaryotes have more microsatellites than bacteria and many of them happen to be in or near genes involved in pathways regulating fundamental cellular processes. Only a few hints have yet emerged, however, about what these purposes might be.

The few effects that have now been traced to eukaryotic microsatellites have generally been harmful. For example, the grim neurodegenerative disorder Huntington's disease-characterized by late-onset dementia and gradual loss of motor control-is triggered by a flawed version of a gene that codes for a large protein, huntingtin, of unknown function. The normal gene contain a long, triplet-repeat microsatellite that adds a string of amino acids called glutamines near the start of the protein.

The number of glutamines at the beginning of the huntingtin protein usually ranges from 10 to 30. But people who have-or who are destined to develop-Huntington's disease carry a microsatellite coding for an unusually long run of 36 or more glutamines. Inheritance of just one copy of the father, is enough to ensure eventual illness. It is not yet clear how the long stretches of glutamines contribute to Huntington's.

More than a dozen such triplet-repeat disease are now known; most are rare neurological diseases. About half the disease-causing microsatellite repeats are inside a gene, and most encode glutamines. The rest are sufficiently close to nearby genes that they can affect their function.

One of these rare neurological diseases-spinal bulbar muscular atrophy-results from expansion of a microsatellite inside a gene on the X chromosome; the gene codes for a receptor for the male hormone androgen. People with 40 or more triplet repeats in part of one of their androgen receptor genes develop the disease. But a group led by E. L. Yong of National University Hospital in Singapore has demonstrated that repeats that are even slightly longer than normal can also have medical effects. They reported in 1997 that men with between 28 and 40 repeat gene that encodes glutamines were likely to be infertile.

Too few triplet repeats in the androgen receptor can also have untoward consequences. Several other research groups have shown that men with 23 or fewer repeats have an increased risk of prostate cancer. Such cases are unusual, however.

Evolving Evolvability
Why do we have all these genetic time bombs ticking inside our genomes? It is striking that so many of our triplet-repeat diseases involve neurological function and that none of those linked to triplet repeats in humans have yet been reported in other primates, such as chimpanzees. If such diseases turn out to be unique to humankind, they might represent a genetic cost we have incurred because of the rapid evolution of our brains. It is possible that long microsatellites at or near certain genes might contribute to brain function and might therefore have persisted throughout evolutionary time even though they occasionally expand too much and cause disease.

In 1989 one of us (Wills) postulated on theoretical grounds that some genes have evolved the ability to evolve. According to the hypothesis, in an environment that fluctuates in some predictable way such as growing warmer or cooler-possessing the genetic apparatus to evolve quickly would have advantages. The contingency genes of bacteria have turned out to be excellent examples of evolvability genes: their high rates of forward and backward mutation allow bacteria to adapt rapidly to predictable environmental changes and then to revert back again when the earlier conditions reappear.

Perhaps eukaryotic microsatellites exert a more subtle form of regulation than that provided by bacterial contingency genes. In humans, microsatellites within genes have been found that influence the production rate of a number of proteins, ranging from the bile pigment bilirubin to neurotransmitters, the chemicals that carry messages between nerve cells. David G. King of Southern Illinois University has suggested that such microsatellites may be "tuning knobs" that evolved to act as rheostats for gene function, turning up the amount of protein produced by a gene in some instances and decreasing it in others.

Indeed, Walter Schaffner and his colleagues at the University of Zurich have shown that adding microsatellites that encode runs of glutamines or prolines (another amino acid) at the start of a known gene can increase its ability to yield protein. Perhaps, because it is so much less disruptive than contingency gene switching, this form of gene regulation emerged during the evolution of complex, multicellular organisms.

Scientists have only begun to probe the role of microsatellites in our own species. It may be begun to probe the roles of microsatellites in our own species. It may be that the repeats, with their ability to switch rapidly among a limited number of states, will provide insights into our own capacity to adapt to environmental change, just as contingency genes have done in bacteria.

The Authors
E.RICHARD MOXON and CHRISTOPHER WILLS approach the study of DNA microsatellites from two different vantage points. Moxon is a pediatrician and an infectious disease specialist; Wills is an evolutionary biologist. Moxon currently heads both the department of pediatrics at the University of Oxford and the Molecular Infectious Diseases Group of the Institute of Molecular Medicine at John Radcliffe Hospital in Oxford, England. Wills is a professor of biology at the University of California, San Diego. He is the author of several popular books on evolutionary biology, including Yellow Fever, Black Goddess (Addison-Wesley, 1996) and Children of Prometheus: The Accelerating Pace of Human Evolution (Perseus Books, 1998).

Further Reading
ADAPTIVE EVOLUTION OF HIGHLY MUTABLE LOCI IN PATHOGENIC BACTERIA. E. R. MOXON et al. In Current Biology, Vol. 4, No. 1, pages 24-33; January 1, 1994. EVOLUTIONARY TUNING KNOBS. D. G. King, M. Soller and Y. Kashi in Endeavour, Vol. 21, No. 1, pages 36-40; 1997. FURTIVE MATING IN FEMALE CHIMPANZEES. P. Gagneux, D. S. Woodruff and C. Boesch in Nature, Vol. 387, pages 358-359; May 22, 1997.