Human genetic code is nearly cracked
Masterpiece of biological research opens medical vistas, raises new fears
By RICK WEISS
Washington Post
Monday, May 29, 2000
It's been prized as the Holy Grail of biology, the Book of Life, the instruction manual for making a person.
For a decade the double-stranded thread of genetic code known as the human genome has been the object of the biggest and boldest biological enterprise ever launched, costing about $2 billion and requiring millions of times the computing power used to land a man on the moon.
The goal has been to trace every letter of the genome's biochemical code, decipher its enigmatic molecular message and gain medical mastery at last over the world's most elegantly encrypted creation, the human being.
Now, in the next two weeks or so, a Rockville, Md., company and a team of publicly funded scientists are expected to make separate and competing announcements that each has largely completed the task.
The researchers, some affiliated with Celera Genomics Corp. and the others with the international Human Genome Project, will tell the world they have identified and placed in order almost all of the approximately 3 billion bits of genetic code that tell a human body how to live and when to die.
"For a long time it's been, `Who knows if we can really do this?' But it feels finite now," said Robert H. Waterston, director of the Washington University Genome Sequencing Center in St. Louis, which has produced much of the data for the project during the last several years. "Now basically we've got it in hand, and that feels really good."
The twin announcements will mark the crossing of an admittedly fuzzy finish line, an arbitrary definition of near-completion created largely to please impatient investors and congressional funders. The job of reading out the entire human genome is actually just more than 90% done, and many of the ostensibly finished stretches of DNA remain riddled with errors because they've been "spell-checked" only four or five times -- about half the number needed to weed out mistakes.
It could be another two years, scientists say, before every letter of the human genetic code is put exactly into place, with an accuracy of 99.99%.
Nonetheless, the first tracing of even this "rough draft" of the human genome onto paper (actually onto computer disk; a printed manuscript would require a stack of paper higher than the Washington Monument) marks a historic milestone in human self-knowledge. And already, the emerging recognition among scientists and the general public that the job is about done is generating intense crosscurrents of excitement and apprehension:
-- Excitement because a full explication of the human genetic code promises a wealth of practical benefits, such as gene-based tests that can tell people years in advance what diseases they ought to watch out for, and new medicines that use genetic information to treat diseases without side effects or even prevent them from ever arising.
-- Apprehension because the same information could usher in a dark age of high-tech eugenics, in which people are discriminated against for harboring imperfect genes and in which the marketing of expensive genetic enhancements deepens the divide between the world's haves and have-nots.
At the same time, the achievement is bringing into focus a less well-advertised truth about the sequencing of the human genome -- namely that the sequence tells scientists relatively little by itself.
Many Hurdles Ahead
Indeed, as the sequencing job moves into its mop-up phase of filling in gaps and double-checking initial results, a new sense of sobriety has settled over researchers around the world. Increasingly it has become clear that getting the human genetic sequence in hand is but the first of many hurdles to be cleared before scientists will be able to interpret its meaning and manipulate its message.
On the positive side, experts said, that means there is still time for Congress to resuscitate stalled legislation that would protect people from discrimination on the basis of their genetic inheritances.
"Now is the time to . . . make it safe for people to learn this information about themselves," said Francis S. Collins, chief of the National Human Genome Research Institute, which, with the Department of Energy, has overseen the U.S. contribution to the international genome-sequencing effort.
On the downside it means that, despite all the breathless anticipation of the genome project's completion, it will probably be another decade or two before the golden era of genetic medicine reaches hospital rooms, doctors' offices and people's medicine cabinets.
The human genome can be thought of as a huge encyclopedia that is written as a single enormously long sentence of 3.1 billion letters, with virtually no punctuation along the way. In a remarkable feat of packaging, a copy of this six-foot-long rambling molecular sentence is folded inside almost every one of the body's 100 trillion cells.
Genes are individual portions of that run-on text, ranging in size from about 1,000 to 100,000 letters. Each "letter" is actually one of four kinds of chemicals, called "bases," which hang like charms along the length of a DNA chain. The pattern of those bases amounts to a coded set of instructions that tells a cell how to make a key product, such as a hormone, or how to complete some other essential task, such as reproducing itself.
The achievement that genome researchers are poised to announce is that virtually all of the letters in this encyclopedia have been identified and placed in order. But that doesn't mean anyone can read the book yet. It is, after all, written in code.
Next, scientists must tease out the genome's tens of thousands of genes -- the periodic riffs of meaningful text -- from the long stretches of gibberish, or so-called junk DNA, that make up the vast majority of the genome's 3 billion letters.
That is not an easy task, and it is made even harder because each gene is itself broken up into many pieces, with each piece separated by lengths of apparently meaningless code. Computer programs designed to recognize gene sequences in this sea of apparent nonsense are still far from perfect, and miss about half the sequences they are looking for.
"It took 3 1/2 billion years of evolution to create this text," said Eric Lander of the Whitehead Institute for Biomedical Research in Cambridge, Mass. "The notion that we're going to have the last word on the interpretation of this text in a mere century seems to me an act of hubris," he said, referring to the birth of modern genetics in 1900.
Much of the work in coming years will be aimed at finding genes that contribute to human diseases. At least 4,000 of the tens of thousands of genes on the human genome are believed to be directly involved in the onset of diseases, and countless others contribute to ailments in more subtle ways.
Even without the help of a nearly complete human sequence, scientists in the past decade have managed to identify several dozen of these genes using old-fashioned techniques. But as the full- length human genome has gradually come into focus, researchers have been able to find culprit genes far more easily.
For example, it took about 100 scientists 10 years to discover the gene that causes the inherited lung disease cystic fibrosis in 1990, before the first crude maps of the genome had been drawn. By contrast, it took a single postdoctoral scientist and a few part- time assistants just a year to discover in 1997 the gene that causes Pendred syndrome, an inherited cause of deafness, and to develop a test that identifies the causative mutations in that gene.
Moreover, thanks to the genome project's collection of data for that portion of the genome and the availability of comparative data from other organisms, the team quickly figured out what the gene does in the body; deduced a plausible explanation for how the gene, when faulty, causes deafness; and began conceiving of possible approaches to preventing or correcting the faulty gene's damage to the inner ear.
Genes of Disease
The genome institute's Collins predicts that within the next three to five years researchers will find most of the genes that individually cause diseases. Within the next five to seven years, he predicts, the project will go further and uncover most of the genes that play major roles in heart disease, diabetes, asthma, manic depression and the many other diseases that are caused not by single genes but by multiple genes along with environmental factors.
Fast on the heels of those discoveries will come genetic tests, which in many cases will do more than merely confirm a diagnosis. For diseases such as cancer, for example, which can take years to blossom, a positive test can warn people that they are at increased risk of trouble. That can give a person the lifesaving opportunity to take preventive medications, initiate changes in diet or lifestyle, or start a program of regular checkups to watch for the first signs of disease, when it is most curable.
Genetic tests also could end up telling people far more than they want to know about themselves. Many people with a family history of an inherited ailment already face wrenching decisions about whether to learn if they are destined to get the disease. For some diseases, parents must decide whether and when to tell their children about the youngsters' risk, or to offer them a chance to get tested. Should a 10-year-old girl have to worry about breast cancer?
Genetic information also might end up telling insurers and employers more than people want them to know.
Copyright 2000
Provided by ProQuest Information and Learning Company. All rights Reserved.