Packed inside every cell in your body is a set of genetic instructions,
3.2 billion base pairs long.
Deciphering these directions would be a monumental task
but could offer unprecedented insight about the human body.
In 1990, a consortium of 20 international research centers
embarked on the world’s largest biological collaboration
to accomplish this mission.
The Human Genome Project proposed to sequence the entire human genome
over 15 years with $3 billion of public funds.
Then, seven years before its scheduled completion,
a private company called Celera announced that they could accomplish the same goal
in just three years and at a fraction of the cost.
The two camps discussed a joint venture, but talks quickly fell apart
as disagreements arose over legal and ethical issues of genetic property.
And so the race began.
Though both teams used the same technology to sequence the entire human genome,
it was their strategies that made all the difference.
Their paths diverged in the most critical of steps:
the first one.
In the Human Genome Project’s approach,
the genome was first divided into smaller, more manageable chunks
about 150,000 base pairs long
that overlapped each other a little bit on both ends.
Each of these fragments of DNA
was inserted inside a bacterial artificial chromosome
where they were cloned and fingerprinted.
The fingerprints showed scientists where the fragments overlapped
without knowing the actual sequence.
Using the overlapping bits as a guide,
the researchers marked each fragment’s place in the genome
to create a contiguous map,
a process that took about six years.
The cloned fragments were sequenced in labs around the world
following one of the project’s two major principles:
that collaboration on our shared heritage was open to all nations.
In each case, the fragments were arbitrarily broken up
into small, overlapping pieces about 1,000 base pairs long.
Then, using a technology called the Sanger method,
each piece was sequenced letter by letter.
This rigorous map-based approach called hierarchical shotgun sequencing
minimized the risk of misassembly,
a huge hazard of sequencing genomes with many repetitive portions,
like the human genome.
The consortium’s “better safe than sorry” approach
contrasted starkly with Celera’s strategy called whole genome shotgun sequencing.
It hinged on skipping the mapping phase entirely,
a faster, though foolhardy, approach according to some.
The entire genome was directly chopped up
into a giant heap of small, overlapping bits.
Once these bits were sequenced via the Sanger method,
Celera would take the formidable risk of reconstructing the genome
using just the overlaps.
But perhaps their decision wasn’t such a gamble
because guess whose freshly completed map was available online for free?
The Human Genome Consortium,
in accordance with the project’s second major principle
which held that all of the project’s data
would be shared publicly within 24 hours of collection.
So in 1998, scientists around the world
were furiously sequencing lines of genetic code
using the tried and true, yet laborious, Sanger method.
Finally, after three exhausting years of continuous sequencing and assembling,
the verdict was in.
In February 2001, both groups simultaneously published
working drafts of more than 90% of the human genome,
several years ahead of the consortium’s schedule.
The race ended in a tie.
The Human Genome Project’s practice of immediately sharing its data
was an unusual one.
It is more typical for scientists to closely guard their data
until they are able to analyze it and publish their conclusions.
Instead, the Human Genome Project accelerated the pace of research
and created an international collaboration on an unprecedented scale.
Since then, robust investment in both the public and private sector
has led to the identification of many disease related genes
and remarkable advances in sequencing technology.
Today, a person’s genome can be sequenced in just a few days.
However, reading the genome is only the first step.
We’re a long way away from understanding what most of our genes do
and how they are controlled.
Those are some of the challenges
for the next generation of ambitious research initiatives.