If genes could talk

More than a million years ago, snow and ice buried a mammoth’s decaying body in Siberia. Over time, most of the “Krestovka mammoth” vanished; first its soft tissues and later even its bones became food for bacteria and bugs. Some teeth remained, however, and a Russian paleontologist unearthed a few in 1970. For decades, they sat in museum collections as curiosities, too old to carbon date. Scientists presumed any lingering organic material long gone.

However, in research published in February 2021 in the scientific journal Nature, an international team of scientists including Beth Shapiro, professor of ecology and evolutionary biology, turned to modern genomics to reveal more about this ancient mammoth and its relatives. Drilling a miniscule amount of material—about a pinch of salt’s worth—out of one molar, the team was able to isolate tattered bits of DNA from the sample. Their sequencing of this and other mammoth DNA showed that the Krestovka specimen represented a previously unknown genetic lineage of mammoths, upending previous hypotheses about mammoth evolution. Based on its sequenced genes and how much they varied from those of more modern mammoths, the team estimated that the animal lived about 1.2 million years ago. Before this effort, no one had ever sequenced genes more than a million years old.

Until recently, such a genomic dive into the deep past was considered impossible. But gene sequencing continues to get better, faster, and cheaper, allowing scientists, like Shapiro and others at UCSC, to develop new ways to piece together and analyze genes from even the most fragmented DNA specimens. The advances are letting researchers solve long-standing mysteries about evolution, migration, and diversity—in both animals and humans. They’re also a boon for understanding and diagnosing genetic diseases and helping criminal investigators solve cold cases. “Even a single genome can tell us all kinds of things about the broader population it belongs to,” said Shapiro. “It really can paint a picture of the past.”

Dog-chewed

Researchers in the UCSC Paleogenomics Lab analyzed DNA from 38 bison hide moccasins, like this one, worn more than 800 years ago by cave dwellers in what is now Utah. They discovered that nearly all of the moccasins, which were well preserved in dry caves for centuries, came from the hides of female bison. The researchers hypothesized that the Indigenous people of the American plains may have purposefully targeted female bison for moccasin manufacture. Credit: Daderot (public domain).

When people swab the inside of their cheeks with test kits like those from 23andMe and ancestry.com, they collect—among other things—strands of genetic material, each containing hundreds of millions of relatively intact nucleotide base pairs, the building blocks of DNA. Sequencing such DNA samples is straightforward. But immediately after an organism dies, the natural world around them starts breaking down their components, including their DNA. Further muddling the genomic picture, microbes and other decaying plants and animals seep into the heap, adding their DNA remnants.

If sequencing DNA from a living person is like putting together a thousand-piece puzzle per the picture on the box cover, then sequencing ancient DNA is like solving such a puzzle after your dog chews up each puzzle piece, someone mixes those pieces with dog-chewed pieces from several other puzzles, and you lose the box. “To sequence ancient genomes, we need to take dirty, crumbled up DNA and transform it into molecules that can be sequenced,” said Shapiro.

Shapiro and her husband Richard Edward Green, associate professor of biomedical engineering, now routinely perform this difficult task in the UCSC Paleogenomics Laboratory they run together. The work involves tedious washing, isolating, and handling to get rid of contaminants, gene amplification to make millions of copies of tiny bits of DNA, and powerful computer programs that assemble the short sequences into more complete gene readouts. “We need specialized facilities where we don’t touch our samples,” said Shapiro. “We don’t even breathe on them—if our DNA gets into this ancient DNA confetti, it makes our experiments much harder.”

Assessing admixing

DNA from mammoth teeth, like the one shown here in the Palatinate Museum of Natural History in Germany, let scientists at the UCSC Paleogenomics Lab and their collaborators discover how two different lineages of mammoths interbred to create an entirely new species of mammoth more than 420,000 years ago. Credit: BlueBreezeWiki (CC BY-SA 3.0).

Work at the Paleogenomics Laboratory in the last few years has sequenced not only the genomes of mammoths but also those of ancient bears, dogs, bison, horses, and Neanderthals. A key question tying many of the projects together is what genetically defines a species. “As humans, we have this proclivity to put things into boxes—a polar bear is a polar bear, and a brown bear is a brown bear,” said Shapiro. But this view is too simple. “Speciation typically doesn’t happen overnight.”

In the genomes of nearly all the ancient animals Shapiro and others have studied, the results indicate that those from different lineages often interbred even after beginning to separate into distinct species, a phenomenon called admixing. The result? Brown bears have snippets of polar bear DNA, bison have cattle in their genomes, and modern humans share a surprisingly large amount of DNA with Neanderthals.

By comparing the genomes of ten pumas from sites across North and South America, Beth Shapiro, Ed Green, and their colleagues at the UCSC Paleogenomics Laboratory assembled a puma family tree. North American pumas, they calculated, diverged from their South American relatives more than 100 thousand years ago. The results also showed that Florida panthers, like the one seen here, remain inbred even after cross-breeding efforts in the 1990s. Credit: U.S. Army Corp of Engineers (CC BY 2.0).

Cataloguing these genetic admixtures can help scientists understand how ancient populations arose and declined. But it also can prove valuable in conservation efforts to help save struggling species from extinction. In some cases, preventing admixing might be critical to ensuring the survival of a rare species—if two species interbreed too frequently, the unique traits of one may completely fade away. More often, though, endangered populations suffer from interbreeding between close relatives, which can harm a population by increasing the prevalence—and impact—of detrimental genes.

To make at-risk species healthier and more likely to survive long term, conservationists sometimes encourage the breeding of isolated populations with closely related animals from the outside. Scientists successfully boosted Florida’s dwindling panther population in the 1990s, for instance, by cross-breeding the Florida animals with Texas cougars. Having genetic information on each species helps conservationists choose which populations—or even individual animals—to breed with each other, which can be critical to the success of such efforts. “The past is a completed evolutionary experiment,” said Shapiro. “But when we try to devise strategies to protect species, we're making a lot of guesses. By looking into the past, we can better understand what has already worked.”

Seeing differences

Coming to terms

In the past several decades, gene-editing technologies recently and notably the 2020 Nobel Prize–winning CRISPR (shorthand for Clustered Regularly Interspaced Short Palindromic Repeats)—have become mainstream laboratory tools, used by scientists to re-engineer the genomes of animals, plants, and even human patients. For some people, the idea of purposefully altering an organism’s genome sounds like spooky science fiction or a recipe for disaster. For Professor Beth Shapiro, however, it sounds like what humans have already done for 50,000 years. “Humans have been changing other species as long as we’ve existed,” she said, “in all sorts of both accidental and very purposeful ways.”

This history, Shapiro said, is typically not appreciated by people wary of new genetic technologies. In her recently published Life as We Made It (Basic Books, 2021), she details the long history of how humans have shaped nature and continue to do so. She also seeks to counteract misinformation about the capabilities of genomic engineering. “If we want a future that is both biodiverse and filled with people,” she said, “we must come to terms with these biotechnologies.”

Published in October 2021, Shapiro’s Life as We Made It (Basic Books) has been well-received: in March 2022, it was announced as a nonfiction finalist for the California Book Award. Writing the book, Shapiro said, was a welcome distraction from the pandemic. Credit: Basic Books (public domain).

The genomic science advances enabling scientists to study ancient DNA and help endangered animals also have vast promise to improve human health, said Benedict Paten, associate professor of biomolecular engineering and associate director of the UCSC Genomics Institute. Some of that promise is clearcut. For example, as reported in February 2022 in the New England Journal of Medicine, Paten and collaborators showed that rapid sequencing of the entire genomes of critically ill patients uncovered genetic variations that impacted the treatment of five of the dozen subjects enrolled in the study. But other implications are less straightforward. Understanding human genetic evolution, Paten said, can shed light on how today’s diversity arose and how it impacts human health, which could help address the increasingly acknowledged—yet quite elusive—problem of systemic racial bias in medicine.

Paten chairs the Genome Alignment and Annotation Committee of the Vertebrate Genomes Project, a group of more than 150 scientists, each with their own funding, aiming to collaboratively sequence at least one genome from 66,000 vertebrate species, including mammals, birds, reptiles, amphibians, and fish. Like Shapiro and Green, Paten is not only interested in the stand-alone genomes of single species, but how these genomes compare and have evolved and diverged over time. “On their own, genomes can be incredibly interesting but they’re also snapshots in time in a long evolutionary history,” Paten said. Comparing the genomes of different species can reveal new genes and proteins that could lead to the discovery and development of a broad range of innovative items, from medicines to adhesives or even electronics, said Paten. “Every species contains unique genes that can be powerful in terms of understanding biology and giving us new tools.”

In addition, understanding how and why one gene has evolved to vary between species while another has remained stubbornly stagnant can tell Paten and other researchers much about the importance—and flexibility—of that gene. Paten gauges this flexibility by comparing variation between modern species, while Shapiro and researchers in the Paleogenomics Lab often try to assess the same thing by analyzing the genomes of ancient species. In July 2021, for instance, the journal Science Advances published Shapiro and Green’s research comparing modern human and Neanderthal genomes to see which parts of the modern human genome were never—in any individuals—derived from Neanderthal ancestry. Less than 7% of the modern human genome, they discovered, is unique and never of Neanderthal origin. “Some traits move between species and become advantageous, but others don’t, because they can’t. They’re not compatible,” said Shapiro, adding that knowledge of a gene’s natural variation could eventually prove helpful to bioengineers attempting to repair or improve its function.

By sequencing ancient DNA from these bones of horses that lived in North America during the last Ice Age, scientists at the UCSC Paleogenomics Lab shed new light on how different horse populations diverged—and then later crossed paths again. Eurasian and North American horses, they discovered, crossed the Bering Land Bridge repeatedly over the course of hundreds of thousands of years. Understanding how, when, and where these populations evolved has implications for managing today’s wild horse populations. Credit: Alisa Vershinina/Paleogenomics Lab, with permission.

Pushing envelopes

CSI: Santa Cruz

In 1980, an off-duty police officer and his brother discovered the body of a murder victim on the side of the road in Henderson, Nevada. She became known as “Arroyo Grande Jane Doe,” and for decades investigators had no leads on who she was or how she was killed. Then, in 2019, a clip of her hair made its way to Astrea Forensics, a Santa Cruz–based startup headed by chief executive officer (CEO) Kelly Harkins Kincaid, a former postdoc of Ed Green’s in the Paleogenomics Laboratory.

In May 2017, after he first applied his expertise in assembling DNA to determine the identity of an unknown body, Green started receiving dozens of phone calls from law enforcement agencies requesting help in analyzing crime scene DNA. It turns out the same techniques that can piece together degraded DNA from ancient bones also make powerful tools for reassembling the minuscule amounts of DNA present in hair, skin, or semen samples from years-old crime scenes. Prompted by this interest, Green and Harkins Kincaid founded Astrea, with Arroyo Grande Jane Doe’s murder the first crime addressed by the company. In late 2021, the Henderson Police Department announced that, thanks in part to the Astrea sequencing, they’d finally put a name to the victim: Tammy Corrine Terrell.

“I realized that I really liked thinking about what these technologies could be applied to other than ancient DNA,” said Harkins Kincaid, who also wears the founding CEO hat of Claret Bioscience, which employs some of the same Paleogenomics Lab techniques to help process degraded DNA in blood sampled for medical testing.

Two brothers discovered the body of 17-year-old Tammy Corrine Terrell in Henderson, Nevada in 1980, but investigators couldn’t identify her until December 2021, after Santa Cruz–based startup Astrea Forensics sequenced DNA from a strand of her hair. Despite the identification, her murder remains unsolved. Credit: Henderson Police Department (CC BY-SA 4.0).

If genomes could talk, each human’s would sound different. Diversity in ancestry, race, disease risk, metabolism, and a plethora of other features, from personality to appearance, are all captured in our DNA. But for the past nearly 20 years since the milestone completion of the Human Genome Project in 2003, researchers have used a single reference genome—based almost entirely on the genome of a single individual from Buffalo, New York—as the template for all other human genomes. “That’s a problem,” said Karen Miga, Paten’s colleague, and an assistant professor of biomedical engineering. “The reference genome gives us only one snapshot of a human genome when we all know that our genomes differ, sometimes in very large ways.”

When researchers analyze a person’s genome, they use that reference genome as a point of comparison. Places where an individual’s genes vary greatly from the reference may indicate causes of disease, for instance. But for some populations, the current reference genome may not be a fair place to start—it may not capture the norm at all.

Like Shapiro, Green, and Paten, Miga is pushing the envelope on how to obtain the most detailed information from genomes and capture the differences between them. Miga co-chaired the Telomere-to-Telomere Consortium, an effort—just recently completed and reported in six papers published in the March 31, 2022 issue of Science—to fill in the gaps that remained in the original reference genome. She also co-chairs working groups within the broader Human Pangenome Project, which aims to create a more global reference genome that better represents human genomic diversity. In all this work, Miga uses multiple new technologies, including nanopore sequencing, that let researchers piece together much larger chunks of the genome at a time—akin to completing a puzzle with a hundred pieces instead of one with a thousand pieces. This makes it easier to figure out the many repetitive stretches of DNA within people’s genomes—a challenge that’s a bit like assembling a large blue sky of a puzzle.

The rapidly improving technology, Miga said, doesn’t just benefit the Pangenome Project, but gene sequencing for all other purposes. This broad applicability rings true for all the researchers. The same breakthroughs that allow them to determine how horses or wolves evolved can also help them better diagnose patients with rare diseases, identify criminals, and engineer improved food crops. “A lot of basic research has a very long tail,” said Paten. “Genomics often asks some fairly technical questions about how we can assemble the whole genome and all its variations, but the progress we make on those questions serves us well in many, many other areas.”

Dog-chewed

Assessing admixing

Seeing differences

Pushing envelopes

Further Inquiry

In the News