Nesting dolls and Narnia

In genomics, what looks like the end is often just the beginning, Nesting dolls, matryoshka dolls

If movies were true to real life, matryoshka dolls would provide a good analogy for genomics research. Open one doll and a smaller one nestles within, like answers gained in a line of study. Finally, the last, smallest doll emerges, representing the final piece of knowledge that ties everything together. Eureka! cries the scientist, and medical care surges forward, lives are saved and everyone lives happily ever after. Sadly, it’s almost never such a straightforward situation.

The reality is more akin to Narnian lore, in which entire universes are revealed within what looks like an ordinary wardrobe or shed. Open the door expecting to see a confined space, and you might instead find yourself exploring an alternate world. Technological advances have allowed us to probe ever deeper into our own biological complexities, but expected ends often turn into beginnings. Instead of tidy answers we’re finding more questions. Sometimes an entire field of study emerges.

Genomics provides many examples. For instance, think about the completion of the Human Genome Project in 2003. Before it was published, the first full human genome sequence was thought to be sort of a road map for human health, carrying sufficient insight in its linear string of letters to answer many long-held questions about human biology. It was a huge achievement without doubt, but in most ways it was not an ending but rather the start of something far larger. Our genomes, it turns out, are worlds unto themselves, and around every corner there’s something new and bigger than expected to explore. We have the ability to learn far more information far more quickly than we could only a few years ago, and yet it still feels like we’re only scratching the surface of what there is to know.  

Such thoughts have been running through my head all year and came to the fore at the recent American Society of Human Genetics meeting. The sessions were awash in genomic data, with more samples and better standardization than ever before. There’s a new scale in modern research, and many talks presented insights gained from large population data sets. Yet it’s not enough. It’s clear that one genome or ten or even a thousand are nowhere near sufficient to assess variant impact and disease susceptibilities, so now there are collections of 141,000+ genomes and exomes (gnomAD), the 100,000 Genome Project, and many other repositories with tens of thousands of genomes, exomes and more. (The 100,000 Genome Project, by the way, just actually sequenced its 100,000th genome.) But the collections—and analyses based on them—will need to continue to grow over time. Why? Because our genomes are so wildly different from each other, even between “healthy” people, that finding the signal in the data noise is extremely hard to do. And those differences go far beyond sequence variation.

Take structural variation, the focus of a growing amount of research. Structural variants involve DNA segments that are duplicated, deleted, inverted or inserted in the genome. Because they don’t change the linear sequence itself, structural variants were largely invisible to standard sequencing methods until after the Human Genome Project concluded. Efficient, reliable methods for finding them have only been  Detecting human genome structural variation with long read sequencingA research team led by Chia-lin Wei is using technology to more easily detect and classify important types of difficult-to-find genomic alterations, known as structural variants.developed recently . Genome sequences are long, no doubt, but when represented as a linear string of letters—ACGTGATTACA—they also appear pretty simple. They are anything but. Structural variants multiply, flip, and sometimes scramble up those nice, neat lines of letters, and they are emerging as a significant source of genomic variation. Indeed, they add more variation and are perhaps more important in health and disease than the more easily detected and more recognized single nucleotide variants. But research into them is still in its early days.

Another interesting trend that is perhaps related to both expanded capability and unexpected complexity is a return to genome-wide association studies, or GWAS. While early GWAS work succeeded in turning up large numbers of genomic variants associated with human disorders or diseases, they largely failed to translate them into actionable medical insight. One might say that they provided many beginnings but few ends, at least in a clinical context. Now, with the benefit of hindsight from the earlier projects, not to mention larger data collections and more advanced analysis tools and methods, researchers are returning to GWAS for a new look at genes or genetic loci associated with disease, especially complex diseases such as type 2 diabetes. Although many researchers remain skeptical, there is far more power for statistical calculations and more ability to follow up findings (using CRISPR to recreate variants in model organism research, for example), and using GWAS to find target genes for therapeutics has re-emerged as a research strategy.

The work is important, because in recent years the gulf between genetics and genomics research and medicine has narrowed substantially. Genomic medicine is no longer limited to a few rare disease cases and science fiction—it’s becoming more mainstream every day. But it’s fundamentally different from one-size-fits-all, population average-based medicine, and much more difficult to implement. In research, questions begetting questions and new beginnings is exciting. In medicine, it’s dangerous. And costly. The more boundaries we can draw around the complexities, the better off patients will be. It’s a tantalizing prospect that has drawn a lot of interest and a lot of investment. Yet even tech giants staffed with brilliant people and formidable resources such as Google and Apple may be underestimating the task at hand. Their initiatives are exciting, but there have been signs that they too expected matryoshka dolls and found Narnia instead.

The words of Sean Parker, a tech executive who made billions in the Silicon Valley and has invested a great deal in cancer research himself, resonate as we push forward into the frontiers of genomics and human biology. “… tech people,” he noted, “… so dramatically underestimate the complexity of the human body. It’s not designed by us. It doesn’t work in ways that make sense.” Kind of like a wardrobe that opens up into a snowy forest.


Mark Wanner followed graduate work in microbiology with more than 25 years of experience in book publishing and scientific writing. His work at The Jackson Laboratory focuses on making complex genetic, genomic and technical information accessible to a variety of audiences. Follow Mark on Twitter at @markgenome.