Biomedical science is capitalizing on various advances in tools to explore and manipulate genomes
These days, it doesn’t take a rocket — or genome — scientist to recognize that our capacity for exploring genomes is at an all-time high. In laboratories here at The Jackson Laboratory (JAX) and across the world, increasingly efficient machines churn out endless strings of genetic information; precise tools home in on specific spots in the genetic code and can even make corrections; and robust analytical methods help make sense of a morass of data. Propelled by these capabilities, scientists are uncovering clues within our DNA that shed light on the mysteries of biology and help unravel the complexities of disease. Studies in this field have already had a huge impact on the world, from CRISPR research winning the 2020 Nobel Prize in chemistry to genetic research leading the charge against the COVID-19 pandemic.
Technologies that fuel this mind-bending pace of discovery are still buzzing and humming away. Three methods in particular are worthy of notice: high-throughput genome sequencing, CRISPR, and single-cell genomics. Here, we offer a brief primer on these approaches, describing how they emerged, how they work, and how they are transforming biology and medicine.
High-throughput genome sequencing
The tools for reading — or “sequencing” — the chemical letters that make up our DNA have evolved rapidly over the last two decades. Researchers can now gather information more quickly and at a lower cost than ever before.
What’s behind this rapid change? Well, in short, entirely new ways of reading DNA that diverge from the standard “first-generation” approach, known as the Sanger method. Named for its inventor, Frederick Sanger, this kind of sequencing was the scientific workhorse of the Human Genome Project (HGP), a sweeping, international effort to decode the full human genetic blueprint, which culminated with the completion of an initial draft genome sequence in 2001. With a length of roughly 3 billion bases, the first full human genome sequence took about ten years and nearly $3 billion to complete. In order to determine the order of chemical letters (or “bases,” abbreviated as A, C, G, and T) that make up the genome, the Sanger method creates new copies of the DNA target of interest. The raw material for these copies comes from bases that carry special modifications, such as fluorescent tags that glow a different color depending on the type of base — for example, green for A, red for T, and so on. As these modified bases are incorporated into the newly synthesized DNA strand, it becomes possible to decipher the sequence. (Watch this short video, which explains the basic idea behind Sanger sequencing.)
Eventually, however, Sanger sequencing hit a technological wall. If sequencing whole genomes was to become more commonplace (and take less time and money), an entirely new approach was needed. Enter so-called “next-generation” sequencing (NGS), which relies on different kinds of chemistry than Sanger sequencing.
While multiple NGS methods emerged that each differed in the nitty-gritty details, most shared a handful of key properties. First, instead of aiming to maximize read length, they used what is known as short-read sequencing protocols, which first chop the genomes into fairly short bits of DNA sequence, from as short as 50 bases to a few hundred bases. The technologies also miniaturized the sequencing process — decoding a piece of DNA within a very, very tiny space, allowing many other pieces of DNA to be sequenced simultaneously. (This is why these NGS methods are often called massively parallel sequencing.) Once the small pieces have been sequenced, they are assembled into the full sequence, using a previously assembled genome for reference.
Short-read technology provides fast, accurate and cost-effective sequencing, and almost all genomic sequencing used this method for many years. But the need to sequence short pieces of DNA, then reassemble them, comes with limitations. For example, it is impossible to sequence long repetitive DNA sequences, and important genomic features such as duplications, inversions and other so-called structural variants are difficult to detect. Therefore, a lot of effort has gone into developing long-read sequencing methods, which can sequence segments many thousands or even millions of bases long. So while short-read sequencing still works well for many purposes, long read technologies are filling in the previously unfilled gaps in genomic data.
The combined effect of these new technologies is that even as sequencing has become more and more accurate and powerful, costs have fallen. And driving down costs means scientists get more DNA sequence for their laboratory dollar. Now, it is economically feasible to sequence not one genome, but hundreds of them, even thousands.
This evolution has ushered in a new era of biomedicine, in which it is possible to probe the human genome on an individual level, revealing variations in one person’s DNA that may have significance for understanding disease biology and could even guide treatment. Countless laboratories here at JAX are harnessing these new capabilities. For example, Charles Lee, Ph.D., FACMGThe study of structural genomic variation in human biology, evolution and disease.Charles Lee , professor and scientific director at The Jackson Laboratory for Genomic Medicine, is a world leader in applying sequencing and other genome-scale technologies to reveal how individual genomes vary from one another. He is part of an international consortium that reported the completion of a pioneering sequencing project, the 1000 Genomes Project, which sequenced the genomes of over 2,000 people from 26 populations across the globe. More recently, Chia-Lin Wei, Ph.D., professor and director of genome technologies, has been seeking to more fully explore the complex structural variants of the cancer genome. By understanding cancer structural variants better than ever, the technology has large implications in the treatment of cancerous tumors.
CRISPR
If sequencing is like reading DNA, then CRISPR provides the power to edit — to correct the typos, or “mutations,” that can arise in genomes — and to do so with an unprecedented level of precision. This genome-editing technology has taken the scientific world by storm with its breakneck pace of growth: The approach was first shown to work in mouse and human cells less than a decade ago and has already been applied to a wide range of biological systems and disease areas. Indeed, it has captivated researchers’ imaginations with the remarkable opportunities it opens up in the laboratory and is even moving into the clinic.
CRISPR’s power stems not only from its precision, but also its ease of use. Genome-editing experiments that previously took months, even years, to complete can now be done in a fraction of the time. Combined with the recent growth of DNA sequencing, which has led to a dramatic rise in the number of genes and gene mutations associated with disease, CRISPR packs a powerful, one-two punch — giving scientists the tools to study the biology behind these mutated genes and to correct them.
Short for Clustered Regularly Interspaced Short Palindromic Repeats, CRISPR’s name reflects its beginnings: a collection of DNA sequences, unusual for their highly repetitive nature. These sequences were first found in the bacteria Escherichia coli in the late 1980’s and garnered little fanfare. But as more and more microbes’ genomes were sequenced, these strange repeats kept popping up. First, they were dismissed as genomic junk, but researchers later came to appreciate their importance: tiny snippets of DNA left behind by pathogens (specifically viruses) that had infected the bacteria. These remnants are a physical record, forming a kind of primitive immune system that enables bacteria to defend themselves against future viral attacks.
In the course of probing the ins and outs of this type of bacterial immunity, two important features emerged. First, one of the CRISPR systems includes an enzyme, called Cas9, which can cut DNA. Second, the system is programmable, meaning it can be directed to precise spots in the genome by virtue of a special guide molecule made of RNA. These discoveries helped launch what has become arguably the hottest technology in biomedicine since the dawn of recombinant DNA.
Although CRISPR is indeed remarkable, the technology is relatively new and the ethics of certain applications of CRISPR—such as editing out harmful traits or adding desirable traits in humans—is still being explored. But while there is intense interest — and understandable concern — about its use in humans, it is a revolutionary tool for research. JAX assistant professor is a leader in the development and use of genome editing tools, including CRISPR. Working at JAXs, Cheng has created new methodology using CRISPR technology that can specifically target messenger RNA splicing, thus widening the potential uses of CRISPR. So, while the scientific community plots a careful and measured course with respect to CRISPR’s applications in the clinic, the genome-editing tool will surely continue to blaze new trails in the laboratory.
Single-cell genomics
The human body is made up of nearly 40 trillion cells and roughly 200 different cell types. Amidst this significant diversity, scientists have typically explored cells in bulk. Rather than examining just a single cell, researchers analyze thousands or millions at a time. And that means what can be gleaned from an experiment usually reflects an entire population of cells, rather than one particular cell.
One reason for this lack of individuality is technical — the amount of DNA (or RNA or protein) that can be extracted from a single cell is often not enough to support genome-scale analyses. Yet there are big questions in biology that stem from single cells. Cancer, for example, begins when the DNA of a cell is damaged (or “mutated”) in such a way that allows it to grow out of control, leading to many other rogue cells and the formation of tumors. But cancer isn’t the only area where a deep knowledge of cells as individuals could be beneficial. The brain, the immune system, blood — simply put, many, if not all, of the body’s systems are built on the concept of cellular diversity. And understanding how this diversity is programmed, through changes in DNA, RNA and beyond, is an essential piece in the vast puzzle of human biology.
Recent advances in the techniques for isolating single cells, together with methods for amplifying their genetic material, now make it possible to explore the genomes of single cells. With the birth of this new field, aptly known as single-cell genomics, scientists can probe the full complement of DNA (or even RNA) that exists within a cell.
Recognizing the remarkable opportunities to apply single cell technologies to major questions in biology and medicine, JAX launched a joint center for single cell genomics together with the University of Connecticut, including UConn Health, in 2015. Associate Professor Paul Robson, Ph.D.Areas of expertise include single cell transcriptomics, primate/human early embryonic development, maternal-fetal medicine, fetal programming, pluripotent cell biology, regulatory networks, tumor heterogeneity, circulating tumor cells.Paul Robson , Ph.D., who serves as JAX Genomic Medicine's director of single cell genomics, believes the work of the new center will be critical to advancing the goals of precision medicine: “If you want better insight into how biology works, you need to look at its fundamental unit,” he says.
Single cell genomics methods are important for many of the studies being done at JAX. Recently, Professor , Ph.D., led a massive effort to see what distinguishes lupus patients from their disease-free peers. Lupus remains extremely difficult to diagnose and treat, but after profiling hundreds of thousands of cells from lupus patients and healthy controls, Banchereau and his team identified a distinctive gene expression signature in the lupus patients. The work provides a road map for future therapeutic development. And Associate Professor Gareth Howell, Ph.D.Applies genetics and genomics approaches to study age-related neurodegeneration associated with Alzheimer’s disease, dementia and glaucoma.Gareth Howell , Ph.D. is using the technology to further Alzheimer's research. Particularly, Howell examined a variety of mouse strains on the cellular level to figure out how mouse models could be better used to develop better targeted therapeutics for Alzheimer's Disease.
Although the field of single cell genomics is still fairly new, it is already becoming clear that cells once thought to be genetically similar, if not identical, are in fact quite different. For example, cancer cells are not the only cells that can acquire changes in their DNA. These kinds of somatic (that is, non-inherited) mutations also appear in neurons, and could play a role in epilepsy, autism and other disorders of the developing brain. It is also possible that these genetic differences are somehow required for normal neuronal development. To be sure, we have only just scratched the surface in identifying and understanding the diversity that lies within our cells.
Indeed, as we ride the waves of single cell genomics — and genome sequencing and editing, too — we can see further than ever before. That means our biomedical knowledge is more advanced and more precise than ever, and it’s growing at a phenomenal pace. At the same time, we must also remember that there is still much more to learn.
Nicole Davis, Ph.D., is a freelance writer and communications consultant specializing in biomedicine and biotechnology. She has worked as a science communications professional for nearly a decade and earned her Ph.D. studying genetics at Harvard University.