JAX researchers used long-read mRNA sequencing to catalog all mRNAs in primary breast cancer samples and revealed previously undocumented mRNA isoforms that can change protein function and may contribute to cancer progression and therapy response.
The information contained in a gene is decoded through a multi-step process that starts with transcription of the DNA into an RNA molecule that is then spliced. Some segments, introns, are removed, and the rest, exons, are connected to form a mature messenger RNA (mRNA) that is then translated to generate a protein. Alternative RNA splicing, in which different introns and exons may be included or excluded, is part of normal gene expression and cellular function and leads to the generation of alternative versions of mRNA molecules called spliced isoforms.
Traditional RNA sequencing methods sequence only short mRNA segments at a time, and, as a result, don’t detect complex alternative splicing events. Splicing aberrations have been associated with diseases such as cancer, however, making it vital to identify spliced isoforms to better understand disease states. A Jackson Laboratory (JAX) research team led by Professor Jacques Banchereau, Ph.D., Assistant Professor Christine Beck, Ph.D.Investigating the mechanisms and consequences of genomic rearrangements with a focus on repetitive elements.Christine Beck, Ph.D., Assistant Professor Olga Anczuków-Camarda, Ph.D.Investigates how alternative RNA splicing contributes to cancer with the goal of identifying novel clinical biomarkers and targets for precision medicine.Olga Anczukow, Ph.D., and including first authors Computational Scientist Diogo Veiga, Ph.D., and Predoctoral Associate Alex Nesta, examined the splicing landscape in human breast tumors using long-read mRNA sequencing, a novel technology that captures full length isoforms. In “A comprehensive long-read isoform analysis platform and sequencing resource for breast cancer,” published in Science Advances, the team presents data obtained from 26 breast cancer and four normal breast tissue samples, the largest such data set to date. These findings would be largely missed with traditional short-read sequencing approaches in these cancers. The results underscore the biological—and clinical—relevance of characterizing full length mRNA isoforms in cancer.
Capturing important differences
Using Pacific Biosciences long-read mRNA sequencing technology, the researchers sequenced the transcriptomes (all mRNA found within the cells) of the 30 samples with high accuracy. Their analyses enabled the identification of 142,514 unique full-length isoforms, two thirds of which are absent from the current reference human transcriptome. Additional analysis revealed that oncogenes—genes that are known to promote cancer—and breast cancer pathways are significantly overrepresented in the tumor-associated isoforms, while tumor suppressors are underrepresented. Importantly, their catalog of novel isoforms was backed up by a number of other experimental data sets.
Interestingly, the novel isoforms were not just differentially spliced, but more than 25,000 were predicted to affect function and cellular localization of the resultant protein products. The researchers found 18 novel isoforms of the estrogen receptor, a clinical biomarker of hormone-positive breast cancers and a target for endocrine cancer therapies. Seven of these isoforms are predicted to lack the DNA binding domain of this protein, potentially affecting therapy response. One of these isoforms has recently been linked to endocrine resistance and breast cancer proliferation, demonstrating the promising clinical impact of discovering tumor-associated isoforms using long-read mRNA sequencing.
Finally, a novel modeling approach allowed the interrogation of isoforms in data from other sources, including The Cancer Genome Atlas (TCGA), the largest collection of genomic data from human tumors. The team identified 3,059 tumor-specific alternative splicing events that are recurrent in breast cancers, and 35 of these isoforms were correlated with differences in survival across breast cancer patients. While the alternative splicing events are mostly restricted to subpopulations of patients, there are several that are recurrent and affect more than half of TCGA patients. Importantly, 21 of the events had not been previously characterized, emphasizing the importance of long-read mRNA sequencing in revealing clinically relevant alternative splicing events.
A data resource
A key aspect of identifying cancer-specific isoforms is that they provide potential targets for immuno-oncology. Isoform-specific antibodies can be developed specifically for isoforms unique to cancer cells, providing a way to eliminate them and protect healthy tissues. They can also be used to generate peptides for cancer vaccination protocols. To this end, the researchers have constructed an interactive web portal (https://thejacksonlaboratory.shinyapps.io/brca-isoforms/) to make their data available to the research community.