Sequencing gut bacterial genes to improve colorectal cancer detection

From the Dey Lab, Translational Sciences and Therapeutics Division

Our guts are teeming with trillions of bacteria. These bacteria serve vital functions in the body: they aid in carbohydrate digestion, synthesize essential vitamins, and help modulate the function of our immune systems. Despite the indispensable nature of the gut microbiome, its exact composition varies from person to person. In addition to this baseline diversity, the composition of the microbiome changes when a person develops colorectal cancer (CRC). Because changes in gut bacteria are associated with patient outcomes in CRC, understanding which, if any, bacteria drive the disease is crucial to identifying people who may be susceptible to colorectal cancer based on the composition of their microbiome. Many researchers have tried to answer this question by identifying and comparing different species of bacteria in the guts of healthy people and CRC patients. The problem with this approach, however, is that it fails to capture the full breadth of microbiome diversity. “One person or another, we all have the same genes,” says Dr. Sam Minot, lead author of the study, “but one bacteria to another could have dozens…of different genes, even if we call them by the same name.” By grouping and comparing bacterial species using traditional taxonomic names, researchers lose information about how the genes that any given bacterium contains associate with or drive cancer. To tackle this problem, Drs. Neel Dey in the Translational Sciences and Therapeutics Division and Sam Minot in the Fred Hutch Data Core took a different approach.

Their analysis identified co-abundant genes (CAGs)—genes present on the same fragment of sequenced bacterial DNA across people—that were associated with CRC or health. This DNA-first approach had been described by other pioneers of microbiome research, but Dey and Minot described a new way to analyze the dataset. These sequencing data sets are enormous, making it almost impossible to pick out CAGs by just looking at the sequences. Because of the size of the data, traditional analysis approaches (like multiple sequence alignments) would take way too long to be practical. To overcome these problems, Minot took inspiration from an unexpected source. “I went down this path of analysis where the people who had done the most work were Spotify™️, trying to figure out who was listening to the same types of songs.” In the end, the software Minot developed was different, but the approach was the same. This new approach allowed the group to quickly pick out pieces of microbial DNA that were likely from the same genome and identify the CAGs. “It’s just another example of how data scientists around the world are coming up against the same problems in remarkably different contexts,” says Minot.

The group isolated bacterial DNA from healthy donors and CRC patients. Then, they sequenced the bacterial genes to identify health- and disease-associated co-abundant genes. They associated those genes with bacterial species, leading them to discover bacteria that induce a precancerous state in the colon.
The group sequenced the bacterial genes from healthy donors and CRC patients to identify health- and disease-associated co-abundant genes. They associated those genes with bacterial species. They discovered bacteria that induce a precancerous state in the colon.

After identifying the CAGs associated with health or CRC, the group then determined which bacterial species in the gut harbored the genes. They found that almost 40% of gut bacteria could be associated with health or cancerous guts. This is a staggering result considering that traditional approaches seeking to identify bacterial species have only identifed  a couple dozen or so species associated with CRC. The CAG-based approach identified bacteria that had never been associated with CRC before, a testament to the power of their analysis. To experimentally support their in silico findings, Dey and Minot created a mix of bacterial species identified by their approach that weren’t previously associated with CRC. They transplanted these bacteria into mice primed to develop intestinal tumors and saw that the mice given cancer-associated bacteria developed tumors at higher rates than those with health-associated bacteria. Single-cell RNA sequencing of colons from these mice suggested an unexpected mechanism of action: induction of precancer states.  

Colorectal cancer is often preceded by the presence of precancerous growths with a high risk of becoming cancerous called adenomas. Although the group did not include patients with adenomas in the datasets used to generate their list of CRC-associated CAGs, they found upon further analysis that CRC-associated genes were more abundant in groups with adenomas than healthy groups. This suggests that CRC-associated CAGs can be used to identify a precancerous state in patients. Dey hopes that this research could be used to more effectively identify individuals that are susceptible to CRC and test whether there are population-specific or diet-correlated trends in CAGs. “One could envision that this approach might offer both a generalizable and customizable…risk stratification for CRC patients,” says Dey. Overall, the innovative computational analysis used by Minot revealed new associations between the gut microbiome and colorectal cancer that could improve cancer detection and outcomes for patients with the disease.


This work was supported by the National Institutes of Health, the Microbiome Research Initiative at Fred Hutch, and the Fred Hutch/University of Washington/Seattle Children's Cancer Consortium.

Fred Hutch/University of Washington/Seattle Children's Cancer Consortium members Drs. Jason Dominitz, William Grady, and Neelendu Dey contributed to this work.

Minot SS, Li N, Srinivasan H, Ayers JL, Yu M, Koester ST, Stangis MM, Dominitz JA, Halberg RB, Grady WM, Dey N. 2024. Colorectal cancer-associated bacteria are broadly distributed in global microbiomes and drivers of precancerous change. Scientific Reports. 10.1038/s41598-024-70702-1.


Kelsey Woodruff is a PhD candidate in the Termini Lab at Fred Hutch Cancer Center. She studies how acute myeloid leukemia cells remodel the sugars on their membranes to reprogram cancer cell signaling. Originally from Indiana, she holds a bachelor's degree in Biochemistry from Ball State University. Outside of lab, you can find her crocheting and enjoying the Seattle summers.