Powerful new tool to advance genomics, disease research

UVA Health researchers have developed an important new tool to help scientists sort signal from noise as they probe the genetic causes of cancer and other diseases. In addition to advancing research and potentially accelerating new treatments, the new tool could help improve cancer diagnosis by making it easier for doctors to detect cancerous cells.

Developed by UVA’s Chongzhi Zang, PhD, and his team and collaborators, the new tool is a mathematical model that will help ensure the integrity of “big data” about the building blocks of our chromosomes, genetic material called chromatin. Chromatin — a combination of DNA and protein — plays an important role in directing the activity of our genes. When chromatin goes wrong, it can turn a healthy cell into cancer or contribute to other diseases.

Scientists now can study chromatin within individual cells using a cutting-edge technology called “single-cell ATAC-seq,” but this generates a tremendous amount of data, including much noise and bias. Zang’s new tool cuts through that, saving scientists from false leads and wasted efforts.

As the best of times, large-scale, single-cell genomics research is like “hunting a needle in a haystack,” Zang says. But his new tool will make it much easier by clearing away a lot of bad hay.

“Using the traditional way of analyzing the data, you might see some patterns that look like real signals of a particular chromatin state, but they are actually fake due to the bias of the experimental technology itself. Such fake signals can confuse scientists,” said Zang, a computational biologist with UVA’s Center for Public Health Genomics and UVA Health Cancer Center. “We developed a model to better capture and filter out such fake signals, so that the real needle we are looking for can more easily stand out of the hay.”

About the Genomics Tool

Zang’s new tool adapts a model from number theory and cryptology called “simplex encoding.” He and his colleagues used that to code DNA sequences into mathematical forms and, ultimately, convert the complex genome sequence into a much simpler mathematical form. They can then compare different forms to detect bias and noise in the sequence data that cannot be found easily using conventional approaches.

Source: Read Full Article