Approximately 8% of human DNA consists of repeating patterns called tandem repeats (TRs). These are sequences in your genetic code that repeat over and over again. There are two types: short tandem repeats (STRs), which are typically small, made up of repeating groups of 1 to 6 DNA nucleotides, and variable number tandem repeats (VNTRs), which are longer, consisting of 7 or more nucleotide repeating units.
Changes in these repeating patterns can sometimes lead to genetic diseases, especially when they mutate to become unusually large. Although scientists are beginning to unravel their significance, TRs remain tricky to study and understand, especially when compared to simpler changes in DNA, such as single-letter swaps known as single nucleotide variants. To simplify the study of diseases caused by TRs, researchers, including Harriet Dashnow, PhD, assistant professor of biomedical informatics, developed STRchive, an online resource you can explore at http://strchive.org.
Dashnow emphasized the importance of building a resource that could evolve alongside scientific discoveries and technological advancements, explaining, “Recognizing the evolving landscape of tandem repeat diseases, we identified a crucial need for a dynamic resource—one that could continuously adapt to the community’s changing demands as emerging disease associations were uncovered and advancing genomic technologies reshaped the field.”
What is STRchive?
STRchive combines information from scientific research, clinical databases, and large-scale genomic studies, making it easier to analyze TRs and understand their role in diseases. Because TRs are often ignored because of their complexity or interpreted from the perspective of limited, scattered data, having clinical and bioinformatic information assembled in one place has empowered researchers and clinicians to tackle TRs head-on. This groundbreaking tool marks a significant step forward in decoding the mysteries of our DNA.
STRchive is the result of a collaborative effort involving Laurel Hiatt, graduate student at the University of Utah, the Dashnow Lab at the University of Colorado School of Medicine, the gnomAD team at the Broad Institute, PacBio, and the Quinlan lab at the University of Utah. The resource is fully public and transparent, allowing anyone to freely access and reuse its data. Users can even review every change made to the database and the decisions surrounding the quality of clinically relevant data. Perhaps most importantly, users and community members can suggest updates to STRchive directly, ensuring it is as comprehensive and up-to-date as possible.
A Living Resource for DNA Repeat Studies
STRchive continues to evolve, with monthly updates incorporating the latest research findings. It serves as a living resource for the scientific community, a watering hole for TR enthusiasts across the globe. The team is also developing a protocol to determine whether newly discovered genetic variants are strong enough to be applied in clinical settings, saving clinicians time and resources as they try to solve complex disease cases.
Additionally, researchers are improving knowledge about complex TR loci in populations by leveraging more long-read data, exploring how often people have entirely different motif sequences at these loci.
Dashnow shared that this tool will have an immediate and long-term impact: "STRchive will enhance the diagnosis of TR disorders by enabling centralized access to the latest information, ensuring patients receive more accurate assessments. Over time, it will also support research efforts, helping to uncover the causes of these disorders and explore potential treatments."
STRchive empowers researchers to better understand genetic variations and their impact on human health, while supporting diagnostic efforts in this understudied area of genetic disease. By seamlessly integrating advanced technology with medical research, this tool paves the way for improved diagnoses and enhanced patient outcomes.
Additional Information
The paper “STRchive: a dynamic resource detailing population-level and locus-specific insights at tandem repeat disease loci” was published in Genome Medicine, part of Springer Nature on Mar. 26, 2025.