Department of Biomedical Informatics

Systematic tissue annotations of genomics samples by modeling unstructured metadata

Written by Nature Communications | November 08, 2022

There are currently >1.3 million human –omics samples that are publicly available. This valuable resource remains acutely underused because discovering particular samples from this ever-growing data collection remains a significant challenge. The major impediment is that sample attributes are routinely described using varied terminologies written in unstructured natural language.