There are currently >1.3 million human –omics samples that are publicly available. This valuable resource remains acutely underused because discovering particular samples from this ever-growing data collection remains a significant challenge. The major impediment is that sample attributes are routinely described using varied terminologies written in unstructured natural language.
Systematic tissue annotations of genomics samples by modeling unstructured metadata
minute read
by Nature Communications | November 8, 2022