Following an analysis of more than 20,000 articles by researchers in the Department of Biomedical Informatics (DBMI) at the University of Colorado School of Medicine, a major science publication is implementing new policies to improve the diversity of its sources.
The study, led by DBMI postdoctoral researcher Natalie Davidson, PhD, and Casey Greene, PhD, professor and founding chair of the DBMI, discovered that 69% of the people quoted directly in Nature’s journalism articles identified as male – a percentage the publication says is “unacceptably high.”
While the researchers provided thorough evidence explaining the disparity in this particular study, they point out that simple data collecting measures can be used to enhance diversity in academic authoring moving forward.
“We did a big lift here with this in-depth analysis to show that there are blind spots,” Davidson says of the study. “But that doesn’t mean that everyone has to do the big lift. If you can collect demographic information as you go, that’s a really easy way to accumulate and monitor your work. Then you don’t have to these sorts of big lifts all the time.”
Driving policy changes
Nature reached out to Greene and his team about a possible analysis of sourcing after seeing the results of a previous study Greene's lab worked on involving disparities among honorees by major scientific societies. It was a broad request, Greene says, but he and Davidson ultimately decided it was research worth pursuing.
"There was a lot that went into this," says Davidson, who started the work in December 2021. "I was going to the website, scraping all the text, cleaning all the text, and figuring out what was actually part of an article and what was not. There's a lot of cleaning and care that has to be taken into account in order to make sure what we're getting is useful."
To analyze the 22,001 non-research articles in Nature, the pair of researchers extracted cited authors’ names and predicted gender and name origin. Next, the articles were compared with a list made up of first and last authors within primary research articles in Nature and a subset of Springer Nature articles during the same time period.
The results showed a skew toward male sources, though Davidson and Greene note that quotation is trending toward equal representation faster than authorship rates in academic publishing.
Researchers also found that there was an over-representation of sources with names of Celtic/English origin and an under-representation of sources with names of East Asian origin.
Davidson noted the research only looked at people who were paraphrased or quoted in stories. Sources who played a role in educating journalists or providing background weren’t counted, as that sort of information-gathering data is much more difficult to capture.
Still, the results were enough to encourage Nature to adopt policies to increase diversity among sources.
"In April 2021, we began a pilot project to track aspects of diversity in some of Nature’s journalism more systematically," the publication wrote of its progress last month. "That September, we began tracking diversity in journalistic content across all relevant teams."
Benefits at Nature and beyond
Now, data reported by Nature shows that about 55% of people using he/him pronouns are quoted or paraphrased in written articles. About 36% use she/her pronouns. Of the 2,252 people quoted or paraphrased, 24 used they/them pronouns and five used other pronouns.
For the most part, it’s hard to tell whether disparity exists in a single article. “That’s why Natalie's analysis at 20,000 articles is helpful. It examines a scale where you can understand whether disparities exist,” Greene explains.
The research duo says it’s wise for publications to look at potential blind spots in the editorial process. It can also be helpful for researchers and scientists to keep sourcing in mind, too — especially as it can greatly affect the trajectory of a scientist’s career.
“If I am contacted by reporters, and people are looking for other sources, I try to be cognizant of the people that I recommend and make sure that demographically it makes sense,” Davidson says. “That way, I’m not just using the first names off the top of my head – which are typically the people from the largest labs that are already very famous. I try to think deeper into my network of who is an expert but isn’t always highlighted.”
Nature’s editors say while they’ve made progress since seeing the results from Davidson and Greene — recording preferred pronouns of people quoted and their geographical location — there is still more work to do.
“We did not collect data on race or ethnicity for this phase of our project, but are working to widen the racial and ethnic diversity of our sources to make our reporting more representative of global science,” the publication wrote of its new efforts.