An ambitious project that set out 8 years ago to replicate findings from top cancer labs has drawn to a discouraging close. The Reproducibility Project: Cancer Biology (RP:CB) reports today that when it attempted to repeat experiments drawn from 23 high-impact papers published about 10 years ago, fewer than half yielded similar results.
The findings pose “challenges for the credibility of preclinical cancer biology,” says psychologist Brian Nosek, executive director of the Center for Open Science (COS), a co-organizer of the effort.
The project points to a need for authors to share more details of their experiments so others can try to reproduce them, he and others involved argue. Indeed, vague protocols and uncooperative authors, among other problems, ultimately prevented RP:CB from completing replications for 30 of the 53 papers it had initially flagged, the team reports in two capstone papers in eLife.
“It is useful to have this level of objective data about how challenging it can be to measure reproducibility,” says Charles Sawyers of Memorial Sloan Kettering Cancer Center, who reviewed the designs and results for some of the project’s early replication studies. But he wonders whether the project will have much impact. “It’s hard to know whether anything will change as a consequence.”
Nosek’s center and the company Science Exchange set up RP:CB in 2013, after two drug companies reported they could not reproduce many published preclinical cancer studies. The goal was to replicate key work from top papers in basic cancer biology published by journals such as Science, Nature, and Cell from 2010 to 2012. With funding from the Arnold Foundation (now Arnold Ventures), the organizers designed replication studies that were peer reviewed by eLife to ensure they would faithfully mimic the original experiments. Outside contract firms or academic service labs would do the experiments.
The project’s staff soon ran into problems because all the original papers lacked details such as underlying data, protocols, statistical code, and reagent sources. When authors were contacted for this information, many spent months tracking down details. But only 41% of authors were very helpful; about one-third declined or did not respond, RP:CB reports in eLife. Additional problems surfaced when labs began experiments, such as tumor cells that did not behave as expected in a baseline study.
The project ended up paring an initial list of papers, comprising 193 experiments, to just 23 papers with 50 experiments. They attempted to replicate all experiments in 18 of those papers and some experiments in the rest; starting in 2017, the results from each one have been published, mostly as individual papers in eLife. All told, the experimental work cost $1.5 million.
Results from only five papers could be fully reproduced. Other replications yielded mixed results, and some were negative or inconclusive. Overall, only 46% of 112 reported experimental effects met at least three of five criteria for replication, such as a change in the same direction—increased cancer cell growth or tumor shrinkage, for example. Even more striking, the magnitude of the changes was usually much more modest, on average just 15% of the original effect, the project reports in a second eLife paper. “That has huge implications for the success of these things moving up the pipeline into the clinic. [Drug companies] want them to be big, strong, robust effects,” says Tim Errington, project leader at the COS.
The findings are “incredibly important,” Michael Lauer, deputy director for extramural research at the National Institutes of Health (NIH), told reporters last week, before the summary papers appeared. At the same time, Lauer noted the lower effect sizes are not surprising because they are “consistent with … publication bias”—that is, the fact that the most dramatic and positive effects are the most likely to be published. And the findings don’t mean “all science is untrustworthy,” Lauer said.
In an ironic twist, however, different replication efforts don’t always produce the same results. Other labs have reported findings that support most of the papers, including some that failed in RP:CB. And two animal studies that weren’t replicated by RP:CB have led to promising early clinical results—for an immunotherapy drug and a peptide designed to help drugs enter tumors. The RP:CB effort struggled to reproduce animal studies, notes stem cell biologist Sean Morrison of the University of Texas Southwestern Medical Center.
Still, the findings underscore how elusive reliable results can be in some areas, such as links between gut bacteria and colon cancer. Johns Hopkins University infectious diseases physician-scientist Cynthia Sears, who reviewed two papers in this area that were not fully replicated, says researchers have come to realize that simple changes in the experimental setup, such as the local bacteria in a lab animal quarters, can sway results. RP:CB has been “an instructive experience,” she says.
If there’s one key message, it’s that funders and journals need to beef up requirements that authors share their methods and materials, the project’s leaders say. “The perception [is] that doing these replications is really slow and hard,” says Science Exchange CEO Elizabeth Iorns. But if the data, protocol, and reagents were readily available, “it should be very fast.” Lauer says new NIH data sharing rules starting in January 2023 should help.
But some researchers worry new rules may not improve the rigor of discovery research. Sawyers says: “In the end, reproducibility will likely be determined by results that stand the test of time, with confirmation and extension of the key findings by other labs.”