Over the past few years, pharmaceutical companies have come to increasingly rely external research to help fill their pipelines. This greater reliance on academic research is not without it’s risks — chief amongst these is the lack of reproducibility. Amgen discovered that 47/53 landmark oncology papers had results that could not be reproduced, and Bayer had a similar results. The links below describe a number of these experiences.
- The Need for Reproducibility in Academic Research
- Nature: Challenges in Irreproducible Research
- Economist: Trouble at the Lab
The NIH is taking steps to change this, and with any luck this will have a positive impact on the current attrition rate. However, even if their efforts are successful, the real problem remains the lack of a means of readily assessing the credibility of research.
Recently, I was looking for a way to catalog and characterize the target space for pancreatic cancer (my long-term pet project), and I found myself wondering how to assess credibility in research areas with which I was unfamiliar. Put simply, “how do I trust the assertions in a paper, when I’m unfamiliar with the methods, materials, and analytical approach used to arrive at these conclusions?”
Assessing the methods
Usually the Methods section of any paper describes the approach taken to gather, and analyze the data in the experiment. My first thought is that there should be a “PubMed for Protocols”. Some central repository that describes every type of protocol. This repository should be similar to source code repository:
- If you design some variation of a protocol, you should be able to “fork” the original, and show how your new variation differs from the original.
- Protocols should be “reviewable” (just like code) – so that researchers can rate and comment on them.
- Protocols should be “testable” (again, like code) – so that researchers can see how this new protocol was verified. After all, if you’re going to be spending millions of dollars on a drug discovery program, you want to limit the amount of risk in the venture.
Similar types of protocol repositories are common enough in pharmaceutical companies, but after Googling around for a while, and asking colleagues, I’ve been rather surprised not to find one generally available. An offhand comment on Twitter quickly resulted in a pointer to new project called HiveBench that the developers are hoping will become a just such an open repository. You can see an example protocol here complete with a YouTube clip demonstrating the protocol.
In addition to HiveBench, the Kickstarter project protocols.io also appeared on my radar. Protocols.io provides both a web-based application, and mobile apps for protocol management, and promises to also provide ratings.
The only fly in the ointment with these approaches is that they’re not integrated into the metadata for the paper. Ultimately, I’d like to see some sort of ratings system directly in PubMed, so that when a search returns a hundred papers, I can filter the list down to the ones that seem the most credible. And even if I’m not an expert is a particular area, I should still be able to quickly identify the most credible articles.
Assessing the fitness of the protocol for the paper would then, at some level, boil down to determining whether or not the protocol was novel or standard, and whether or not it was followed appropriately. Even slight, undocumented variations in a protocol can sometimes lead to variations in results.
Assessing the materials
One of the trickier aspects of evaluating research is determining whether or not the materials used in the experiment were appropriate. For example, if you’re doing an in vitro comparison of a known standard of care, against your new drug, you want to know the sources of the compounds and cell lines, the purities of the compounds, whether or not the author had a reasonable number of replicates to get the statistical power necessary to prove their assertion, etc.
Assessing the authors
Of course, any discussion of credibility relies in part on the publishing history, institutional association of the authors of the paper, and any potential conflicts of interest.
Quis custodiet, ipso custodes [?]
Lastly, assuming that a ratings system were introduced both for papers and protocols, there would need to be some means of validating the credibility of the reviewers.
In a recent addition to PubMed, NCBI added the ability for people to comment on a paper. This has both advantages and disadvantages. The main advantage is that it would be possible for the original authors to get feedback about the paper, and for anyone else reading the paper to also see any criticism. The disadvantage to this is that (as the links at the top of this article reveal), with the current funding dry-spell, negative feedback is often viewed as a career limiting move. After all, the person whose work you critique could end up evaluating your grant proposal at some point in the future.
Imagine if each PubMed entry had a list of assertions associated with it. The assertion might be a snippet of text like “KRAS is mutated in 90% of pancreatic cancer patients”. Each assertion would, in turn link to all papers that substantiated that assertion, and all papers that contradicted that assertion. By providing both types of evidence the reader could assess how much reliance they should place on a particular assertion made in the paper. The tricky part here would be that an assertion that is only substantiated in the paper itself, might be dismissed out of hand by the reader.
What if each section of the paper had a rating or score associated with it? The score would be determined based on some of the criteria outlined previously, and could be provided as part of the standard peer review process, along with any reviewer comments associated with that rating.