Pancreatic Cancer: Are We Making Any Progress?

After my recent post on the current state of pancreatic cancer research, a reader got me wondering if we are indeed making any measureable progress towards understanding and treating pancreatic cancer.

My first thought was that a simple year-by-year count of the papers on the subject would give me a crude measure.  I’d been itching to try out some of the functions in Google Doc’s spreadsheet — especially the ones that you don’t find in Microsoft Excel.  Google Docs has an “ImportXML” function that allows you to download XML and extract data from the XML using XPath expressions.  In this case, I want to run a simple PubMed query and extract a count of the number of papers found in a particular year.  NCBI has a set of RESTful web services called eUtils that we’ll use for this purpose.

http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term=pancreatic+cancer&mindate=1997&maxdate=1998/01/01&datetype=edat&retmax=100&usehistory=y

The key pieces of the URL are the term parameter, and the mindate and maxdate parameters.  In this case, I’m looking for papers on pancreatic cancer published between the mindate and maxdate. This query returns an XML document — you can click on the link to see what the query returns.

We’re primarily interested in the number of results found in a given year, so we use the following XPath expression to extract the paper count: /eSearchResult/Count

Lastly, we create a Google Docs spreadsheet to put together the results using the formula shown below:

=ImportXML(“http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term=pancreatic+cancer&mindate=1997&maxdate=1998/01/01&datetype=edat&retmax=100&usehistory=y”,”/eSearchResult/Count”)

You can see the results here.

Admittedly, this is rather a crude measure of progress.   All it really takes is one seminal paper to make a big difference in the field.  It begs the question though, “what are the hallmarks of true progress”?  For example, if we looked at the papers produced in the run-up to the approval of Imatinib (Gleevec) are there any indications that a revolutionary drug is in the offing?  At what point in the history of the literature do we have an indication that we’ve identified the right target for CML?  And how much time elapsed between target identification/validation and the approval of Imatinib?

About these ads

About Mark Fortner

I write software for scientists. I'm interested in Java/Groovy/Grails, the Semantic Web and Cancer Biology.
This entry was posted in Cancer Research, Informatics, pancreatic cancer and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s