Utopia Docs: Making Scientific Literature More Useful

Every weekend my PubMed search agent sends me a slew of papers to review.  The interesting ones I add to the Pancreatic Cancer Genomics group in Mendeley — and at last count there are over 500 papers that have found their way into the collection.  The hard part of this process is simply reviewing the papers.  I read through the papers, examine the references for additional articles that might be useful for the collection, look at the EntrezGene entry for the genes described in the paper, and look at the pathways described in the paper in Wikipathways.

The standard tool I use for this has been Adobe Acrobat Reader; however, recently a Twitter friend sent me a link to an application called Utopia Documents, and I’ve found that it definitely makes it easier to review papers.

Utopia is more than just a PDF reader though, it performs smarter searches on terms found in the papers.  For example, if I select the term “KRAS” in a paper I’m reading, and click the “Explore” popup button, Utopia recognizes the KRAS as a gene and displays the 3D structure of the KRAS protein found in the Protein DataBank.  It also displays the Wikipedia entry for KRAS.  If I select the drug “gemcitabine”, the ChEMBL entry for the drug appears, including the structure for the compound.

The video below shows some of the Utopia’s features:

After using Utopia for a little over a week now, I’ve started to build a wish list of features that I’d like to see:

  • Support for additional file systems.  With the increasing popularity of cloud computing, it’s often useful to store PDFs in Google Drive, or Box.com, or other similar services.
  • Support for Mendeley.  Mendeley is a great tool for sharing papers, and forming ad-hoc special interest groups around a particular subject.  Although the Utopia developers have integrated a Mendeley search into Utopia, it would be really useful to be able to access papers stored in Mendeley.  I tend to use Mendeley and other cloud services to make papers accessible from both work and at home.
  • Support for MyPubMed.  PubMed offers a service called MyPubMed which allows you to save collections of papers, and searches.  It would be nice if Utopia could access those collections.
  • Text mining.  Once you have a collection of papers, it’s often useful to mine those papers for additional related information.  For example, “extract all common genes, and pathways mentioned in my collection, and find similar papers in PubMed”, or “extract all of the common terms from the papers that I’ve starred in Mendeley and find similar papers in PubMed”.
  • Additional links. Currently when you click on the “Explore” button to find out more about a selected term you get information from Chembl or PubMed, or PDB, or a number of other sites.  However, there are always going to be very specialized sites that you wish you could search but aren’t included in Utopia.  It would be great if there was a way to add those searches and share them with other users.  Off the top of my head, here are a few of the sites I would add:
    • OMIM – the Online Mendelian Inheritance in Man project has some really useful information about the association between genes and diseases.
    • Wikipathways – this site contains one of the largest collections of pathway information.
    • Entrez-Gene – this site helps you understand the function of a gene, and contains GeneRIFs (papers that describe the function of a gene), lists of gene interactions, in addition to information about the structure and sequence of the gene.
    • COSMIC – the Catalog Of Somatic Mutations In Cancer (COSMIC) site would provide information about common mutations in a selected gene.
    • DrugBank – it would be nice to be able to select a drug referenced in a paper and see drug target and related information in DrugBank.
  • An open API (preferrably in Java).  As with any piece of software, there’s always room for improvement, and one of the ways that new features get added quickly to software is by providing an open application programming interface (API) for developers.  Since Java has the largest number of available open source libraries, it makes sense to have the API in Java or Groovy.  If you used Groovy, it would make it more amenable to scripting without losing the ability to reference and make use of Java libraries.

About Mark Fortner

I write software for scientists doing drug discovery and cancer research. I'm interested in Design Thinking, Agile Software Development, Web Components, Java, Javascript, Groovy, Grails, MongoDB, Firebase, microservices, the Semantic Web Drug Discovery and Cancer Biology.
This entry was posted in Informatics, Semantic Web, Uncategorized and tagged , , , . Bookmark the permalink.

6 Responses to Utopia Docs: Making Scientific Literature More Useful

  1. Jan Velterop says:

    Great post! Just a tiny bit of clarification: OMIM and Entrez Gene are already incorporated in Utopia Documents 2.0. If you right-click (ctrl-click) a highlighted term in the PDF, you get a lookup menu which includes the federated NCBI databases which in turn include OMIM and Entrez-Gene (and many others, of course). And I understand that indeed more resources are currently being considered for inclusion in Utopiadocs, no-doubt including the ones you are proposing if they weren’t already on the list.

    • aspenbio says:

      Hi Jan,
      Thanks for your comment. I didn’t even realize there was a popup menu available. The one thing that bothers me about the “Search NCBI Databases” feature is that it opens up a separate webpage, performs a search and then makes you click around to find the result you’re after. I prefer the way the “Explore” feature works, because it shows you the result directly. It would be nice if there was a way to configure the Explore feature to specify which databases you wanted to search. For me, I’m primarily interested in Human genes, proteins and pathways, but I could see the need for scientists in different disciplines to select their favorite organism, or specialist database.

  2. Pingback: Build your Mendeley Network « Social Mexican

  3. A fellow blogger here, found your site via Fork, and I have a piece of advice:
    write more. Literally, it appears as though you relied
    on the video to make your point. You certainly have an understanding of what you’re discussing, so why waste your brains merely putting up videos to your blog when you could be giving us something illuminating to read?

    • aspenbio says:

      In general, people don’t tend to read long blog posts, and in my case, the point of the article was to give other scientists a reason to try out Utopia Docs, and to give feedback to the Utopia Docs developers. Not to write an all-encompassing document that describes how to use the software. I’ll leave that task to the software developers working on the project.

  4. Pingback: Tinkering with Paperpile | Aspen Biosciences Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s