Tinkering with Paperpile


stacks (Photo credit: salerie)

In a previous post I wrote about how Google Docs (and the Research plugin) can be used for writing journal articles.  There were two shortcomings to this approach:

  • Google Docs doesn’t support end notes (only footnotes)
  • The Research plugin supports only a limited number of citation styles

Recently, I was contacted by the folks at Paperpile to have a look at their solution to this problem.  Here’s a short demo that gives you some idea of what you can do with it.

I started by importing my paper collection from Mendeley.  My primary collection has over 800 papers and took a while to import, but that was to be expected. Overall the performance was pretty good.

In addition to importing from Mendeley and Zotero collections, you can drag and drop papers into your collection, or import from search results in Google Scholar, PubMed or arXive.

Paperpile did a nice job of importing all of the metadata for those papers including the abstracts, keywords, and Mendeley tags.

Once imported I could explore the papers by tag, add new tags, and download the PDFs. You can sync your collection with Google Drive. Which means, to quote Buckaroo Banzai, “no matter where you go, there you are” — all of your papers in one spot, accessible from any computer you log into.

You can color code tags to give additional meaning to your collection.  In my collection, I typically tag papers with gene names, disease processes (like angiogenesis, desmoplasia, metastasis, etc), pathways, and drugs.  Although this worked well, it would have been nice if the tagging were more semantic in nature.  In my case, I’d eventually like to be able to explore the collection as a graph of linked data.  That’s probably outside the scope of what the authors of Paperpile envision, but it would make it easier to explore the collection.

Paperpile’s real value lies in how it makes the process of research writing easier. To explore how Paperpile integrates with Google Docs, I started working on a white paper to pull together the target-related information from my Pancreatic Cancer Genomics collection.

As the video shows, it was pretty easy to add references to a paper as you work.  Unlike the Google Docs Research plugin, you can choose from a wide variety of citation formats.  In my case, I selected the Nature Reviews: Cancer format.  I had started the paper using the default format, and realized part of the way through the process that I wanted to change the citation format.  When I selected the new format, Paperpile zipped through my paper and reformatted my citations properly.

I talked with a number of colleagues in the industry to see how they use Google Docs and how Paperpile might fit into their workflow.  Although, scientists in pharmaceutical companies are typically closed-lipped about their work, Google Docs is being used for research reports, and presentations.  In the latter example, researchers provide an analysis for a specific target and range of indications along with supporting references from various journals.  In both of these use cases, Paperpile makes a good deal of sense for organizing and sharing references, and aiding in the process of writing itself.  One colleague’s reaction to Paperpile was literally – “WOW, just WOW”.

After playing with it for a few days, my wishlist for future additions to Paperpile was actually pretty small:

  • A way to highlight selected text in papers.  Utopia Docs ( a tool I blogged about previously) does a good job of handling this, and it would be nice to be able to do the same thing from within Google Docs.  Ultimately, I’d like highlighted text to appear in a list of “key findings”.  You can think of this as a manual form of text mining.  But it’s a task that everyone finds themselves doing with paper copies of journal articles.
  • Semantic tagging would also be pretty useful.  This would give you a standardized vocabulary for the types of tags, and allow you to organize related tags.
  • Text mining collections would also be pretty useful.  This would let users automatically generate semantic tags based on the content of the paper, and potentially identify key findings.
  • It would be useful if there was a way to synchronize tag changes, and keyword changes back to the originating collection of papers. For me, this would mean being able to make changes in Paperpile and see them reflected in my Mendeley collection.
  • Lastly, I wish there was a way to do “more like this” searches in Paperpile. You can imagine having one or two references, and you want to expand the list of references.  If only you could simply select a few papers, and then find similar papers based on the PubMed records MeSH terms, keywords or authors.

Give Paperpile a try, and let me know what you think.

Paperpile Getting Started Guide


About Mark Fortner

I write software for drug discovery and cancer research scientists. I'm interested in Design Thinking, Agile Software Development, Web Components, Java, Javascript, Groovy, Grails, MongoDB, Firebase, microservices, the Semantic Web Drug Discovery and Cancer Biology.
This entry was posted in Bioinformatics, Informatics, Science Blogging and tagged , , , , , , . Bookmark the permalink.

3 Responses to Tinkering with Paperpile

  1. Pingback: Literaturverwaltung kompakt 7/2013 – zweiter Teil: Communitynachrichten | Literaturverwaltung

  2. tedus says:

    well – biggest problem is using google.
    For expample when writing in google docs you agree to their license terms which state: they can do whatever they want with that – or to quote from their terms of service:”
    When you upload or otherwise submit content to our Services, you give Google (and those we work with) a worldwide license to use, host, store, reproduce, modify, create derivative works (such as those resulting from translations, adaptations or other changes we make so that your content works better with our Services), communicate, publish, publicly perform, publicly display and distribute such content. ”

    NO clearly thinking scientist actually should do that. Also probably most universitys (which are the ones the who own the knowledge) dissalow such usage!!!

    This is the major problem with paperpile – and makes it unusable.

    • aspenbio says:

      On the surface that would seem concerning; however, I would point out that Google Docs is already in widespread use in a lot of universities already, and Google’s marketing and licensing documentation clearly states that you own the copyright to all of your stuff. The reproducibility clause is designed to allow Google to replicate your contact from one server to another (so called “edge loading” where data is moved from one server to another in order to get your stuff closer to you or your collaborators). There are also a number of articles [1,2] on Google’s licensing terms that address the issue.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s