Wednesday, July 7, 2010

Managing PDF books and papers

I have a small collection of mathematical and computer science books and papers as PDF files. Some are stashed in Google Docs, others in Read it Later and others as links in email, etc. It would be nice if:

  • all documents were organised together

  • I could access the documents from all my devices - work and home macs and iPad

  • adding / removing documents automatically synchronised across all devices

  • there is a record of the url each document came from

  • document contents are searchable

A cloud solution with device specific apps would be a good start. I would be happy to pay an annual price similar to flickr and Remember the Milk for a good product. From memory these are about USD$25 / year.

All the solutions I have come across are intended for academic bibliographic management. Most of my documents were collected adhoc from the web, without access to restricted channels such as the ACM Digital Library. The tools are geared towards searching the popular open and paid publication repositories and extracting bibliographic details automatically. I can see how this would be very useful if you have access, but I found it problematic in matching my documents when I don't have access. Given my requirements this became more hassle then it was worth.

Some solutions integrate social networking. Being able to share your reading and observe what others are reading might be quite useful. Warrants future investigation.

I don't have any need to annotate, highlight or add notes to the PDF files at present.

A Mac application (USD$42) featuring document synchronisation with the corresponding iPhone and iPad apps (USD$17.99). Storing the database and documents on Dropbox should also enable sharing between Macs. Metadata and content are searchable and the full screen reading mode is great. You can add notes to documents, but not annotate or highlight portions. At work I am behind a strict corporate firewall and proxy. Papers gave a spurious error about internet connectivity and disabled all online actions. Following this forum post fixed things up.

I had high hopes for Papers. The iPad synchronisation is very compelling, but otherwise it currently doesn't do enough for me to warrant paying money for (and putting my data in) Mac only software. If I end up regularly using the supported repositories then I will probably reconsider.

A web application with a companion Mac/Windows/Linux desktop app. There is no iPad app, although it is apparently in the works. The desktop UI looks like it is Java and is a bit clunky on the Mac.

The desktop app is orientated around managing your documents and associated files. You can add your files, set their metadata and sync with your website account. Synchronisation is initiated manually via a UI button. There is a full screen viewer, although I couldn't figure out the zoom and navigation keys. Alternatively you can open documents with the default external application. Document annotations, highlighting and notes are all supported, as well as full text searching. Unfortunately the network proxy settings aren't automatically detected from System Preferences and I need to manually edit them when taking the MacBook Pro between work and home.

On the website you can manage your references and notes and view your files, but I couldn't figure out how to upload files or see annotations and highlighting. Otherwise, the web site is all about social networking. You get 500MB of personal storage space and 500MB of shared space. The first upgrade is USD$4.99/month for 3.5GB + 3.5GB.

To make it attractive for me, Mendeley needs to tidy up the desktop app, automatically sync in the background, make the full screen view useful, release a synchronised iPad app and lower the price.

BibDesk + Dropbox + Skim + GoodReader
BibDesk is a Mac BibTex editor and reference manager. I can record as much metadata as is convenient, associate url's and PDF files, organise and search. By storing the BibDesk database file and PDF files on Dropbox, both content and metadata is automatically synchronised between all Macs. Dropbox provides 2GB of free storage, which is a great to get started. Skim is a PDF reader with a decent full screen mode on the Mac. GoodReader (USD$1.19) is a PDF viewer with Dropbox integration on the iPad. Although it is much quicker to transfer documents to GoodReader via iTunes.

The main downside with this setup is the lack of BibDesk support on the iPad and that all documents aren't automatically synchronised locally to the iPad. It is sufficient for me now though, as I am currently only adding new documents periodically.

Web site sponsored by Springer. I haven't investigated it properly, but it looks like you can add references from the major publication repositories or add your own, upload the associated documents and setup social connections. There are no Mac or iPad apps, so you need to manage your local files and how to view them yourself.


charles said...

Good overview. Just a quick note: Papers actually **does** index the content of the PDF and that content is searched as well.


Charles Parnot

Brad Clow said...

Charles, yes you are correct. It looks like I was testing with a couple of older papers that have been scanned in as images embedded in PDF, which is why the full text search was not working. I will update the post.

Ian Mulvany said...

Hi Brad,

just wanted to let you know that we at Mendeley now have a preliminary iPad app available. One still needs to download pdf files item by item to that device, but we are working on improvements.

you can get it here: