It depends. Does the PDF just have images or does the PDF have text. What is the objective? A great option is to use the functionality available at the linux command line. I use âpdftotextâ command line tool to pull out the text and for images you can use another command line tool called âpdfimagesâ. (see also How to Extract and Save Images from a PDF File in Linux) You will first need these tools installed, then call them from the Ruby script you are running, passing in any variables the tool needs. Install poppler utilities. - âsudo apt-get install poppler-utilsâ Wily (15.10) . Ubuntu - Poppler ( Portable Document Format (PDF) to text converter (version) in ruby call `pdftotext` and you will see the options. Another option to try the yob/pdf-reader gem this seems to have a lot of followers and some activity. The basic rule is try it out. If it is simple then it should be suitable. With Linux use tools that have been built. Linux strength - Tools do one thing and do it well. String a few tools together and you have excellence.
I'm sure you are all a great programmer.