How to Scan a PDF to OCR
Portable Document Format (PDF) files have been in development since the mid-1990s. They allow any computer user on almost any computer to view documents in the way that the author intended, with all the fonts and formatting intact. Adobe, the inventor of the PDF file, says that there are over 250 million PDF files on the web as of 2010. Typically when you scan a file, it will produce a graphic PDF, in which the text is not searchable or extractable. However, using Optical Character Recognition (OCR) you can convert a graphics PDF into a searchable text file. One method is to use OCR software on your computer; another is to use the OCR feature on a web-based service such as Evernote.
Instructions
-
Converting Using OCR Software
-
1
Connect your scanner to your computer. Set up the scanner correctly so that the computer recognizes it. Often you do this by connecting the scanner to the computer with a USB cable.
-
2
Install any necessary scanner software and drivers on your computer. You might find a setting in your scanner software that allows you to select many different formats of files to scan to, such as JPG, TIFF, BMP and PDF. Some scanner packages also offer a setting called OCR PDF, or searchable PDF. This setting will allow you to scan a document as a picture, or graphics file, and it will also process the graphics to turn it into searchable text that your computer can read.
-
-
3
Place the document on the scanner surface, by lifting the lid and placing it face down on the glass. If your scanner has a sheet feeder that allows you to stack multiple documents in it to feed through on at a time, you may use that for a multiple page document. Select the setting in the software that allows you to start the scan. When the scan is complete, the OCR software will activate and convert the picture document to text. Save the document.
Using the Web-based Service Evernote
-
4
Open an Evernote premium account. Evernote is a service that allows you to capture notes, photos, and files and sync these items across multiple devices, using a web-based interface, applications for computers or apps for smartphones. When you upload a PDF with an Evernote account, the service performs its own OCR on the document, without any extra software or processing on your part.
-
5
Scan the copy of the document by placing it face down on the scanner bed, or place it in the sheet feeder of the scanner. Set the software to scan the document as a PDF. Save the document on your computer.
-
6
Upload a copy of the document to Evernote by creating a "note" in any of the interfaces. Create a note by using the "New Note" button. Attach the PDF document to the note by dragging and dropping.
You can instead email the document as an attachment to the Evernote email address provided in the settings section.
-
7
Wait about ten minutes for Evernote to complete its processing. After this, access your note on the Evernote interface, and double-click the file attachment to open it. Save a copy of the opened note into your select folder on your computer.
-
1
Tips & Warnings
If using a scanner and software, set the resolution to 300 dpi. If you use OCR often, you may want to invest in higher-quality software than the free software that came with the scanner
Evernote will recognize text in photos, and make that searchable as well.
References
Resources
- Photo Credit stack of book image by vnlit from Fotolia.com