Lucion Technologies Home
  Register   Log In 
Products Shopping Cart Knowledge Base
 Knowledge Base >  Returning Customers
click here to log in.
Shopping Cart
Your Cart is Empty
View Cart

What is a "Searchable PDF"?

The PDF file format can be confusing, especially when it comes to understanding what constitutes a "searchable" PDF file. To understand whether a PDF file is searchable, you have to look at its origin.

Text-Based PDF

First, a PDF file can originate with a file on your computer, like a Word document. Normally, you create the file in your software and then "print" it to a PDF printer. This converts the file to PDF format. These PDF files are text-based PDF, meaning that they retain the text and formatting of the original. Text-based PDF files are searchable because they contain real text.

Image-Based PDF

PDF files can also originate from a scan or a fax. These are image-based PDF files, meaning that they are simply a picture of the original. To your computer, these images are no different than digital photos or graphics. Your computer does not see any text in them. To make these files searchable, it is necessary to "recognize" the text in the image using optical character recognition ("OCR"). This creates text from the "pictures" of the letters and then inserts the text invisibly behind the image. Without OCR, an image-based PDF file is not searchable.

Test Whether a PDF Has Text

If you're in doubt, there's an easy way to see whether a PDF file is searchable or not. Just open the file in Adobe Acrobat, then select the "Edit" menu > "Select All". This will select all of the text in the file. If nothing is selected, there is no text and the file isn't searchable. You can also press Ctrl+C after selecting all text to copy it. Then try pasting in a word processor. If you are able to paste text from the PDF, the file is searchable.

 
This web store and web site powered by NetSuite ecommerce software