Improve Your Paperless Document Searches
Financial advisers create large volumes of data due to the comprehensive nature of the financial planning process. Portions of the material generated include documents that can easily be searched, such as Word documents of financial plans, Excel worksheets with calculations, and email correspondence with clients and allied professionals.
There are also many types of documents that advisers do not generate, such as third party brokerage statements and insurance policies supplied by a client during the data-gathering process. Adviser firms are adopting paperless office practices in increasing numbers to reduce the amount of paper stored for all of these documents. However, paperless versions of these third party documents pose a problem to advisers as they cannot quickly be searched for information critical to the comprehensive financial plan.
Optical Character Recognition (OCR)
Advisers can use Optical Character Recognition technology, or OCR, to create searchable, full-text indexes for documents that are not created in-house. There are a variety of OCR tools available, but one of the most prevalent (and likely to already installed in most advisory firms) is the OCR utility included in Adobe Acrobat.
Most of the paperless office implementations I’ve seen involve scanning documents to Adobe PDFs to be stored on a server or in a Document Management System. By using the OCR tool integrated in Acrobat, advisers can add “invisible” text on top of a scanned document so it becomes searchable. The OCR will provide its best guess on the words and numbers in the document, so it will not be 100% accurate. Nevertheless, a little inaccuracy is not a deal-breaker for general search purposes as long as most of the critical information (e.g. client name, address, account number, etc.) is adequately identified.
Batch Processing with OCR
Anything worth doing once is worth repeating over and over. Running the OCR tool within Acrobat file-by-file would succeed in producing searchable indexes, but it would be far too time consuming to run on each individual document. Fortunately Acrobat includes a a batch processing option to perform the OCR process on a large number of PDFs.
By now, readers should know that I try to identify efficient and scalable solutions for a variety of tasks. So instead of duplicating work already completed by someone else, I’m providing a link to the Acrobat for Legal Professionals blog. Here is an excellent step-by-step tutorial on how to use Acrobat’s built in OCR function in a batch process to extract full text from scans saved as PDF files. The tutorial is shown for Acrobat Professional 7.0, but the steps are still valid for latest release.
Searching the OCR Documents
With the full text of any scanned PDF document available, advisers can search for any and all types of data. Searches can be performed for specific words, phrases, client names, account numbers, securities, you name it. Most Document Management Systems include tools to search the full text of PDF documents. However, one does not need a DMS program to do the searching. The standard Windows Explorer search tool will work, but better yet, Google Desktop likely will operate faster and deliver more relevant search results (granted, after all the files to be searched have been indexed by Google Desktop).
So this begs the question, if one is able to scan all documents and obtain full-text indexes for everything, then search all documents with a powerful search tool, is there really a need for a Document Management System? I think both sides of the DMS requirement have persuasive arguments, so I think it is ultimately a matter of satisfying personal needs and requirements. Perhaps I’ll investigate the pros and cons in a future entry.
Nevertheless, if advisers are not generating full-text indexes for your paperless documents using some kind of OCR, they never will have the opportunity to quickly and efficiently search for the contents of those documents.
Bonus Thoughts
Go outside the box when performing OCR on scanned documents. Don’t limit it just to client-provided documents. Scan and OCR:
- CFP® study materials so you can search for those questions on ILITs or QPRTs.
- Relevant magazine and journal articles (that aren’t available for viewing/printing on the web).
- Corporate and personal tax returns and supporting documents (e.g. receipts, W-2s, etc.)
Storage space for scanned documents is inexpensive, and powerful search tools continue to get better and better. Adopting this practice should result in a huge improvement in efficiency, especially for those advisers frustrated by never being able to find anything on the server. Sound familiar?
Enjoy FPPad.com?
April 26th, 2008 at 4:58 am
Bill makes some valid arguments for the financial planning community to get serious about their document management challenges.
In most cases we do not see this as a core competency of many financial planning firms and would recommend that they evaluate their outsourcing options with a profession document imaging vendor. Yes, we are such a company, but there are many others in our industry that can provide these services.
Your community should be aware that there are very few companies (but we are one of them) that are SAS 70 Type II certified and PCI certified. We have made these major investments to assure our clients (in all industries) that there is a secure chain of custody of their documents from the time they leave their office until the digital images and related data are returned.
There is a science to every industry and we are one of the national leaders in document management systems and services and would be happy to discuss your paperwork challenges and provide you with options that can meet every budget and time frame.
We provide our services through nine full-service secure and seamlessly integrated production facilities strategically located throughout the United States.
Bob Zagami
DataBank IMX
General Manager, New England Region
781-83-2500, Ext 3552
bobzagami@databankimx.com