Cognitive Data Capture is the technology that allows you to capture data from semi-structured documents. Think invoices, bills of lading, CMRs, orders and the like, which are the perfect start for you to start automating your documents. The future is looking bright, but AI is already helping us get rid of menial, manual tasks.
What semi-structured documents have in common, is that they all contain some sort of visual cues which us humans understand at a glance, even if we’ve never seen this particular document before. The ultimate goal of Automatic Document Processing (ADP) is mimick this and read a document just like a human would. Adding AI to the mix is bringing us one step closer to this moon shot. For now, though, count on AI to bring you added value at different stages of your document processing automation.
In pre-processing stage, AI helps to ‘scan’ full documents or pages, creating a flow in which all the information is neatly classified. The AI recognises a signature as a signature, and doesn’t confuse it with a barcode, logo, stamp or whatever else might be on there. Whether it’s images or text, the technology reads semi-structured documents and relies on neural networks which are easily fed with examples and trained. Schooled with this info, the AI extracts the right features out of your documents, before classifying them correctly and storing the information where it belongs.
A lot of engineering and training effort is going into making these models more accurate, recognising many more types of documents and increasing language and geography coverage.
Where capturing is concerned, no need to reinvent the wheel: this can already be done via existing trained AI models. Especially for invoices, these trained models exist as smart online APIs. Yes, there are still some limitations. Some APIs are limited to header fields or bound to certain languages. But things are evolving rapidly. A lot of engineering and training effort is going into making these models more accurate, recognising many more types of documents and increasing language and geography coverage. The focus is on invoices, mostly, meaning there isn’t any ‘whatever document’ feature available just yet.
However, technology never ceases to amaze. If you’re looking to decipher a very specific type of document, it is possible to train the Azure Form Recognizer. Not quite like the pre-trained models, but very close. And more is coming, because this feature will soon be integrated in O365 as a new Cortex service. What this means is you’ll be able to train and use your own AI models as part of your automation strategy. Looks like the future isn’t that far away after all.
AI’s role isn’t quite played out once the data is extracted, though. Once we’re in the next stage of processing the data, AI can play its role as a big data cruncher. It can detect malicious data patterns or help you with auto suggestions for data entry based on your previous activities, as well as assist end-users in completing the work more efficiently.
Endless possibilities? Maybe, but we’re not quite there yet. Although AI clearly provides an added value, it doesn’t solve everything and has some disadvantages, too. Let’s say a field or table element is not or wrongly recognised, then AI won’t be able to immediately correct that mistake for the next document with this same layout.
The result? Optimised automated document processing, aka less work on the administrative side of things.
Because a re-training is needed or for this specific layout, it’s too far-off compared to the trained ‘generic’ document understanding of the model, hence it will never be trained. Translation: the correction loop departing from the end user to the trained model is long, and he or she will have to manually overwrite that same error many, many times.
No need to despair, though, because there is a workaround to solve AI’s long correction loop. Combine your neural network with our layout-based point and click short correction loop learning system, and you’ve got yourself the perfect mix. This setup will capture generic data for unknown layouts, as well as for repeating documents. The result? Optimised automated document processing, aka less work on the administrative side of things.