Can I use Deep Learning to segment specific regions in an invoice and then to apply an OCR method to recognize its content?

Yes, you can but its a bad idea.

Unlike reading numbers from a car license plate – invoices are not created in a fixed format. A lot of companies have tried extracting invoice information based on templates that look for information in specific regions. This approach does not scale at all. Every vendor and every supplier has their own template for invoice. Some list their business entity number on the top, others at the bottom. Some specify taxes by calling out a specific tax type, others just say Tax. The combination of these variations is so large that a region specific extraction approach will not take you anywhere unless everyone you know uses a fixed template.

But if you have enough say in the invoice formatting then you are better off requesting an XML instead of PDF file and not bother with extraction.

The right way of looking at invoice data extraction is a text analytics problem rather than an image recognition. You need algorithms that can understand what they are reading and what it means.

Think about this – you show an accountant an invoice in any format and he figures out whats what with ease. His experience looking at thousands of invoices has taught him how to spot elements of invoices irrespective of their location.

You should aim to build an algorithm that learns like that accountant.

Leave a Reply

Your email address will not be published. Required fields are marked *