2021-11-22T18:00:00Z

What is your recommended RPA tool for complex document data extraction?

GB
  • 3
  • 149
PeerSpot user
3

3 Answers

MW
Vendor
2021-11-23T21:56:53Z
Nov 23, 2021

Hello George,


My tool recommendation is EMMA RPA (Robotic Process Automation - see https://www.wianco.com/en/emma... for further details). 


Why is it my recommendation? As Managing Director of WIANCO OTT Robotics, it might look obvious, why I am recommending our solution EMMA RPA, but I also can bring you in contact with a "Big4" customer of us who performed a major research and tool testing on this topic in order to extract data from energy passports for real estate properties in various scan qualities.


Their result is, that EMMA RPA extracted the data correctly of 94% of all scanned documents that had very bad quality. Of course, the ones with good scan quality the data was extracted correctly in 100% of the cases.


Please let me know if you are interested in getting together with that contact. I can also send you the presentation slides of EMMA RPA that contain further interesting benchmarks that are important when evaluating such a solution.


Have a great week and kind regards,


Michael

Search for a product comparison in Robotic Process Automation (RPA)
Shibu Babuchandran - PeerSpot reviewer
Real User
ExpertModerator
2021-11-26T02:55:09Z
Nov 26, 2021

Hi @George Bennett ​,


Have used Jiffy.ai at one of my customer place with good feedback and closure of project as expected by the end users.


Would present some of the feature Jiffy provides for document extraction below : 


Document Processing is the conversion of paper-based and electronic documents into digital information using the combination of Intelligent Character Recognition (ICR), Optical Character Recognition (OCR), Machine Learning (ML) algorithm, and necessary manual interventions.


Types of Documents


The types of documents and the nodes which are used to process them are listed below.



  • Documents in PDF format:


    • Use Doc Reader node to process Structured and Semi-Structured documents.






    • JIFFY.ai is not handling unstructured documents currently.




  • Use Excel node to process documents in Excel format.


If the document contains image, install ABBY Fine Reader to convert image to editable text and pass it through the Doc Reader node to extract the data.


Out of the Box Capabilities


In JIFFY.ai, Invoice and Bill of Lading are provided as predefined schemas for ease of use. Invoice schema comes with thirty-five predefined fields and Bill of Lading schema with twelve predefined fields. Jiffy.ai automatically extracts information from these documents without any training and provides out-of-the-box machine learning models for these document types.


The model is already trained for the predefined schemas. When an Invoice or Bill of Lading is processed through the Doc Reader node, you do not need to train the ML. The data is extracted automatically from the documents using the built-in extraction modules.


For other documents, you may have to train using the point and click familiarization environment provided.


How is Document Processing Done in JIFFY.ai?


Document processing is achieved in four phases:



  • Create a Document Table with the required columns for the fields being extracted from the document. Document Table is the persistence layer to store, track and present extracted contents of the document being processed.



  • Design the task using the Doc Reader node to extract the fields from the document.



  • Execute the task to:


    • Categorize the documents: classify the document type and identify the classification group that the document falls in, based on the format of the document.





    • Populate the data into Document Table to store, track and present extracted contents of the document being processed. If Document Table is created using predefined schema, ML auto-extracts the required data and assigns a category based on the template of the document.




  • Familiarize the document: A user-friendly interface is provided to:


    • Point the labels and data to be extracted from the document, thereby training the model for the category of document being processed.





    • Verify and approve the fields extracted by the model.



If Document Table is created using custom schema, the fields are auto-extracted based on the existing trained model.


In an Invoice Processing HyperApp:


  1. A Document Table with name InvoiceTable is created using Invoice schema.

  1. A Task is designed with Doc Reader node to extract the fields from the Invoice.

  1. The Task is executed to extract the fields.

The document is familiarized, saved, and approved to train the ML engine for the category of document being processed. The approved fields are populated into InvoiceTable for further processing.



JF
Reseller
2021-11-24T16:29:22Z
Nov 24, 2021

I would recommend evaluating HelpSystems' RPA product AutoMate! 

Find out what your peers are saying about UiPath, Microsoft, Automation Anywhere and others in Robotic Process Automation (RPA). Updated: March 2024.
765,234 professionals have used our research since 2012.
Robotic Process Automation (RPA)
What is RPA? Robotic process automation (RPA) is a software technology that enables enterprises to build, deploy, and manage a virtual workforce made up of software robots (“bots”) that emulate the actions of humans in interactions with software and digital systems.
Download Robotic Process Automation (RPA) ReportRead more

Related Q&As

Robotic Process Automation (RPA) experts

Prateek Agarwal - PeerSpot reviewer
Sachin Vinay - PeerSpot reviewer
Raphael Haroun  Ikyagh - PeerSpot reviewer
VivekIsukapalli - PeerSpot reviewer
Saket Pandey - PeerSpot reviewer
Ashish Upadhyay - PeerSpot reviewer
Mohammed Tafazal - PeerSpot reviewer
JA