Example how to 1) upload files to AWS S3 and 2) process the PDF file via AWS Textract and 3) send link to form to validate data from PDF. What you need to do is decide where the data from the form should go. But that is a different story and a different Blueprint :-)
Amazon Textract
Amazon Textract is a machine learning service that automatically extracts text, handwriting and data from scanned documents that goes beyond simple optical character recognition (OCR) to identify, understand, and extract data from forms and tables. Read more: https://aws.amazon.com/textract/.
Here is a example of the PDF files that are in the "inbox".
Here is how Amazon Textract sees the PDF as a form.
Here is how the data ends up in the form in Onify.
- Onify Hub API 2.3.0 or later
- Mail configured in Onify Hub
- Onify Agent (tagged
agent
) - Onify Flow license
- Node.js installed (on agent)
- Camunda Modeler 4.4 or later
- Amazon AWS services: S3 Bucket, SNS and SQS
- 1 x Flow
- 3 x Scripts (nodejs)
In order for this to work you need the following setup:
- Amazon S3 Bucket
- AWS user with permissions
- Document access key (
accessKeyId
) and Secure Access Key for AWS user (secretAccessKey
)
NOTE: For more information, please read Configuring Amazon Textract for Asynchronous Operations
NOTE: Amazon Textract is not available in all regions. Also make sure S3 bucket and Textract are in same region.
- Copy files from
.\resources\agent\scripts
to.\scripts
folder on Onify Agent. - Run
npm install
from the.\scripts
folder - Update
aws_config.json
with AWS credentials and region.
Update flow (aws-textract-pdf-to-form.bpmn
) with your own variables:
inboxPath
- Path to the PDF filesbucket
- S3 bucket to upload filesmailTo
- Where to send the link to the formonifyUrl
- URL to Onify APP (default is http://localhost:3000)roleArn
- The Amazon Resource Name (ARN) of an IAM role that gives Amazon Textract publishing permissions to the Amazon SNS topicsnsTopicArn
- The Amazon SNS topic that Amazon Textract posts the completion status tosqsQueueUrl
- Amazon SQS url that is subscribed to the SNS topic
- Open
aws-textract-pdf-to-form.bpmn
in Camunda Modeler - Click
Start current diagram
- Community/forum: https://support.onify.co/discuss
- Documentation: https://support.onify.co/docs
- Support and SLA: https://support.onify.co/docs/get-support
This project is licensed under the MIT License - see the LICENSE file for details.