Textract
Last updated
Last updated
Amazon Textract is a machine learning (ML) service that automatically extracts text, handwriting, and data from scanned documents.
It goes beyond simple optical character recognition (OCR) to identify, understand, and extract data from forms and tables
Automatically extract printed text, handwriting, and data from any document
Features:
Optical character recognition (OCR)
Identifies relationships, structure, and text
Uses AI to extract text and structured data
Recognizes handwriting as well as printed text
Can extract from documents such as PDFs, images, forms, and tables
Understands context. For example know what data to extract from a receipt or invoice