Folders
Use folders to group documents togheter. Each folder belongs to an account (owner) and could contain one or many files.Supported formats
Our IDP can process the following formats:- TIFF including multi-page TIFF
- PNG
Preprocessing
Our API will take care of preprocessing the documents and their images while maintaining the source resolution. PDF files containing images will be extracted maintaining their original format and compression. If the source file contain 95% or more grayscale pixels the destination file will always be created in grayscale. Summary of processing events:- Extract images
- Remove metadata
- Rotate if needed
- Deskew pages
- Remove punch holes
- Remove border lines
- Adjust for page margins
- Remove page artifacts (arises from scanners and smudges)
- Enhance text
Documents
You can list, delete, upload and download documents. Each document can be associated with a company. When you upload a document you can pick four different tranformations.transformToEntity
This will transform each page in the file to a set of understandable
structured entities in JSON format. Think like an Excel sheet. You will get
each data in columns and rows. It deals with the complexity around
borderless and bordered tables.
transformToSearchablePDF
This will transform the source file to a searchable visually pleasing PDF
adhering to PDF version 1.4 and linearized.
transformToOCR
An hOCR result will be provided than can be used to feed a search database
that contains the location of text boxes and the contents of the text box.
Also a full string representing the content of the page is included.
transformToMetadata
An json file will be provided containing the original metadata of the file.
It will for example contain format and resolution of images in PDF files.
Searchable PDF
When you upload a PDF or when a PDF is created from images all metadata will be removed when a new searchable PDF is created. The new PDF will be linearized which means they are optimized to be viewed on mobil and desktop apps by enabling the viewer to incrementally download the pages.How to upload a file
To upload a file you use a multipart form post as the following simple cURL shows:cURL
Response