Document Auto Capture
Document auto capture is a component to capture the photo of an identity document in the required quality, without the user having to press the shutter button.
In order to successfully detect and classify an identity document from a photo and to achieve high accuracy in optical character recognition (OCR), it is important to capture an image with the highest quality possible. The Document Auto Capture functionality is part of the DOT mobile libraries and web components.
The Document Auto Capture supports taking pictures of ID document cards and single passport pages that have visible corners, have the correct size ratio, and are of light color tones.
Document Auto Capture Mobile UI Component
The Document Auto Capture UI component is provided for easy integration into apps. It provides a camera preview with a rectangular placeholder and text instructions in the middle of the screen. The component continuously looks for an identity document in the preview frames, and analyses the image parameters. A text instruction is shown which instructs the user to achieve the conditions in which the position and quality of the image are of sufficient quality. The component in the case of success returns a high-quality document image suitable for further data extraction.
Document Auto Capture Web Component
The Document Auto Capture Web component is provided for integration into the web frontend. It provides the same functionality as mobile UI components.
Video Stream Preview Scale Type
There are two available ways to scale the camera preview with mobile libraries:
- Fit Center - Scale the preview, maintaining the source aspect ratio, so it is entirely contained within the UI component, and center it inside the view. This may show black rectangles on the sides of the auto capture component area. Recommended scale type for Optical Character Recognition, because it produces the maximum possible output image size.
- Fill Center - Scale the preview, maintaining the source aspect ratio, so it fills the entire UI component, and center it in the view. Keep in mind, that the output image might be smaller in size than the Fit Center type, and therefore might not be optimal for Optical Character Recognition.
The Web component has to be wrapped inside the parent node with defined height and width. The media stream will be resized to fit inside that node, maintaining the source aspect ratio.
|Document is not detected in the image||Scan document|
|Document is too small||Move closer|
|Document does not fit the placeholder||Fit document into rectangle|
|Sharpness is too low in the placeholder area||More light needed|
|Brightness is too low in the placeholder area||More light needed|
|Brightness is too high in the placeholder area||Less light needed|
|Hotspots are present in the placeholder area||Avoid reflections|
|Image is good enough||Hold still…|
|The instructions can be localized as follows:|
Controlling the process
You can also build your own UI on top of the Document Auto Capture process. The Document Auto Capture Controller non-UI component is designed for this purpose. This component also controls the process in Document Auto Capture UI Component.
The Document Auto Capture Controller continuously accepts image frames (from camera preview), processes them, and returns the result for each frame as a callback. The component is configured with an ordered list of validators. A frame should pass all of them to be considered a valid frame. If the quality is not sufficient, the component returns a Hint according to the first not passed validator. The UI handles the result (e.g. shows a text instruction).
- The component continuously accepts image frames until there is a defined count of valid frames in a row.
- The component enters the Stay still phase, which means that the arrangement is good enough and the user should be instructed not to move. This phase lasts for a defined amount of time.
- The component selects the best image from all valid images, and returns it as the result of the auto capture process.
Customize the frame validation
DOT Mobile Kit libraries contain a predefined list of validators which are used in the default configuration. You can define your own list of validators or you can implement a custom validator and evaluate available image data such as sharpness, brightness, or document corner coordinates etc.
In the case an own Document Auto Capture solution is preferred, one can leverage the DOT’s document detection and image analysis technology.
Document image quality
Image quality is an important for OCR recognition. But being too demanding on image quality during the face capture can also impact user experience. Therefore, the application should only require sufficient quality inputs for the specific use case in question. For example, login should be quick and require only basic adjustments, as opposed to passport quality image capture where correct lighting and background uniformity are required. Quality can be controlled by various quality attributes, but to simplify integration, pre-configured quality providers are available to cover most common use cases.
Tutorial for user
Some user’s effort in taking the picture is needed in order to achieve the best quality. The capture component provides feedback to the user, but there are user mistakes, which the capture component cannot detect. It is recommended to show a walkthrough with instructions or tips how to avoid these mistakes. Following tips are provided in a GitHub repository as vector animations.
|Clean the camera’s lens by wiping it|
|Place the document on a table that contrasts with it|
|Make sure there are good light conditions in the room|
|To prevent reflections on the document, turn it facing away from the light|
Detect document in an image
Document Detector is a state-less non-UI component for identity document detection in an image.
- The document card edges must be clearly visible and be placed at least 10px inside the image area.
- Images should not contain other objects or backgrounds with visible edges.
Analyze image parameters
Image Parameters Analyzer is a state-less non-UI component for calculating sharpness, brightness and hotspot presence in an image.