Machine Readable Zone data - MRZ
DOT provides direct data extraction from MRZ of the ID document image.
Most travel identity documents nowadays are machine-readable, which means that the most relevant data is encoded in a format suitable for optical character recognition (OCR). The part of the identity document where this information is embedded is called the machine-readable zone (MRZ).
On-device MRZ data extraction
DOT Document component for Android and iOS offer document autocapture with the MRZ reading and parsing step included. See the Android Document Auto Capture Fragment and iOS Document Capture Placeholder View Controller how to extract and parse data from document during the process with the mrzReadingEnabled
flag.
Parsing the MRZ data in the OCR process
DOT Digital Identity Service in the OCR process searches the ID document photo, if it does not contain an MRZ field. If yes, the data are automatically extracted and included in the API response. Also the check digit is validated to identify integrity of the MRZ field.
Short Extract of the MRZ specification
According to the ICAO Document 9303, there are three standardized document types depending on the position of the MRZ within the document. These three types are:
- Size 1 Travel Document (TD1)
- Size 2 Travel Document (TD2)
- Size 3 Travel Document (TD3)
Size 1 Travel Document (TD1)
TD1 is mostly used in identity cards. As space is limited, the MRZ is moved to the back, which results in the need to capture both the front and back of the document in order to both assess its validity and extract the required information. Furthermore, each issuing country can add additional content to the document, which is usually on the back of the document above the MRZ.
TD1 document example:
As seen in the picture above, the MRZ of TD1 spans 3 lines, and each line is 30 characters long. Optional content goes in the area above the MRZ. Furthermore, there are check digits added to the MRZ so that the data in the MRZ can be verified.
The following image explains the different fields that are present in the MRZ:
Size 2 Travel Document (TD2)
Although also used in several identity cards, TD2 is being replaced by TD1 due to its more manageable size.
One of the main benefits of TD2 is that it has the MRZ on the front side – making the back side less important. This means that only the front side of the document needs to be scanned to extract the required information.
TD2 document example:
As illustrated above, the MRZ of TD2 spans 2 lines, and each line is 36 characters long.
The following image explains the different fields that are present in the MRZ:
Size 3 Travel Document (TD3)
TD3 is used for most passports worldwide. This document, although a booklet, contains a card with all information on the front. Therefore only the front of the document needs to be scanned, making the process easier for passport control officers as well as data extraction software such as Innovatrics DOT Core Server.
TD3 document example:
As illustrated above, the MRZ of TD3 spans 2 lines, and each line is 44 characters long. There are also several check digits added to the MRZ so that the data in the MRZ can be verified.
The following image explains the different fields that are present in the MRZ: