Accuracy

Face Identification

Face Identification accuracy describes how reliably the system can find the correct identity within a gallery of enrolled faces.

Two key metrics define identification performance:

FPIR (False Positive Identification Rate) – probability that the system incorrectly returns a match when it should not.
FNIR (False Negative Identification Rate) – probability that the system fails to return a correct match when one exists.

These are opposing forces:

Increasing the identification (matching) threshold reduces FPIR (fewer false matches), but increases FNIR (more missed matches).
Decreasing the threshold reduces FNIR, but increases FPIR.

Choosing the right threshold depends on your use case:

Scenario	Typical Priority
Border control / secure access	Low FPIR
Watchlist / surveillance	Low FNIR

Two ways of expressing performance are often used:

FPIR@FNIR = X% – FPIR measured when FNIR is fixed at X%.
FNIR@FPIR = X% – FNIR measured when FPIR is fixed at X%.

A large threshold is not always better – if it is too strict, even genuine users may fail to match (high FNIR).
Conversely, a small threshold may yield too many false matches (high FPIR).
The goal is to find a balanced operating point for your gallery size and use case.

Template Extraction Algorithms

Video Processing Platform provides two algorithms for extracting biometric templates:

Algorithm	Description	Performance	Recommended Use
Balanced	Default extractor providing a good trade-off between speed and accuracy.	Fast	Video surveillance, small to mid-size galleries
Accurate	Enhanced extractor optimized for maximum precision.	Average	Access control systems, large galleries, high-security applications

The Accurate extractor yields tighter score distributions (better separation of genuine and impostor scores), improving FNIR at low FPIR levels.
However, it is computationally heavier and therefore not the default.

To change the extraction algorithm used by your SmartFace deployment, see Changing Extraction Algorithm.

Importance of Dataset and Quality

Identification accuracy depends strongly on:

The dataset (lighting, pose, demographics, sensor type)
Enrollment image quality (sharp, frontal, ICAO-compliant images)
Probe image quality (pose, illumination, blur, occlusions)

Our results are based on internal evaluation datasets, but you must always validate performance on your own data.
Real-world conditions vary greatly and can influence optimal thresholds.

The higher the threshold you configure, the higher image quality you must ensure — both for enrollment and identification.
Poor-quality or non-frontal captures will lead to degraded results even at the same threshold.

Measured Results

The following tables present real measured results for identification accuracy across different gallery sizes.
Each scenario corresponds to a typical deployment scale and complexity.

Gallery ≈100 k images

FPIR Level	FNIR Balanced (%)	Threshold Balanced	FNIR Accurate (%)	Threshold Accurate
1:50	0.317	49.471	0.047	46.495
1:100	0.420	52.439	0.067	49.680
1:200	0.577	55.270	0.097	52.670
1:500	0.823	58.942	0.150	56.472
1:1000	1.107	62.328	0.277	61.368
1:2000	1.510	65.709	0.437	64.943
1:5000	4.057	72.195	3.453	70.649

Gallery ≈1.7 M images

FPIR Level	FNIR Balanced (%)	Threshold Balanced	FNIR Accurate (%)	Threshold Accurate
1:50	1.784	72.350	0.614	68.608
1:100	2.367	73.637	0.836	69.283
1:200	3.095	75.010	1.179	69.930
1:500	4.924	77.072	1.956	71.111
1:1000	8.543	79.657	3.899	72.491

Choosing the Right Threshold

Identification accuracy is a trade-off between FPIR, FNIR, and processing performance.

Determine acceptable FPIR for your system.
- For 1 false match in 100 searches → FPIR = 1:100.
- For 1 false match in 1,000 searches → FPIR = 1:1000.
Select the corresponding threshold from the table above for your gallery size and algorithm.
Validate on your own data — real operational conditions will influence final FNIR.
Maintain image quality.
- High thresholds demand high-quality ICAO-compliant, frontal, well-lit images.
- Poor images may require lowering the threshold or improving the camera setup.

Example

For a gallery of 100,000 identities targeting 1 false match per 1,000 searches (FPIR = 1:1000), you can start with thresholds around ~62 for Balanced or ~61 for Accurate then fine-tune using your own dataset.

Face Liveness

Passive Liveness

The final decision of whether a face is real, or a spoof, should be determined by the passive liveness score and threshold. If the score is above the threshold, this can be interpreted as accepted. If the score is below the threshold, it is rejected.

Setting the correct threshold depends on the security/convenience balance that is required for the specific use case.

The results below are for the currently used IFace version 5.14.

Thresholds for distant passive liveness:

False Accept Rate (Level)	False Reject Rate [%]	Threshold
1:5	0.055	67.01692199707031
1:10	0.068	69.2877197265625
1:50	0.806	78.48787689208984
1:100	2.432	83.84744262695312
1:500	10.700	90.17024993896484

Thresholds for nearby passive liveness:

False Accept Rate (Level)	False Reject Rate [%]	Threshold
1:5	0.305	74.63423156738281
1:10	1.536	81.77136993408203
1:50	8.800	89.43424987792969
1:100	13.523	91.35359954833984
1:500	26.425	94.28164672851562

Example

Let’s set the threshold of distant passive liveness to 83.84. If we have a representative set of 10,000 real faces, statistically about 243 of the faces will be on average incorrectly marked as spoofs, even though they were real faces (False Reject Rate). If we have a set of 10,000 spoofs, statistically about 100 of the spoofs will be on average wrongly marked as real faces (False Accept Rate).

To make the Spoof/Liveness check more strict we can look for a higher Threshold, where it is less likely to accept a spoof face as a real one, while it will more likely consider a real face as a spoof face. If you set the Threshold to 92.82 instead, we will have about 9 faces incorrectly accepted out of 10,000 spoofs and about 1,809 real faces considered to be spoofs out of 10,000 real faces.

Accuracy

Face Identification#

Template Extraction Algorithms#

Importance of Dataset and Quality#

Measured Results#

Gallery ≈100 k images#

Gallery ≈1.7 M images#

Choosing the Right Threshold#

Example#

Face Liveness#

Passive Liveness#

Example#

Face Identification

Template Extraction Algorithms

Importance of Dataset and Quality

Measured Results

Gallery ≈100 k images

Gallery ≈1.7 M images

Choosing the Right Threshold

Example

Face Liveness

Passive Liveness

Example