Skip to main content
Version: 3.16.1 (latest)

Face Capturing

Face detection consists of several stages:

  1. Face detection in the image. This stage results in a rectangle (a frame) around the detected face. Several types of detectors are described below.
  1. Fitting (positioning) of anthropometric points. This stage results in a set of anthropometric points with 2D/3D coordinates linked to a specific detected face. Several types of fitters that use different sets of anthropometric points are described below.
  1. Calculating head rotation angles relative to the observation axis. The result of this stage is three head rotation angles: pitch, yaw, roll. The accuracy of these angles depends on the set of anthropometric points used.

The Capturer class is used for face detection. A configuration file must be specified when you create a class object. The configuration file contains the detector type to be used and the type of set of anthropometric points to be used (see Configuration files). You can also configure various detection parameters in the configuration file that affect the quality and speed of the entire algorithm. The detector and the set of anthropometric points are specified in the name of the configuration file. For example: common_capturer_blf_fda_front.xml - the blf detector, the fda set of points.

There is also another option of face detection combined with face tracking in a video stream. In this case the algorithm assumes that a set of consecutive frames from the video is provided at the input, therefore, faces are tracked from frame to frame. In addition to the above-mentioned stages, the face tracking stage is added and a unique identifier is assigned to each face. From the moment, when a face is detected in the video stream, this id doesn't change until the moment it is lost. Such configurations contain the word video in the file name. For example, common_video_capturer_fda.xml. Currently, two trackers are available:

  • common_video_capturer - Provides high speed, but the quality is lower compared to fda_tracker_capturer.
  • fda_tracker_capturer - Provides high quality, but the speed is lower compared to common_video_capturer.


Currently, the following detectors are available:

  • LBF – An outdated detector, not recommended for use.
  • BLF – A detector that provides higher quality and faster detection than LBF for faces of a medium size and larger (including masked faces). On Android you can use GPU acceleration (enabled by default).
  • REFA – A detector that is slower than LBF and BLF detectors. But at the same time it guarantees a better quality of face detection of various sizes (including masked faces). Recommended for use in expert systems.
  • ULD – A new detector that is faster than REFA. This detector lets you detect faces of various sizes (including masked faces).

For BLF, REFA, ULD detectors you can get the detection confidence level. To do this, call the RawSample.getScore() method. As a result, you'll receive a float number in the range of [0, 1].

For LBF, REFA, ULD detectors you can set the size of the detected faces using the min_size parameter (See the Detailed Info about Capturer Configuration Parameters section). Decreasing the value of this parameter increases the detection time.

See below the examples of the operation of different detectors in different conditions.

Click here to expand the table
BLF (score_threshold=0.6) REFA (min_size=0.2, score_threshold=0.89) ULD (min_size=10, score_threshold=0.7)

See below the examples of the operation of different detectors at different thresholds.

Click here to expand the table
ULD (score_threshold=0.4) ULD (score_threshold=0.7) REFA (score_threshold=0.89)

When a detector is created, you can use it to detect / track faces. There are two ways to pass an image to the detector:

  • Pass the data of the decoded image to the method Capturer.capture(RawImage image), using the RawImage class (See Samples)
  • Pass the data of the encoded image in JPG, PNG, TIF or BPM format to the method Capturer.capture(byte[] data)

In both cases, the result is the vector of detected / tracked faces (RawSample is the object that stores the captured face).

For a tracker you can also call the Capturer.resetHistory method to start tracking on a new video sequence.

Anthropometric Points#

Note: You can learn how to display anthropometric points and head rotation angles in our tutorial.

Four sets of anthropometric points: esr, singlelbf, doublelbf, fda, mesh.

  • The esr set is our first set that was the only set available in previous SDK versions. The esr set contains 47 points.
  • The singlelbf and doublelbf provide higher accuracy than esr. The singlelbf set contains 31 points. The doublelbf set contains 101 points. Actually, the doublebf set consists of two concatenated sets – the last 31 points of doublelbf duplicate the singlelbf set (in the same order).
  • The fda provides high accuracy in a wide range of facial angles (up to the full profile), in contrast to the previous sets. So we recommend you to use detectors with this set. However, recognition algorithms still require face samples to be close to frontal. The fda set contains 21 points.
  • At the moment, the mesh set is the newest. It contains 470 3D points of a face. Use this set to get a 3D face mesh.
fda set of points. RawSample.getLeftEye returns point 7. RawSample.getRightEye returns point 10esr set of points. RawSample.getLeftEye returns point 16. RawSample.getRightEye returns point 17
singlelbf set of points. RawSample.getLeftEye returns point 29. RawSample.getRightEye returns point 30first 70 points of doubleldb set of points (the rest 31 points are taken from singlelbf). RawSample.getLeftEye returns point 68. RawSample.getRightEye returns point 69
mesh set of points. RawSample.getLeftEye returns point 468.RawSample.getRightEye returns point 469

Extended Set of Eye Points#

In addition to the standard set of anthropometric points, you can get an extended set of eye points, which includes points of pupils and eyelids. To get this set, call the RawSample.getIrisLandmarks() method. This will return a vector of 40 points for the left and right eyes in the order shown in the image below. For each eye 20 points are returned: the first 5 points refer to the pupil (its center and points on the circle), the remaining 15 points form the contour of the eyelids. A rendering example is available in demo (C++/Java/C#).

To get this set, turn on the iris_enabled parameter in the configuration file (for example, using the method for overriding parameters in the configuration file: overrideParameter). If the parameter is turned off, the vectors will be empty.

Capturer Class Reference#

To capture faces, create a Capturer object using FacerecService.createCapturer and pass the path to the configuration file or the FacerecService.Config object. If you pass the path to the configuration file, the default settings will be used. By using FacerecService.Config you can override any numerical option inside the config file. Also, some parameters can be changed in the existing Capturer object with the Capturer.setParameter method. See the Capturer usage examples Here.

The type and characteristics of the capturer depend on the configuration file or the FacerecService.Config object passed to the FacerecService.createCapturer member function.

Note: We recommend you to use VideoWorker for face tracking on video streams. When VideoWorker is created with matching_thread=0 and processing_thread=0, then the standard Face Detector license is used.

Capturer Usage Examples#

First Example#

pbio::Capturer::Ptr capturer = service->createCapturer("common_capturer4.xml");

Second Example#

pbio::FacerecService::Config capturer_config("common_capturer4.xml");
capturer_config.overrideParameter("min_size", 200);
pbio::Capturer::Ptr capturer = service->createCapturer(capturer_config);

Third Example#

pbio::Capturer::Ptr capturer = service->createCapturer("common_capturer4.xml");
capturer->setParameter("min_size", 200);
capturer->setParameter("max_size", 800);
// capturer->capture(...);
// ...
capturer->setParameter("min_size", 100);
capturer->setParameter("max_size", 400);
// capturer->capture(...);

Face Cropping#

A face can be cropped by one of the following methods:

  • RawSample.cutFaceImage: the cropped face is saved to the specified stream (for example, to a file), the encoding format is selected via RawSample.ImageFormat
  • RawSample.cutFaceRawImage: the cropped face is returned in the RawImage format (it stores the non-coded image pixels in the RGB/BGR/GRAY format (the format is selected via RawImage.Format)

Examples of using RawSample.cutFaceRawImage:

auto raw_image_crop = sample->cutFaceRawImage(
cv::Mat img_crop(raw_image_crop.height, raw_image_crop.width, CV_8UC3, (void*);

Available face cropping types (RawSample.FaceCutType):

  • FACE_CUT_BASE - unspecified cropping (any sample type).
  • FACE_CUT_FULL_FRONTAL - ISO/IEC 19794-5 Full Frontal (for ID, travel documents) (only frontal sample type). It is used for saving face images in electronic biometric documents.
  • FACE_CUT_TOKEN_FRONTAL - ISO/IEC 19794-5 Token Frontal (fixed eye positions) (only frontal sample type).

To preview the cropping, call the RawSample.getFaceCutRectangle method by specifying the cropping type. As a result, you'll have four points – the corners of the rectangle that will be used for cropping.

See the example of usage in Samples.

RawSample Class Reference#

With RawSample you can do the following:

  • get track id (RawSample.getID) – only if the sample was captured from a tracker.
  • get a face rectangle (RawSample.getRectangle), angles (RawSample.getAngles), left / right eyes (RawSample.getLeftEye / RawSample.getRightEye, see Anthropometric Points), anthropometric points (RawSample.getLandmarks, see Anthropometric Points) – only if the face is frontal (i.e captured with frontal detector / tracker).
  • crop the face (see Face Cropping).
  • downscale an internal face image to suitable size (RawSample.downscaleToPreferredSize).
  • serialize an object in a binary stream ( or RawSample.saveWithoutImage). You can deserialize it later using FacerecService.loadRawSample or FacerecService.loadRawSampleWithoutImage.
  • pass the estimates of the age, gender, quality and liveness to the methods (see Face Estimation).
  • provide it to Recognizer.processing for template creating (see Face Identification, test_identify).

Detailed Information about Capturer Configuration Parameters#

Click here to see the list of parameters inside the configuration files that can be changed using the FacerecService.Config.overrideParameter object
  • max_processed_width and max_processed_height – (for trackers only) Limits the size of the image passed to the internal detector of new faces.
  • min_size and max_size – Minimum and maximum face size for detection (for trackers: the size is defined for an image already downscaled according to the restrictions max_processed_width and max_processed_height).
  • min_neighbors – An integer detector parameter. Note that large values require higher detection confidence. You can change this parameter based on the situation. For example, increase the value if a large number of false detections is observed, or decrease the value if a large number of faces is not detected. If you aren't sure, do not change this parameter.
  • min_detection_period – (for trackers only) A real number that means the minimum time (in seconds) between two runs of the internal detector. A zero value means β€˜no restrictions’. Used to reduce the processor load. Large values increase the latency in detection of new faces.
  • max_detection_period – (for trackers only) An integer that means the max time (in frames) between two runs of the internal detector. A zero value means β€˜no restrictions’. For example, if you process a video offline, you can set the value to 1 so as not to miss a single person.
  • max_occlusion_time_wait – (for trackers only) A real number in seconds. When the tracker detects face occlusion, it holds the face position and tries to track it on new frames during this time.
  • fda_max_bad_count_wait – An integer. When fda_tracker detects the decline in the face quality, it tries to track this face with the general purpose tracker (instead of the fda method designed and tuned for faces) during at most fda_max_bad_count_wait frames.
  • base_angle – An integer: 0, 1, 2, or 3. Set camera orientation: 0 means standard (default), 1 means +90 degrees, 2 means -90 degrees, 3 means 180 degrees.
  • fake_detections_cnt – An integer. Number of start positions to search a face using video_worker_fdatracker_fake_detector.xml.
  • fake_detections_period – An integer. Each start position will be used once in fake_detections_period frames.
  • fake_rect_center_xN, fake_rect_center_yN, fake_rect_angleN, fake_rect_sizeN – Real numbers. Parameters of start positions. N is from 0 to fake_detections_cnt – 1 including. fake_rect_center_xN – x coordinate of a center relative to the image width. fake_rect_center_yN – y coordinate of a center relative to the image height. fake_rect_angleN – roll angle in degrees. fake_rect_sizeN – size relative to max(image width, image height).
  • downscale_rawsamples_to_preferred_size – an integer, 1 means enabled, 0 means disabled. Default value is enabled. When enabled, Capturer downscales each sample to the suitable size (see RawSample.downscaleToPreferredSize) in order to reduce memory consumption. However, it decreases the performance. It is recommended to disable downscale_rawsamples_to_preferred_size and use RawSample.downscaleToPreferredSize manually for RawSamples that you need to save or keep in RAM for a long time.
Last updated on