Skip to main content
Version: 3.12.0 (latest)

Face Capturing

Face detection consists of several stages:

  1. Detecting a face in the image. The result of this stage is a rectangle (a frame) around the detected face. There are several types of detectors that are described below.
  1. Fitting (positioning) of anthropometric points. The result of this stage is a set of anthropometric points with 2D/3D coordinates linked to a specific detected face. There are several types of fitters that use different sets of anthropometric points, which are described below.
  1. Calculating head rotation angles relative to the observation axis. The result of this stage is three head rotation angles: pitch, yaw, roll. The accuracy of these angles depends on the set of anthropometric points used.

The Capturer class is used for face detection. A configuration file should be specified when creating a class object. The configuration file contains the detector type to be used and the type of set of anthropometric points to be used (see Configuration files). You can also configure various detection parameters in the configuration file that affect the quality and speed of the entire algorithm. The detector and the set of anthropometric points are specified in the name of the configuration file, for example: common_capturer_blf_fda_front.xml - the blf detector, the fda set of points.

There is also another option of face detection combined with face tracking in a video stream. In this case, the algorithm assumes that a set of consecutive frames from the video is provided at the input, therefore, faces are tracked from frame to frame. In addition to the stages described above, the face tracking stage is added and a unique identifier is assigned to each face. This id does not change from the moment a face is detected in the video stream until the moment it is lost. Such configurations contain the word video in the file name, for example, common_video_capturer_fda.xml. Currently, two trackers are available:

  • common_video_capturer - provides high speed, but the quality is lower compared to fda_tracker_capturer
  • fda_tracker_capturer - provides high quality, but the speed is lower compared to common_video_capturer

Detectors#

Currently, the following detectors are available:

  • LBF – an outdated detector, not recommended for use;
  • BLF – a detector that provides higher quality and faster detection than LBF for faces of a medium size and larger (including masked faces). On Android, you can use GPU acceleration (enabled by default);
  • REFA – a detector that is slower than the LBF and BLF detectors, but at the same time guarantees a better quality of face detection of various sizes (including masked faces). Recommended for use in expert systems;
  • ULD – a new detector that is faster than REFA. This detector allows you to detect faces of various sizes (including masked faces).

For BLF, REFA, ULD detectors, you can get the detection confidence level. To do this, you need to call the RawSample.getScore() method. As a result, you'll receive a float number in the range of [0, 1].

For LBF, REFA, ULD detectors, you can set the size of the detected faces using the min_size parameter (see the Detailed Info about Capturer Configuration Parameters section). Decreasing the value of this parameter increases the detection time.

Below you can see the examples of the operation of different detectors in different conditions.

Click here to expand the table
BLF (score_threshold=0.6) REFA (min_size=0.2, score_threshold=0.89) ULD (min_size=10, score_threshold=0.7)

Below you can see the examples of the operation of different detectors at different thresholds.

Click here to expand the table
ULD (score_threshold=0.4) ULD (score_threshold=0.7) REFA (score_threshold=0.89)

When a detector is created, you can use it to detect / track faces. There are two ways to pass an image to the detector:

  • pass the data of the decoded image to the method Capturer.capture(RawImage image), using the RawImage class (see Samples)
  • pass the data of the encoded image in JPG, PNG, TIF or BPM format to the method Capturer.capture(byte[] data)

In both cases, the result is the vector of detected / tracked faces (RawSample is the object storing the captured face).

For a tracker, you can also call the Capturer.resetHistory method to start tracking on a new video sequence.

Anthropometric Points#

Note: Learn how to display anthropometric points and head rotation angles in our tutorial.

There are four sets of anthropometric points: esr, singlelbf, doublelbf, fda, mesh.

  • The esr set is our first set that was the only set available in previous SDK versions. The esr set contains 47 points.
  • The singlelbf and doublelbf provide higher accuracy than esr. The singlelbf set contains 31 points. The doublelbf set contains 101 points. Actually, the doublebf set consists of two concatenated sets – the last 31 points of doublelbf duplicate the singlelbf set (in the same order).
  • The fda provides high accuracy in a wide range of facial angles (up to the full profile), in contrast to the previous sets, so we recommend you to use detectors with these set. However, recognition algorithms still require face samples to be close to frontal. The fda set contains 21 points.
  • At the moment, the mesh set is the newest. It contains 470 3D points of a face. Use this set to get a 3D face mesh.
fda set of points. RawSample.getLeftEye returns point 7. RawSample.getRightEye returns point 10esr set of points. RawSample.getLeftEye returns point 16. RawSample.getRightEye returns point 17
singlelbf set of points. RawSample.getLeftEye returns point 29. RawSample.getRightEye returns point 30first 70 points of doubleldb set of points (the rest 31 points are taken from singlelbf). RawSample.getLeftEye returns point 68. RawSample.getRightEye returns point 69
mesh set of points. RawSample.getLeftEye returns point 468.RawSample.getRightEye returns point 469

Extended Set of Eye Points#

In addition to the standard set of anthropometric points, you can get an extended set of eye points, which includes points of pupils and eyelids. To get this set, call the RawSample.getIrisLandmarks() method. This will return a vector of 40 points for the left and right eyes in the order shown in the image below. For each eye, 20 points are returned: the first 5 points refer to the pupil (its center and points on the circle), the remaining 15 points form the contour of the eyelids. An example of rendering is available in demo (C++/Java/C#).

To get this set, you need to turn on the iris_enabled parameter in the configuration file (for example, using the method for overriding parameters in the configuration file: overrideParameter). If the parameter is turned off, the vectors will be empty.

Capturer Class Reference#

To capture faces, you should create a Capturer object using FacerecService.createCapturer, passing the path to the configuration file or the FacerecService.Config object. If you pass the path to the configuration file, the default settings will be used. By using FacerecService.Config you can override any numerical option inside the config file. Also, some parameters can be changed in the existing Capturer object with the Capturer.setParameter method. See the Capturer usage examples here.

The type and characteristics of the capturer depend on the configuration file or the FacerecService.Config object passed to the FacerecService.createCapturer member function.

Note: We recommend you to use VideoWorker for face tracking on video streams. When VideoWorker is created with matching_thread=0 and processing_thread=0, then the standard Face Detector license is used.

Capturer Usage Examples#

First Example#

pbio::Capturer::Ptr capturer = service->createCapturer("common_capturer4.xml");

Second Example#

pbio::FacerecService::Config capturer_config("common_capturer4.xml");
capturer_config.overrideParameter("min_size", 200);
pbio::Capturer::Ptr capturer = service->createCapturer(capturer_config);

Third Example#

pbio::Capturer::Ptr capturer = service->createCapturer("common_capturer4.xml");
capturer->setParameter("min_size", 200);
capturer->setParameter("max_size", 800);
// capturer->capture(...);
// ...
capturer->setParameter("min_size", 100);
capturer->setParameter("max_size", 400);
// capturer->capture(...);

Face Cropping#

A face can be cropped by one of the following methods:

  • RawSample.cutFaceImage: the cropped face is saved to the specified stream (for example, to a file), the encoding format is selected via RawSample.ImageFormat
  • RawSample.cutFaceRawImage: the cropped face is returned in the RawImage format (it stores the non-coded image pixels in the RGB/BGR/GRAY format (the format is selected via RawImage.Format)

Examples of using RawSample.cutFaceRawImage:

auto raw_image_crop = sample->cutFaceRawImage(
pbio::RawImage::Format::FORMAT_BGR,
pbio::RawSample::FACE_CUT_FULL_FRONTAL);
cv::Mat img_crop(raw_image_crop.height, raw_image_crop.width, CV_8UC3, (void*) raw_image_crop.data);

Available face cropping types (RawSample.FaceCutType):

  • FACE_CUT_BASE - unspecified cropping (any sample type).
  • FACE_CUT_FULL_FRONTAL - ISO/IEC 19794-5 Full Frontal (for ID, travel documents) (only frontal sample type). It is used for saving face images in electronic biometric documents.
  • FACE_CUT_TOKEN_FRONTAL - ISO/IEC 19794-5 Token Frontal (fixed eye positions) (only frontal sample type).

To preview the cropping, call the RawSample.getFaceCutRectangle method by specifying the cropping type. As a result, you will have four points – the corners of the rectangle that will be used for cropping.

See the example of usage in Samples.

RawSample Class Reference#

With RawSample you can:

  • get track id (RawSample.getID) – only if the sample was captured from a tracker
  • get a face rectangle (RawSample.getRectangle), angles (RawSample.getAngles), left / right eyes (RawSample.getLeftEye / RawSample.getRightEye, see Anthropometric Points), anthropometric points (RawSample.getLandmarks, see Anthropometric Points) – only if the face is frontal (i.e captured with frontal detector / tracker)
  • crop the face (see Face Cropping)
  • downscale an internal face image to suitable size (RawSample.downscaleToPreferredSize)
  • serialize an object in a binary stream (RawSample.save or RawSample.saveWithoutImage), you can deserialize it later using FacerecService.loadRawSample or FacerecService.loadRawSampleWithoutImage
  • pass the estimates of the age, gender, quality and liveness to the methods (see Face Estimation)
  • provide it to Recognizer.processing for template creating (see Face Identification, test_identify)

Detailed Info about Capturer Configuration Parameters#

Click here to see the list of parameters inside the configuration files that can be changed using the FacerecService.Config.overrideParameter object
  • max_processed_width and max_processed_height – (for trackers only) limit the size of the image that is passed to the internal detector of new faces.
  • min_size and max_size – minimum and maximum face size for detection (for trackers: the size is defined for an image already downscaled according to the restrictions max_processed_width and max_processed_height).
  • min_neighbors – an integer detector parameter. Please note that large values require higher detection confidence. You can change this parameter based on the situation, for example, increase the value if a large number of false detections are observed or decrease the value if a large number of faces are not detected. Do not change this parameter if you are not sure.
  • min_detection_period – (for trackers only) a real number that means the minimum time (in seconds) between two runs of the internal detector. A zero value means ‘no restrictions’. Used to reduce the processor load. Large values increase the latency in detection of new faces.
  • max_detection_period – (for trackers only) an integer that means the max time (in frames) between two runs of the internal detector. A zero value means ‘no restrictions’. For example, if you are processing a video offline, you can set the value to 1 so as not to miss a single person.
  • max_occlusion_time_wait – (for trackers only) a real number in seconds. When the tracker detects face occlusion, it holds the face position and tries to track it on new frames during this time.
  • fda_max_bad_count_wait – an integer. When fda_tracker detects the decline in the face quality, it tries to track this face with the general purpose tracker (instead of the fda method designed and tuned for faces) during at most fda_max_bad_count_wait frames.
  • base_angle – an integer: 0, 1, 2, or 3. Set camera orientation: 0 means standard (default), 1 means +90 degrees, 2 means -90 degrees, 3 means 180 degrees.
  • fake_detections_cnt – an integer. Number of start positions to search a face using video_worker_fdatracker_fake_detector.xml.
  • fake_detections_period – an integer. Each start position will be used once in fake_detections_period frames.
  • fake_rect_center_xN, fake_rect_center_yN, fake_rect_angleN, fake_rect_sizeN – real numbers. Parameters of start positions. N is from 0 to fake_detections_cnt – 1 including. fake_rect_center_xN – x coordinate of a center relative to the image width. fake_rect_center_yN – y coordinate of a center relative to the image height. fake_rect_angleN – roll angle in degrees. fake_rect_sizeN – size relative to max(image width, image height).
  • downscale_rawsamples_to_preferred_size – an integer, 1 means enabled, 0 means disabled. Default value is enabled. When enabled, Capturer downscales each sample to the suitable size (see RawSample.downscaleToPreferredSize) in order to reduce memory consumption. However, it decreases the performance. It's recommended to disable downscale_rawsamples_to_preferred_size and use RawSample.downscaleToPreferredSize manually for RawSamples that you need to save or keep in RAM for a long time.
Last updated on