In this tutorial, you'll learn how to perform liveness detection in a video stream with Face SDK and an RGBD sensor. As a rule, liveness detection is used to prevent spoofing attacks (when a person tries to subvert or attack a face recognition system by using a picture or a video and thereby gaining illegitimate access).
With Face SDK, you can perform liveness detection by analyzing a depth map or an RGB image from your sensor. The first method is more accurate, that's why we'll consider it in this tutorial.
This tutorial is based on Face Recognition in a Video Stream and the corresponding project. In this project, we'll also use a ready-made database of faces for recognition. After you run the project, you'll see RGB and depth maps, which you can use to correct your position relative to the sensor: to ensure the stable performance of a liveness detector, your face should be at a suitable distance from a sensor, and the quality of the depth map should be sufficient.
A detected and recognized face will be highlighted with a green rectangle on an RGB image. Next to the detected face, there will be a picture and a name of a person from the database. Also, you'll see the liveness status
REAL. If a person isn't recognized, the liveness status will be
REAL but the bounding rectangle will be red. If a detected face is taken from a picture or a video, the bounding rectangle will be red and recognition won't be performed. In this case, the liveness status will be
Besides Face SDK and Qt, you'll need:
- An RGBD sensor with OpenNI2 or RealSense2 support (for example, ASUS Xtion or RealSense D415);
- OpenNI2 or RealSense2 distribution package.
You can find the tutorial project in Face SDK: examples/tutorials/depth_liveness_in_face_recognition
- First of all, we have to import necessary libraries to work with the depth camera. You can use either an OpenNI2 sensor (for example, ASUS Xtion) or a RealSense2 sensor. Depending on the camera you're using, you have to specify the condition
- [For OpenNI2 sensors] Specify the path to the OpenNI2 distribution package and also the paths to the necessary OpenNI2 libraries and headers.
Note: For Windows, you have to install OpenNI2 and specify the path to the installation directory. For Linux, you just need to specify the path to the unpacked archive.
- [For RealSense sensors] Specify the path to the RealSense2 distribution package and the paths to the necessary RealSense2 libraries and headers. In the
win32block, we determine the platform bitness to set the correct paths to the RealSense libraries.
Note: For Windows, you have to install RealSense2 and specify the path to the installation directory. For Linux, you have to install RealSense2 as described at the Intel RealSense website.
- At this stage, we need to retrieve a depth frame from an RGBD sensor using OpenNI2 API or RealSense2 API, depending on the camera used. We won't elaborate on retrieving the depth frames. Instead, we'll use the headers from one of the Face SDK samples (video_recognition_demo). In the profile of the project, specify the path to the folder examples/cpp/video_recognition_demo/src from Face SDK.
- Specify the necessary headers to work with OpenNI2 and RealSense2 cameras. You can find the detailed information about retreving the depth frames in the specified files (
- To use mathematical constants, define
cmathis already imported in
- In previous projects, we retrieved the image from a webcam using the
QCameraCaptureobject. However, in this project we have to retrieve both RGB and depth frames. To do this, let's create a new class
DepthSensorCapture: Add New > C++ > C++ Class > Choose… > Class name – DepthSensorCapture > Base class – QObject > Next > Project Management (default settings) > Finish.
depthsensorcapture.h, import the
ImageAndDepthSourceheader. Also import
QSharedPointerto handle pointers,
QThreadto process threads,
QByteArrayto work with byte arrays,
atomicto handle smart pointers and atomic types respectively. In
depthsensorcapture.cpp, import the headers
RealSenseSourceto retrieve the depth frames, and also import
depthsensorcapture.h. We use
assert.hto handle errors and
QMessageBoxto display the error message.
RGBFramePtr, which is a pointer to an RGB frame, and
DepthFramePtr, which is a pointer to a depth frame. The
DepthSensorCaptureclass constructor takes a parent widget and also a pointer to worker. The sensor data will be received in an endless loop. To prevent the main thread, where the interface is rendered, from waiting for completion of the cycle, we create another thread and move the
DepthSensorCaptureobject into this new thread.
DepthSensorCapture::start, we start the thread, where the data is received, and stop it in
DepthSensorCapture::frameUpdatedThread, process a new frame from the sensor in an endless loop and pass it to
addFrame. If an error occurs, an error message box will be displayed.
VideoFrameobject should contain an RGB frame from the sensor.
videoframe.h, import the
depthsensorcaptureheader to work with the depth sensor.
IRawImageinterface allows to receive a pointer to the image data, its height and width.
- In the
runProcessingmethod, we start the camera, and in the
stopProcessingmethod, we stop the camera.
worker.h, we import the
SharedImageAndDepthstructure contains the pointers to an RGB frame and a depth frame from the sensor, and also the
pbio::DepthMapRawstructure with the information about depth map parameters (width, height, etc.). The pointers are used in
Worker. Due to some delay in frame processing, a certain number of frames is queued for rendering. To save the memory space, we store the pointers to frames instead of the frames.
SharedImageAndDepthframe, which is RGB and depth frames for rendering, to the
TrackingCallback, we extract the image corresponding to the last received result from the frame queue.
Worker::Worker, override the values of some parameters of the
VideoWorkerobject to process the depth map, namely:
depth_data_flag("1" turns on depth frame processing to confirm face liveness);
weak_tracks_in_tracking_callback("1" means that all samples, even the ones flagged as
weak=true, are passed to
weak flag becomes
true if a sample doesn't pass certain tests, for example:
- if there are too many shadows on a face (insufficient lighting)
- if an image is blurry
- if a face is turned at a great angle
- if a the size of a face in the frame is too small
- if a fase hasn't passed the liveness test (for example, if it's taken from a photo or a video)
You can find the detailed information about lighting conditions, camera positioning, etc. in Guidelines for Cameras. As a rule, samples that haven't passed the tests, are not processed and not used for recognition. However, in this project we want to highlight all the faces, even if they're taken from the picture (haven't passed the liveness test). Therefore, we have to pass all samples to
TrackingCallback, even if they're flagged as
- In the
worker.h, specify the enumeration
pbio::DepthLivenessEstimator, which is the result of liveness estimation. All in all, there are four liveness statuses:
- NOT_ENOUGH_DATA means that face information is insufficient. This situation may occur if the depth map quality is poor or the user is too close/too far from the sensor.
- REAL means that the face belongs to a real person.
- FAKE means that the face is taken from a picture or a video.
- NOT_COMPUTED means that the face wasn't checked. This situation may occur, for example, if the frames from the sensor are not synchronized (an RGB frame is received but a corresponding depth frame wasn't found in a certain time range).
The liveness test result is stored in the
face.liveness_status variable for further rendering.
Worker::addFrame, pass the last depth frame to Face SDK using the
VideoWorker::addDepthFramemethod and store it for further processing.
- Prepare and pass the RGB image to Face SDK using
VideoWorker::addVideoFrame. If the format of the received RGB image is BGR instead of RGB, the byte order is changed so that the image colors are displayed correctly. If a depth frame wasn't received together with an RGB frame, the last received depth frame is used. A pair of the depth and RGB frames is queued in
_framesin order to find the data corresponding to the processing result in
- Let's modify
framefield of the
Worker::DrawingDatastructure contains the pointers to RGB frame data and depth frame data, as well as depth frame parameters (width, height, etc.). For convenience, we'll create the references
const QImage& color_image,
const QByteArray& depth_array, and
const pbio::DepthMapRaw& depth_optionsto refer to these data. The RGB image and depth map will be displayed in
QImage result, which can be considered as a sort of a "background" and contains both images (an RGB image at the top and a depth map at the bottom). Before that, we have to convert 16-bit depth values to 8-bit depth values to correctly display the depth map (in grayscale). In the
max_depth_mmvalue, specify the maximum distance from the sensor to the user (usually, it's 10 meters).
- Form the depth image from the converted values. Create the
resultobject, which will be used to display an RGB image (at the top) and a depth map (at the bottom). Render these images.
- Display the liveness status next to the face depending on the information received from the liveness detector. Specify the label parameters (color, line, size). You'll see a bounding rectangle in the depth map so that you can make sure that the RGB frame and depth frame are aligned.
- Run the project. You should see an RGB image and a depth map from the sensor. You should see the information about the detected face:
- detection and recognition status (which is indicated by the color of the bounding rectangle: green means that a person is detected and found in the database, red means that a person is not recognized or the face is taken from an image or a video).
- information about the recognized person (his/her image and name from the database).
- liveness status; real means that a person is real, fake means that a face is taken from an image or a video, not_enough_data means that the depth map quality is poor or a person is too close/too far away from the sensor, not_computed means that RGB and depth frames are not synchronized.