Monday, March 2, 2015

Real time hand gesture recognition using finger segmentation


Title: Real time hand gesture recognition using finger segmentation

Introduction
In this proposed method at the first the hand region is extracted from the background with the background subtraction method. Then, the palm and fingers are segmented so as to detect and recognize the fingers. Finally, a rule classifier is applied to predict the labels of hand gestures.







The novelty of the proposed method is listed as follows

·         The hand gesture recognition is based on the result of finger recognition. Therefore, the recognition is accomplished by a simple and efficient rule classifier.

·         Some previous works need the users to wear data glove to acquire hand gesture data. However, the special sensors of data glove are expensive and hinder its wide application in real life. In the work, the authors use TOF camera, that is, Kinect sensor, to capture the depth of the environment and a special tape worn across the wrist to detect hand region. This approach only uses a normal camera to capture the vision information of the hand gesture meanwhile does not need the help of the special tape to detect hand regions.

·         The third advantage of the proposed method is that it is highly efficient and fit for real-time applications

Method
As explained in introduction section first, the hand is detected using the background subtraction method and the result of hand detection is transformed to a binary image. Then, the fingers and palm are segmented so as to facilitate the finger   recognition. Moreover, the fingers are detected and recognized. Last, hand gestures are recognized using a simple rule classifier.

1.       Hand detection

These images are captured with a normal camera and taken under the same condition. The background of these images is identical. So, it is easy and effective to detect the hand region from the original image using the background subtraction method. However, in some cases, there are other moving objects included in the result of background subtraction. The skin color can be used to discriminate the hand region from the other moving objects. The color of the skin is measured with the HSV (hue, saturation, and value) model.

2.       Finger and Palm segmentation

The output of the hand detection is a binary image in which the white pixels are the members of the hand region, while the black pixels belong to the background. Then, the following five procedure are implemented on the binary hand image to segment the fingers and palm.

1.       Palm point
The palm point is defined as the center point of the palm. It is found by the method of distance transform/distance map is a representation of an image. In the distance transform image, each pixel records the distance of it and the nearest boundary pixel. The block city distance is used to measure the distances between the pixels and the nearest boundary pixels. In the distance transform image of the binary hand image, the pixel with largest distance is chosen as the palm point.

2.       Inner circle of the maximal radius

When the palm point is found, it can draw a circle with the palm point as the center point inside the palm.

3.       Wrist points and palm mask

Some points are sampled uniformly along the circle. For each sampled point on the circle, its nearest boundary point is found and lined to it. The boundary point is judged in a simple way. If the 8 neighbors of a pixel consist of white and black pixels, it is labeled as a boundary point.

Two wrist points are the two ending points of the wrist line across the bottom of the hand.

4.       Hand Rotation

When the palm point and wrist point are obtained, it can yield an arrow pointing from the palm point to the middle point of the wrist line at the bottom of the hand. Then, the arrow is adjusted to the direction of the north. The hand image rotates synchronously so as to make the hand gesture invariant to the rotation

5.       Fingers and Palm segmentation

With the help of the palm mask, fingers and the palm can be segmented easily. The part of the hand that is covered by the palm mask is the palm, while the other parts of the hand are fingers.

3.       Finger recognition

In the segmentation image of fingers, the labeling algorithm is applied to mark the regions of the fingers. In the result of the labeling method, the detected regions in which the number of pixels is too small is regarded as noisy regions and discarded. Only the regions of enough sizes are regarded as fingers and remain. For each remained region, that is, a finger, the minimal bounding box is found to enclose the finger. Then, the center of the minimal bounding box is used to represent the center point of the finger.


1.       Thumb detection and recognition.

The centers of the fingers are lined to the palm point. Then, the degrees between these lines and the wrist line are computed. If there is a degree smaller than 50, it means that the thumb appears in the hand image. The corresponding center is the center point of the thumb. The detected thumb is marked with the number 1. If all the degrees are larger than 50, the thumb does not exist in the image.

2.       Detection and recognition of other fingers.

To detect and recognize the other fingers, the palm line is first searched. The palm line parallels to the wrist line. The palm line is searched in the way: start from the row of the wrist line. For each row, a line paralleling to the wrist line crosses the hand. If there is only one connected set of white pixels in the intersection of the line and the hand, the line shifts upward. Once there are more than one connected sets of white pixels in the intersection of the line and the hand, the line is regarded as a candidate of the palm line. In the case of the thumb existing, the line continues to move upward with the edge points of the palm instead of the thumb as the starting point of the line. Now, since the thumb is taken away, there is only one connected set of pixels in the intersection of the line and the hand. Once the connected set of white pixels turns to 2 again, the palm line is found.

After the palm line is obtained, it is divided into 4 parts. If the finger falls into the first part, it is the forefinger. If the finger belongs to the second part, it is the middle finger. The third part corresponds to the ring finger. The fourth part is the little finger.

In some case, two or more fingers stay closely and there is no interval among the fingers. In order to discriminate the case from that of a single finger, the width of the minimal bounding box is used as a discrimination index. If the width of the minimal bounding box is equal to a usual value, the detected region is a single finger. If the width of the minimal bounding box is several times of the usual value, the detected region corresponds to several fingers that stay together closely.


4.       Recognition of hand gestures

When the fingers are detected and recognized, the hand gesture can be recognized using a simple rule classifier. In the rule classifier, the hand gesture is predicted according to the number and content of fingers detected. The content of the fingers means what fingers are detected.


Conclusion and future work

Future works, machine learning methods may be used to address the complex background problem and improve the robustness of hand detection.


Applicability

We can use hand and finger segmentation methods to our project.

No comments:

Post a Comment