Introduction
In this proposed method at the first
the hand region is extracted from the background with the background
subtraction method. Then, the palm and fingers are segmented so as to detect
and recognize the fingers. Finally, a rule classifier is applied to predict the
labels of hand gestures.
The novelty of the proposed method is
listed as follows
·
The hand gesture recognition is based on the result of
finger recognition. Therefore, the recognition is accomplished by a simple and
efficient rule classifier.
·
Some previous works need the users to wear data glove to
acquire hand gesture data. However, the special sensors of data glove are
expensive and hinder its wide application in real life. In the work, the
authors use TOF camera, that is, Kinect sensor, to capture the depth of the
environment and a special tape worn across the wrist to detect hand region.
This approach only uses a normal camera to capture the vision information of
the hand gesture meanwhile does not need the help of the special tape to detect
hand regions.
·
The third advantage of the proposed method is that it is
highly efficient and fit for real-time applications
Method
As
explained in introduction section first, the hand is detected using the background
subtraction method and the result of hand detection is transformed to a binary
image. Then, the fingers and palm are segmented so as to facilitate the finger recognition. Moreover, the fingers are
detected and recognized. Last, hand gestures are recognized using a simple rule
classifier.
1. Hand
detection
These images are captured with a normal camera and taken
under the same condition. The background of these images is identical. So, it
is easy and effective to detect the hand region from the original image using
the background subtraction method. However, in some cases, there are other
moving objects included in the result of background subtraction. The skin color
can be used to discriminate the hand region from the other moving objects. The
color of the skin is measured with the HSV (hue, saturation, and value) model.
2.
Finger and Palm segmentation
The output of the hand detection is a binary image in
which the white pixels are the members of the hand region, while the black
pixels belong to the background. Then, the following five procedure are
implemented on the binary hand image to segment the fingers and palm.
1.
Palm point
The palm point is defined as the center
point of the palm. It is found by the method of distance transform/distance map
is a representation of an image. In the distance transform image, each pixel
records the distance of it and the nearest boundary pixel. The block city
distance is used to measure the distances between the pixels and the nearest
boundary pixels. In the distance transform image of the binary hand image, the
pixel with largest distance is chosen as the palm point.
2.
Inner circle of the maximal radius
When the palm point is found, it can draw a circle with the palm point as
the center point inside the palm.
3.
Wrist points and palm mask
Some points are sampled uniformly along
the circle. For each sampled point on the circle, its nearest boundary point is
found and lined to it. The boundary point is judged in a simple way. If the 8
neighbors of a pixel consist of white and black pixels, it is labeled as a
boundary point.
Two wrist points are the two ending
points of the wrist line across the bottom of the hand.
4.
Hand Rotation
When the palm point and wrist point are obtained, it can yield an arrow
pointing from the palm point to the middle point of the wrist line at the
bottom of the hand. Then, the arrow is adjusted to the direction of the north.
The hand image rotates synchronously so as to make the hand gesture invariant
to the rotation
5.
Fingers and Palm segmentation
With the
help of the palm mask, fingers and the palm can be segmented easily. The part of
the hand that is covered by the palm mask is the palm, while the other parts of
the hand are fingers.
3.
Finger recognition
In the segmentation image of fingers,
the labeling algorithm is applied to mark the regions of the fingers. In the
result of the labeling method, the detected regions in which the number of
pixels is too small is regarded as noisy regions and discarded. Only the
regions of enough sizes are regarded as fingers and remain. For each remained
region, that is, a finger, the minimal bounding box is found to enclose the
finger. Then, the center of the minimal bounding box is used to represent the
center point of the finger.
1.
Thumb detection and recognition.
The centers of the fingers are lined to the palm point. Then, the degrees between
these lines and the wrist line are computed. If there is a degree smaller than 50∘, it means that the thumb appears in the hand image. The
corresponding center is the center point of the thumb. The detected thumb is
marked with the number 1. If all the degrees are larger than 50∘, the thumb does not exist in the image.
2.
Detection and recognition of other fingers.
To detect and recognize the other
fingers, the palm line is first searched. The palm line parallels to the wrist
line. The palm line is searched in the way: start from the row of the wrist
line. For each row, a line paralleling to the wrist line crosses the hand. If
there is only one connected set of white pixels in the intersection of the line
and the hand, the line shifts upward. Once there are more than one connected
sets of white pixels in the intersection of the line and the hand, the line is
regarded as a candidate of the palm line. In the case of the thumb existing,
the line continues to move upward with the edge points of the palm instead of
the thumb as the starting point of the line. Now, since the thumb is taken
away, there is only one connected set of pixels in the intersection of the line
and the hand. Once the connected set of white pixels turns to 2 again, the palm
line is found.
After the palm line is obtained, it is
divided into 4 parts. If the finger falls into the first part, it is the
forefinger. If the finger belongs to the second part, it is the middle finger.
The third part corresponds to the ring finger. The fourth part is the little
finger.
In some case, two or more fingers stay
closely and there is no interval among the fingers. In order to discriminate
the case from that of a single finger, the width of the minimal bounding box is
used as a discrimination index. If the width of the minimal bounding box is
equal to a usual value, the detected region is a single finger. If the width of
the minimal bounding box is several times of the usual value, the detected
region corresponds to several fingers that stay together closely.
4.
Recognition of hand gestures
When the fingers are detected and
recognized, the hand gesture can be recognized using a simple rule classifier.
In the rule classifier, the hand gesture is predicted according to the number
and content of fingers detected. The content of the fingers means what fingers
are detected.
Conclusion
and future work
Future works, machine learning methods may
be used to address the complex background problem and improve the robustness of
hand detection.
Applicability
We can use
hand and finger segmentation methods to our project.

No comments:
Post a Comment