Speak For Me

The International Arab Conference on Information Technology (ACIT’2013)

Real Time Finger Binary Recognition Using

Relative Finger-Tip Position

ASADUZ ZAMAN #1, MD ABDUL AWAL ǂ2, CHILL WOO LEE §3 and MD ZAHIDUL ISLAM #4

#Dept. of Information & Communication Engineering, Islamic University, Kushtia, Bangladesh

1asadiceiu@gmail.com, 4zahidimage@gmail.com

ǂDept. of Computer Science, College of Engineering & Technology, IUBAT, Dhaka, BAngladesh

2 bongoabdo@gmail.com

§School of Computer Engineering, Chonnam National University, Gwangju, South Korea

3leecw@chonnam.ac.kr

Abstract: propose a method to recognize real time Finger Binary number shown by hand gesture using relative finger-tip position using procedural and logical way. Using relative finger-tip position knowledge ,the process will be able to identify binary sequence of finger-tip positions making a way to recognize hundreds of finger-gestures using only two hands. The proposed method uses a color based segmentation to identify skin area from image frames and connected component detection to detect hand region.

1. Introduction

In Finger Binary[1] Recognition System, a computer or machine will response to the binary number represented by the hand. The binary string will be just a combination of 1 and 0 representations of hand finger tips. 1 for shown and 0 for not shown finger tips. If we consider one hand, we’ll able to provide 0 to 31 [2^0 to 2^5-1] and if we consider two hand at a time, we’ll be able to provide 0 to 1023 [2^0 to 2^10-1]. We can also provide negative number just making one finger as a flag whether it’s negative or positive.

Fully capable version of this method will be able to recognize 32 numeric counting gestures from one hand where traditional approach limits in 5 to 10 numeric counting gestures.

2. Proposed Framework

The method we propose uses color based segmentation approach to detect skin area and connected component detection for hand region detection. Relative finger-tip position is acquired in features extraction stage. The brief block diagram of finger binary recognition process is shown in Figure 1.

A. Image Acquisition

Each frame of input video stream either real time video or locally stored video will be processed for finger binary recognition. Each frame is resized to 320x240 if the frame is larger.

B. Segmentation

The input frame is segmented for further processing in this stage. There are several features which are used in image segmentation. Such as skin color, shape, motion and anatomical model of hand. Used skin color based segmentation approach. For skin color segmentation, illumination invariant color spaces are more preferred. Normalized RGB, HSV, YCrCb and YUV are some illumination invariant color spaces. We’ve used YCrCb color space for segmentation process.

After skin-color segmentation, the frame is passed through Gaussian Blur, Erosion and Dilation process to remove noise from image

C. Hand Region Detection

Binary image frame produced by segmentation stage is treated as input of this stage. we’ve assumed that the biggest part of the input frame will be hand region. The process of finding biggest part of the skin image is done with help of finding contours of the binary skin image and taking the biggest contour as hand region.

For be in safe side input color image is passed through Haarclassifier based face detector with scale factor of 1.5 and minimum size of one-fifth of original image for faster processing. If any face is found, then hand region detection is processed excluding face area. Figure 3 shows the result of this process with hand region and face area marked with rectangle.

Figure 3: (a) input skin color segmented image, (b) contour detection with hand region and face area marked. Notice that if face areas are excluded, the biggest contour is hand region

D. Features Extraction

For features extraction, firstly we’ll find contour of hand region which we have found in hand region detection stage. Then we’ll find convex hull for that contour. The convex hull provides a set of convexity defect. Every convexity defect contains four piece of information. Such as,

A. Start point

B. End point

C. Depth point

D. Depth

Convexity defects of a hand figure are shown in figure 4.

Figure 4. Convexity defects: the dark contour line is a convex hull around the hand; the gridded regions (A–H) are convexity defects in the hand contour relative to the convex hull. Image curtsey: Learning OpenCV by Gary Bradski and Adrian Kaehler

Each start points and end points are possible finger-tip position. To prevent the system to detect false fingertips, the system uses equation (1).

Where x implies convexity defect and max implies maximum depth of all convexity defects and α is the threshold value. Output 1 or 0 simply implies that the convexity defect is potential finger-tip bearer or not

After that start and end points of all potential convexity defects are taken as possible finger-tip position. Actual finger-tip positions are detected using equation

Where x and y are potential finger-tip position, FT is finger-tip array which initially set to NULL. ED(x,y) is Euclidean distance of x and y and α is threshold value which implies the minimum distance for two consecutive finger-tips. This function will return 1 if FT is null and x will be stored in FT. If for any member ED(x,y) returns less then α, the point will be discarded. Otherwise the x will be stored in FT.

This stage extracts information about all finger-tips position along with a center-point of hand region which is the center of bounding rectangle of the detected hand region.

Finally this stage stores some information for later use. Initially the system asks the user toshow binary number 11111 or all finger-tips shown position. When the user shows binary number 11111, the system learns the features for making further communication smooth. In this stage the system stores a structure for the information bellow,

1. Angular relation from each finger to all otherfingers. E.g. t2ir, t2mr, i2mr etc meaning thumb to index finger angular relation, thumb to middle and index to middle.

2. c2tA [Palm center to thumb angle with reference to x-axis]

3. c2tD [Palm center to thumb distance]

4. hand [bounding rectangle of hand region]

E. Decision Making

This stage provide the recognized finger binary number as the system output. Recognition of binary 00000 and binary 00001 are processed separately as they provide quite distinguishable features. All other recognition is done by predicting whether a specific finger-tip is shown or not.

i) 00000 Recognizing

Recognizing 00000 is quite easy task as when 00000 is shown by a user, the hand region provides smallest area. If current hand region’s height and width are less than a threshold level, we’ll detect the case as 00000. The system uses equation 3 for this case.

Where x is the current frames hand region. This case is shown in figure 5.

Figure 5. 00000 Recognizing. Here α and β are both taken as 0.8

ii). 00001 Recognizing

Recognizing 00001 is almost same as recognizing 00000. The only difference between 00000 and 00001 is that thumb is shown or not. The thumb finger just extends the 00000 frame’s hand region width above a threshold value of the actual hand width. The system uses equation 4 for this case.

Where x is the current frames hand region. This case is shown in figure 6.

Figure 6. 00001 Recognizing. Here α and β are both taken as 0.8

iii). Predicting Thumb Position

Using stored information from features extraction stage, thumb position is predicted in each frame. Predicting thumb position is very important because the system uses this thumb position as the reference position for relative finger-tip position finding. The system uses equation 5 and 6 for predicting thumb position.

Where T is the thumb point, c2tD and c2tA are center to thumb distance and center to thumb angle with reference to X-axis from saved features and center is current frame’s center position. Red dots in figure 7(a), 7(b), 7(d), 7(e), 7(g), 7(h), 7(i) and 7(j) are predicted thumb position.

iv). Predicting Finger-tips Shown or not

For predicting finger-tips shown or not, we will firstly measure the angle among our predicted thumb position, current center position and each of the fingertips found from features extraction stage. Then we’ll label each finger-tip whether it’s shown or not. Let the angle found is a. Then system will decide what finger is shown using the equation (7) to equation (11). Note that angles are measured in degree

From equation (8) to equation (12), we can see that some angles are omitted. Omitted angle-ranges are 80- 95 and 100-105. These angle-ranges are possible position for more than one finger. Angle-range 80-95 is possible position for index and middle finger and anglerange 100-105 is for middle and ring finger. To determine which finger is actually shown, we’ll use our stored information and will update our stored information in every frame using equation (12).

Where R is the previous relation of angle between fingers. And then we’ll compute sum of distance of the relations using equation (13).

Where sri is the i’th stored relationships and cr is the current relationship found by equation (12). Finger-tip with minimum value of w will be associated as predicted finger-tip.

3. Experimental Result

Some frames are affected by dynamicity of the real time approach. For this reason, some gestures couldn’t be recognized. Also some gesture is very hard to perform because of articulation of human hand. At this point of development, the system is not scale invariant.

The most affecting issue is skin-color segmentation. The process will do better is a better skin-color segmentation approach is applied. But as our focusing point is not skin-color segmentation, we’ve just used traditional approach with best possible localization.

4. Conclusion

accuracy of recognition rate of experimental result is nearly 80%, it’s noteworthy that

the total number of individual gesture is increased a lot in this process. It is also notable that the process uses very simple mathematical calculation to recognize gesture which is computationally very inexpensive.

The systems’ performance of accuracy could be increased with using a more sophisticated skin-color segmentation approach. Good lighting condition also now effecting the system performance. If we use both hands for gesture recognition, it’s possible to recognize all 1024 finger binary gestures.

Relation to Our Project Speak4Me

Using this method we can easily identify all the numbers of the ASL And also we identify some of the letters of the alphabet.We can give the final input binary to a AI component to identify the Correct letter according to that.

Speak For Me

Thursday, March 19, 2015

No comments:

Post a Comment

Blog Archive