Saturday, June 6, 2015

Title

Real-Time Palm Tracking and Hand Gesture Estimation Based on Fore-Arm Contour

Introduction

In this thesis proposed an image processing system using a web camera. Differently from other hand recognition method. They mark up the important features of hand: fingertips, palm center by computation geometry calculation, provide real-time interaction between gesture and the system. Within the advantages brought by computation geometry method, this system can accurately locate the palm center even when the fore-arm is involved. And the system tolerates a certain rotation of palm and fore-arm, which enhances the freedom of use in palm center estimation.

Overview


1). Pre-Processing and Convex hull method

This will introduce how to extract the interested area from color image. Here they use a single web camera to capture a series of iamges. After transforming the image into HSV color space, might be able to extract the interested area by defining the range of hue saturation and value. A binary image will be produced and morphology processing will be performed. Erosion will eliminate the noises while the dialation will fill up the defects to smooth the ontour of interested area.

The color model captured from web camera is composed of RGB values, but it will be influenced by the light very easily, we must convert the RGB color space into another color space which is not sensitive to lght variation. In addition to RGB, there are some other commonly used color spaces such as Normalized RGB , HSV , YCbCr , and so forth. In order to make the system achieve real-time processing and adapt to most of the environments, we choose the color model with simple converting formulas and low sensitivity. In this thesis, we choose HSV color space to extract human skin areas. The skin color areas will be represented in binary image. Hence we will perform morphology processing including erosion and dilation. Noises will be eliminated by erosion and the contour of skin color area will be smoothed by dilation.

1.1) Skin color detection using HSV color space

The HSV model comprises three components of a color: hue, saturation, and value. It can distinguish the value from the hue and saturation. A hue represents the gradation of a color within the optical spectrum, or visible spectrum of light. Saturation is the intensity of a specific hue, which is based on the purity of a color. The practicability of the HSV model can be referred to two main factors: 
1) the value separated from the color image, and 
2) both the hue and saturation related to human vision.

All colors composed of three so-called primary colors (red, green, and blue) are inside the chromatic circles. The hue H is the angle of a vector with respect to the red axis. When H = 0 , the color is red. The saturation S is the degree of a color that has not been diluted with white. It is proportional to the
distance from the location to the center of a circle. The longer distances are away from the center of a chromatic circle, the more saturation the colors will be. The value V is measured by the line which is orthogonal to the chromatic circle and passes through the center of the circle. It tends to be white along this center line to the apex.

The relationship between the HSV and RGB models is expressed below:



We distinguish the skin color from non-skin color regions by setting upper and lower bound thresholds. In our experimental environment, we choose the H value from 2 to 39 and from 300 to 359; the S value between 0.1 and 0.9 for the range of skin colors


1.2) Morphology Processing

After the generating of the binary image, we further perform morphological operations on them to smooth hand segments and it also can eliminate some noises. Two basic morphological operations are dilation and erosion. These two morphological operations are described below. 

For two sets in 2 Z , A and B , the dilation of A by B denoted A+B is
defined as

This equation is based on obtaining the reflection of B about its origin and shifting this reflection by z Then the dilation of A by B yields the set of all displacements, z , such that ˆB and A overlap with at least one element. 


Set B is commonly referred to a structuring element in the dilation operation as well as in other morphological ones. Therefore, all points inside this boundary constitute the dilation of A by B.

Given two sets A and B in 2 Z , the erosion of A by B denoted A - B is defined as 

In words, this equation manifests that the erosion of A by B is the set of all points z such that B , translated by z , is contained in A. As in the case of dilation, this equation is not the unique definition of erosion. However,this is usually favored in practical implementations of mathematical morphology for the same reasons stated earlier in connection with previous equation. Following figure shows the original set for reference, and the solid line stands for the limit beyond which any further displacements. In consequence, all points inside this boundary constitute the erosion of A by B. 


The erosion and dilation operations are both the conventional and popular methods used to clean up anomalies in the objects. The combination of these two operations can also achieve good performance. Therefore, a moving object region is morphologically eroded (once) then dilated (twice).

1.3) Conotur Finding

After the previous processing, we should be able to obtain a binary image. The white pixels represent for skin color regions. The next step will be contour finding. A contour is a sequence of points which are the boundary pixels of a region. The contour of those regions will be found so that we can disregard those small areas and focus on the fore-arm area we are interested in. It can be done by comparing the length of their contour. The longest one we are looking for.

In this thesis they use Theo Pavlidis' Algorithm to find contours.



1.4) Convex Hull

Finding a convex hull is a well-known problem in computation geometry. Let us imagine that there are several nails on the wall, a rubber band is used to surround those nails. Only the peripheries will touch the rubber band while the nails inside won’t be able to affect the rubber band. The shape of the rubber band is probably how the convex hull of the nails will be looked like. It is obviously not difficult for a human to understand the shape of convex hull just in a glance. But as for machines, an algorithm is needed. 

We calculate the convex hull of the fore-arm contour in order to find the desired information. Usually the arm part has a smooth contour and doesn’t contain any important information. The hand part has more convex and concave contours and it usually contains the information we want. After the comparison of a fore-arm contour and its convex hull, we find out that the convexity defects are around the palm area. Hence we might be able to find the points which have the longest distance to the convex hull in each convexity defect. Those points are on the edge of palm section since the convexity defects are around the palm area. Since the fore-arm is relatively smooth, so the neighbor defect usually won’t create a point which has the longest distance to the convex hull on the arm contour. By these points, we can determine the position of palm even the fore-arm contour is included in the contour.

Here they use Three Coin Algorithm to find convex hull.

1.5) Convexity Defects

We calculate the convex hull of the fore-arm contour in order to get the convexity defetcs of the contour. It provides us useful information to understand the shape of a contour. 


By connecting the start , end and points on the contour between the start point and end point, we get a convexity defect area.


The depth of the defect is the longest distance of all points in the defect to the convex hull edge of the defect. The point in the defect which has the longest distance ti the convex hull edge of the defect will be the depth point. 




2) Palm Tracking and Fingrtip Detection

We are able to extract the points which has the longest depth in a convexity defetc. The convexity defects with a depth larger than a certain threshold tend to be appeared around the palm. 

Gathering depth point gives us useful information to decide where the palm poisition is. In this thesis there are 2 ways to determine the palm position. 
1). Minimum Enclosing Circle
It calculates the smallest circle which covers the all the depth points.

2). Calculate average position within a minimum enclosing circle.

2.1)Minimum Enclosing Circle

2.1.1) Find circle through 3 points

It is obviously that 3 different points in a plane can form a unique circle. The equation of the circle in a plane has three unknown variables.
When there are 3 points in a plane, we might be able to input their x and y coordinates into the equation to generate 3 equations with 3 variables. That makes unique solution. Instead of solving thhe simultaneous equations, we can first calculate the center of the circle according to geometric property. when the center of the circle is calculated it will be easy to get the radius.


This circle is formed by 3 points P1, P2 and P3. Two lines can also be formed through two pairs of three points. Line A passes through P1 and P2. Line B passes through P2 and P3. The equations of these two lines are;



The center of the circle is the intersection of the two lines which are prependiclular to line A and line B and also pass the midpoints of P1 P2and P2 P3. The perpendiclular of a line with slope m has slope -1/m, thus the equations of the lines perpendicular to line A and line B which pass the midpoints of P1 P2 and P2 P3 are;

The 2 lines of the equations above intersect at the center point, so the coordinate of the center point will be the common solution for the equationa above.  The following equation solves the X-coordinate of the center point.

(mb-ma) will be 0 only when the two lines are parallel, that implies that they must be coincident. But we apply the calculation only when the 3 input points are different, so will avoid that situation. By substituting the x value into the following equation we obtain the Y coordinate of the center point. 



2.2) Fingertip Detection

Algorithm




2.2.1) Buffer for palm position


Since the contour we extract can have some small changes from frame tp frame, it may cause the palm position to move a little bit in every frame. So tha palm position might be shaking. To reduce that weapply a buffer for the palm position. 

The buffer will record the position and radius of palm cirlces in the past 3 frames and the current frame. The palm circle we are going to draw in the current frame will be the average of the buffer. 

2.2.2) Mix of palm position and average depth point position

When the fore-arm part is large in the image the depth point at the wrist part might slip away. This will enlarge the size of the palm circle and the estimation of palm will be wrong. 


we calculate the average position of current depth points and so that slipped depth point can only affect 1/(amount of depth points) of the results. 

we calculate the average of average depth points and the palm circle. The effect of slipped depth point will be reduced and the estimation of palm position will still be accurate. 




2.2.3) Add an extra point when there is less than 2 depth points

Another condition we need to take care of often happened when the hand is closed; all the depth of convexity defects could be smaller than the threshold except the 2 convexity defetcs beside the wrist. Then palm center will be incorrect. 


To avoid this problem when the total amount of depth points is 2 or less, the system will add an extra point for minimum enclosing circle. This point can be obtained by finding the point which is at the top of the contour. 




3) Results


Applications

We can use their contour finding algorithm and convex hull finding algorithm for our system. And also we can use their palm position and fingertip detection algorithms as well. 





























No comments:

Post a Comment