Improving human computer interaction emphasized the research in
gesture recognition field. Researchers have done a lot of research work in
Gesture Recognition field for different languages. And many gesture recognizers
are already in the market for different languages. But unfortunately
International Sign Language has received less attention. In Pakistan, India and
Middle East, there are more than hundred million people, which are deaf and
mute, and having difficulty for communicating with normal people. As literacy
rate is very low in these areas, there is barrier of technological
learning/awareness; therefore, we intend to propose Gesture Recognizer.
In Gesture recognition, Gesture is converted to text that makes it
easy to communicate. This project deals with the study of processing techniques of International Sign Language and conversion of gestures with the
methodologies to achieve a gesture recognizing system for
communication. This system would help user to
communicate with any one, which is deaf or mute without getting dependent on
any interpreter or person who translate their signs. Our overall goal is the
implementation of existing gesture recognition techniques and develops a
gesture recognizer for International Sign Language (Alphabets) using existing
Mute Life Envoy
(Hand Gesture Recognition System) 1.1. Overview:For decades,
humans wanted to setup interactions between computers and them using their
natural languages. The interaction between machine and human languages requires
computational support that makes human language understandable to the computer.
Mute Life Envoy is the basic step in the automation of computer and human
interactions with more attention at Deaf And Mute (DNM) community. Hand gesture
recognition is a vast topic that includes its applications in
telecommunication, multi-media and other fields. Nowadays, in the market, gesture
recognition is considered cutting edge technology. There is an increasing
demand in the market for the applications that use gesture commands in their
systems. Future concepts of computers, cars, telephones, and mobile phones are
incorporating this gesture technology that reduces the use of keyboard, touch
panels/screens, and reduces diversion of attention. 1.2. Hand Gesture Recognition:Gesture
recognition refers to the process of conversion of human
hand motions to a sequence of words
(text). These words are recognized and then can be used as a command to control
systems or for typing documents.Gesture
recognition also refers to the interaction between humans and computers by
using gesture as an input to the system causing it performs specific tasks after
recognizing the input. 1.3. Need for Hand
Gesture Recognition:Increased interest of researchers in
improving human computer interaction emphasized the research in the field of
gesture recognition. Interfaces through with human gesture and interacts with
machines are now becoming popular. There is limited number of people who know
how to access and operate computers. So using which more people can communicate
and interact with computers using their languages needs such interfaces.
Without progress in this field, large number of people would be unable to
communicate with others. Gesture recognition systems benefit people with
Application of Hand Gesture
Recognition: Hand Gesture Recognition refers to the process in
which hand gestures are converted to text. Hand gesture recognition
applications are vastly used in different areas such as in healthcare,
military, training traffic controllers, telephony etc. Some hand gesture
recognition applications that use gesture user interfaces are mentioned below: · Gesture-To-Speech· Speech-To-Gesture· Text-To-Gesture· Gesture-To-Text 2.
MLE Gesture Recognizer:For decades,
man has wanted to setup human like interactions between computers and themselves
using their natural languages. Mute Life Envoy (MLE) system’s development is
the basic step towards that goal, especially bearing in mind those with special
needs. 2.1. Working of Hand
Gesture Recognizer system:Hand gesture recognition is also called gesture-to-text processing.
In general, gesture-to-text processing involves the following steps: 1.
Data Processing 3.
Interpreted Gesture Transmission 2.1.1.
data has to be prepared before the hand gesture recognition process can be
started. The hand’s interior structural details are received through the
hardware; that reads hand motion in some binary form on the computer. This is
then converted so that it can be understandable to the recognizer. 2.1.2.
Data Processing:When a DNM person
makes static hand gestures within range of the infrared sensors, the gesture
recognition process starts, Input gestures are then converted by infrared
sensors to digital signals that can be stored (after verification) on the
computer. Then the recognizer converts these signals to form some meaningful
strings of English text and stores the sequence in the database before
transmission. Gesture recognizer uses some information for the conversion of
input signals. 2.1.3.
Gesture Recognition:Recognition process matches input sequences
to segments stored on the computer. It searches all possibilities made
from image acquisition and the gesture segmentation. By applying the most
efficient algorithm this process finds the match for our input. 2.1.4.
Transmission:In MLE, the
interpreted gestures, which have been translated into English text, are stored
in the database before transmission. These are then transmitted through
Internet services and protocols from one device to another. To accomplish this
there must be some rules for each transmission protocol.
Challenges to Mute Life Envoy system:Developing a static
hand gesture recognition system is a challenging task. That includes following
Vocabulary could be problem·
Is there Continuous
gesture capabilities in our system·
Do we have limited
limitation could be problem·
algorithm has 100% accuracy? Researchers in the gesture recognition field are trying to resolve these
challenges. Development of computational methods
for the conversion of an input gesture into set of words is the purpose of MLE
system. At the moment no generic recognition system exists that recognizes alphabet’s
sign language or sign language from different regions made by user in any
environment. Therefore following variables are restricted for a hand gesture
recognition system to tackle problems.2.2.1. Language:Gesture
recognition systems are language specific and are trained for a specific
positioning for gesture, as there is a wide variation in hands and fingers.2.2.3. Size of Vocabulary:Vocabulary
size varies for different systems. Some systems use small vocabulary, some have
large vocabulary. Size of vocabulary has to be determined for MLE system to
ensure accuracy.2.2.4. Class of SignersFK1 :On
the basis of following acoustic and phonetic characteristic, signers may be
classified in different classes.§ Dialects: Dialects in any language leads to
different pronunciations. So particular dialect should be defined in a gesture
recognition system. Furthermore different people have different dialects, which
may vary from each other. So MLE system should be trained as pidgin for
different people having different dialects.§ Type: On the basis of following hand
usage, signers may be classified in two types.§ Single-Handed: Hand
gestures made by using any of the one hand only by the signer.§ Double-Handed: Hand
gestures made by using both hands simultaneously by the signer.
Scope:There is no application related to hand gesture recognition for IS
language. By seeing the increasing demand of hand gesture recognition in
market, we propose a hand gesture recognizer to dictate and transmit gestures
from IS language to English text format.Our project is basically for those people who are DNM and want to
interact with each other and general public independently without any physical
interpreter. Our proposed system benefits
people with disabilities. For those that are deaf or they can hardly hear or people with
learning disabilities, hand disabilities, can use MLE system’s software in many
ways to get ease in their daily life. This system
may also assist general public in our community who want to interact with DNM
but due to pidgin issues they can’t. Gesture recognizer for IS language is
proposed to overcome all these problems and lowering the rate of error as well. 3.1.
Statement: Nowadays, gesture recognition
applications are becoming useful. Many gesture interactive applications are
available in the market. But they all focus on American Sign Language (ASL) or
other pidgin. This system deals with the study of processing techniques of ISL and integrates the linguistics with the computing
methodologies to achieve a hand gesture recognizing system to dictate
gestures to word processor. This system would interpret the
input gesture when the user dictates it to the system and then transmit it to
the other system. Our overall goal is to implement existing gesture recognition
techniques and develop a hand gesture recognizer for IS language using existing
Goals: This project deals with the study of processing techniques of leap
motion controller infrared sensors along hand motion for static gestures and
integrates the linguistics with the computing
methodologies to achieve a hand gesture recognizing system to
interact through peer-to-peer communication. Our goal is to
implement gesture recognition techniques and develop a hand gesture recognizer
for International Sign language using existing technologies. 3.3.
Objectives: The objective of this system is to make deaf and mute people in our
community to work and interact in a normal environment. This system would help
those people to dictate their signs to those who do not have partial or no training
of Sign Language, which want to give online independent interviews or lectures,
and communicate on computer. In this software, user may interact with each
other or general public by dictating their signs into text and lowering the
rate of error.
Hand gesture is the most effective, dominant and basic way to interact or
communicate with DNM and general public. They commonly interact with each other
by using their region sign language only. This is the most uncomfortable and lesser
easy for communication with the people of other region having their own
language(s). They prefer to interact with general public and computer just like
normal people. So, increased interest of researchers in improving human computer
interaction emphasized the research work in hand gesture recognition field;
thus hand gesture recognition systems are developed. In such system input gesture
signal is converted into words.
of Gesture Recognition System:
On the basis of following gesture
approaches, speakers’ class, and size of vocabulary, speech recognition systems
can be classified in different categories. There are many challenges in
development of hand gesture system. These are described briefly as follows:
Hand gesture recognition systems are
built on the basis of following types of approaches:
Template Matching: The simplest method for recognizing hand postures is through Template
matching. The template matching is a method to check whether a given data
record can be classified as a member of a set of stored data records.
Recognizing hand postures using template matching has two parts. The first is
to create the templates by collecting data values for each posture in the
posture set. The second part is to find the posture template most closely
matching the current data record by comparing the current sensor readings with
the given set.
Feature Extraction Analysis: The low-level information from the raw data is analyzed in
order to produce higher-level semantic information and is used to recognize
postures and gestures are defined as Feature Extraction and Analysis. The
system recognized these gestures with over 97% accuracy. It is a robust way to
recognize hand postures and gestures. It can be used to recognize both simple
hand postures and gestures and also complex ones as well.
Active Shapes Model: A technique for locating a feature within
a still image is called Active shape models or “smart snakes”. A contour on the
image that is roughly the shape of the feature to be tracked is used. The
manipulation of contour is done by moving it iteratively toward nearby edges
that deform the contour to fit the feature.
shape model is applied to each frame and use the position of the feature in
that frame as an initial approximation for the next frame.
Principal Component Analysis: A statistical technique for
reducing the dimensionality of a data set in which there are many interrelated
variables is called Principal Component Analysis where retaining variation in
the dataset. Reduction of data set is by transforming the old data to a new set
of variables that are ordered so that the first few variables contain most of
the variation present in the original variables. By computing the eigenvectors
and eigenvalues of the data set’s covariance matrix the original data set is
transformed. When dealing with image data is that it is highly sensitive to
position, orientation, and scaling of the hand in the image.
Linear Fingertip Models: This is a model that assumes most
finger movements are linear and comprise very little rotational movement. The model
uses only the fingertips as input data and permits a model that represents each
fingertip trajectory through space as a simple vector. Once the fingertips are
detected, their trajectories are calculated using motion correspondence. The
postures themselves are modeled from a small training set by storing a motion
code, the gesture name, and direction and magnitude vectors for each of the
fingertips. The postures are recognized if all the direction and magnitude
vectors match (within some threshold) a gesture record in the training set.
System testing showed good recognition accuracy (greater than 90%), but the
system did not run in real time and the posture and gesture set should be
expanded to determine if the technique is robust.
Casual Analysis: A vision-based recognition technique that
stems from work in scene analysis is known as Causal Analysis. The technique
extracts information from a video stream by using high-level knowledge about
actions in the scene and how they relate to one another and the physical
environment. The gesture filters normalize and combine the features and use
causal knowledge of how humans interact with objects in the physical world to
recognize gestures. The system captures information on shoulder, elbow and
wrist joint positions in the image plane. From these positions, the system
extracts a feature set that includes wrist acceleration and deceleration, work
done against gravity, size of gesture, area between arms, angle between
forearms, nearness to body, and verticality. Gesture filters normalize and
combine the features and use causal knowledge of how humans interact with
objects in the physical world to recognize gestures such as opening, lifting,
patting, pushing, stopping, and clutching. There is no clarity how accurate this
method is. This system also has the disadvantage of not using data from the
fingers. More research needs to be conducted in order to determine if this
technique is robust enough to be used in any nontrivial applications.
Following are the hand gesture types on the basis of their
1. Static Hand Gesture: These types of systems are designed for specific speaker. These
systems can be developed easily as compared to speaker independent systems.
These are accurate but not more flexible than speaker independent systems.
2. Dynamic Hand Gesture: Variety of speakers can use this type of system. These systems are
difficult to develop as compared to speaker independent systems. These are not
accurate but are more flexible than speaker dependent systems.
Vocabulary size may increase complexity in ISL
process. In MLE system, there are various sizes of vocabulary i.e. small,
medium, large, very large vocabularies and out of vocabulary. To handle very
large size vocabulary is more difficult as compared to small size vocabulary.