DSpace at VNU: Recognizing postures in Vietnamese sign language with MEMS accelerometers

6 159 0
DSpace at VNU: Recognizing postures in Vietnamese sign language with MEMS accelerometers

Đang tải... (xem toàn văn)

Thông tin tài liệu

IEEE SENSORS JOURNAL, VOL 7, NO 5, MAY 2007 707 Recognizing Postures in Vietnamese Sign Language With MEMS Accelerometers The Duy Bui and Long Thang Nguyen, Member, IEEE Abstract—In this paper, we discuss the application of microelectronic mechanical system (MEMS) accelerometers for recognizing postures in Vietnamese Sign Language (VSL) We develop a similar device to the Accele Glove [6] for the recognition of VSL In addition to the five sensors as in the Accele Glove, we placed one more sensor on the back of the hand to improve the recognition process In addition, we use a completely different method for the classification process leading to very promising results This paper concentrates on signing with postures, in which the user spells each word with finger signs corresponding to the letters of the alphabet Therefore, we focus on the recognition of postures that represent the 23 Vietnamese-based letters together with two postures for “space” and “punctuation.” The data obtained from the sensing device is transformed to relative angles between fingers and the palm Each character is recognized by a fuzzy rule-based classification system, which allows the concept of vagueness in recognition In addition, a set of Vietnamese spelling rules has been applied to improve the classification results The recognition rate is high even when the postures are not performed perfectly, e.g., the finger is not bended completely or the palm is not straight Index Terms—Human computer interaction, microelectronic mechanical system (MEMS) sensors, sign language recognition, Vietnamese sign language (VSL) I INTRODUCTION G ESTURE recognition has been a research area which received much attention from many research communities such as human computer interaction and image processing Gesture recognition has contributed significantly to the improvement of interaction between human and computer Another application of gesture recognition is sign language translation Among many types of gestures, sign languages seem to be the most structured ones Each gesture in a sign language is usually associated with a predefined meaning Moreover, the application of strong rules of context and grammar makes the sign language easier to recognize [13] There are two main approaches in sign language recognition The former is vision-based, which uses color cameras to track hand and understand sign language The latter uses expensive Manuscript received June 15, 2006; revised August 28, 2006; accepted August 29, 2006 The associate editor coordinating the review of this paper and approving it for publication was Dr Subhas Mukhopadhyay T D Bui is with the Faculty of Information Technology, College of Technology, Vietnam National University, Hanoi 144 Xuan Thuy, Hanoi, Vietnam (e-mail: duybt@vnu.edu.vn) T L Nguyen is with Faculty of Electrical Engineering and Telecommunication, College of Technology, Vietnam National University, Hanoi 144 Xuan Thuy, Hanoi, Vietnam (e-mail: longnt@vnu.edu.vn) Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org Digital Object Identifier 10.1109/JSEN.2007.894132 sensing gloves to extract parameters such as joint angles that describe the shape and position of the hand With the vision-based approach, Uras and Verri [17] obtain some success when trying to recognize the shapes using the “size function” concept on a Sun Sparc Station A recognition rate of 93% for the most recognizable letter and of 70% for the most difficult case is obtained by Lamart and Bhuiyant [9] with the use of colored gloves and neural networks Starner and Pentland track hands with a color camera and then use hidden Markov models to interpret American Sign Language (ASL) [13] The hands are tracked by their color In this approach, instead of attempting a fine description of hand shape during the hand tracking stage, only a coarse description of hand shape, orientation, and trajectory is produced This information is used as the input for the hidden Markov models to understand ASL This approach is developed further by Starner et al [14] with 98% accuracy Clearly, they abandoned the idea of recognizing hand postures In sign languages, many signs may look similar to each other For example, in the ASL alphabet, the letters “A,” “M,” “N,” “S,” and “T” are signed with a closed fist (see [1]) At the first sight, the postures for these five letters appear to be the same A vision-based system would encounter difficulties in recognizing these postures One approach to overcome these difficulties is to use sensing gloves In literature, there are numbers of work on gesture recognitions based on sensing gloves For example, in order to enter ASCII characters to a computer, Grimes [4] developed the Data Entry Glove using switches and other sensors sewn to the glove Kramer and Leifer [8] used a lookup table with his patented CyberGlove to recognize the 26 letters of the alphabet Alternatively, Erenshteyn et al [3] used a method involving coded output, such as Hamming, Golay, and other hybrid codes together with the CyberGlove Zimmerman invented the VPL Data Glove [23] in order to recognize postures in different sign languages For example, a set of 51 basic postures of Taiwanese Sign Language was solved by Liang and Ouhyoung [11] with probability models; and 36 ASL postures were able to be recognized with this glove by the work of Waldron and Kim [18] with a two-stage neural network Those mentioned gloves, however, are very expensive A more affordable option was proposed by Kadous [7] This is a system for Australian Sign Language based on Mattel’s Power Glove However, because of a lack of sensors on the pinky finger, the glove could not be used to recognize the alphabet hand shapes With accelerometers at fingertips, Perng et al [12] developed a text editor where each hand gesture refers to a letter of the alphabet For more detailed reviews of gesture recognition with sensing gloves, see [15] and [21] 1530-437X/$25.00 © 2007 IEEE 708 IEEE SENSORS JOURNAL, VOL 7, NO 5, MAY 2007 Together with the rapid development of advanced sensor technology, researchers are increasingly paying attention to making this interaction more adaptive, flexible, human-oriented, and especially, more affordable For example, microelectronic mechanical system (MEMS) sensors such as accelerometers, geophones, and gyros [1], [10], [16], [19], [20], thanks to their small size and weight, modest power consumption and cost, and high reliability, allow the development of human computer interaction systems with portability and affordability Recently, Hernandez et al [6] proposed a system called The Accele Glove, a whole-hand input device using MEMS accelerometers to manipulate three different virtual objects: a virtual hand, icons on a virtual desktop and a virtual keyboard using the 26 postures of the ASL alphabet When using this device as finger spelling translator, a multiclass pattern recognition algorithm is applied [5] First, the data are collected and analyzed “offline” on a PC The obtained data are transformed to vectors in the posture space then divided into subclasses This way, it is possible to apply simple linear discrimination of the postures in 2-D space, and Bayes’ Rule in those cases where classes have features with overlapped distributions This algorithm can be implemented as a sequence of “if-then-else” statements in the microcontroller, allowing a real-time processing The application of this device has much potential, which can be developed further to be a more comprehensive system and for other sign languages In this paper, we discuss the application of MEMS accelerometers for recognizing postures in Vietnamese Sign Language (VSL) We develop a similar device to the Accele Glove [6] for recognition of VSL In addition to the five sensors as in the Accele Glove [6], we place one more sensor on the back of the hand to improve the recognition process In addition, we use a completely different method for the classification process leading to very promising results This paper concentrates on signing with postures, in which the user spells each word with finger signs corresponding to the letters of the alphabet Therefore, we focus on the recognition of postures that represent the 23 Vietnamese-based letters together with two postures for “space” and “punctuation.” The data obtained from the sensing device are transformed to relative angles between fingers and the palm Characters are recognized by a fuzzy rule-based classification system, which allows the concept of vagueness in recognition In addition, a set of Vietnamese spelling rules has been applied to improve the classification results The recognition rate is high even when the postures are not performed perfectly, e.g., the finger is not bent completely or the palm is not straight II VIETNAMESE ALPHABET SYSTEM Vietnamese was originally written with a Chinese-like script During the 17th century, a Latin-based orthography for Vietnamese was introduced by Roman Catholic missionaries Until the early 20th century, both orthographies were used in parallel Today, the Latin-based is the only orthography used in Vietnam The Latin-based Vietnamese alphabet is listed below: ˙ PQRSTUU ˘ Â B C D -D E Ê G H I K L M N O Ô O ˙ V AA XY Fig Alphabet system in VSL The letters J, W, and Z are also used, but only in foreign loan words In addition, Vietnamese is a tonal language with six tones These tones are marked as follows: level, high rising, low (falling), dipping rising, high rising glottalized, and low glottalized Since the Vietnamese alphabet system is more complicated than the English alphabet system, more signs are required for VSL in comparison with ASL However, it is possible to implement finger spelling of Vietnamese words similar to the ASL system In principle, VSL is based on the well-established ASL According to the ASL dictionary [1], four components are used to describe a sign: hand shape, location in relation to the body, movement of the hands, and orientation of the palms A popular concept in sign language, “posture,” is formed by the hand shape (position of the fingers with respect to the palm), the static component of the sign, and the orientation of the palm The alphabet in ASL, which consists of 26 unique distinguishable postures, is used to spell names or uncommon words that are not well defined in the dictionary The VSL consists of 23 based letter and some addition signs for the accents and the tones The 23 based letters are: A B C D -D E G H I K L M N O P Q R S T U V X Y In this paper, we concentrate on the recognition of postures for these based letters These postures are shown in Fig III THE SENSING DEVICE One of the most successful MEMS sensors in the market is ADXL202 accelerometers from Analog Devices, Inc (www analog.com) The ADXL202 are low cost, low power, complete two-axis accelerometers on a single IC chip with a mea The ADXL202 can measure both dysurement range of namic acceleration (e.g., vibration) and static acceleration (e.g., gravity) The accelerometer is fabricated by the surface micromaching technology It is composed of a small mass suspended by springs Capacitive sensors distributed along two orthogonal axes (X and Y) provide a measurement proportional to the displacement of the mass with respect to its rest position Because the mass is displaced from the center, either due to acceleration or due to an inclination with respect to the gravitational vector , the sensor can be used to measure absolute angular position The outputs are digital signals whose duty cycles (ratio of pulsewidth to period) are proportional to the acceleration in each of the two BUI AND THANG NGUYEN: RECOGNIZING POSTURES IN VIETNAMESE SIGN LANGUAGE WITH MEMS ACCELEROMETERS 709 Fig Function block diagram of ADXL202 (from Analog Devices, Inc.) Fig The X axis and Y axis of the sensor on the finger and of the sensor on the back of the palm Fig Sensing glove with six accelerometers and a basic stamp microcontroller sensitive axes The output period is adjustable from 0.5 to 10 If a voltage output is desired, a ms via a single resistor voltage output proportional to acceleration is available from the and pins, or may be reconstructed by filtering the duty cycle outputs The bandwidth of the ADXL202 may be set and The typical from 0.01 Hz to kHz via capacitors noise floor is 500 g/ allowing signals below mg to be resolved for bandwidths below 60 Hz The function block diagram of ADXL202 is shown in Fig Our sensing device, which is shown in Fig 3, consists of six ADXL202 accelerometers attached on a glove, five on the fingers, and one on the back of the palm The Y axis of the sensor on each finger points toward the fingertip, providing a measure of joint flexion (see Fig 4) The Y axis of the sensor on the back of the palm measures the flexing angle of the palm The X axis of the sensor on the back of the palm can be used to extract information of hand roll, while the X axis of the sensor on each finger can provide information of individual finger abduction Data are collected by measuring the duty cycle of a train of pulses of kHz When a sensor is in its horizontal position, the duty cycle to , the duty cycle is 50% When it is tilted from varies from 37.5% (.375 ms) to 62.5% (.625 ms), respectively (see Fig 5) In our device, the duty cycle is measured using a BASIC Stamp microcontroller The Parallax BASIC Stamp module is a small, low cost general-purpose I/O computer that is programmed in a simple form of BASIC (from Parallax, Inc., Fig Dependence of accelerometer output on tilt angle www.parallax.com) The pulsewidth modulated output of the ADXL202 can be read directly of the BASIC Stamp module, so no ADC is necessary Twelve pulsewidths are read sequentially by the microcontroller, beginning with the X axis followed by the Y axis, thumb first The data are then sent through the serial port to a PC for further analyses IV DATA PROCESSING Our sensing glove produces the raw data represented as a vector of 12 measurements, two axes per finger, and the last two axes for the palm At first, we convert our data to the angles After that we subtract the and values of the fingers to the and values of the palm, respectively Note that our sensing device has a sensor on the back of the palm, which measures the rolling and flexing angle of the palm By processing the data this way, we convert the raw data into the relative angles between the fingers and the palm We will the classification based on the and value 710 IEEE SENSORS JOURNAL, VOL 7, NO 5, MAY 2007 Fig The five fuzzy sets representing the level of bending or flexing of the fingers Fig An overview of the classification system We have found that the concept of fuzzy set is well suited for the problem of posture classification because the posture is normally defined in a vague way, e.g., “the index finger bends a little bit.” Moreover, with a fuzzy rule-based system, the classification can be solved by a set of rules in natural language which look like: if all fingers bend maximally then of the palm and the relative angles between the fingers and the palm Our approach is different from the approach proposed in [6], which recognizes the postures by extracting the features directly from the raw data There are two reasons for this The first reason is that the raw data are the pulsewidths which relate to the rolling or flexing angles through cosine functions Since the cosine function itself is not linear, the sum of pulsewidths measured on fingers does not represent the hand shape accurately The second reason is that the sum is not a good function to extract the feature of hand shape A lot of different hand shapes can result in the same feature extracted by the sum function V CLASSIFICATION The classification system can be seen in Fig First of all, we use the value of the palm to divide the postures into three categories: “Vertical” which consists of the postures of letter “A,” “B,” “E,” “H,” “I,” “K,” “L,” “O,” “R,” “T,” “U,” “V,” and “Y”; “Sloping” which consists of the postures of letter “C,” “D,” “-D,” and “S”; and “Horizontal” which consists of the postures of letter “G,” “M,” “N,” “P,” “Q,” and “X.” After the postures are divided into three categories, we use a three fuzzy rule-based system to perform further classification Human beings often need to deal with input that is not in precise or numerical form Inspired by that observation, Zadeh [22] developed a fuzzy set theory that allows concepts that not have well-defined sharp boundaries In contrast to the classical set theory in which an object can only be a full member or a full nonmember of a set, an object in fuzzy set theory can possess a partial membership of a fuzzy set A fuzzy proposition of the form “if is A” is partially satisfied if the object (usually crisp value ) is partial membership of the fuzzy set A Based on that, fuzzy logic was developed to deal with fuzzy “if-then” rules, where the “if” condition of the rules is a Boolean combination of fuzzy propositions When the “if” condition is partially satisfied, the conclusion of a fuzzy rule is drawn based on the degree to which the condition is satisfied it is the posture of letter “A” if all fingers does not bend then it is the posture of letter “B” The fuzzy rule-based system allows the classification immediately at high recognition rate without having to collect training samples Moreover, a new rule can be added easily for a new posture without changing the existing rules We would miss out on that when using other models like neural networks and hidden Markov models We model the level of bending or flexing of the fingers by five fuzzy sets (Fig 7): Very-Low, Low, Medium, High, VeryHigh The fuzzy classification rules look like: if thumb finger’s bending is Low and index finger’s bending is Very-Low and medium finger’s bending is Very-Low and ring finger’s bending is Very-Low and pinky finger’s bending is Very-Low then the posture is recognized as letter “B” We have created 22 fuzzy rules to classify VSL postures The posture of letter “G” is recognized directly with the use of the value of the palm With these fuzzy rules, the classification process is done as follows Every time we receive the data from the sensing device, we first verify if the hand is at static position by comparing with previous data We wait until the hand stops moving to start the recognition process The preprocessed data are used to calculate the “membership values”—the degree to which the data belongs to the fuzzy sets We then calculate the degree to which the current data set matches each of the 22 fuzzy rules The matching degree is calculated as the product of the membership values to which the data belongs to the fuzzy sets BUI AND THANG NGUYEN: RECOGNIZING POSTURES IN VIETNAMESE SIGN LANGUAGE WITH MEMS ACCELEROMETERS in the rules Finally, the data set is recognized by the rule with the highest matching degree The recognition process is enhanced by the use of Vietnamese spelling rules Vietnamese has a very special characteristic, which is that all the words are monosyllabic Moreover, Vietnamese spelling rules are very strict The combination of consonant and vowels must follow a set of predefined rules In most of the cases, a consonant cannot be followed by another consonant Taking advantage of these rules, when recognizing words formed by postures of letters, we can eliminate many misclassifications VI RESULT The system was implemented in C++, and was tested using a total of 200 samples for each letter to measure the recognition rate In order to collect the samples, we have asked five different persons to perform the posture of each letter 40 times Twenty out of the 23 letters reached a 100% recognition rate This is a very good recognition rate compared with that of vision-based approaches This is because in front of a single camera, many postures look very similar This is not an issue with glove-based approaches because joint flexion and finger abduction of each individual finger can be measured When comparing our recognition rate with other glove-based approaches with expensive and comprehensive gloves, this is a also a competitive rate In our approach, the problem arises from the three letters “R,” “U,” and “V.” They produce ambiguity as the data represented these letters is similar The recognition rate is 90%, 79%, and 93%, respectively The current two-axis MEMS accelerator cannot detect very well the different among the postures of these three letters Vision-based approaches or other glove-based approaches might not suffer from this problem We have improved the situation by applying Vietnamese spelling rules on word spelling in our system After this, the recognition rates for the three letters “R,” “U,” and “V” have increased significantly, which is 94%, 90%, and 96%, respectively The novelty of our system is that the postures can also be recognized even when they are not performed perfectly, e.g., the finger is not bent completely or the palm is not straight This is because we carry out the classification on the relative angles between the fingers and the palm instead of the classification on the raw data, as in [6] This is also the result of the fuzzy rule-based system, which allows the concept of vagueness in recognition VII CONCLUSION In this paper, we presented work on understanding VSL through the use of MEMS accelerometers The system consists of six ADXL202 accelerometers for sensing the hand posture, a BASIC Stamp microcontroller, and a PC for data acquisition and recognition of sign language The classification process is done by a fuzzy rule-based system on the preprocessed data In addition, we have applied a set of Vietnamese spelling rules in order to improve the classification results We have achieved 711 very high recognition rates Moreover, the postures can also be recognized even when they are not performed perfectly One advantage of the glove-based approaches is the potential of mobility The glove can be used independently with an embedded processor or by connecting wirelessly with mobile devices such as mobile phones or PDAs This requires our future work on wireless communication for the system, of which the problem of energy consumption is the most challenging one This also requires us to port the program part of the system into mobile devices In the future, we also want to place more sensors of different types such as three-axis MEMS accelerator and strain gauge into our sensing device in order to recognize more complex forms of the VSL, as well as to recognize gestures for other human computer interaction applications The approach presented in this paper can be easily applied to other sign languages This is done mainly by modifying the rules in our Fuzzy Rule-Based System There are also some other potential applications for our Sensing Glove which are: a wireless wearable mouse pointing device, a wireless wearable keyboard, hand motion and gesture recognition tools, virtual musical instruments, computer sporting games, and work training in a simulated environment REFERENCES [1] N Chaimanonart and D J Young, “Remote RF powering system for wireless MEMS strain sensors,” IEEE Sensors J., vol 6, no 2, pp 484–489, Apr 2006 [2] E Costello, Random House Webster’s Concise American Sign Language Dictionary New York: Random House, 1999 [3] R Erenshteyn, D Saxe, P Laskov, and R Foulds, “Distributed output encoding for multi-class pattern recognition,” in Proc Int Conf Image Anal Process., 1999, pp 229–234 [4] G Grimes, “Digital data entry glove interface device,” U.S Patent No 4414537, 1983 [5] J Hernandez, N Kyriakopoulos, and R Lindeman, “The AcceleGlove a whole hand input for virtual reality,” in Proc SIGGRAPH 2002, San Antonio, TX, 2002, p 259 [6] J Hernandez, R Lindeman, and N Kyriakopoulos, “A multi-class pattern recognition system for practical finger spelling translation,” in Proc 4th IEEE Int Conf Multimodal Interfaces, Pittsburgh, PA, 2002, pp 185–190 [7] M W Kadous, “Grasp: Recognition of australian sign language using instrumented gloves,” M.S thesis, Univ New South Wales, Sydney, Australia, 1995 [8] J Kramer and L Leifer, “The talking glove: An expressive and receptive ‘verbal’ communication aid for the deaf, deaf-blind, and nonvocal,” in Proc SIGCAPH 39, 1988, pp 12–15 [9] M V Lamart and M S Bhuiyant, “Hand alphabet recognition using morphological PCA and neural networks,” in Proc Int Joint Conf Neural Netw., Washington, DC, 1999, vol 4, pp 2839–2844 [10] S Lei, C A Zorman, and S L Garverick, “An oversampled capacitance-to-voltage converter IC with application to time-domain characterization of MEMS resonators,” IEEE Sensors J., vol 5, no 6, pp 1353–1361, Dec 2005 [11] R Liang and M Ouhyoung, “A real-time continuous gesture recognition system for sign language,” in Proc 3rd IEEE Int Conf Autom Face Gesture Recogn., 1998, pp 558–567 [12] J Perng, B Fisher, S Hollar, and K S J Pister, “Acceleration sensing glove (ASG),” in Proc ISWC Int Symp Wearable Comput., San Francisco, CA, 1999, pp 178–180 [13] T Starner and A Pentland, “Real-time american sign language recognition from video using hidden Markov models.” MIT Media Lab, Perceptual Computing Group, Cambridge, MA, Tech Rep 375, 1995 [14] T Starner, J Weaver, and A Pentland, “A wearable computer based american sign language recognizer.” MIT Media Lab., Cambridge, MA, Tech Rep 425, 1998 712 IEEE SENSORS JOURNAL, VOL 7, NO 5, MAY 2007 [15] D J Sturman and D Zeltser, “A survey of glove-based input,” IEEE Comput Graphics Appl., pp 30–39, 1994 [16] S X P Su, H S Yang, and A M Agogino, “A resonant accelerometer with two-stage microleverage mechanisms fabricated by SOI-MEMS technology,” IEEE Sensors J., vol 5, no 6, pp 1214–1223, Dec 2005 [17] C Uras and A Verri, “On the recognition of the alphabet of the sign language through size functions,” in Proc 12th IAPR Int Conf Pattern Recogn., 1994, vol 2, pp 334–338 [18] M B Waldron and S Kim, “Isolated ASL sign recognition system for deaf persons,” IEEE Trans Rehabil Eng., vol 3, no 3, pp 261–271, Sep 1995 [19] R Wang, W H Ko, and D J Young, “Silicon-carbide MESFET-based 400/spl /C MEMS sensing and data telemetry,” IEEE Sensors J., vol 5–6, pp 1389–1394, 2005 [20] Y Wang, X Li, T Li, H Yang, and J Jiao, “Nanofabrication based on MEMS technology,” IEEE Sensors J., vol 6, no 3, pp 686–690, Jun 2006 [21] R Watson, A Survey of Gesture Recognition Techniques Dept Comput Sci., Trinity College, Dublin, Ireland, Tech Rep TCD-CS93-11, 1993 [22] L A Zadeh, “Fuzzy sets,” Inf Control, vol 8, pp 358–353, 1965 [23] T Zimmerman, “Optical Flex Sensor,” U.S Patent No 542 291, 1987 The Duy Bui received the B.S degree in computer science from the University of Wollongong, Wollongong, NSW, Australia, in 2000 and the Ph.D degree in computer science from the University of Twente, Enschede, The Netherlands, in 2004 Since 2004, he has been with the College of Technology, Vietnam National University, Hanoi, as a Lecturer His main research interests are human computer interaction, virtual reality, and multimedia Long Thang Nguyen (S’03–M’04) received the M.S degree from the International Institute of Materials Science, Hanoi University of Technologies, Hanoi, Vietnam in 1998, and the Doctor of Engineering degree from the University of Twente, Enschede, The Netherlands, in 2004 He has worked as a Lecturer with the Faculty of Electronics and Telecommunications, College of Technology, Hanoi National University, since 2004 His main activities are related to design and application of MEMS sensors He has been involved in several projects such as designing of the patient monitoring system and integrations of inertial MEMS sensors and GPS for navigation ... AND THANG NGUYEN: RECOGNIZING POSTURES IN VIETNAMESE SIGN LANGUAGE WITH MEMS ACCELEROMETERS in the rules Finally, the data set is recognized by the rule with the highest matching degree The recognition... classification rules look like: if thumb finger’s bending is Low and index finger’s bending is Very-Low and medium finger’s bending is Very-Low and ring finger’s bending is Very-Low and pinky finger’s... signals whose duty cycles (ratio of pulsewidth to period) are proportional to the acceleration in each of the two BUI AND THANG NGUYEN: RECOGNIZING POSTURES IN VIETNAMESE SIGN LANGUAGE WITH MEMS

Ngày đăng: 16/12/2017, 16:49

Từ khóa liên quan

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan