LOGIN
>> Home
>> Topics
>> Students
>> Partners
>> Statistics
Information for topics
Topic Id:
ID topic:
548
Partner Email:
L.J.M.Rothkrantz@tudelft.nl
Project Title:
Building a visual speech recognizer
Abstract:
This thesis describes how an automatic lip reader was realized. Visual speech recognition is a precondition for more robust speech recognition in general. The development of the software comprised the following steps: gathering of training data, extracting meaningful features from the obtained video material, training the speech recognizer and finally evaluating the resulting product. First, research was done to gain insight on the theoretical aspects of automatic lip reading, the state of the art, speech corpus development, face tracking and feature extraction. The results of a visual speech recognizer based on training data from a single person depend on the utterance type of the unlabeled data. For the simple word-level task of digit recognition 78% was recognized correctly with a word recognition rate of 68%. For letter recognition tasks it did not perform nearly as well, but considering the limitations that the use of visemes over phonemes imposes, these results are at the expected level. The data corpus and visual speech recognizer will be a valuable asset to future research.
Advisor:
Leon Rothkrantz
Link:
Degree:
Master
Keywords:
Computer Software
Artificial intelligence & Neural networks
Data mining
Data modeling
Information systems