News Column

Patent Issued for Microphone-Array-Based Speech Recognition System and Method

June 17, 2014



By a News Reporter-Staff News Editor at Journal of Technology -- A patent by the inventor Liao, Hsien-Cheng (Taipei, TW), filed on October 12, 2011, was published online on June 3, 2014, according to news reporting originating from Alexandria, Virginia, by VerticalNews correspondents.

Patent number 8744849 is assigned to Industrial Technology Research Institute (Hsinchu, TW).

The following quote was obtained by the news editors from the background information supplied by the inventors: "Recently users of mobile devices such as flat panel computer, cellular phone increase dramatically, vehicle electronics and robotics are also developing rapidly. The speech applications of these areas may be seen growing in near future. Google's Nexus One and Motorola's Droid introduce active noise cancellation (ANC) technology to the mobile phone market, improve the input of speech applications, and make the back-end speech recognition or its application performing better, so that users may get better experience. In recent years, more mobile phone manufacturers are also actively involved in research of noise cancellation technology.

"Common robust speech recognition technology includes two types. One type is the two-stage robust speech recognition technology, such kind of technology first enhances speech signal, and then transmits the enhanced signal to a speech recognition device for recognition. For example, uses two adaptive filters or combined algorithm of pre-trained speech and noise models to adjust an adaptive filter, enhances speech signal, and transmits the enhanced-signal to the speech recognition device. Another type uses a speech model as the basis for adaptive filter adjustment, but does not consider the information of noise interference. The criteria of this speech signal enhancement is based on maximum likelihood, that is, the better the enhanced speech signal more similar to the speech model.

"FIG. 1 illustrates an exemplary schematic diagram of filter parameter adjustment process in a dual-microphone-based speech enhancement technology. The speech enhancement technology uses a re-recorded and filtered corpus to train a speech model 110, then uses the criterion of maximized similarity to adjust the noise filtering parameter y, that is, the criteria of the speech enhancement technique is determined by the better the enhanced speech signal 105a from the phase-difference-based time-frequency filtering 105 more similar to the speech model 110. The corpus for the training of the speech model 110 is needed to be re-recorded and filtered, and no noise information is considered, thus the setting for test and training conditions may be mismatched.

"Dual microphone or microphone-array noise cancellation technology has a good anti-noise effect. However, in different usage environments, the ability of anti-noise is not the same. It is worth for research and development work on adjusting parameters of microphone array to increase speech recognition accuracy and provide better user experience."

In addition to the background information obtained for this patent, VerticalNews journalists also obtained the inventor's summary information for this patent: "The present disclosure generally relates to a microphone-array-based speech recognition system and method.

"In an exemplary embodiment, the disclosed relates to a microphone-array-based speech recognition system. This system combines a noise masking module for cancelling noise of input speech signals from an array of microphones, according to an inputted threshold. The system comprises at least a speech model and at least a filler model to receive respectively a noise-cancelled speech signals outputted from the noise masking module, a confidence measure score computation module, and a threshold adjustment module. The confidence measure score computation module computes a confidence measure score with the at least a speech model and the at least a filler model of the noise-cancelled speech signal. The threshold adjustment module adjusts the threshold and provides it to the noise masking module to continue the noise cancelling for achieving a maximum confidence measure score through confidence measure score computation module, thereby outputting a speech recognition result related to the maximum confidence measure score.

"In another exemplary embodiment, the disclosed relates to a microphone-array-based speech recognition system. The system combines a noise masking module to process noise cancelling of input speech signals from an array of microphones, according to each of a plurality of inputted thresholds within a predetermined range. The system comprises at least a speech model and at least a filler model to receive respectively a noise-cancelled speech signal outputted form the noise masking module, a confidence measure score computation module, and a threshold adjustment module. The confidence measure score computation module computes a confidence measure score with the at least a speech model and the at least a filler model for each given threshold within the predetermined range and the noise-cancelled speech signal. The maximum confidence measure score computation module determines a maximum confidence measure score from all confidence measure scores computed by the confidence measure score computation module and finds a threshold corresponding to the maximum confidence measure score among all the computed confidence measure scores, and outputs a speech recognition result.

"Yet in another exemplary embodiment, the disclosed relates to a microphone-array-based speech recognition method. This method is implemented by a computer system, and may comprise following acts executed by the computer system: performing noise cancelling of input speech signals from an array of microphones, according to at least an inputted threshold, and transmits a noise-cancelled speech signal to at least a speech model and at least a filler model respectively, using at least a processor to compute a corresponding confidence measure score based on score information obtained by each of the at least a speech model and score obtained by each of the at least a filler model, and finds a threshold corresponding to the maximum confidence measure score among all the computed corresponding confidence measure scores, and outputs a speech recognition result.

"The foregoing and other features, aspects and advantages of the exemplary embodiments will become better understood from a careful reading of a detailed description provided herein below with appropriate reference to the accompanying drawings."

URL and more information on this patent, see: Liao, Hsien-Cheng. Microphone-Array-Based Speech Recognition System and Method. U.S. Patent Number 8744849, filed October 12, 2011, and published online on June 3, 2014. Patent URL: http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.htm&r=1&f=G&l=50&s1=8744849.PN.&OS=PN/8744849RS=PN/8744849

Keywords for this news article include: Industrial Technology Research Institute.

Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2014, NewsRx LLC


For more stories covering the world of technology, please see HispanicBusiness' Tech Channel



Source: Journal of Technology